Noteworthy Datasets on Diabetes Mellitus

Noteworthy Datasets on Diabetes Mellitus

Deepti Das
April 13, 2022

The ‘Monthly Dataset Roundup’ series features datasets on Polly that are of scientific value, intended to promote data sharing and reuse of bio-molecular data. Polly OmixAtlases contains highly curated ML-ready, data sets from diverse public data repositories of both omics (transcriptomics, proteomics, metabolomics, single-cell data, etc.) and non-omics data (flow cytometry, lab measurements, immunological assays, etc.). It offers a unique advantage of allowing users to access, utilize and integrate diverse data types to perform a truly multi-dimensional analysis of their research question. This month, we are featuring datasets that capture the comprehensive molecular landscape of ‘Diabetes Mellitus’; curated versions of which can be found and analyzed on Polly.

A visual summary of Diabetes and related diseases datasets on Polly

1) Transcriptomics data was used to study circadian rhythmicity of clock genes associated with metabolic dysfunction in Type2 Diabetes (T2D)

Dataset ID:  GSE182121_GPL17586

Year of Publication:  2021

Total Samples:  49

Experiment type:  Transcriptomics

Organism:  Homo sapiens

Reference link: Publication

The dataset compares circadian rhythm-related genes in age-matched subjects with T2D and subjects with normal glucose tolerance (NGT). Subjects with T2D have a higher BMI and show lower insulin sensitivity (m-values)


Circadian rhythms are generated by an autoregulatory feedback loop of transcriptional activators and repressors. Circadian rhythm disruption contributes to type 2 diabetes mellitus (T2DM) pathogenesis.

The researched studied whether altered circadian rhythmicity of clock genes is associated with metabolic dysfunction in T2DM. Transcriptional cycling of core-clock genes BMAL1, CLOCK, and PER3 was altered in skeletal muscle from individuals with T2DM, and this was coupled with reduced number and amplitude of cycling genes and disturbed circadian oxygen consumption. They observed that Inner mitochondria-associated genes were enriched for rhythmic peaks in normal glucose tolerance, but not T2DM, and positively correlated with insulin sensitivity.

Heatmap of gene expression in subjects with T2DM and normal glucose tolerance (NGT)

Heatmap of gene expression in subjects with T2DM and normal glucose tolerance (NGT)

2) Single-cell RNA sequencing of murine islets shows high cellular complexity at all stages of autoimmune diabetes

Dataset ID: GSE141784_GPL24247

Year of Publication: 2020

Total Samples: 86959 cells from 7 samples

Experiment type: Single cell

Organism: Mus musculus

Reference link: Publication

The dataset studies the transcriptome of various islet infiltrating immune cells using single-cell sequencing at different time points


Tissue-specific autoimmune diseases are driven by the activation of diverse immune cells in the target organs. However, the molecular signatures of the immune cell populations over time in an autoimmune process remain poorly defined. Using single-cell RNA sequencing, the researchers performed unbiased examination of diverse islet-infiltrating cells during autoimmune diabetes in the non-obese diabetic (NOD) mouse at 4, 8 and 15 weeks.

The data revealed a landscape of transcriptional heterogeneity across the lymphoid and myeloid compartments. Memory CD4 and cytotoxic CD8 T cells appeared early in islets accompanied by regulatory cells with distinct phenotypes. The study observed a dramatic remodeling in the islet microenvironment, in which the resident macrophages underwent a stepwise activation program. This process resulted in the polarization of the macrophage subpopulations into a terminal pro-inflammatory state. This study provides a single-cell atlas defining the staging of autoimmune diabetes and reveals that diabetic autoimmunity is driven by transcriptionally distinct cell populations specialized in divergent biological functions.

The dataset captures changes in the immune landscape of the pancreatic islets with the progression of T1D, Here, NOD.IFNGR-/- and NOD.RAG1-/- mice are used as controls.

3) Metabolomics data identifies changes in nucleotide and methylamine metabolism in type 2 diabetes in human compared to the control group

Dataset ID: MTBLS1_m_live_mtbl1_rms_metabolite_profiling_NMR_spectroscopy

Year of Publication: 2007

Total Samples: 132

Experiment type: NMR base metabolic analysis

Organism: Homo sapiens

Reference link: Publication

The dataset consist of urinary metabolite readings from subjects with T2D along with healthy subjects


In this study, NMR-based metabolomic analysis in conjunction with uni- and multivariate statistics was applied to examine the urinary metabolic changes in Human type 2 diabetes mellitus patients compared to the control group. The human population was un-medicated diabetic patients who have good daily dietary control over their blood glucose concentrations by following the guidelines on diet issued by the American Diabetes Association.

This study demonstrates metabolic responses associated with general systemic stress, changes in the TCA cycle, and perturbations in nucleotide metabolism and in methylamine metabolism. Type 2 diabetes patients showed profound changes in nucleotide metabolism, including that of N-methylnicotinamide and N-methyl-2-pyridone-5-carboxamide, which may provide unique biomarkers for the following type 2 diabetes mellitus progression.

The researchers identified N -methylnicotinamide and N-methyl-2-pyridone-5-carboxamide as potential biomarkers of T2DM progression

4) Transcriptomics data defines microRNA profile of human Type 2 diabetic islets

Dataset ID: GSE52314_GPL10999

Year of Publication: 2013

Total Samples: 9

Experiment type: Transcriptomics

Organism: Homo sapiens

Reference link: Publication

The dataset consists of samples from non-diabetic and diabetic subjects


This study is focused on finding factors that are involved in T2DM. In this case, the primary focus was on determining miRNA involved in the pathogenesis of human T2DM. Samples of mature islets were collected from T2DM donors and non-diabetic donors to determine the miRNA transcriptome.

Of the miRNAs that were differentially expressed in T2DM islets, an imprinted cluster of non-coding RNAs on human chromosome 14q32 was identified to be down-regulated. Strikingly, most of the miRNAs that were significantly down-regulated in T2DM islets are derived from the imprinted MEG3/GTL2 locus at human chromosome 14q32. Genomic imprinting refers to the biased expression of genes from either the paternally or maternally inherited chromosome, rather than the more common biallelic expression. Repression of this miRNA cluster is strongly correlated with hyper-methylation of the MEG3-differentially methylated region in T2DM islets, demonstrating an epigenetic alteration associated with T2DM.

Boxplot displaying expression of miRNA derived from the imprinted MEG3/GTL2 locus shows that the miRNA are significantly down-regulated in T2DM islets
Volcano plot showing differential expression of genes across the non-diabetic and diabetic state

5) Transcriptomics data defines lncRNA profiles of lean non-diabetic, obese non-diabetic as well as obese diabetic humans.

Dataset ID: GSE121344_GPL20301

Year of Publication: 2020

Total Samples: 12

Experiment type: Transcriptomics

Organism: Homo sapiens

Reference link: Publication

The dataset compares transcriptomes of lean and obese non-diabetic subjects against obese diabetic patients to study the role of  (LINC) RNAs in T2DM.


In this study, the researchers report that an unexpectedly high fraction of lncRNAs, but not protein-coding mRNAs, is repressed during diet-induced obesity (DIO) and refeeding, whilst nutrient deprivation specifically induced lncRNAs in mouse liver. Similarly, lncRNAs are lost in diabetic humans. LncRNA promoter analyses, global cistrome, and gain-of-function analyses confirmed that increased MAFG signaling during DIO curbs lncRNA expression. Silencing Mafg in primary hepatocytes and in vivo elicited a fasting-like expression profile, improved glucose metabolism, derepressed lncRNAs, and prevented mammalian target of rapamycin (mTOR)-driven protein translation.

The key observation was that obesity-repressed lincIRS2 is controlled by MAFG and observed that genetic and RNAi-mediated lincIRS2 loss causes hyperglycemia, insulin resistance, and aberrant glucose output in lean mice. Taken together, the research led to the identification of a novel MAFG-lncRNA axis controlling hepatic glucose metabolism in health and metabolic disease.

Volcano plot of differentially expressed genes in obese subjects with diabetes compared to obese subjects without diabetes

Boxplots showing Upregulated and Downregulated LINC RNAs in T2DM patients

Subscribe to our Newsletter

Get the latest insights on Biomolecular data and ML

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Blog Categories