The ‘Monthly Dataset Roundup’ series features datasets on Polly that are of scientific value, intended to promote data sharing and reuse of useful cancer data. This month, we feature datasets that capture the comprehensive list of large colorectal cancer (CRC) datasets, the curated versions of which can be found and analyzed on Polly.
The cancer genome atlas colorectal adenocarcinoma dataset (TCGA-COAD)
Dataset ID: COAD-*
Year of Publication: 2012
Total Samples: 2549 samples from 460 patients
Experiment type: Multiomics (CNV, miRNA, Transcriptomics, Proteomics, Methylation, Mutation)
Organism: Homo sapiens
Reference link: Publication
TCGA-COAD is a large collection of multi-omics data of colorectal adenocarcinoma from CRC patients, published as part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes.
The collection consists of multiple omics studies carried out on tissue from the same patient and the samples studying the response of various drugs CRC.
RNA-seq of tumor-educated platelets enables blood-based pan-cancer, multiclass, and molecular pathway cancer diagnostics.
Dataset ID: GSE68086_GPL16791
Year of Publication: 2015
Total Samples: 283 cells
Experiment type: Single cell RNA-Seq
Organism: Homo sapiens
Reference link: Publication
Tumor-educated blood platelets (TEPs) are implicated as central players in the systemic and local responses to tumor growth, thereby altering their RNA profile.
The researchers report RNA-sequencing data of 283 blood platelet samples, including 228 tumor-educated platelet (TEP) samples collected from patients with six different malignant tumors (non-small cell lung cancer, colorectal cancer, pancreatic cancer, glioblastoma, breast cancer, and hepatobiliary carcinomas). Additionally, RNA-sequencing data of blood platelets isolated from 55 healthy individuals is also reported . This dataset highlights the ability of TEP RNA-based 'liquid biopsies' in patients with several types with cancer, including the ability for pan-cancer, multiclass cancer and companion diagnostics.
By utilizing this data the scientist, distinguished 228 patients with localized and metastasized tumors from 55 healthy individuals with 96% accuracy. Across six different tumor types, the location of the primary tumor was correctly identified with 71% accuracy.
Also, MET or HER2-positive, and mutant KRAS, EGFR, or PIK3CA tumors were accurately distinguished using surrogate TEP mRNA profiles.
The results indicate that blood platelets provide a valuable platform for pan-cancer, multiclass cancer, and companion diagnostics, possibly enabling clinical advances in blood-based "liquid biopsies".
Gene expression profiles of breast, colorectal, prostate, and non-small cell lung cancer
Dataset ID: GSE103512_GPL13158
Year of Publication: 2017
Total Samples: 280
Experiment type: Transcriptomics
Organism: Homo sapiens
Reference link: Publication
The tumor microenvironment is an important factor in cancer immunotherapy response. To further understand how a tumor affects the local immune system, the researchers analyzed immune gene expression profiles from 280 formalin-fixed and paraffin embedded normal and tumor samples of four cancer types.
Regulatory T cells (Tregs) were found to be one of the main drivers of immune gene expression differences between normal and tumor tissue. Hence the conclusion that Treg gene expression is highly indicative of the overall tumor immune environment.
Gene expression profiling of colorectal cancer liver metastases (CRLM).
Dataset ID: GSE159216_GPL17586
Year of Publication: 2021
Total Samples: 280
Experiment type: Transcriptomics
Organism: Homo sapiens
Reference link: Publication
Gene expression-based subtyping has the potential to form a new paradigm for stratified treatment of colorectal cancer. However, the established frameworks are based on the transcriptomic profiles of primary tumors, and metastatic heterogeneity is a challenge. Here the researchers aimed to develop a de novo metastasis-oriented framework.
High-resolution microarray gene expression profiling was performed of 283 liver metastases from 171 patients treated by hepatic resection, including multiregional and/or multi-metastatic samples from each of 47 patients were analysed.
Using this dataset they were able to develop a de novo liver metastasis subtype (LMS) framework recapitulated the main distinction between epithelial-like and mesenchymal-like tumors, with a strong immune and stromal component only in the latter.
LMS1 metastases had several transcriptomic features of cancer aggressiveness, including secretory progenitor cell origin, oncogenic addictions, and microsatellite instability in a microsatellite stable background, as well as frequent RAS/TP53 co-mutations.
LMS5 showed a mesenchymal phenotype with higher immune system activation while LMS1-4 showed epithelial characteristics.
Profiling of CD8+T cells upon treatment with extracellular vesicles derived from colorectal cancer and normal patients with different body mass index
Dataset ID: GSE152508_GPL20844
Year of Publication: 2020
Total Samples: 15
Experiment type: Transcriptomics
Organism: Homo sapiens
Reference link: Publication
Colorectal cancer (CRC) is one of the most widely diagnosed cancers worldwide. It has been shown that the body-mass index (BMI) of the patients could influence the tumor microenvironment, treatment response, and overall survival rates.Nevertheless, the mechanism on how BMI affects the tumorigenesis process, particularly the tumor microenvironment is still elusive.
Here the researchers postulated that extracellular vesicles (EVs) from CRC patients and non-CRC volunteers with different BMI could affect immune cells differently, in CD8 T cells particularly.
The changes in the CD8+T cells upon treatment with different types of extracellular vesicles isolated from obese and non obese volunteers, with and without CRC, was studied using RNA-seq.
This study highlights the possible difference in the regulatory mechanism of cancer patients-derived EVs, especially on CD8 T cells.