How to Perform Patient Stratification on Polly

Anurag Srivastava, Shruti Malavade

November 6, 2023

Patient stratification involves categorizing a patient population into subgroups based on the presence or absence of a disease. This approach plays a crucial role in understanding the underlying pathology of a disease, enabling physicians to customize therapeutic interventions for individuals. Patient stratification is key to precision medicine and the development of novel therapeutic targets. While AI models and multi-omics approaches have simplified patient stratification, significant challenges persist.

This blog will discuss the significant challenges faced in performing patient stratification and how users can achieve it using the Polly harmonization engine.

Challenges Faced by the Industry in Performing Patient Stratification

The ideal situation for successful personalized medicine would be for clinicians to know beforehand the patient’s risk classification and which drug to administer. However, the reality is that performing patient stratification is difficult even when utilizing multi-omics datasets at our disposal. There are many hurdles like poor data quality, a small sample size, and limited data availability.

More than 50% of public repository datasets like Gene Expression Omnibus (GEO) lack annotations, and just 2% are harmonized.

Nearly 80% of the available data are unstructured and unFAIR, making their usefulness inadequate. The problems associated with poor data quality, unFAIR data, missing metadata, and small sample sizes can result in a faulty predictive model, leading to suboptimal results.

One common strategy for patient stratification relies on cell type differentiation, which has proven effective in classifying autoimmune and cancer patients. Implementing it on a large scale presents challenges due to data-related issues. Another significant challenge in patient stratification is the lack of reliable biomarkers, as exemplified in the case of pancreatic cancer. Moreover, disease heterogeneity adds another layer of complexity, as primary tumor sites vary among patients. The critical breakthrough in overcoming these challenges lies in the quality and harmonization of data.

The solution to this Polly by Elucidata. Polly's harmonization engine provides the means to enhance data quality, harmonize multi-modal datasets, and train patient classifiers.

Polly's capabilities include data harmonization, metadata annotation (providing essential information like tumor site), and seamless integration of various data types. Polly further aids in tackling quality-related issues by harmonizing multi-modal data into an ML-ready resource. This ensures that all data is clean, consistently processed, linked to critical metadata, and statistically robust.

Our Approach:

1. Curate an Atlas specific to disease

The first step in patient stratification using the cell type differentiation method at scale involves aggregating a large multi-omic data corpus to gain a comprehensive view of the disease. This multi-omic data corpus provides a holistic perspective, simultaneously enhancing model robustness and clinical relevance. To create a data warehouse, we use our Polly harmonization engine. Polly harmonization engine can build disease-specific atlases of ML-ready datasets. Researchers can merge and harmonize multi-modal datasets from diverse sources to meet common standards. This integration of multiple omics types and samples enhances the robustness of the models.

2. Define Genetic Signatures

The next step involves defining genetic signatures for each stage of cell differentiation using the harmonized data from the disease-specific atlas. We employ cell types and ranking genes from each dataset to build the classifier model. After comparing gene pairs, the model classifies cell types. The cell differentiation stage cannot be determined by pairwise comparisons alone. Instead, we use more modeling techniques. We acquire patient samples from public sources like TCGA after defining the genetic markers for each cell type at each differentiation step.

3. Train Classifier Model

The classifier model is trained on harmonized datasets to categorize patients based on their cell differentiation stage and to classify them into low and high-risk groups. Performing differential expression analysis on the two patient cohorts generates a list of differentially expressed genes, serving as the foundation for a genetic signature for these patient populations. Subsequently, users can utilize transcription factor enrichment analysis to refine these genetic signatures and define potential drug targets.

4. Target Prioritization

To obtain precise targets from patient stratification, it's crucial to prioritize further gene targets more thoroughly. Our experts collaborate with your team to prioritize drug targets based on druggability scores and supporting literature evidence.

Case Study: Identifying Potential Drug Targets for AML

We used Polly OmixAtlas and patient stratification to identify potential drug targets for AML. To do this, 10k+ multi-omics datasets related to AML and normal Hematopoiesis were consolidated from public & proprietary sources. The datasets were cleaned & linked to harmonized metadata, stage of differentiation, cell line, cell type & more by Polly. Curation enabled the integration of multiple datasets to create high-quality multi-omics signatures.

Overcoming Challenges in Multi-Omics Patient Stratification — The Polly Platform

‍

Results

2+ data-centric patient stratification-based targets in AML were identified using an integrative multi-omics approach
6 Months to identify & validate targets with Polly. Significantly faster than the average 2-year time period

Read the case study here.

Polly For Enhanced Quality of Patient Stratification

By utilizing Polly's capabilities, researchers can streamline the multi-omics analysis process, from data retrieval to downstream analysis and interpretation. Its assistance can save time, provide expert guidance, and simplify the complex tasks involved in multi-omics analysis, ultimately enhancing the efficiency and accuracy of research.

Polly aims to empower researchers by augmenting their capabilities, accelerating the pace of discovery, and facilitating breakthroughs in various scientific fields.

‍

Other Resources

Blogs Case Studies Dataset Roundup Documentation Glossary Webinars Whitepapers

Thank you for reaching out!

Our team will get in touch with you over email within next 24-48hrs.

Oops! Something went wrong while submitting the form.

Explore : Target Discovery - Lessons from the Field

Read More

Pharma Company Achieves 4x Faster Target Identification for Inflammatory Disease

Key Highlights

What’s a Rich Text element?

Static and dynamic content editing

How to customize formatting for each rich text

All Solution Briefs

Other Resources

How to Perform Patient Stratification on Polly

Challenges Faced by the Industry in Performing Patient Stratification

Our Approach:

1. Curate an Atlas specific to disease

2. Define Genetic Signatures

3. Train Classifier Model

4. Target Prioritization

Case Study: Identifying Potential Drug Targets for AML

Results

Polly For Enhanced Quality of Patient Stratification

Other Resources

Talk to our Data Expert

More Solution Briefs

Faster Insights on Omics Data Signatures with Polly Discover

Enhancing Data Quality: QC Filters for Single Cell RNA-seq Analysis

How to Perform Patient Stratification on Polly

ChatGPT in Drug Discovery

Solving Biomedical Data Findability Issues Using Polly

How to Compare Gene Signatures on Polly

FAQs

What are the key benefits of using Polly for gene target prioritization in patient stratification?

How does Polly help in training classifier models for patient stratification?

How does Polly assist in defining genetic signatures for different stages of cell differentiation?

What is the process of creating a disease-specific atlas using Polly’s harmonization engine?

How does Polly integrate multiple data types for more reliable patient stratification?

Can Polly handle data quality issues and unstructured data from public repositories?

How does Polly harmonize multi-omic datasets to improve the quality of patient stratification?

How does Elucidata's Polly help in overcoming the challenges of patient stratification?

What challenges do researchers face when performing patient stratification using multi-omics data?

What is patient stratification, and why is it important for precision medicine?

What are the key advantages of using Polly for transcriptome profiling and biomarker identification?

What methodologies does Polly use to identify synergistic drug combinations?

How does Polly rank datasets similar to a gene signature query?

What steps are involved in creating a query gene signature on Polly?

How does Polly's RNA-Seq Atlas simplify gene signature analysis?

What is gene signature comparison, and why is it important in drug discovery?

Get the latest news, industry insights, and updates delivered directly to your inbox.

All Solution Briefs

Faster Insights on Omics Data Signatures with Polly Discover

Enhancing Data Quality: QC Filters for Single Cell RNA-seq Analysis

How to Perform Patient Stratification on Polly

ChatGPT in Drug Discovery

Solving Biomedical Data Findability Issues Using Polly

How to Compare Gene Signatures on Polly

info@elucidata.io

info@elucidata.io

info@elucidata.io