Elucidata Delivers 99% Accurate Oncology Metadata Curation for ICI Therapy Research

Introduction: The Metadata Challenge in Immunotherapy

Cancer continues to be one of the most threatening diseases to human health and a leading cause of mortality worldwide (1). While advances in chemotherapy and targeted therapies have improved survival, many cancers remain refractory to treatment. Immune checkpoint inhibitors (ICIs) have ushered in a new era of cancer treatment - delivering profound, durable remissions for some patients, but with response rates that remain highly inconsistent across indications. In recent years, blocking the immune checkpoint pathways - particularly programmed cell death-1 (PD-1) and programmed cell death-ligand 1 (PD-L1) - has led to remarkable improvements in clinical outcomes. PD-1/PD-L1 inhibitors such as nivolumab, pembrolizumab, and atezolizumab have become frontline therapies for multiple cancers, including non-small cell lung cancer (NSCLC), metastatic melanoma, and renal cell carcinoma (2,3).

Despite these advances, response rates to immune checkpoint inhibitors remain modest overall and highly variable across cancer types. A large-scale meta-analysis of more than 16,000 patients reported an average objective response rate (ORR) of just 19.6%, with wide variation depending on indication (4). For example, ICI combinations in metastatic melanoma can achieve ORRs above 40%, whereas lung and gastrointestinal cancers show much lower rates of durable response.

This variability highlights a critical unmet need: the ability to predict who will respond to ICIs. Doing so requires access to large, harmonized, and well-annotated transcriptomics and clinical trial datasets. Yet, for many biopharma teams, this is easier said than done. Patient metadata is often fragmented across repositories, structured inconsistently, or annotated without standardized ontologies. These gaps make it difficult to integrate datasets, slowing the identification of biomarkers that could explain ICI response and resistance.

The Impact: Structured Metadata Driving Faster Insights

By partnering with Elucidata, a leading U.S. biopharma company transformed its oncology research workflow. The team reduced the time required to create high-quality data products by a factor of six, saved nearly 500 hours of manual wrangling, and cut data discovery and curation costs by 70%.

Most importantly, researchers gained access to structured, harmonized, and AI-ready transcriptomics datasets that enabled efficient analysis of ICI therapy response. As their Director of Computational Biology summarized:

“We ingest large volumes of oncology datasets, but without structured metadata curation and harmonization, it becomes challenging to efficiently scout, integrate, and utilize the most relevant data for research and clinical insights.”

The Challenge: Fragmented Metadata Slowing Discovery

The company’s scientific mission was clear: investigate the molecular mechanisms driving response and resistance to ICI therapies. But their data infrastructure told another story. Metadata was scattered across publications, repositories like GEO, and clinical trial reports lacked standardization. Searching for relevant datasets was time-consuming, and integration was limited by poor interoperability. Manual expert-driven curation, though accurate, was too slow to keep up with research demands.

This fragmentation meant that downstream analyses - patient stratification, biomarker discovery, or validation studies - were often delayed. The absence of harmonized metadata wasn’t just a technical nuisance; it directly slowed the pace of discovery.

The Solution: Metadata Curation and Harmonization with Polly

To address these challenges, Elucidata conducted a comprehensive data audit and deployed its Polly platform to harmonize transcriptomics metadata for ICI response studies.

The process began by identifying approximately 390 relevant samples across 11 datasets, using strict inclusion and exclusion criteria. From there, a detailed data dictionary was developed with more than 150 metadata fields per sample, creating a consistent framework for annotation. Metadata from publications, GEO, and clinical trials was curated and harmonized, ensuring interoperability across sources.

Once metadata was structured, the raw transcriptomics data was processed through an optimized STAR pipeline, and the final curated metadata was delivered as CSV files that could be deployed on the company’s in-house infrastructure. The result was a reliable, high-quality resource for investigating biomarkers of ICI response.

Why Metadata Harmonization Matters in Oncology R&D

For teams working at the intersection of immunotherapy and translational research, the lesson is clear: metadata harmonization is not optional. Without structured, AI-ready metadata, oncology programs risk drowning in noise rather than uncovering actionable insights.

By standardizing metadata across repositories and creating reusable data products, organizations can accelerate their ability to identify responder and non-responder populations, explore resistance mechanisms, and make faster, evidence-driven decisions in ICI therapy development.

Interested in making your oncology data AI-ready? Talk to our team about metadata harmonization and scalable data curation.

References :

  1. Sung H, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–249.
  2. Topalian SL, Taube JM, Pardoll DM. Immune checkpoint blockade: a common denominator approach to cancer therapy. Cancer Cell. 2015;27(4):450–461.
  3. Robert C, et al. Nivolumab in previously untreated melanoma without BRAF mutation. N Engl J Med. 2015;372(4):320–330.
  4. Haslam A, Prasad V. Estimation of the percentage of US patients with cancer who are eligible for and respond to checkpoint inhibitor immunotherapy drugs. JAMA Netw Open. 2019;2(5):e192535.

Blog Categories

Talk to our Data Expert
Thank you for reaching out!

Our team will get in touch with you over email within next 24-48hrs.
Oops! Something went wrong while submitting the form.

Blog Categories