Data on Polly

Polly provides ML-ready biomolecular data

Adopt a data-centric approach for formulating new hypotheses, validating hypotheses, and integrating diverse data types using ML-ready biomolecular data on Polly

Curating biomolecular data at scale

Leverage more than 1.5 million datasets and 4.1 million samples aggregated from 32 publicly available sources.

Data from licensed sources


Aggregate-level clinical data contributed longitudinally by 1,500+ participants. The PPMI OmixAtlas on Polly contains 11,341 samples of methylation & proteomics data.


Multi-omics, multi-cohort human cancer-specific molecular datasets. The CCLE OmixAtlas comprises 5209 ML-ready datasets of the following data types: proteomics, metabolomics, transcriptions, miRNA, and mutation.

Browse our catalog

AML OmixAtlas
Learn More
Liver OmixAtlas
Learn More
Single Cell OmixAtlas
Learn More

Making data FAIR with Polly


Polly harmonizes and structures public and premium data in heterogeneous file formats. It is built using a modern tech stack that uses cutting-edge ML models for automating ingestion workflows.

Learn More

Researchers can access millions of curated datasets stored in OmixAtlas', Polly's repositories, through GUI or programmatically with Polly Python. Identify and analyze genes of interest and patient cohorts efficiently.

Learn More
Customer Success

Build, scale and get to quicker actionable insights with Polly's dedicated Customer Success Team of expert scientists and engineers. Fast track your journey from data generation to insight discovery.

Learn More

Talk to us

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.