Harmonize and Curate Public Data on Polly

Polly curates and harmonizes datasets from public repositories- processes measurements, links to ontology-backed metadata and transforms them into a unified data model to deliver pristine quality data for faster insights.

Public Repositories Supported

Array Express

Learn More
30+

Public Repositories Supported

Technology

How Does Polly Make Data from Public Repositories ML-ready?

Polly curates datasets of your choice from public repositories, harmonizes them, and delivers in a pristine-quality, ML-ready format, fit for downstream analysis.

The Polly Difference

Deeply Curated, Highest
Quality Data

Polly retrieves datasets from public repositories and curates and harmonizes data of your choice. Polly's powerful harmonization engine processes measurements, links to ontology-backed metadata and transforms them into a Unified Data Model.

Datasets are mapped with 6 standard fields at dataset and 15 at sample level (can be customized for up to ~30 fields). Commonly used ontologies are MeSH, BRENDA Tissue Ontology, NCBI Taxonomy, Cellosaurus, Cell Ontology and PubChem, for disease, tissue, organism, cell line, cell type, and drug respectively.

Request Demo
35+ TBs

Biomedical data harmonized every month.

25+

Data types supported across multi-omics, assay and clinical.

30+

Data sources supported including GEO, PRIDE, CPTAC and more.

2500+

Diseases across oncology, metabolic, immunology and more.

Request Demo