Elucidata | Public Data on Polly

Public Repositories Supported

GEO

Know More

CPTAC

Know More

PRIDE

Know More

Array Express

Know More

GDC

Know More

30+

Public Repositories Supported

Technology

How Does Polly Make Data from Public Repositories ML-ready?

Polly curates datasets of your choice from public repositories, harmonizes them, and delivers in a pristine-quality, ML-ready format, fit for downstream analysis.

The Polly Difference

Deeply Curated, Highest
Quality Data

Polly retrieves datasets from public repositories and curates and harmonizes data of your choice. Polly's powerful harmonization engine processes measurements, links to ontology-backed metadata and transforms them into a Unified Data Model.
‍
Datasets are mapped with 6 standard fields at dataset and 15 at sample level (can be customized for up to ~30 fields). Commonly used ontologies are MeSH, BRENDA Tissue Ontology, NCBI Taxonomy, Cellosaurus, Cell Ontology and PubChem, for disease, tissue, organism, cell line, cell type, and drug respectively.

Request Demo

35+ TBs

Biomedical data harmonized every month.

25+

Data types supported across multi-omics, assay and clinical.

30+

Data sources supported including GEO, PRIDE, CPTAC and more.

2500+

Diseases across oncology, metabolic, immunology and more.

The Polly Difference

Deeply Curated, Highest Quality Data

Polly retrieves datasets from public repositories and curates and harmonizes data of choice. Polly's powerful harmonization engine processes measurements, links to ontology-backed metadata and transforms them into a Unified Data Model. Datasets are mapped with 6 standard fields (can be customized for up to ~30 fields) to ontologies at dataset and sample level. Commonly used ontologies are MeSH, BRENDA Tissue Ontology, NCBI Taxonomy, Cellosaurus, Cell Ontology and PubChem, for disease, tissue, organism, cell line, cell type, drug respectively.

request demo

35+TBs

Biomedical data harmonized every month.

25+

Data types supported across multi-omics, assay and clinical.

30+

Data sources supported including GEO, PRIDE, CPTAC and more.

2500+

Diseases across oncology, metabolic, immunology and more.