Proteomics Data on Polly

Polly delivers the highest quality Proteomics Datasets to fit diverse analysis methods and pipelines. All datasets are Polly Verified, i.e. harmonized with a configurable, granular, and transparent curation process.


How Does Proteomics Data on Polly Become ML-ready?

Why Access Proteomics Data on Polly?

Find Datasets in a Fraction of the Time

Use Elucidata’s data concierge service to streamline access to high-quality proteomics datasets from PRIDE.

Our experts do the heavy lifting of ensuring that each dataset is relevant to your research and contains all necessary information, i.e. data matrix, associated metadata, protein intensity tables, and so on.

Configure Curation to Fit Your Analysis Needs

Ensure precise extraction of the data matrix at source, consistent processing, and annotation with 30+ ontology-backed metadata fields using Polly’s harmonization engine.

Request additional metadata fields, or customizations in the pipeline used to process raw data.

Use Data You Can Trust

Our experts implement ~50 QA checks to perform batch effect correction, metadata validation, and remove technical artifacts & variations in every dataset.

The data normalization methods or QC metrics used on Polly are not a black box. Learn how each proteomics dataset was processed by downloading a detailed QA/QC report from your Atlas on Polly.

Polly Verified - Our Quality Guarantee

We use 50 QA checks to ensure every dataset is:


Data validation checks ensure that all dataset and sample-level metadata annotations contain non-NULL and non-blank values.


Rigorous QC checks to ensure metadata attributes are human-readable and accurately assigned at all levels (dataset, cell).


Normalization and batch Effect correction are applied wherever necessary to eliminate technical variations and enable meaningful comparisons between samples.


Doublets, which can arise during sample preparation and confound analysis, are identified and removed.


Poor-quality samples and genes are filtered out. Genes that drive biological variation are retained and used for downstream analyses.

Request Demo