Single-cell RNA-seq on Polly

Harmonize in-house and public single-cell RNA-seq datasets to ML-ready formats and leverage Polly’s suite of custom solutions designed for scRNA-seq data, to accelerate analysis and insight generation.

Technology

Polly Makes Disparate Single-cell RNA-seq Data Actionable & Usuable

Accelerate Time to Insight with Curated Single-cell Data

Configure Curation to Fit Your Analysis Needs

Polly harmonizes unstructured single-cell data with a configurable, transparent, and granular curation process as per your inclusion/ exclusion criteria to accelerate downstream analysis.

99.99% Metadata Accuracy

Polly’s datasets come with 99.99% accuracy and have ontology-backed metadata for 30+ fields.

100% Metadata Completeness

Polly delivers datasets with 100% metadata completeness and 0 empty metadata fields.

Work with a Unified Data Model

Make single-cell data from disparate sources interoperable & analysis-ready with our datatype-agnostic data model.

Unify and Manage Data from In-house Assays on a Single Atlas

Integrate multi-modal datasets into one central Atlas to unveil hidden patterns, and expedite research breakthroughs.

Use Polly’s Data Custom Processing Pipelines to Achieve Consistently Processed Data

Access unfiltered raw counts from original publications on Polly, or get consistent Polly-processed single-cell data or replicate author-defined counts, as per your research needs.

Custom Cell Type Annotation for Single-cell

Use Polly for custom cell type annotations for your single-cell data using markers derived from subclusters or figures in the publication.

Custom QC Filters to Ensure Pristine Data Quality

Single-cell datasets on Polly go through a robust ~50 steps QA/QC check to ensure the metadata quality, filtering and normalization, batch effect correction, as well as quality of measurements.

Derive Faster Insights from Your Harmonized Single-cell Data on Polly

Extract deeper insights from data at hand by using Polly’s extensive suite of ML solutions to accelerate downstream analysis.

Harness the Synergy of ML for 75% Faster Insights

Work with our experts to deploy popular foundational models like scGPT across your own harmonized data, or fine-tune existing models to improve predictions or accelerate insights.

Network Analysis and Use-cases for Bioinformatics

Work with tools for constructing and analyzing cell-cell interaction networks, providing insights into cellular communication and signaling pathways. Perform analysis like differential expression, trajectory analysis, UMAP, clustering or more.

Use Polly’s ML-tool- PollyGPT for Querying and Analysis

Run queries across your harmonized data using Polly-GPT, a natural language-based querying interface, to perform complex statistical analyses like PCA and differential gene expression.

Analyze and Visualize Single-cell Data on Polly

Leverage our expertise to construct tailored consumption methods, unique to your research.

Visualize Harmonized Data Using CellxGene

Use native web-apps integrated on Polly- like CellxGene and CellxGene VIP to analyze and visualize an array of single-cell data on the fly.

Build, Deploy and Maintain Custom Apps or Dashboards

Use Polly’s APIs or GUI to stream harmonized data on external tools like Spotfire, or your preferred analysis environment like react, shiny, etc. Or work with our experts to build or customize a production-ready, scientifically validated application that caters to your research needs.

Multi-omics Integration

Develop methods for integrating single-cell data with other omics data types (e.g., genomics, proteomics) to gain a more comprehensive understanding of cellular processes.

Polly Verified – Our Quality Guarantee

We use at least ~50 QA Checks to ensure every dataset is:

Complete

Data validation checks ensure that all cell & dataset-level metadata annotations contain non-NULL and non-blank values, all metadata attributes are human-readable and accurately assigned at all levels.

Consistent

Normalization & Batch Effect correction are applied wherever necessary to eliminate technical variations and enable meaningful comparisons between cells.

Relevant

Polly filters out poor-quality cells and genes and also identifies highly variable genes that drive biological variation useful for downstream analyses, and for improving the robustness of results.

Case studies

Pharma Company Achieves 4x Faster Target Identification for Inflammatory Disease

View Case Study
Case studies

Accelerating Translational Research in Immune-mediated Disorders with 5M cells from Harmonized Datasets

View Case Study
Whitepaper

Leveraging Machine Learning for Robust Cell Type Annotation: A Data-Driven Perspective

View Whitepaper

FAQs

What are the quality measures applied to single-cell datasets on Polly?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

  1. We perform 50+ quality checks on single-cell datasets on Polly to ensure high-quality datasets are delivered. Broadly, the checks are categorized as follows:
  • Metadata checks ensure metadata follows the specified ontology, no missing metadata, no missing samples, source link addition, etc.
  • Data matrix checks to ensure that data is properly processed, clustered, and annotated.
  1. We share a quality report for each dataset that is processed on Polly which contains the processing and quality details.

What metadata fields are curated for single-cell RNA-seq data on Polly? Can users request the curation of additional metadata fields?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

We offer 30+  curated fields for single-cell RNA-seq datasets on Polly. If any additional curated fields are required, they are added on request as part of custom curation.

In what formats are single-cell datasets on Polly provided, and are they compatible with common bioinformatics tools and pipelines?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

Single-cell datasets are stored in the H5AD format on Polly. Additionally, our team can also support custom requests for providing data in the file formats that are best suited for the downstream bioinformatics tools and pipelines used by our clients.

Can single-cell data on Polly be accessed and downloaded? Do you support cloud-based access or direct transfer?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

  1. Single-cell data on Polly can be easily accessed and downloaded via the GUI or the Polly Python module. The downloaded file will be in the H5AD file format and will contain both, the data matrix and the metadata.
  2. On request, our team can build exporters that will transfer data on Polly to customers' cloud storage as a service.

What are the benefits of using single-cell on Polly for single-cell analysis?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

Single-cell data on Polly has various benefits,

  1. Access high-quality data.
  2. Transparency in dataset processing.
  3. Customizable metadata harmonization.
  4. Transparency in dataset processing.

What distinguishes Polly's processed single-cell datasets from unfiltered raw counts?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

  1. Polly processed single-cell data is consistently processed using a validated scanpy-based pipeline which takes author-provided raw counts data as input and gives filtered, normalized, clustered and cell-type annotated data as output. All Polly-processed single-cell datasets have cell type annotation available as per author-provided markers and have two H5AD files available:
  • Polly Processed H5AD - Containing normalized counts and metadata including cell type annotation.
  • Raw Counts H5AD - Containing raw/integer counts and sample metadata.
  1. Unfiltered raw counts on Polly are integer counts that are not filtered and normalized. Clustering cell type annotation is available if provided by the author at the source. Clustering and cell type annotation is not exclusively performed and might not be available for all raw count datasets. There will be only one H5AD file available.
  • Raw Counts H5AD - Containing raw/integer counts and sample metadata.

How do you do cell type annotation for single-cell datasets on Polly? Is it customizable?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

  1. We perform scType-assisted cell type annotation on Polly using the cell type markers as documented in the publication of the dataset.
  2. If required, our team can add the following customizations for cell-type annotation
  • Using cell type markers list as specified by our clients.
  • Customization of the cell type annotation methodology as service.

Do you provide integrated single-cell data on Polly? Is there a particular integration methodology you follow?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

Yes, our team has the expertise to provide integrated single-cell datasets on Polly upon request. We don't follow a specific method for integration. The integration methodology is architected by our team of experts based on the biological question and the downstream analysis our client wants to perform.

What are the single-cell data processing pipelines used on Polly? Do you offer customization?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

  1. We primarily use a scanpy-based validated pipeline to process single-cell datasets on Polly. This uses the author-provided raw counts as the starting point, performs filtering low-quality genes/cells, doublet filtering, normalization, batch effect removal from samples within a dataset (not across datasets), clustering, and cell type annotation.
  2. There is also a Cell Ranger pipeline we have in place for processing fastq files generated using the 10x platform. This can be used on request and will be restricted to datasets have fastq files available at source and are generated using the 10x platform.
  3. We offer various customizations such as
  • Customizations to the parameters of the scanpy-based single-cell pipeline.
  • Customizations to the tools and algorithms being used in the two pipelines mentioned above as service.
  • Building a pipeline from scratch as per client requirements as service.
Request Demo