Bulk RNA-seq on Polly

Utilize consistently processed, ML-ready in-house and public Bulk RNA-seq datasets on Polly. Ideal for meta-analysis, rare transcript discovery, and integrative multi-omics analysis.

Technology

Polly Makes Disparate Bulk RNA-seq Data
Actionable & Usuable

Accelerate Time to Insight with Curated Bulk RNA-seq Data

Configure Curation to Fit Your Analysis Needs

Polly harmonizes unstructured bulk RNA-seq data with a configurable, transparent, and granular curation process as per your inclusion/ exclusion criteria to accelerate downstream analysis.

99.99% Metadata Accuracy

Polly’s datasets come with 99.99% accuracy and have ontology-backed metadata for 30+ fields.

100% Metadata Completeness

Polly delivers bulk RNA-seq datasets with 100% metadata completeness and 0 empty metadata fields.

Work with a Unified Data Model

Polly makes bulk RNA-seq data from disparate sources interoperable & analysis-ready by delivering it in consistent formats, usable by both Python and R users.

Build High Throughput & Cost-efficient STAR and Kallisto Pipelines

Work with ready-to-use ETL pipelines for processing bulk RNA-seq data on Polly, or build custom pipelines fit for your data and analysis requirements.

Ready-to-use ETL Pipelines

Work with your bulk RNA-seq data on flexible pipelines based on STAR and Kallisto for alignment/mapping available on Polly.

Custom Pipelines for Tailored Use

Utilize our experience to create additional pipelines that provide modular options for QC tools, feature counting tools, aligners, and annotation that can be built on demand.

Sequencing Data QC and Annotations

Use Polly to do QC over sequencing data and metadata for harmonization. Adjust alignment/mapping, QC, feature counting, and other step parameters and annotations fit for your research.

Derive Faster Insights from Your Harmonized Bulk RNA-seq Data on Polly

Extract deeper insights from data at hand by using Polly’s extensive suite of ML solutions to accelerate downstream analysis.

Harness the Synergy of ML for 75% Faster Insights

Work with our experts to deploy deploy foundational ML models on top of your own harmonized bulk RNA-seq data.

Unlock Bulk RNA-seq Use-cases for Bioinformatics

Perform gene, pathway, or metadata-based queries to find and explore the data you need for downstream solutions like predicting biomarkers, identifying target, meta-analysis and more. Utilize interactive volcano plots, heatmaps, and more to visualize enriched genes and pathways.

Multi-omics Integration

Develop methods for integrating bulk RNA-seq data with other omics data types (e.g., genomics, proteomics) to gain a more comprehensive understanding of cellular processes.

Analyze and Visualize Bulk RNA-seq Data on Polly

Visualize Harmonized Data Using Phantasus

Use native web-apps integrated on Polly- like Phantasus to analyze and visualize an array of bulk RNA-seq data on the fly.

Stream Data into Applications of Choice

Use Polly’s APIs or GUI to stream harmonized data on external tools, applications like Spotfire, or your preferred analysis environment like react, shiny, etc., to avail unrestricted consumption.

Build, Deploy and Maintain Custom Apps or Dashboards

Work with our experts to build or customize a production-ready, scientifically validated application that caters to your research needs.

Polly Verified – Our Quality Guarantee

We use at least ~50 QA Checks to ensure every dataset is:

Complete

Data validation checks ensure that all cell & dataset-level metadata annotations contain non-NULL and non-blank values, all metadata attributes are human-readable and accurately assigned at all levels.

Consistent

Normalization & Batch Effect correction are applied wherever necessary to eliminate technical variations and enable meaningful comparisons between cells.

Relevant

Polly filters out poor-quality cells and genes and also identifies highly variable genes that drive biological variation useful for downstream analyses, and for improving the robustness of results.

Case studies

Polly Delivers STAR Quality Data at High Throughput, 5x Lower Cost

View Case Study
Case studies

Pharma-AI Collaboration Cuts Costs by ~$3M with Curated Public Data

View Case Study

FAQs

How is bulk RNA-seq data on Polly defined?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

Bulk RNA-Seq Datasets on Polly represent a curated collection of biologically and statistically comparable samples. All the datasets are denoted using a unique ID, which follows the GEO record identifier format comprising a series ID and a platform ID. For example, dataset ID 'GSE189190_GPL25947_raw’ would translate to:

  1. sequenced using the platform ID- GPL25947
  2. from the Series GSE189190
  3. Here _raw signifies 'raw counts' for datasets

What are the quality measures applied to bulk RNA-seq datasets on Polly?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

  1. We perform 50+ quality checks on bulk RNA-seq datasets on Polly to ensure high-quality datasets are delivered. Broadly, the checks are categorized as follows:
  • Metadata checks ensure metadata follows the specified ontology, no missing metadata, no missing samples, source link addition, etc.
  • Data distribution is checked to ensure the number of genes are within a valid range, the data distribution across samples follows expected trends and features follow the recommended nomenclature.
  1. We share a quality report for each dataset that is processed on Polly which contains the processing and quality details.

What metadata fields are curated for bulk RNA-seq data on Polly? Can users request the curation of additional metadata fields?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

We offer 30+ curated fields for bulk RNA-seq datasets on Polly. If any additional curated fields are required, they are added on request as part of custom curation. There are 6 standard fields mapped to ontologies at dataset and sample level:
Disease → MeSH
Tissue → BRENDA Tissue Ontology
Organism → NCBI Taxonomy
Cell Line → Cellosaurus
Cell type → Cell Ontology
Drug → PubChem

In what formats are bulk RNA-seq datasets on Polly provided, and are they compatible with common bioinformatics tools and pipelines?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

Bulk RNA-seq datasets are stored in the GCT format on Polly. Additionally, our team can also support custom requests for providing data in the file formats that are best suited for the downstream bioinformatics tools and pipelines used by our clients.

Can bulk RNA-seq data on Polly be accessed and downloaded? Do you support cloud-based access or direct transfer?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

  1. Bulk RNA-seq data on Polly can be easily accessed and downloaded via the GUI or the Polly Python module. The downloaded file will be in the GCT file format and will contain both, the data matrix and the metadata.
  2. On request, our team can build exporters that will transfer data on Polly to customers' cloud storage as a service.

What are the advantages of using Polly for bulk RNA-seq?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

Bulk RNA-seq data on Polly has various benefits,

  1. Access high-quality data.
  2. Transparency in dataset processing.
  3. Customizable metadata harmonization.
  4. Highly scalable infrastructure to store and analyze large volumes of bulk RNA-seq data.

Which pipeline is used for processing bulk RNA-seq data on Polly? Do you provide alternative pipelines to process bulk RNA-seq data on Polly?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

Bulk RNA-seq data from FastQ files can be processed using either of the following options:

  1. Kallisto pipeline
  2. STAR pipeline

The data is processed with the following reference genome, annotation, and complementary DNA sequence data from Ensembl release 107 for each organism.

  1. Homo Sapiens Ensembl release 107, 90
  • Genome sequence (fast)
  • Gene annotation set (GTF)
  • cDNA sequences (fast)
  1. Mus musculus Ensembl release 107, 90
  • Genome sequence (fast)
  • Gene annotation set (GTF)
  • cDNA sequences (fast)
  1. Rattus norvegicus Ensembl release 107, 90
  • Genome sequence (fast)
  • Gene annotation set (GTF)
  • cDNA sequences (fast)

We provide Kallisto and STAR as the default choices for processing Bulk RNA-seq data. However, customizations in the normalization steps, QC metrics used, and so on can be made to these pipelines at an additional cost.

Can I integrate bulk RNA-seq data from Polly with other omics datasets?

Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.

Yes, our team has the expertise to help with horizontal (within omics) or vertical (across omics) integration as a part of Polly enabled solutions. The choice of integration method is dependent on the biological question and downstream analysis and is mutually finalised with clients.

Request Demo