Quality control (QC) is a critical preliminary stage in single-cell RNA-seq (scRNA-Seq) data analysis, serving two primary objectives:
In this solution brief, we discuss the prevalent metrics and techniques employed for QC and filtering of cell barcodes in single-cell RNA-seq data on our biomedical data curation platform - Polly. These methods influence the inclusion of cell barcodes in downstream analysis, potentially influencing clustering outcomes and visualization.
In scRNA-seq, various techniques exhibit differences in transcript length and sequence coverage. Some methods, such as Smart-seq and Quartz-seq, capture complete transcript sequences, while others, like Drop-seq (3’-end only), STRT-seq (5’-end only), and Chromium (3’-end only) focus on partial sequences. These techniques collectively form a pipeline that transforms limited-scale input into high-dimensional output, shedding light on cellular mechanisms and trajectory dynamics.
This analysis follows a structured workflow, divided into two main sections: pre-processing and downstream analysis. Common quality control filters are the gatekeepers of data integrity, ensuring that the information derived from complex datasets remains accurate and reliable.
Before embarking on the data filtering process for single-cell RNA-seq data, two essential steps should be undertaken:
In the analysis of single-cell data, the adoption of common metrics and filtering methods is pivotal. Below, we explore these practices in breif, providing insights into their rationale and potential caveats where applicable.
Quality control is critical in scRNA-seq data analysis, ensuring that only high-quality cells and genes are used for downstream analysis. By implementing ordinary QC filters and considering the unique characteristics of your dataset, you can enhance the reliability and biological relevance of your scRNA-seq results, leading to more accurate insights into cellular heterogeneity and gene expression patterns.
Polly is a transformative asset in elevating the quality of data. It excels in curating multi-omics and assay data, rendering them ML-ready and analysis-ready. This process is driven by a Polly-verified curation engine, overseen by skilled experts who harmonize a wide spectrum of data types, enrich metadata, and ensure consistent data processing while maintaining affordability. The ML-Ready data is securely stored on cloud-based Atlas data stores, optimized for efficient analysis and data management.
Polly's state-of-the-art technology caters to approximately 26 diverse R&D data types, meeting the requirements of teams involved in pre-clinical drug discovery and diagnostics R&D. It's the trusted choice for over 25 research organizations, including four of the largest 10 pharmaceutical companies, who leverage Polly and its associated solutions to expedite their discovery programs. Numerous other data-driven healthcare enterprises rely on Polly-verified processes to harmonize and securely store public and proprietary biomedical data. In a nutshell, Polly, with its user-friendly interface and advanced capabilities, ensures high-quality scRNA-seq data.
Get the latest insights on Biomolecular data and ML