Polly Verification Report for Clinical Trial
This report has been verified by Polly as per framework version 1.0 Learn More
| Stat Label | Stat Value |
|---|---|
| Total Drug Molecules | 3 |
| AUC Values | 21 |
| Numerical Values | 48 |
| Treatment Arms | 16 |
| Primary Route | Oral |
| nct_id | title | official_title | status | study_type | phase | start_date | completion_date | conditions |
|---|---|---|---|---|---|---|---|---|
| NCT03107988 | NANT 2015-02: A Phase 1 Study of Lorlatinib | Phase 1 Study of Lorlatinib (PF-06463922), an Oral Small Molecule Inhibitor of ALK/ROS1, for Patients With ALK-Driven Relapsed or Refractory Neuroblastoma | COMPLETED | INTERVENTIONAL | PHASE1 | 2017-09-05 | 2025-01-31 | Neuroblastoma |
| NCT03009292 | Pharmacokinetic Study of Lenvatinib | Pharmacokinetic Study of E7080/Lenvatinib in Chinese Subjects With Solid Tumor | COMPLETED | INTERVENTIONAL | PHASE1 | 2018-08-06 | 2021-08-27 | Solid Tumor |
| NCT03220295 | Putative Cognitive Enhancer VU319 | Study of the M1 Positive Allosteric Modulator VU0467319 | COMPLETED | INTERVENTIONAL | PHASE1 | 2017-07-28 | 2019-10-30 | Cognitive Impairment |
| id | docs | patent_number | patent_title | filing_date | grant_date | assignee | abstract | nct_id |
|---|---|---|---|---|---|---|---|---|
| 1 | Lorlatinib | US11299500B2 | CRYSTALLINE FORM OF LORLATINIB FREE BASE HYDRATE | Oct. 4, 2018 | Apr. 12, 2022 | Pfizer Inc., New York, NY (US) | This invention relates to a crystalline form of lorlatinib free base hydrate and pharmaceutical compositions for treating abnormal cell growth such as cancer. | NCT03107988 |
| 8 | Lenvatinib | US9006256B2 | ANTITUMOR AGENT FOR THYROID CANCER | Apr. 8, 2011 | Apr. 14, 2015 | Eisai R&D Management Co., Ltd., Tokyo (JP) | Pharmaceutical compositions and therapeutic methods for treating thyroid carcinoma using RET kinase inhibiting substances. | NCT03009292 |

Figure 1: These violin plots display the distribution of quality control metrics for each cell. Metrics include the number of genes detected, total transcript counts, and the percentage of mitochondrial transcripts.
A good-quality dataset would typically have a reasonable number of genes detected per cell and a moderate total transcript count. High mitochondrial transcript percentages can indicate low-quality, dying cells. Please Note: certain datasets do not have mitochondrial genes (MT-), thus figure for percentage of mitochondrial transcripts may be empty.
.webp)
Figure 2: Sample level distribution of clustering pattern of cells with the help of UMAP embeddings.
If cells from the same sample cluster together distinctly from cells of other samples, it may indicate the presence of batch effects. Ideally, cells should be mixed and group based on their biological characteristics rather than their originating sample, indicating that the data is free of significant batch effects and the samples are comparable.
.webp)
Figure 3: The bar plot showcases the distribution and abundance of different cell types within each sample. Each color in a bar represents a different cell type with the height of the color segment indicating the count of that cell type in the sample.
A uniform distribution of cell types across samples, may suggest that the sample preparation and preprocessing methods used were effective and there was minimal bias or variation in the processing steps. In some cases, if the experiment design ensures enrichment of a cell-type in a sample, then a non-uniform distribution is also valid.
.webp)
Figure 4: The bar plot showcases the distribution and abundance of different cell types within each cluster. Each color in a bar represents a different cell-type with the height of the color segment indicating the count of that cell-type in the cluster.
Generally, each cluster should have only one cell-type to indicate accurate cell-type annotation. A corner-cases are observed when the authors have only provided cell ID to cell-type mapping and no marker genes. These need to manually rectified.
.webp)
Figure 5a: The bar plot visualizes the total count of cells detected in each sample. Each bar corresponds to a different sample, with its height representing the number of cells.
This plot provides an understanding of the sample distribution in terms of cellularity. A wide variance in cell numbers across samples might indicate inconsistencies in cell isolation, sample preparation, or sequencing depth. Consistent cell counts across samples, however, would suggest a more uniform sampling process.
.webp)
Figure 5b: The bar plot illustrates the median number of genes detected in each sample. Each bar represents a different sample, and its height corresponds to the median gene counts.
Consistently low gene counts might indicate low sequencing depth or poor-quality samples. On the other hand, large variances between samples or cell types might point to technical biases or true biological differences.
.webp)
Figure 5c: The bar plot showcases the median percentage of mitochondrial gene transcripts across samples.
Consistently high mitochondrial gene percentages across samples might indicate a widespread issue with cell viability, while sporadic high values could suggest sample-specific issues which can be removed before downstream analysis
.webp)
Figure 6: The plot provides a smoothed representation of the distribution of detected genes across cells.
This plot gives an idea about the average gene richness in cells. High variability might indicate a mix of high and low-quality cells.
.webp)
Figure 7: The plot provides a smoothed representation of the distribution of UMIs across cells.
This plot offers insight into the typical transcriptomic depth of the dataset. A broad distribution might indicate variability in sequencing depth across cells.
.webp)
Figure 8: The scatter plot provides a visual representation of the relationship between the number of unique molecular identifiers (UMIs) and the number of genes detected in single cells. The color intensity indicates the density of data points in a particular region of the plot, allowing for the identification of trends and patterns.
Ideally, one would expect to see a positive correlation between UMIs and genes, indicating that cells with more transcripts also express more unique genes. Areas with higher density may represent the most typical cells in the dataset, while outliers could indicate low-quality cells or potential doublets.
| NMI | ARI | PCR_batch | Graph_iLISI | kBET_accept_rate | batch_correction_score | |
|---|---|---|---|---|---|---|
| uncorrected | 0.7051 | 0.8182 | 0.8793 | 0.0244 | 0.0772 | 0.0000 |
| corrected | 0.9713 | 0.9910 | 0.9855 | 0.0748 | 0.7651 | 1.0000 |
Table 1: Table displaying batch mixing metrics
These metrics are adopted from a recent benchmarking study of single-cell integration methods (Lueken et al. 2022). Values closer to 1 indicate better mixing of cells from the different batches.
| sc_cluster | prediction | sctype_score | sctype_confidence | diff_exp_cell_markers |
|---|---|---|---|---|
| 0 | CD4+ naive T cell | 0.3867 | 0.4287 | DACT1,EDA,FAM13A,LRRN3,PECAM1 |
| 1 | CD4+ central memory T cell | 0.1992 | 0.4951 | CCR6,KLRB1 |
| 2 | CD4+ naive T cell | -0.2365 | -0.5666 | LRRN3,PECAM1 |
| 3 | CD4+ effector memory T cell | 2.4715 | 1.8014 | CCL5,CEBPD,CST7,GZMA,GZMK |
| 4 | CD4+ naive T cell | -0.0367 | -0.2476 | LRRN3,PECAM1 |
| 5 | regulatory T cell | 3.5664 | 3.5823 | FCRL3,FOXP3,HLA-DRB1,IKZF2,TIGIT |
| 6 | CD4+ central memory T cell | 1.2004 | 2.8250 | CCR6,KLRB1,PHLDA3 |
| 7 | terminally differentiated effector memory CD4+ alpha-beta T cell | 4.8192 | 3.7242 | GZMH,NKG7 |
| 8 | CD4+ naive T cell | 0.0473 | -0.1134 | STAT1 |
| 9 | CD4+ naive T cell | 0.0363 | -0.1309 | LRRN3,PECAM1 |
| 10 | CD4+ naive T cell | 1.5244 | 2.2458 | IFI44L,MX1,STAT1 |
| 11 | CD4+ naive T cell | -0.0147 | -0.2124 | |
| 12 | CD4+ naive T cell | 1.7115 | 2.5446 | LRRN3,PECAM1,SOX4 |
| 13 | CD4+ naive T cell | 0.0420 | -0.1218 | |
| 14 | CD4+ naive T cell | -0.1009 | -0.3500 |
Table 2: Table displaying sctype score and differential expressed genes per cell annotation
Disclaimer: The cell type annotation for the clusters that have a negative sctype_score and/or sctype_confidence value has been manually re-annotated based on specific markers that were prominent in those clusters. The updated annotations are stored in the "uns" slot of the final h5ad file.
Cell type predictions are made using the author-reported cell types. Next to the predictions, the marker genes of the assigned cell type which are differentially expressed in the corresponding cluster are also highlighted (where found). Differentially expressed genes were identified by running the Scanpy rank_genes_groups function with the following settings:Log-fold change cutoff: 1.0, Statistical test: t-test Adjusted p-value cutoff (Benjamini-Hochberg): 0.05 By default, "normalized_counts" layer is used for DE testing. DE genes per cluster are identified separately within each batch, and the results from all batches are summarized at the cluster level.
| Validation Check | Description | Status |
|---|---|---|
| Dose/AUC Linkage | Ensures every dose amount has a corresponding AUC exposure value. | Passed |
| Curated Compound IDs | Ensures all compounds have pubchem_cid as the unique identifier. | Passed |
| Curated Treatment Arm | Ensures all studies have detailed treatment arm defined. | Passed |
| Treatment Arm To PK values Check | PK Parameters and AUC row tables mapped to treatment arm. | Passed |
| Tmax Median Check | Tmax is reported as a Mean instead of a Median (standard practice is median/range). | Passed |
| Unit Normalization | Validation of unit conversion to standardized ng*h/ml or µg*h/ml for AUC and μg/mL or ng/ml for Cmax. | Passed |
| Schema Adherence | Check for presence of Primary and Foreign keys in each table (e.g., compounds has pubchem_cid, nct_id). | Passed |
| Linkage Adherence | Check that there are no orphan Primary or Foreign Keys. | Passed |
| Data Duplicates | Check that there are no duplicated NCT IDs, Patent IDs, and Treatment Arm IDs. | Passed |
| Data Logic | Check that Start Dates < Completion Dates. | Passed |
| Human Verified | Literature and unstructured data extraction is human verified and updated. | Passed |
.webp)
Figure 1: The bar chart illustrates the variance ratio explained by each of the top 10 principal components (PCs). Each bar represents the proportion of the total variance in the data attributed to the corresponding principal component.
The PCA variance plot highlights the proportion of total variance captured by each of the top 10 principal components. This helps in understanding how much of the total variance in the data is captured by the initial components.
.webp)
Figure 2: The dot plot showcases the expression levels (often represented by dot size) and prevalence (often represented by dot color intensity) of specific marker genes across different cell types.
Marker genes that are predominantly expressed in specific cell types validate the identified cell populations and help in characterizing and annotating them.
.webp)
Figure 3: The dot plot showcases the expression levels (often represented by dot size) and prevalence (often represented by dot color intensity) of specific marker genes across different clusters.
This visualization aids in understanding the heterogeneity within the dataset and can hint at different cellular states or subtypes within a cell type.

Figure 4: A Sunburst plot illustrating the distribution of data. It reflects user-defined custom fields if specified; otherwise, it represents standard fields.
.webp)
Figure 5: The umap visualization represents cells in a reduced dimensional space, with colors indicating various categorical attributes.
.webp)
Figure 6: This umap visualization represents cells in a reduced dimensional space, with colors indicating the Polly curated fields.
Report Powered by Polly by Elucidata | © Copyright 2024 | Elucidata.io