The healthcare systems continually require new drugs to address the unmet medical needs across diverse therapeutic areas. Pharmaceutical industries primarily strive to deliver new drugs to the market through the complex activities of drug discovery and development. $150M - $2.6B & 10-15 years of time is needed to bring a drug to market. Let us take a quick look at the current drug discovery process before we talk about how the process can be accelerated.
There are about 4 ways drugs are discovered:
One common element across the above ways is the process of collecting & curating large amounts of biomedical micro-data to test various hypotheses.
Fun Fact: Did you know there are about 10,000 compounds that are tested before 97.5% of compounds are eliminated in the pre-clinical trial phase which generally takes ~4-6 years & you are left with just 5 compounds by the time you get to the clinical trial stage.
đź’ˇ$150M - $2.6B & 10-15 years of time is needed to bring a drug to market.
Out of the 15 years in development time of a successful compound, about 6 years are devoted to the drug discovery and preclinical phase, 6.7 years to clinical trials, and 2.2 years to the approval phase.
Various elements slow down the process of drug discovery, but some of the crucial ones are:
It is safe to say that data plays a massive role in any drug to be discovered, developed & brought to market! And biomedical data is further critical because that’s the basis of research & hypothesis building.
Using the FAIR - Findable, Accessible, Interoperable, and Reusable- principles to store, manage and share data is a step toward being better prepared for expediting the discovery of safer, better drugs.
This emphasizes that it is time to adopt tools and software to store, manage and share data using FAIR guiding principles to enable scientists to use it better.
We are surrounded by data but starved for insights.
Anything that reduces the need for users to spend time and effort on discovering, cleaning, re-formatting, and analyzing, thus making it easier and quicker for scientific discovery to move forward, can be called curation. Combine this with FAIR guiding principles, and you get the most efficient data for analysis!
There are a bunch of benefits of using curated FAIR data:
Curation is a necessary step to make sense of data using ML, and it takes time, sometimes becoming a bottleneck. Without a dedicated curation effort, researchers have to restrict themselves to 1 or 2 datasets they can analyze, thus making it a resource-constraint-driven process - a huge red flag!
Moreover, as datasets are assembled from multiple sources, each might be processed differently & it becomes hard to compare multiple datasets. Hence, it becomes a long & cumbersome process for researchers to re-process data into the required format & then run this data using their own custom pipelines to compare effectively. This is not a quick activity and can immediately become a huge overload if the team isn’t equipped with the right skill sets/ scalable way of doing it.
Also, many times, metadata isn’t accurate or thorough for a question that the research team is trying to answer. There needs to be a regular metadata update by adding new fields, ideally through automation. This consistent updation of metadata is extremely important to keep contextualized discoverability of the data set easy.
Thus, biomedical data needs to be curated & formatted, ideally in a single place, for research teams to harness its power of it.
If you are spending time scouring datasets just to find out relevant ones for downstream analysis, now is the time to reach out.
Connect with us to learn more about how to accelerate your drug discovery process using curated data.
‍
Get the latest insights on Biomolecular data and ML