FAIR Data

DataFAIR 2022 at a Glance

Trisha Dhawan
November 10, 2022

Batman, Ironman, and AI were synonymous with the sci-fi movie genre until a few years back. Fast-forward to 2022- very few parts of the planet are untouched by technology. Not just technology but advanced technology- cutting edge, groundbreaking- you get the drift. AI has not just touched our lives in multiple ways but also improved most industries by leaps and bounds.

Talking about the life science industry and how AI has transformed it, I can think of a lot of recent groundbreaking advances such as disease diagnosis, identification of all known protein structures, and acceleration of target identification for diseases. This is just scratching the surface of the iceberg - the possibilities the integration of AI in life science research can offer. But as for most things, a little knowledge is a dangerous thing. Successful integration and adoption of AI/ML in life science research come with its own set of challenges and pitfalls.

In an attempt to shed light on these, we recently hosted our annual event- DataFAIR 2022, with an excellent lineup of experts from the machine learning and life sciences R&D spaces. The speakers enhanced our knowledge of the advantages that integrating AI into pipelines bring, the learnings, and the success stories from the shift in the adoption of technology.

This edition of Polly Bytes is a glimpse of DataFAIR 2022, so sit back, relax and enjoy the read!

Fair is Foul and Foul is Fair - May be True in Shakespeare’s Realm but for Us (and Data), FAIR is fair!

The excitement for the outcomes from the marriage of technology with science is natural, however, we need to be mindful of how we proceed in this endeavor. Why do I say that? AI/ML has a humongous potential that is largely untapped and our understanding of how ML models work is naive. On the other hand, our understanding of complex human biology is also not very advanced yet. To bring two unknowns together without a plan is a sure-shot recipe for disaster.

At Elucidata, we've been advocates of using data-centric approaches for AI and the urgent need to adopt a FAIR digital transformation. To join the technological revolution bandwagon for accelerating drug discovery is the most obvious course of action with promising outcomes and proven success stories. However, the devil lies in the details and that detail is the scrutiny and expertise required to test the ML models before we can fully rely on them to make accurate predictions.

With the unfathomable volumes of experimental and clinical data being generated each year, identifying new targets for diseases and repurposing drugs should become easier. However, scouring through large volumes of semi-structured data from multiple repositories becomes a major bottleneck owing to the amount of time required to find relevant, usable data. Further, the data may be in different file formats, not annotated, and unsuitable for analysis without substantive cleaning.

ML initiatives offer a reprise from this tedious but necessary manual activity with the possibility to automate it. Using the FAIR guidelines and a data-centric approach can help create ML models which are better, faster, and more robust. DataFAIR 2022 centered around two main themes: Data-centric AI and FAIR Digital Transformation. Here are some nuggets from a few interesting talks.

FAIR Transformation: De-risking AI Initiatives in Biopharma

What does “de-risking AI initiatives” mean? Why is there a need to de-risk AI Initiatives in Biopharma R&D?  With an increase in the partnerships between Pharma and AI companies, drug discovery is becoming AI-driven.


Increasing no. of partnerships between Pharma & AI companies.

In his talk, Dr. Abhishek Jha, cofounder of Elucidata, takes us on the journey of biomedical data- from paper to excel sheets, electronic lab notebooks, and finally, to the cloud.

As we transition into a post-cloud world and all your data is on the cloud, any AI initiative will be at risk unless the data is clean and linked to each other.

He talks in detail about the goals and intent behind the earliest AI initiatives, and how they were flawed to the current models being successfully used to aid researchers and experts. He also talks in great detail about how to de-risk AI initiatives in BioPharma to enable faster, more accurate, and reproducible outcomes in biological drug discovery.

AI initiatives, without focus on right problem or data, is (at best) a loss of time and money, and (at worst) harmful to the pharma community and the society at large.

Watch his entire talk here.

Large vs Small Models: A David- Goliath Story

In the year 2022, we look around and we’re immersed in technology. We cannot think of any single task which doesn’t involve technology. We’ve watched sci-fi movies and read books that made life seem so easy to navigate. Our favorites- Batman and Ironman seemed to have an instant solution to any problem, assisted by AI!


But in reality, things are not as smooth. Our understanding of how models learn and how they make predictions is limited. In this talk featuring ML/NLP expert Dr. Ashutosh Modi and our BioNLP Director, Shashank Jatav, the speakers present a realistic look at the technology and explore how it can be improved in a more optimized and well-rounded manner.

Which is better- small or a large model? How can we make models better? And much more. Watch this exciting talk and find out the expert opinion on the large vs. small model conundrum here.

Intrigued? Naturally! Don’t wait any longer to watch all the talks and discussions and start thinking about making the shift in your organization today! Watch the talks here!

Blog Categories

Blog Categories

Request Demo