Data Science & Machine Learning

The Lone-wolf Scientist Building Data Platforms

Swetabh Pathak
February 8, 2024

"The road to hell is paved with good intentions" .

Heard about the lonesome Biopharma warrior? The one who often goes unacknowledged. It’s the discovery scientist building an end-to-end data platform!

Like a lot of you, I have been trying to lose a few pounds for a while. Let me tell you, I got the whole shebang. Right from my food weighing scale - unused for one and a half years, to the actual weighing scale - which is buried under layers of dust, countless hours of YouTube videos watched - yes I have a playlist, no I won’t share it, the best apps on my phone - on the first screen no less. You get the drift…

But having all the tools hasn’t solved for the basics - number of steps, mindful eating etc. Or maybe has given me a false sense of moving forward while the core, immediate action items are regularly skipped over. No surprise. But hey, YOLO!

Such is the tale of the discovery scientist trying to build a complex piece of software in-house. Building software is sexy. Seems like a lot of constructive work. Unfortunately, it is a path ridden with many potholes.

The expertise to build such tools rarely exists with the discovery scientists themselves. They are usually hacky scripters who can get (almost) all analysis done. But building complex software is a different ball game. It’s almost the equivalent of wanting to build a car on the confidence alone that - well, I could clean my carburetor. My apologies for the terrible car analogy. I should also stick to tools I understand.

IT and other teams are rarely (if at all) in the loop or part of the same effort. They have their own priorities. Often maintaining the critical infrastructure running. Or doing larger-scale implementations of common tools and policies. After all, building software isn’t their expertise as well.

So mostly, this is done in isolation. Often a vendor would be hired from outside - budgets permitting. If not, a few smart PhDs looking for part time work. Worse, freelance developers.

Fun-fact, my co-founder Abhishek did the same at Agios. Needless to say, that led to the genesis of Elucidata. As you can tell, this never goes well.

Not all such efforts are doomed and unnecessary. Point solutions can solve important, pesky problems effectively. They don’t need to have complex development life cycles. Once developed, they can work just about fine for the few users that they serve. There isn’t any need to scale. No need for CI/CD or all the other sacred chants of modern software engineering. The problem comes when we allow for scope creep.

As a wise man said:

"Premature optimization is the root of all evil"

If point solutions are not treated as such, they can become mammoths wandering aimlessly in the corridors of early-stage research. Then they become the dreaded word - ‘Platform’! Whenever you hear the word platform - run for your life.

The discerning reader at this point would of course guess that Elucidata’s flagship tool Polly is….. a platform!

Platforms are hard to build for a reason. They have many components that need to work together. Others need to be able to not just use it like a gearbox, but adjust the levers if need be. They are supposed to serve multiple use-cases and grow rapidly. Not only are they hard to build, they are harder to maintain. They are even harder to get adoption, cause their user experience is not optimized for a narrow use case but is more general. Generalizability has its costs. Premature optimization is the root of all evil.

Scientists can spend years and millions building it while struggling for adoption. All this while not achieving immediate impact. Immediate impact in early discovery is often achieved by agility, not robustness of the tool.

And then comes the dreaded - ‘change in strategic direction’. Biopharma companies change directions fast. It could be focusing on newer indications. Or focusing on molecules more advanced in the pipeline. Or just tightening discovery spending in markets such as today’s. A complex tool requiring tons of maintenance and not serving an immediate use-case is often the first one sacrificed.

I won’t offer remedies here. One short blog is too little for that. What I’ll say is that as someone who has built software for a decade now I can only tell you that building software is hard. Writing code is the easy part. It’s the ‘all else’ that can become a proverbial scope quicksand. Scientists working in discovery have the privilege of having immediate access to users and pressing timelines. Focusing on immediate outcomes is crucial. After all, that is why agile is the dominant ideology of modern software development.

See you next time. Hopefully a few pounds lighter.

This post was originally published in Polly Bits- our biweekly newsletter on LinkedIn.

Blog Categories

Blog Categories

Request Demo