I have been traveling around for the last 2 weeks- meeting customers, friends, and colleagues alike. The journey has taken me through new cities - like the Hague - and familiar ones - like Paris. Early-stage discovery in Biotech/ BioPharma is going through a winter that hasn’t been seen for a while. The good news is that the recent interest in ML is spurring some activity. There is no way out, but through. So we shall remain hopeful like Fierce Biotech in 2022.
If you are losing hope, maybe you should watch movies from the list that is officially called: The best biotech movies to watch this winter. If the idea of this list appeals to you - I am at a loss of words. I wish you…..
In my conversations, I came across the increasing role of data governance in R&D. More companies are hiring across data ops, data strategy, and data stewards. These terms can be used interchangeably. For the purpose of this article, I’ll stick to ‘data stewards’.
The term ‘steward’ is an interesting choice. It is often used in the context of flight attendants or at events where one’s role is to take care of people.
Data Stewards, therefore, by extension are there to care for the data as an asset.
As data-as-an-asset, or a product, becomes a more common paradigm, we will see more of these roles.
The changing nature of discovery is becoming more data-driven. It’s becoming increasingly common for companies - both large and small - to use public and in-house data to identify targets for example. Machine learning as a process iterates around the data. Models get updated often on the same corpus or variations of it.
In BioPharma research, data stewards are doing a few key things:
It can be a challenging role to fill. In other industries, Data Stewards have been called ‘purple people’ - and not because they are from other species. It’s because they are the amalgamation of two other, more prominent, colors. Purple in the RYB color model is a secondary color made by combining red and blue. Data Stewards work at the intersection of ‘business’ and ‘IT’. Hence they are the amalgamation of these two, often quite siloed, areas.
In our context, ‘business’ is research. Business users are those who prioritize the data needs. IT, on the other hand, enforces standards, security policies, etc., and makes it possible to execute the whole exercise.
I wonder who the best people for this role will look like. While doing research for this blog, I realized that we at Elucidata have never had a role formally called a ‘data steward’. Our product managers and technical leaders have been playing this role. A couple of years ago, we tried to fill this role but weren’t successful in finding the right person. We hired and fired and then sort of abandoned the project.
At that time, the analogy I was working with was of the category manager in e-commerce. If you have never heard that term, a category manager is someone who decides what products should hit the shelves. For example, an e-commerce website may have one for all kitchenware. The role of the category manager is to understand what is working - both on the supply and demand side - and hence design the strategy to bring in more revenue.
Data stewards are often also tasked with ‘data strategy’. It is their responsibility to ensure that their customers’ (scientists) needs are met most effectively, at costs and timelines that are viable for the R&D unit. They need to understand their customers’ needs. In other words, they should have a basic understanding of how the data will be used. They should also look outwards - keeping an eye on what is out there.
The need for an in-house data steward is not as critical for small organizations as the bulk of their data might be publicly sourced or through collaborators. If there is a large public data play, they might want to bring in outside consultants. In larger organizations, the requirement is well-defined. The data steward can connect groups that can be of use to each other. That’s one reason why larger organizations have created this role first. It is often seen as an ‘Enterprise’ role. Something that only larger organizations need. This role is also sometimes called ‘Data Operations’.
I believe that for organizations that are taking a very data-centric approach to discovery, and that this role is a must-have. That doesn’t mean that it is always a full-time role. One could go a long way by clearly assigning this responsibility to a scientist who is excited about data as an asset. Especially if the organization has an extensive use case for ML.
Finally, it might be a great place for individuals who want to work closely with research but are not quite on the bench or in data science. But at the same time, have business acumen and the ability to align a complex organization.
Till the next time - with hopefully a warmer Biotech sun! :sun
This post was originally published in Polly Bits- our biweekly newsletter on LinkedIn.