Say No to Data D(u)mps with DMP

This blog post is the first in a four-part series called ‘Current Trends in Open Data’. Here we talk about the importance of DMP. Upcoming posts will focus on data repositories and the effects of the Covid-19 pandemic on open data.

‘Generate, process, analyze, publish’ is the part of the data lifecycle most familiar to every researcher. Publishing a research article in a journal of repute is the crowning glory of a scientist’s career. But with sights set so firmly on the goal of a publication, it is easy to forget that research doesn’t take place in a vacuum. Informing the scientific community about discoveries is important, but enabling other discoveries is equally important.

Why is DMP Necessary?

This is where data management planning (DMP) comes into the picture. Data form the backbone of research. High-throughput technologies enable vast amounts of data to be generated and are now as ubiquitous as older research methods. An efficient data management plan details how data will be acquired, processed, organized, stored, preserved, and be made available for reuse. Though additional steps have been appended to the data lifecycle, so far, the focus has only been on the path leading up to publication. Preserving the data so that it’s accessible and sharing it for reuse by other interested parties are areas that are still lagging. Proper DMP is essential to foster collaboration, increase transparency, and accelerate the process of scientific discovery.

Challenges

One reason is that many scientists are still wary of sharing their data. In a survey conducted by Emory University among researchers across different disciplines, the top reasons (1) for not consenting to share original data were:

· fear of misuse/misinterpretation

· presence of sensitive clinical data

· absence of a system to ensure that due credits/citations are made

Scientists who are willing to share their data but do not have data management plans face their own challenges (2). The scientists:

· don’t know where to begin or how to implement it.

· don’t know whom to approach for help.

· think they lack the required skill set to manage it.

· are resistant to changes in the status quo.

Advances in Implementation

On the bright side, the general attitude towards data management is changing slowly but surely. FAIR data principles (3), which provide guidelines for making data findable, accessible, interoperable, and reproducible, are being increasingly implemented. Librarians at academic research institutions and universities are beginning to serve as data management consultants and help ease scientists into the process of proper data management. Many prominent funding agencies require a robust data management plan to be put together and submitted along with grant applications (4), impressing the scientific community that their data will still have a life beyond a couple of publications. Specialized data management courses are also on offer for principal investigators and scientific leads looking to educate themselves in this area. It’s up to scientists to step up and give DMP the importance that it deserves. The ultimate victory, though, will be when data management principles are so deeply ingrained in a research culture that ‘generate, process, analyze, publish, preserve, share and reuse’ becomes the new mantra.

References:

View of Disciplinary differences in faculty research data management practices and perspectives. http://www.ijdc.net/article/view/8.2.5/332.
Surkis, A. & Read, K. OF INTEREST* Research data management. doi:10.3163/1536–5050.103.3.011.
Wilkinson, M. D. et al. Comment: The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3, 1–9 (2016).
Michener, W. K. Ten Simple Rules for Creating a Good Data Management Plan. (2015) doi:10.1371/journal.pcbi.1004525.

‍