Research Data Management
Science is based on the collection and analysis of data, or detailed recorded information on a specific topic. Because "data" can include anything from GIS points to audio recordings, it's important to have a plan for how you are going to collect, store, and preserve that data in the long term. Research Data Management is the plan and process for your data, including:
In science, reproducibility is another scientist's ability to replicate the experiment that you conducted.
In 2016, Nature published survey results gathered from 1500 scientists asking if researchers thought there was a crisis in reproducibility. 52% of participants said there was a severe crisis in reproducibility and 38% said there was a slight crisis.
Research data is a big part of this problem. At one time, scientists didn't let other researchers look at, much less manipulate, the data that they gathered. But this trend is changing!
If you apply for or receive a grant, especially a large grant, chances are you will have to explain how you plan to handle your data. Don't get caught unaware!
Data Management Plans
Data Management Plans (DMP) are written documents that describe how data will be handled throughout the research process. It covers everything from collection, documentation, analysis, access, and preservation. You may have thought about all of these topics at some point, but writing down all of the steps in data curation will reveal that some steps may have been overlooked.
It is important to note that your DMP is a living document, meaning that it will (probably) change over the course of your research. However, it is better to get the outline down before you start so that some of the problems you may run into during your research will be resolved before you start.
Data Management Plan Examples
Different funding organizations have different requirements for the DMPs that they expect. Be sure to write a DMP according to your funding agency's guidelines. If you are being funded through a federal organization, the Scholarly Publishing and Academic Resources Coalition (SPARC) has created a list of federal funding organizations and their requirements, which you can find on their web page.
Below are DMP examples from other organizations:
DMPTool is an online service that lets you write, share, and publish data management plans. It provides a plethora of templates required by various funding agencies in grant applications. While DMPTool is free, while you are at UMD you can sign in with your University Credentials. DMPTool also allows you to email our data librarian directly when you are logged in.
One of the key components in a Data Management Plan is the preservation and long term access to the data you create. When you're creating data and writing your plan, try to use file formats that promote long-term use of your data (such as a CSV), and always keep a copy of the raw data you create (without doing analysis or cleaning). Where can you keep your data so that other scientists can find it? Find out below!
The Digital Repository at the University of Maryland (DRUM) collects, preserves, and provides public access to the scholarly output of the university. Faculty and researchers can upload research products for rapid dissemination, global visibility and impact, and long-term preservation. This includes data sets, which should be deposited into DRUM separately from other research products (such as papers). This ensures that other researchers can find it. Don't forget to cite your data in your research paper. That way researchers looking for your data set will easily find it.
Other Data Repositories
Finding the right data repository can be crucial to your data's visibility. Add your data to DRUM is good. Also depositing your data to another repository is better. Need help finding the right repository? The resources below will help you find the right place for your data.
FAIR Data Principles
In an effort to keep data ensure that research data is Findable, Accessible, Interoperable, and Reusable, Wilkinson et al published a set of policies and guidelines that encouraged data repositories to standardize their data entries. Since then, publishers, repositories, and funding agencies have all adopted these principles. If you need help finding a FAIR data certified repository, contact your librarian, or use the DataCite Repository Finder.
For a complete breakdown of the guidelines, check out FORCE11's FAIR Data Principles page.
Image Courtesy xkcd
As with every research product you use, data sets must be cited appropriately, even if you create them! With a little guidance, it's not too hard to write up a proper citation. The International Association for Social Science Information Services & Technology (IASSIST) suggest the following information be included in every data citation:
For specific citation examples, check out the University Libraries' Data Citation page.
What is Citizen Science?
Citizen Science is crowdsourcing data collection for scientific purposes, often through the use of phone apps and the internet. It has been used by many research institutions with wild success and is often a fun activity that anyone, children included, can enjoy with friends. Many of the most successful citizen science projects partner with research libraries. For example, the Macaulay Library are Cornell University fully supports the Cornell Lab of Ornithology which runs the eBird app. The STEM librarians at the University of Maryland can help you set up your next citizen science project!
Citizen Science apps are great tools for understanding large scale data collection projects. They are often simplified forms that create spreadsheet entries. In this box, we are going to investigate some of the best data collecting apps that will give us a better idea of how to gather data in the field.
The University of Maryland offers Data Science Workshops for free to UMD researchers. Looking to improve your Python, R, or Data Visualization skills? Register for a workshop and get connected!