This page explores different ways to think about data through an equity lens. Research data is often regulated, and many research projects have to go submit an IRB for approval. However an IRB is not always required, and even when it is, an IRB doesn't always investigate research methods from an equity standpoint. It's up to the researcher to go above and beyond to ensure that whatever the topic they are researching is equitable to all people in their community.
This looks vastly different depending on whether you are working in engineering, medicine, ecology, or English literature, dance, and history. This guide is designed to get you thinking about some of the equity concerns that might effect your research data. If you need additional help on your specific topic, don't be afraid to reach out to your Subject Librarian!
Note that the information below is also available as a workshop. Please contact me, Jodi Coalter, to arrange a time. You can also view the workshop slides in Google Drive linked below.
Research Data Management (or RDM) can be defined as the process of documenting, organizing, and maintaining the processes used in the information/data lifecycle.
But what counts as "research" data? Basically, we describe research data as the you are collecting that are used to reach your conclusions and “prove that you are right”Research data may be experimental data, observational data, operational data, third party data, public sector data, monitoring data, processed data, or repurposed data.
Note that the collection, use, and reuse of data often follows a common cycle, called the Research Data Management Life Cycle. At each stage, there is an opportunity to incorporate more EDI into your research. Click on the tabs in this box to learn more!
The very first step to incorporating equity into your data is to acknowledge that data are not objective or neutral. From collection to use and reuse, every step is filtered through each researchers preconceived ideas and notions about what should be counted and how it should be counted. Viewing your data management plan through an equity lens will help mitigate unconscious bias and ensure that any unintended harm to historically marginalized communities is avoided.
In other words, how you collect data, who collects the data, how the data is stored, who can access the data are all human driven projects! That means that data themselves cannot be neutral.
There are a few key points where equity overlaps with RDM:
You don't have to look very far to discover an example of when research data management went bad. Below are a few examples of why considering equity in data is important from the very beginning.
1. Facial Recognition
Facial recognition software started in computer science. Unfortunately, computer science is almost 87% white male, which means that when it came to training facial recognition software, they didn't use a diversity of faces. This lead to facial recognition software having trouble distinguishing between Black people. Find out more.
2. Medical Research
Medicine has a long and highly problematic history, and the research devised to study medicine is perhaps the most tarnished. Modern day medicine is no less troubled, as COVID-19 has clearly demonstrated. Medical research recognized health disparity in COVID, but the data about who was dying failed to effect the vaccination program.
3. The Library of Missing Datasets
Who and what doesn't get counted is just as troublesome as who and what does get counted. The Library of Missing Datasets is an art project by MIMI ỌNỤỌHA that demonstrates how many datasets that should exist but don't.
In summary, none of the examples of “when things go bad” are necessarily the result of poor data management. If anything, proper data management perpetuates a lot of the inequities presented here. However, it is important to note that data management, sharing, and access procedures can reinforce and defend other poor decisions. For example, if an example of misuse or inequity arises, but it isn’t against policy, procedure, or the documentation, then who is to say that it an issue? Who is responsible for fixing it?
It’s easy to understand how inequality is perpetuated in large systems/big data, but what about small-scale datasets? What happens to your data after you graduate or move to a different institution? Who has it and who gets a say in how its used?
Many research projects require the approval of an IRB before any type of experimentation can begin. They are intended to stop the project from causing harm, both physical and mental, to research subjects. But they often don't go far enough when it concerns marginalized communities.
For our example, lets investigate one research study that wanted to investigate performance-related musculoskeletal disorders (PRMDs) in musicians. This particular study required IRB approval because it directly applied to human health. An IRB would probably investigate the ethical implications of the actual research procedure and ensure that the procedure itself didn't impact the participants mental wellbeing.
But what isn't covered in this IRB?
This study might not have taken into account the number of research participants who would be drawn from an academic community of musicians. This group, from the cultural/professional context of a disability, e.g. injury and musical performance, may have unintended consequences to an individual’s career. Professional musical networks are dense and pedagogical lineage is important; personal details about institution, expertise, etc. may give the individual’s identity away. There are a finite number of professors on the tenure track. In this scenario, the onus and responsibility is place on the person with the disability instead of the system.
There are other considerations, too:
There are a plethora of projects that simply do not require an IRB, or quality for expedited IRB due to their only glancing interaction with humans. From engineering, ecology, computer science, and data science, many research projects simply don't have to delve too deeply into ethics.
But going above and beyond ensures that there are no accidental casualities to your research.