This article introduces the concept of the qualitative data package to describe the necessary components of a complete qualitative dataset that supports long-term preservation, access, and reuse.
Introduction
Qualitative research is not exempt from the growing demand for data management and sharing from funding agencies, journals, institutions, and scholarly communities. Data management and sharing policies are meant to promote research transparency and to facilitate data reuse by requiring researchers to make their data accessible to others. To achieve the goals of these policies, the data must be well-documented and organized so that they can be accessed, understood, and reused.
This guide introduces the concept of the qualitative "data package" as a framework for understanding and implementing best practices for assembling, documenting, and sharing qualitative data.
What is a Data Package for Qualitative Research?
In most cases, interpreting and reusing qualitative data requires more than access to the dataset file. A data package is a collection of files alongside the data that contain the information necessary to interpret the data and the context in which the data were collected/generated and analyzed. At a minimum, a qualitative data package should contain the following materials:
- The data or evidence underlying reported results (e.g., records, transcripts, qualitative analysis software project file)
- Documentation capturing data collection and analysis processes (e.g., methodology brief, data collection protocols, data audit trial, coding schemas)
- Explanation of data structures and files (e.g., case or document groups)
- File descriptions that explain file relationships and other relevant information about the content and purpose of files
Qualitative Data Package Examples
Qualitative data packages and their contents will vary depending on the research method and techniques used to collect and analyze the data. The table below provides examples of qualitative data packages for different types of research projects to illustrate the types of materials that may be included in the package to facilitate data interpretation and use.
Qualitative data package examples
Archival Research |
Qualitative Interview Study |
Mixed Methods Research (Archival + Survey) |
Textual data (.txt, .qdpx)
Images (.tif)
|
De-identified transcripts and memos (.txt, .qdpx) |
Textual data (.txt, .qdpx)
Numeric data (.csv)
|
Annotations (.txt, .pdf/a) |
Codebook with definitions and examples (.txt) |
Annotations (.txt, .pdf/a)
Numeric data codebook (.pdf/a)
|
Artifact collection strategy (.pdf/a) |
Schedule of questions (.pdf/a) |
Artifact collection strategy (.pdf/a)
Survey instrument (.pdf/a)
|
Data use agreements (.pdf/a) |
Informed consent form (.pdf/a) |
Data use agreements (.pdf/a)
Informed consent form (.pdf/a)
|
Methodology brief (.pdf/a) |
Data collection and coding process brief (.pdf/a) |
Methodology brief (.pdf/a) |
README file (.txt) |
README file (.txt) |
README file (.txt) |
Data license with terms of use |
Data use agreement for non-de-identified data (.pdf/a) |
Data license with terms of use |
Documentation for Qualitative Research Transparency
For some qualitative researchers, deciding how much documentation is too much or too little can be challenging, especially considering the amount of time and effort it can take to assemble this information. It is important to note that documentation is essential for sharing qualitative data. Without sufficient information, your data may be misinterpreted or be unusable. When deciding what documentation to include in the qualitative data package, a useful question to ask is: What would someone outside of my research team need to know to understand and use my data appropriately?
Some examples of documentation that provide the information needed to understand the data and their context include:
- Guidance for fieldwork or collection development
- Schedule of questions and instructions for interviews and focus groups
- Informed consent forms and approved IRB application
- Data citations and attribution if data were obtained from a data producer
- Licenses, terms of use, or permissions from data holders
- Description of analysis method
- Description of fieldwork sites and context
- Description of how derived variables or files were created
- Coding or annotation schemas/codebook
- Researcher’s positionality statement
Considerations for Sharing a Qualitative Data Package
Data sharing is not binary (open or closed); there are many ways to make data available (i.e., restricted access, dark archive) and good reasons why you cannot make data completely open to the public. It is your responsibility to determine if the data has any ethical, legal, or privacy concerns and ultimately if it is appropriate to share these data and in which ways are appropriate to share these data.
As you assess your data for sharing, a few considerations include:
- Does my data contain personally identifying information (PPI) and/or personal health information (PHI)? Is there any confidential or sensitive information in my data?
- How likely is someone to be identified in my data? If you are looking across the variables and at combinations of variables about a participant, how likely could someone guess who this participant is?
- Does my data fall under copyright or a data license from the original data producers? If you signed a research data use agreement, are there restrictions on sharing?
- Does my informed consent process and forms describe data sharing?
- Are there any restrictions on data use (i.e., only for academic research, big data cannot be moved) that a future user needs to be aware of?
If these questions raised any red flags for you, you might need to consider alternatives to openly sharing your data. There are many options such as de-identification, redaction, embargos, restricted access, applying terms of use, data confidentiality agreements, etc. For more information on ways to share data, please consult our guidance on Data Access Restrictions and Sensitive Data.
Conclusion
Constructing a data package for qualitative research requires planning as well as thinking through any ethical, legal, and/or privacy concerns with regard to data sharing. Along with the information provided above, we've pulled together a few additional resources to help you navigate packaging and sharing your qualitative research.
Resources
The Data Curation Network (2023) has developed primers focused on preserving data types (e.g., Oral History, Twitter, Atlas. TI) that offer considerations for documentation, file formats, etc. The Qualitative Data Repository at Syracuse University provides valuable guidance on managing, preparing, and sharing data. UNC is a member of the Qualitative Data Repository (QDR), enabling UNC researchers to have access the QDR curation services. The Qualitative Data Sharing Toolkit provides guidance on planning and preparing qualitative data for sharing.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
RDM Guidance formatting was influenced by The Writing Center, University of North Carolina at Chapel Hill Tips & Tools handouts.