Body
This article offers recommendations for formatting qualitative data that align with best practices and standards.
Introduction
File formats matter. The file format used for qualitative data can impact both current and future usability and accessibility of the data. File formats define the structure of the information within the file so that it can be understood by machines and humans. Some formats are better suited for long-term use than other formats. Have you ever tried to open a file that was created 10 years ago? If you have, you probably encountered challenges in getting the file to render properly without having access to the particular version of the software program originally used to create the file.
Best practices for data management and sharing recommend using file formats that support long-term preservation, access, and use based on various sustainability factors that take into account the likelihood that information stored in a given file format will be preserved even as technology changes over time. This guide focuses on preservation file formats for qualitative data.
Recommended File Formats for Qualitative Data
If you use any of the popular computer-assisted qualitative data analysis software (CAQDAS) such as NVivo, ATLAS.ti, MAXQDA, and Dedoose, your qualitative data files likely use the REFI-QDA standard, which supports interoperability across qualitative data analysis applications. This means that users can transfer data projects between software programs that comply with the REFI-QDA standards.
Beyond CAQDAS file formats, qualitative data often include other file types including text, image, audio, and video files. The table below provides a list of primary and secondary recommendations for qualitative data file formats. Primary recommendations support long-term use, while secondary recommendations are acceptable for the near term, but may require migration to another format for long-term use.
Recommended file formats for qualitative data
Qualitative Data Type |
Primary File Format Recommendation |
Secondary File Format Recommendation |
Word processing |
.pdf/a
.pdf
|
.rdt |
Text |
.txt |
|
Structured text (markup) |
.xml |
.xhtml or .html
.dtd
.tex (LaTeX)
|
Image |
.tif
.jp2 (JPEG2000)
.png
.svg (scalable vector graph)
|
.gif |
Video |
.avi
.mov
.mpg or .mpeg (MPEG-2)
|
.mp4 |
Audio |
.bwf
.wav
|
.mp3
.ogg
|
Social media |
.csv
.txt
.json (open, non-proprietary format)
.WARC (Web Archive)
|
.ARC (Archive) |
Computer-assisted qualitative data analysis software (CAQDAS) |
.qdpx (REFI-QDA project)
.qdc (REFI-QDA codebook)
|
Export as proprietary format along with export as common data format (.rtf, .txt) |
References
Library of Congress. (2023). Library of Congress Recommended Formats Statement (Table of Contents). https://www.loc.gov/preservation/resources/rfs/TOC.html
Qualitative Data Repository. (2023). Formatting Data. https://qdr.syr.edu/guidance/managing/formatting-data
Smithsonian Institution Archives. (2023). Recommended Preservation Formats for Electronic Records. https://siarchives.si.edu/what-we-do/digital-curation/recommended-preservation-formats-electronic-records
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
RDM Guidance formatting was influenced by The Writing Center, University of North Carolina at Chapel Hill Tips & Tools handouts.