RDMC Data Management Guide: Recommended Preservation File Formats for Qualitative Data

Summary

This article offers recommendations for qualitative data file formats that align with data management and sharing standards and best practices.

Body

This article offers recommendations for formatting qualitative data that align with best practices and standards.

Introduction

File formats matter. The file format used for qualitative data can impact both current and future usability and accessibility of the data. File formats define the structure of the information within the file so that it can be understood by machines and humans. Some formats are better suited for long-term use than other formats. Have you ever tried to open a file that was created 10 years ago? If you have, you probably encountered challenges in getting the file to render properly without having access to the particular version of the software program originally used to create the file.

Best practices for data management and sharing recommend using file formats that support long-term preservation, access, and use based on various sustainability factors that take into account the likelihood that information stored in a given file format will be preserved even as technology changes over time. This guide focuses on preservation file formats for qualitative data.

Recommended File Formats for Qualitative Data

If you use any of the popular computer-assisted qualitative data analysis software (CAQDAS) such as NVivo, ATLAS.ti, MAXQDA, and Dedoose, your qualitative data files likely use the REFI-QDA standard, which supports interoperability across qualitative data analysis applications. This means that users can transfer data projects between software programs that comply with the REFI-QDA standards.

Beyond CAQDAS file formats, qualitative data often include other file types including text, image, audio, and video files. The table below provides a list of primary and secondary recommendations for qualitative data file formats. Primary recommendations support long-term use, while secondary recommendations are acceptable for the near term, but may require migration to another format for long-term use.

Recommended file formats for qualitative data

Qualitative Data Type Primary File Format Recommendation Secondary File Format Recommendation
Word processing

.pdf/a

.pdf

.rdt
Text .txt  
Structured text (markup) .xml

.xhtml or .html

.dtd

.tex (LaTeX)

Image

.tif

.jp2 (JPEG2000)

.png

.svg (scalable vector graph)

.gif
Video

.avi

.mov

.mpg or .mpeg (MPEG-2)

.mp4
Audio

.bwf

.wav

.mp3

.ogg

Social media

.csv

.txt

.json (open, non-proprietary format)

.WARC (Web Archive)

.ARC (Archive)
Computer-assisted qualitative data analysis software (CAQDAS)

.qdpx (REFI-QDA project)

.qdc (REFI-QDA codebook)

Export as proprietary format along with export as common data format (.rtf, .txt)

References

Library of Congress. (2023). Library of Congress Recommended Formats Statement (Table of Contents). https://www.loc.gov/preservation/resources/rfs/TOC.html

Qualitative Data Repository. (2023). Formatting Data. https://qdr.syr.edu/guidance/managing/formatting-data

Smithsonian Institution Archives. (2023). Recommended Preservation Formats for Electronic Records. https://siarchives.si.edu/what-we-do/digital-curation/recommended-preservation-formats-electronic-records

 This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

RDM Guidance formatting was influenced by The Writing Center, University of North Carolina at Chapel Hill Tips & Tools handouts.

Details

Details

Article ID: 411
Created
Mon 3/3/25 3:11 PM
Modified
Fri 4/11/25 1:42 PM
Article Agent
The TDX agent acting as the primary point of contact for the article and is responsible for ensuring the content's accuracy on behalf of the group.

Related Services / Offerings

Related Services / Offerings (1)

The Research Data Management Core (RDMC) provides guidance and support to researchers on complying with funding agency data sharing policies and requirements.