Which file formats should I use?
Question
What are the preferred file formats to use for storage of primary and secondary data and documents?
Answer
The strongly preferred way of storing all data is as tab- or comma-delimited text files with variable names in the first line, with an associated R script that reads the data file, as this makes data robust towards future changes in software and data file formats. For other data types, consider using the suggested file formats below (based on the KNAW-DANS Preferred Formats overview, May 2013) for similar reasons of compatibility and future accessibility:
Data type | Preferred format | Acceptable format |
Documents | PDF/A (.pdf) Unicode TXT (.txt) |
OpenDocument Text (.odt) MS Word (.doc, .docx) Rich Text File (.rtf) PDF (.pdf) Non-unicode TXT (.txt) |
Spreadsheets | PDF/A (.pdf) Comma separated values (.csv) |
OpenDocument Spreadsheet (.ods) MS Excel (.xls,.xlsx) |
Databases | ANSI SQL (.sql) Comma separated values (.csv) |
MS Access (.mdb, .accdb) dBase III or IV (.dbf) |
Statistical data | R | SPSS portable (.por) SAS transport (.sas) |
Audio | WAVE (.wav) | MP3 AAC (.mp3) |
Video | MPEG-2 (.mpg, .mpeg, …) MPEG-4 H264 (.mp4) Lossless AVI (.avi) |
QuickTime (.mov) |
Pictures (raster) | JPEG (.jpg, .jpeg) TIFF (.tif, .tiff) |
|
Pictures (vector) | PDF/A (.pdf) Scalable Vector Graphics (.svg) |
Adobe Illustrator (.ai) PostScript (.eps) PDF (.pdf) |
Geographical data | Google Earth Keyhole Markup Language (.kml, .kmz) Geographical interchange standard geoTIFF (.geotiff) |
ESRI Geodatabase ESRI Shapefiles (.shp and accompanying files) ERDAS Imagine (.img) |
Last modified: | 01 February 2017 12.43 a.m. |