Skip to content

Help & guidance Guides to Good Practice

Archiving file types

Peter Brewer (Laboratory of Tree-Ring ResearchUniversity of Arizona, USA), Esther Jansma (Cultural Heritage Agency and Utrecht University, The Netherlands), Version 1.1 – June 2016, Archaeology Data Service / Digital Antiquity, Guides to Good Practice

As mentioned in File types (whilst creating, working with, and processing data), there are a great many data formats in use within the dendrochronological community. However, with the exception of the Tree Ring Data Standard, none provide the mechanism for recording rich standardised metadata. Wherever possible we strongly recommend using TRiDaS format files to store both your data and metadata.

If the tools being used for the dendrochronological analyses do not support TRiDaS, then one option is to maintain your data within the legacy format supported by the software, and the metadata in a TRiDaS-enabled database like TRiDaBASE[1] (Jansma et al. 2012b). The DCCD repository supports the submission of TRiDaS plus a variety of legacy formats, whereas the ITRDB supports the submission of Tucson and TRiDaS files.

The managing of the legacy data files in a hybrid TRiDaS/legacy system is definitely the weakest link in this approach. With no software to provide error checking, it is all too easy to mislabel, rename, move and delete files – especially in a multi-user environment. The utmost care must be taken at all times to be as strict as possible. Write access to the files should be limited to only essential personnel. Folders containing site data files should be made read-only as soon as work on a site has been completed.

If one decides to follow this method it is important to ensure that the legacy data format that is used stores the ring-width data in an unambiguous manner. Where ambiguities exist, these should be clarified in the TRiDaS metadata. A list of formats, along with some of the potential issues they may have, is available in table 2. Further details are available in Brewer et al. (2011).

One issue shared by many of the legacy data formats is the ambiguity of the calendar used for dates. To reduce the programming complexity, many formats use what is known as the astronomical rather than the Gregorian calendar. The astronomical calendar includes a year zero and denotes BC dates as negative integers. This means that prior to the AD/BC transition the year numbers are offset by one (e.g. 1AD = 1; 1BC = 0; 2BC=-1; etc.). This can cause complications for researchers both outside and within the dendrochronological community. It is not uncommon to find legacy data files with BC data where the offset has been removed.

Table 2 – Legacy files formats: assessment of their suitability for storing dendrochronological data in combination with a companion TRiDaS file containing metadata.
Format Suitable for TRiDaS co-archival Potential issues
Belfast Apple With caveats Format cannot store missing ring information
Belfast Archive With caveats Format cannot store missing ring information
Besancon With caveats The format should strictly only contain data values are in 1/100th mm, however, some users store micron resolution data instead. There is no standard way to discern the difference
CATRAS No Closed source proprietary binary format. There is some support for reading/writing CATRAS files in the TRiCYCLE library but the closed nature of the format means this is not comprehensive
Cracow Binary format No Simple binary format storing just ring-widths and sapwood markers. Although this is a very simple format, the fact it is binary means accessing the data is not trivial
Corina No File format with very limited software support
DendroDB No An export format used by the discontinued DendroDB database. No known software implementations generate this format.
FHX2 No Dendro format for recording fire history event data. Does not support ring-width data
Generic spreadsheet formats No Various spreadsheet-style formats (including CSV, MS Excel, ODF etc) are used by some to store and transfer ring-width data. The flexible nature of spreadsheets means that data can be stored in a wide variety of ways makes it unsuitable for long term storage of data.
Heidelberg Yes Native format for the widely used TSAP software. Has extensive support for metadata fields, but the contents of these fields are not standardised
KINSYS-KS With caveats Format specific to the software of the same name produced by the Finnish Forest Research Institute although not widely used elsewhere. Stores ring-width data effectively but the lack of support in software means it is not ideal for long term storage
Nottingham No A format with no known documentation or extant reference implementation
Oxford With caveats Does not store data: prior to 1AD; ring-widths >=1mm
PAST4 No Files can be in either astronomical or Gregorian calendar which can introduce ambiguities.
Sheffield With caveats Format cannot store missing ring information
SYLPHE (see Bescancon)
Topham With caveats Very simplistic format which should only be used for raw ring-width data
TRIMS With caveats Format cannot store missing ring information
Tucson With caveats There are many undocumented alterations to this format. Check files with TRiCYCLE or COFECHA to ensure readability before archiving
Tucson compact No Rather complicated format that is not widely supported
VFormat No Highly encoded metadata makes this format not very accessible
WinDENDRO No Proprietary delimited text format

[1] TRiDaBASE – http://www.tridas.org/tridabase/