Archiving file types
As mentioned in File types (whilst creating, working with, and processing data), there are a great many data formats in use within the dendrochronological community. However, with the exception of the Tree Ring Data Standard, none provide the mechanism for recording rich standardised metadata. Wherever possible we strongly recommend using TRiDaS format files to store both your data and metadata.
If the tools being used for the dendrochronological analyses do not support TRiDaS, then one option is to maintain your data within the legacy format supported by the software, and the metadata in a TRiDaS-enabled database like TRiDaBASE[1] (Jansma et al. 2012b). The DCCD repository supports the submission of TRiDaS plus a variety of legacy formats, whereas the ITRDB supports the submission of Tucson and TRiDaS files.
The managing of the legacy data files in a hybrid TRiDaS/legacy system is definitely the weakest link in this approach. With no software to provide error checking, it is all too easy to mislabel, rename, move and delete files – especially in a multi-user environment. The utmost care must be taken at all times to be as strict as possible. Write access to the files should be limited to only essential personnel. Folders containing site data files should be made read-only as soon as work on a site has been completed.
If one decides to follow this method it is important to ensure that the legacy data format that is used stores the ring-width data in an unambiguous manner. Where ambiguities exist, these should be clarified in the TRiDaS metadata. A list of formats, along with some of the potential issues they may have, is available in table 2. Further details are available in Brewer et al. (2011).
One issue shared by many of the legacy data formats is the ambiguity of the calendar used for dates. To reduce the programming complexity, many formats use what is known as the astronomical rather than the Gregorian calendar. The astronomical calendar includes a year zero and denotes BC dates as negative integers. This means that prior to the AD/BC transition the year numbers are offset by one (e.g. 1AD = 1; 1BC = 0; 2BC=-1; etc.). This can cause complications for researchers both outside and within the dendrochronological community. It is not uncommon to find legacy data files with BC data where the offset has been removed.
Format | Suitable for TRiDaS co-archival | Potential issues |
---|---|---|
Belfast Apple | With caveats | Format cannot store missing ring information |
Belfast Archive | With caveats | Format cannot store missing ring information |
Besancon | With caveats | The format should strictly only contain data values are in 1/100th mm, however, some users store micron resolution data instead. There is no standard way to discern the difference |
CATRAS | No | Closed source proprietary binary format. There is some support for reading/writing CATRAS files in the TRiCYCLE library but the closed nature of the format means this is not comprehensive |
Cracow Binary format | No | Simple binary format storing just ring-widths and sapwood markers. Although this is a very simple format, the fact it is binary means accessing the data is not trivial |
Corina | No | File format with very limited software support |
DendroDB | No | An export format used by the discontinued DendroDB database. No known software implementations generate this format. |
FHX2 | No | Dendro format for recording fire history event data. Does not support ring-width data |
Generic spreadsheet formats | No | Various spreadsheet-style formats (including CSV, MS Excel, ODF etc) are used by some to store and transfer ring-width data. The flexible nature of spreadsheets means that data can be stored in a wide variety of ways makes it unsuitable for long term storage of data. |
Heidelberg | Yes | Native format for the widely used TSAP software. Has extensive support for metadata fields, but the contents of these fields are not standardised |
KINSYS-KS | With caveats | Format specific to the software of the same name produced by the Finnish Forest Research Institute although not widely used elsewhere. Stores ring-width data effectively but the lack of support in software means it is not ideal for long term storage |
Nottingham | No | A format with no known documentation or extant reference implementation |
Oxford | With caveats | Does not store data: prior to 1AD; ring-widths >=1mm |
PAST4 | No | Files can be in either astronomical or Gregorian calendar which can introduce ambiguities. |
Sheffield | With caveats | Format cannot store missing ring information |
SYLPHE | (see Bescancon) | |
Topham | With caveats | Very simplistic format which should only be used for raw ring-width data |
TRIMS | With caveats | Format cannot store missing ring information |
Tucson | With caveats | There are many undocumented alterations to this format. Check files with TRiCYCLE or COFECHA to ensure readability before archiving |
Tucson compact | No | Rather complicated format that is not widely supported |
VFormat | No | Highly encoded metadata makes this format not very accessible |
WinDENDRO | No | Proprietary delimited text format |
[1] TRiDaBASE – http://www.tridas.org/tridabase/