Data documentation and metadata
In order for the archived measurement data to be useable they must be accompanied by sufficient contextual information. Only then can new data processing methods be applied, old measurements integrated into a GIS and new archaeological interpretations derived. Contextual information and documentation for such data is often provided through so-called ‘metadata’, which can be seen as ‘data about data’. They can range from the most basic (e.g. whether spatial coordinates are in metres or feet; that magnetic susceptibility values are quoted in 10-5 (SI)) to more advanced information (e.g. what is the size of the data grids, what data processing was undertaken). These metadata are either directly related to the geophysical data or to the project overall. Some of the geophysics metadata are usually stored as part of proprietary data formats (e.g. the size of data grids is included in Geoplot’s composite files) but are usually lost when data are exported into preservation formats (e.g. to XYZ text files). Since software for use of the proprietary data may not be available or already obsolete, it is necessary to explicitly provide data documentation. For example, to undertake data improvement (see above) information about the size of data grids is essential; just reading a XYZ text file into a GIS package does not help. The explicit provision of geophysics metadata must hence be an integral part to the export of measurement data into preservation file formats.
Another important function of metadata is their use for resource discovery. When data documentation is stored as metadata in the archival database of an Archiving Body (e.g. as in ArchSearch for the ADS, or the Digital Archaeological Record for Digital Antiquity) it is possible to use database search tools, including spatial searches, to locate the underlying data.
Creation of the geophysics metadata can be time consuming and at the moment no tools exist to automatically extract them from proprietary data formats, although some teams have worked on their own solutions to implement such tools (Sparrow et al. 2009). Survey reports that are created following the guidelines released by English Heritage (David et al. 2008) will contain the most essential documentation as part of the report text and compiling it in a table is hence fairly simple. It could in fact be argued that a table of metadata could replace certain sections of a professional archaeological geophysics report, thereby saving duplication of efforts. However, at the time of writing, a textual description of all parameters is still the preferred form of documentation by many clients.
Maintaining a comprehensive documentation for all their geophysical survey projects will be of value to all archaeological geophysical practitioners, whether in a research or commercial environment. Such information is best held in a database as metadata and a common layout might be a desirable goal for the archaeological geophysics community. This comprehensive documentation should be part of the Archive of archaeological geophysics data, using a text document, spreadsheet or XML file.
Sometimes a limited subset of metadata may be acceptable or even prescribed by an Archiving Body. Three examples are discussed in Subsets of documentation: the information used in English Heritage’s Geophysical Survey Database (EH GSdb) , the OASIS project information and the Core Metadata fields for the ADS online catalogue defined according to Dublin Core. Other subsets include the information used by tDAR in the USA. Although Archiving Bodies might store only such subsets of metadata in their database for resource discovery (i.e. database searches) it is advisable to include the full metadata and documentation as part of an Archive.