ADVICE

Guidelines for Depositors (Version 3.0 September 2015)

Contents

Introduction to the Guidelines
Why Deposit Data?
Depositing with the ADS
What to Deposit
How to Deposit
Costs
Preparing Collections for Deposit
Data Management Plans
File Management (Formats, Structure, Naming, Versioning)
Metadata
Selection and Retention
File-level Metadata Requirements
Documents
Databases, Spreadsheets and Statistics
Raster Images
Geophysics and Remote Sensing
CAD and Vector Images
Geographical Information Systems
Video and Audio
Virtual Reality
Photogrammetry
Collection-level Metadata Requirements
Deposit Check List
Downloads
Acknowledgements


File-level Metadata Requirements

File-level Metadata

File-level metadata is the information required for ADS to archive and disseminate files within a larger dataset, this will commonly include notes on the software used, lists of file names, and contextual information. For example a spreadsheet recording small finds from a project which includes detailed information about the finds is useless unless you know what site the small finds are from, what the columns in the spreadsheet are recording and what any codes or context numbers used refer to.

It is most efficient to prepare for file-level metadata collection at the beginning of a project and set up rigorous systems to collect metadata to the required standards during an ongoing project. Compiling file-level metadata at the end of a project can be time consuming, costly and in some cases impossible.

Required file-level metadata will differ dependent on data type. This page of the Guidelines for Depositors provides advice and Metadata templates in a variety of formats for the common data types used in archaeology. Annotated PDFs of completed templates are also available. If your data type is not here please contact ADS to discuss if we can archive your data.


Return to Contents

Documents

Documents and text files are arguably the most common file type created as the result of archaeological research. The significant properties that need to be preserved in text documents are the words and their order, the hierarchical structure of the document, formatting and page numbering of a document. This is particularly important for citation purposes, therefore ADS ensures that the same page numbering is maintained through each migration of the document.

Formats_Texts.png

When depositing texts for archiving they should be submitted in one of the accepted file formats that can be seen in the table on the right. Documents often contain embedded content such as images and spreadsheets. It is recommended that in addition to embedding this content within a document, such content is deposited and archived separately, thereby retaining the original qualities of the content (e.g. image resolution) and allowing it to follow a separate archival strategy to the textual content.

In addition to documents created within word processing software, a significant proportion of text documents can be created as the result of a digitisation process. This process generally starts with a digitised image of a hard copy page which is then processed using optical character recognition (OCR) in order to transform the image into 'real' (editable and searchable) text. The final text, which may also include images and figures, is predominantly stored as a PDF. For advice on digitising journal articles and grey literature reports see ADS's Quick Guide to Digitisation.

ADS will accept PDF for documents where this is the only format available, as in case of digitisation projects. ADS will accept all PDF types, but would prefer that they are deposited in PDF/A. This can be in either PDF A/1a or PDF A/1b. However, when an original text format is available ADS would prefer that to be the format submitted. This is because PDF content is often downsampled during the PDF process, leading to loss in the original data streams. At the same time once data is embedded within a PDF it is difficult to get it out again. Other formats allow for great flexibility in terms of preservation and future reuse.

The text metadata template must be completed for all text files. However if the collection you are depositing is a large collection of text documents only, such as the results of a journal digitisation project, please contact ADS directly as we may also require additional documentation.

Text Metadata Template: Microsoft Excel, Open Office Spreadsheet, csv

Text Metadata Example: PDF


Return to Contents

Databases, Spreadsheets and Statistics

Databases and spreadsheets are a very common archaeology data type, used to record anything from simple photography lists to complex multi-relational context and finds information. Although, strictly speaking, databases and spreadsheets have very different functions, it can be argued that in many archaeological applications both are used to collect and store data in a similar way (defined in terms of records/rows and fields/columns). From an archival perspective this similarity becomes more apparent when the significant properties are taken into account. When preserving data in these formats the key significant properties of both databases and spreadsheets are: the data values themselves and the structure (tables or sheets) in which this data is held. From this perspective both types of object can be treated (and archived) in a similar way. The Guides to Good Practice have further detail on managing and archiving databases and spreadsheets.

Databases

Formats_Database.png

ADS views the data tables within a database, together with the relationship between these as the core of a database and it is these elements that we ask to be documented and we seek to preserve. Forms, reports, queries and macros are not seen as significant properties of a database and are therefore generally not preserved. When depositing databases for archiving they should be submitted in one of the accepted file formats that can be seen in the table on the right and the database metadata template must be completed and accompany the database.

Database Metadata Template: Microsoft Excel, Open Office Speadsheet, csv

Database Metadata Example: PDF

Spreadsheets

Formats_spreadsheets.png

Spreadsheets are dealt with in a similar way to databases in that we aim to preserve the data within the spreadsheet rather than specific elements associated with its presentation (cell colour, font formatting, etc). Such elements, if deemed significant, should be documented in the accompanying metadata. ADS accepted and preferred file formats for spreadsheets can be seen in the table on the right. The spreadsheet metadata template must be completed and accompany any spreadsheets deposited with ADS.

Spreadsheet Metadata Template: Microsoft Excel, Open Office Spreadsheet, csv

Spreadsheet Metadata Example: PDF

Statistics

ADS will accept statistics. The preferred format for statistics is a delimitated text file (.txt) or the above spreadsheet formats, however ADS will accept the specialised statistics formats listed in the table on the right if a delimitated text file is not suitable. ADS does not have a predefined metadata template for statistics and it is recommend that if you intend to deposit statistics you contact ADS for further advice.


Return to Contents

Raster Images

Formats_Raster.png

Raster Images are, like Documents, one of the most popular archaeological data types as this can cover site photography and drawings. When preserving raster imagery it is important that no data is lost through compression. ADS preferred format for deposit is therefore TIFF. Data creators however, should be aware that there are various types of TIFF available, including compressed versions, which are not suitable as preservation formats. The uncompressed baseline v.6 is the only suitable TIFF for preservation purposes. Also note that the compression used by the TIFF (LZW) is based on proprietary software and is unsuitable for long term preservation. To ensure the image is an uncompressed baseline v.6. TIFF file, please make sure no compression is selected when the file is saved. The Guides to Good Practice contain more detailed information about raster imagery. If TIFF is not available then ADS will accept a number of other raster image formats indicated in the table on the right.

If EXIF or other metadata has been embedded within files then data creators should indicate so and if this metadata requires preservation (ideally such metadata should be exported and deposited as separate files).

All raster images deposited with ADS should be accompanied by raster image metadata. Raster images, in particular photographs, are a good exemplar for why metadata should be collected gradually during your project rather than trying to compile metadata at the end of the project when the person preparing the archive may not have been the person that took the photograph and may not have enough information about the image to fully complete the metadata.

Raster Image Metadata Template: Microsoft Excel, Open Office Spreadsheet, csv

Raster Image Metadata Example: PDF


Return to Contents

Geophysics and Remote Sensing

Formats_Geophysics.png

Geophysics and other remote sensing data types are sometimes the only record of buried archaeological features that are destroyed during commercial developments and may hence be the only record of our cultural heritage in years to come, therefore it is very important that they are correctly preserved. The Guides to Good Practice contain a full and detailed Guide which covers the data management of geophysical data throughout its lifecycle.

When depositing survey data with the ADS it should be in one of the accepted formats indicated in the table on the right. For UK contracting units, in the absence of specific guidance (i.e. within a project brief) the ADS recommendation is to deposit a geo-rectified TIFF of high quality and a pre-processed composite file(s) of raw data. All data should be accompanied by the following metadata:

Geophysics Metadata Template: Microsoft Excel, Open Office Spreadsheet, csv

Geophysics Metadata Example: PDF

Please note that the information above is limited to geophysical methods common to archaeological investigation in the UK.


Return to Contents

CAD and Vector Images

Formats_CAD.png

Many large archaeological projects make use of CAD and during their life span literally thousands of CAD models may be created and saved as part of a project's archive. During the process of post-excavation these files are often agglomerated into sets of group, sub-group and phase plans, and CAD models may be saved at each stage in the process. All of these later models are essentially composites of the earlier context plans. The appropriate use of a layer-naming system can help to reduce the need for large numbers of separate files. Rather than generating new files each time, new layers can easily be added to existing models to reflect changes in interpretation and, in the process, maintain a close relationship between the underlying data and the interpretations derived from them.

At some point, all project managers need to consider the question of whether it is really necessary to archive and, of course, to document every model. There are obvious cost implications associated with these decisions. During the final stages of a project there should be a process of data selection, where the overall archive is worked through and individual files are either selected for retention in the archive or discarded. This process is a standard part of the preparation of the non-digital project archive for deposition, and should also be part of how large digital archives are dealt with. For example, there are arguments to be made for the inclusion of every set of group, sub-group and phase plans in the archive, despite the fact that they are often simply agglomerations of the individual context plans. Essentially this will lead to a lot of duplication of data in the archive. Nevertheless such composite plans represent the cumulative results of interpretative decisions made by the archaeologists and as such are important building blocks towards the overall understanding of the site. Consequently it is important that these files are archived as they have a high reuse potential. It is also important that the individual context plans are archived alongside the group, sub-group and phase plans as they can be used to question the original archaeologists' phasing of the site and as such can be reused to attempt radical re-interpretations of the archaeology.

Nevertheless, there will be CAD files that are appropriate to discard and omit from the final archival deposit. Such models include test and unfinished versions of later plans or earlier versions of phasing, which have been superseded by later interpretations and would consequently lead to false impressions of the archaeology of the site. More information on working with CAD can be found in the CAD Guide to Good Practice.

Accepted formats for depositing CAD and other vector images with ADS can be found in the table on the right. The following metadata template should also be completed and deposited alongside the CAD and vector image files.

Vector Metadata Template: Microsoft Excel, Open Office Spreadsheet, csv

Vector Metadata Example: PDF


Return to Contents

Geographical Information Systems

Formats_GIS1.png

A GIS is usually a collection of data files brought together to form a layered view of an area being studied. As such the archived object is often just the data files. ADS accepted file formats are indicated in the table on the right. For more information on managing GIS data files see the GIS Guide to Good Practice which is designed specifically to provide guidance for individuals and organisations involved in the creation, maintenance, use and long-term preservation of GIS-based digital resources.

In order to facilitate reuse at a later date the GIS requires some documentation about the whole. In particular it is important to record what the purpose of the GIS is, what each layer/file represents, and what coordinate system or arbitrary site grid was used and how the data relates to the chosen grid. When depositing GIS files with ADS the following metadata template should be completed.

GIS Metadata Template: Microsoft Excel, Open Office Spreadsheet, csv

GIS Metadata Example: PDF


Return to Contents

Video and Audio

Video and audio are becoming more and more popular as a means of recording both archaeology and events related to archaeological investigation such as surveys, procedures and interviews. Data creators should be aware of the various rights involved in recording people in audio or video and should ensure that the relevant rights and clearances are attained.

Video

Formats_Movies.png

Video is often used as a tool to accompany, document and supplement other data collection techniques. In particular, video is often a common component of survey projects particularly amongst maritime archaeologists where sites are less easily accessible than those on dry land. In addition, video can also form a simple output from projects employing a wide variety of data collection and analysis techniques such as 3D modelling or virtual reality in which a video 'fly-through' is produced as an easy way to engage with modeled data.

A common issue with video is that the data itself in a 'raw' high quality format can be very large. The case for archiving digital video created from archaeology projects depends largely on the intended purpose of the video. Video should be preserved when it contains unique original data (i.e. not recorded in another format such as photographs) or provides valuable support and documentation to other datasets.

Video files can also consist of a mix of container formats and codecs, emphasising the importance of detailed technical metadata in successfully identifying and working with video files.

The ADS accepts video in the formats indicated in the table on the right but ADS does not have a standard metadata template for video. The Guides to Good Practice provide metadata advice for video and it is recommended that if you are planning on depositing video files you contact ADS at the beginning of your project to discuss specific requirements.

Audio

Formats_Audio.png

Although video has perhaps found more applications in archaeology, audio files are often created as a component of projects looking to record oral histories or to recreate 'archaeological sounds', either through the modern reconstruction of archaeological musical instruments or through the recording of sounds within archaeological contexts such as reconstructed - physically or virtually - churches, henges or theatres.

Audio files may be large when created/stored in uncompressed formats and informed decisions need to be made when deciding when and how lower quality files are created. Another issue, that is again similar to video, is that the range of audio files consist of a mix of container formats and codecs. Metadata also plays a key role with audio files in documenting the file's creation process and contents (e.g. names and dates of interviews, locations, etc.) which may not be as apparent in similar video files.

The ADS accepts audio in the formats indicated in the table on the right, but ADS does not have a standard metadata template for audio files. The Guides to Good Practice provide metadata advice for audio files and it is recommended that if you are planning on depositing audio files you contact ADS at the beginning of your project to discuss specific requirements.


Return to Contents

Virtual Reality

Formats_VR.png

Virtual reality (VR) is the label given to a range of computer-based approaches to the visualisation of concepts, objects or spaces in three or more dimensions. Although the distinction is becoming increasingly blurred, these approaches tend to differ from other three-dimensional visualisations, such as the output by Computer Aided Design (CAD) packages and Geographic Information Systems, in that the experience is interactive.

The ADS accepts virtual reality in the formats indicated in the table on the right. If your file format is not on this list and you cannot convert to one of the accepted file formats contact ADS for advice. ADS does not have a standard metadata template for virtual reality data types as the metadata required can differ dependent upon the virtual reality model and its purpose.

It is recommended that if you intend to deposit virtual reality data types you contact ADS as early in your project as possible to discuss metadata requirements. The VR Guide to Good Practice also has lots of useful information on working with Virtual Reality data types.


Return to Contents

Photogrammetry

It is recommended that if you intend to deposit photogrammetry data types you contact ADS as early in your project as possible to discuss requirements. The CRP Guide to Good Practice also has lots of useful information on working with these data types.

Photogrammetry Metadata Template: Microsoft Excel, Open Office Spreadsheet


Back | Top | Next