About these guidelines
The importance of access and preservation
Much of the information produced by archaeological research over the past century exists in lengthy, technical, limited-distribution reports, tables, and charts scattered in offices across the nation. The data contained in these resources are often encoded in computer cards, magnetic tapes and floppy disks that are degrading in archives, museums, bookshelves, filing cabinets, or desk drawers; all the while, technology to retrieve these data – and the human knowledge required to make them meaningful-rapidly disappears (Eiteljorg 2004; Michener et al. 1997). Museums and other repositories have typically treated the physical media on which archaeological data are recorded as artifacts and stored them as such: in boxes, on shelves. By far, the most common preservation treatment for digital data reported by repositories is to conserve the media on which the digital files are stored. Unfortunately, this method presents two serious problems. First, it renders the data on such media inaccessible unless one can physically retrieve the media, extract the data using the appropriate computer drive, and access the data with compatible software. The needed hardware and software for these tasks is rapidly disappearing in many cases. Second, the physical curation of the actual media (and associated hardware and software environments) is an inadequate long-term preservation approach, because both software and hardware constantly evolve and bits on magnetic and optical media gradually, but inevitably, “rot.”
Current preservation practices for digital data must be improved, or a significant portion of the available information about the archaeological record will be lost. This information—collected at substantial financial, intellectual, and physical cost—must be preserved and made accessible to future generations. Present data collection practices and preservation methods, if unaltered, will render this digital data inaccessible to future investigations.
A significant portion of the archaeological investigations undertaken in the United Kingdom and the United States involve public funding, land, permitting, planning, and regulations. Typically, public agencies are responsible for overseeing these investigations and also for ensuring the preservation of important archaeological sites, collections, and associated records. In addition to being accountable for preservation, agencies must make archaeological records and data available to the public, with appropriate controls in place to protect archaeological assets. Legal frameworks requiring these actions exist in both the UK (e.g. PPS5) and US (e.g. 36CFR79), and in many other nations.
Within existing legal frameworks, archaeological organizations, practitioners and repositories must institute good practices to ensure digital data access and long-term preservation, alongside the proper curation of physical collections and associated paper records.
About these guides
Modern archaeological projects have the capacity to create large quantities of digital information, whether from the on-site recording of excavation and survey data, specialist datasets developed during pre and post-excavation analysis, or publication of interpretative maps and plans. Digital information is created at every stage of a project, from fieldwork to assessment, analysis, and finally reporting and dissemination. Within the discipline, there has been an increasing awareness that data, not just artefacts and paper records, are part of the primary collection and should be preserved and archived as such. The current trend of moving away from physical and toward digital recording of information has made the lack of thoughtful preservation of digital data an even larger issue. In fact, in many cases, a digital dataset may be the only product of a project, and without thoughtful preservation, all context for any archaeological resources will be lost. The primary aim of these Guides to Good Practice is to provide information on the best way to create, manage, and document digital material produced during the course of an archaeological project. The ultimate aim of the Guides is to improve the practice of depositing and preserving digital information safely within an archive for future use.
A fundamental principle of these Guides is that any digital data produced from archaeological investigation should be managed and archived in a digital format. This approach precludes costly re-digitisation in the future while ensuring maximum accessibility and reusability of the data. Digital archiving also preserves the functionality of complex datasets such as GIS, CAD, and relational databases that simply could not exist outside of a digital medium.
This new set of guidelines aims to revise and expand upon an original series of Guides to Good Practice created by the Archaeology Data Service (ADS) as part of the UK’s Arts and Humanities Data Service (AHDS). Between 1998 and 2002, six Guides were published covering the creation and archiving of data within a number of key project types including aerial photography, excavation and geophysical survey, GIS, CAD and virtual reality. These original Guides individually drew together a number of key authors and contributors, all active in their respective fields, in order to produce widely relevant material that was subsequently reviewed and approved in both academic and non-academic circles. The new Guides also incorporate a fuller US perspective on the topic, having been reviewed and updated in collaboration with Digital Antiquity—which oversees the Digital Archaeological Record (tDAR)—as well as other US partners.
The aim of the original Guides—and one that continues with this new series—was to identify and explore key considerations in digital archiving, and in particular, to explore specific issues such as metadata, documentation, file formats and data migration. These issues, in keeping with the “Good Practice” sentiment, have been largely examined within the context of ‘a project’ and many of the Guides illustrate key issues with datasets from actual archaeological projects.
Although much of the content in the original ADS Guides remains applicable, in many areas the relevant technologies have developed considerably over the subsequent years. These changes have spawned new approaches and formats, as well as requiring different or further documentation to order, understand and reuse the data. In addition, many existing areas of data production have now fallen into the digital realm with “born-digital” reports, images and video files becoming commonplace components of many archaeological projects. These new Guides aim to address these developments, and provide concise guidance on managing and archiving the wide array of digital objects that are now core components of modern archaeological research.
The structure of these guides
The broadening and updating of the Guides to Good Practice has been carried out in collaboration with Digital Antiquity , a US-based organization devoted to enhancing preservation of and access to digital records of archaeological investigations. A major aim of the Guides is to provide the basis for archaeological project workflows to create digital data that can be archived effectively both by Digital Antiquity’s tDAR in the US and by the Archaeology Data Service in the UK. The development of the Guides involved close collaboration between ADS and teams in the US at the University of Arkansas and at Arizona State University.
The guidelines in this series aim to provide, within the context of archaeological research and data, both an overview of current digital archiving practice together with application-specific guidance. The initial sections of these Guides cover the fundamentals of digital preservation and examine general archival strategies: concepts such as significant properties, the processes of selection and retention strategies and the implications of very large datasets (“Big Data”). In addition, the opening sections of the Guides cover general, broad themes that should be considered at the outset of a project, such as potential data sources, metadata and documentation formats, specifications and copyright considerations.
The Guides then proceed to cover “basic components,” i.e. common file types that are frequently present in archaeological archives, irrespective of a project’s primary technique or focus. The chapters in this section cover a range of file types, including documents and texts, databases, spreadsheets, raster and vector images and digital audio and video files. Although these file types have been isolated here, in many cases such files can feed into or be the product of other techniques or applications that are discussed elsewhere in the Guides. In such cases, links with the relevant chapters are included so that the way in which these “basic components” fit in and relate to other data types is made clear.
In addition to the basic components chapters discussed above, the core chapters of these new Guides address the preservation of data resulting from common data collection, processing and analysis techniques such as aerial and geophysical survey, laser scanning, GIS and CAD. As with the basic components, these chapters largely deal with each technique as a workflow in isolation but link, where relevant, to other chapters where certain data can be seen as a discrete input or output.
It is also important to recognise that, within the final chapters, these Guides provide information concerning how to prepare and deposit material in a digital archive. Although they provide an overview and brief discussion of the facilities and procedures required to create and maintain a digital archive, these topics are covered more fully in other guidelines, which are referenced where appropriate. While the original Guides to Good Practice were written primarily to address archaeological practice in the United Kingdom, the new broadened and updated Guides aim to provide guidance that is more widely relevant. The scope of these new Guides therefore not only includes the United Kingdom and Europe, but also the United States, North and South America, and other parts of the world.
Background to digital data in archaeology
Excavation, survey, and other fieldwork archives are the physical results of archaeological interventions. Archaeological archives consist of artefacts and records of the work undertaken on site as well as during post-fieldwork recording and analysis. Many of today’s recording and analytical procedures result in digital material, such as databases, images, CAD, GIS, spreadsheet and word-processed files. Traditionally the entire archive (artefacts, paper and digital records) would have been transferred to a museum at the end of a project’s life. However, a survey into the state of museum archaeological archives in England (Swain 1998:47), noted that “most museums do not have the correct technology to store, access and curate in the long term those archives for which computer files form an important part.” The Swain report also highlighted that little digital material was being transferred to museums for archival purposes. This finding was echoed in Strategies for Digital Data (Condron et al. 1999:29-32 and Figure 6.6), which showed that the majority of digital material from archaeological projects is either retained by creators or is transferred into the hands of local government organisations. Strategies for Digital Data also reported that these organisations generally have inadequate policies for digital archives (Condron et al. 1999:33-39). A similar situation seems to exist in the US, where two national surveys of archaeological repositories (Childs and Kagan 2008; Watts 2011; see also discussion in McManamon and Kintigh 2010:37-38) suggest that digital documents and files from archaeological investigations are curated as objects rather than being made accessible. Digital archaeological material, much of which is central to description of the archaeological record, is in danger of being lost.
The intertwined problems of data access, preservation, and synthesis are not new to archaeology in the United States. In the late 1990s, a series of meetings and panels were sponsored by the Society for American Archaeology, the Society of Professional Archaeologists (now the Registry of Professional Archaeologists), and the National Park Service on the general topic of “Renewing Our National Archaeological Program.” Improving the management of archaeological information through greater data access and synthesis was one of the major topics covered in this effort (Lipe 1997; McManamon 2000).
Nor are the challenges of data access and preservation unique to archaeology. In 2009, the scientific journal Nature editorialized on the need for broader sharing and long-term preservation of data. The same issue of the journal included related reports on data access and preservation challenges (Nature 2009a, b; Nelson 2009; Schofield et al. 2009). The editorial cited particular successes: “Pioneering archives such as GenBank have demonstrated just how powerful such legacy data sets can be for generating new discoveries—especially when data are combined from many laboratories and analysed in ways that the original researchers could not have anticipated (Nature 2009a:145).” However, the editorial also emphasized that most scientific disciplines “…still lack the technical, institutional, and cultural frameworks required to support such open data access—leading to a scandalous shortfall in the sharing of data by researchers. This deficiency urgently needs to be addressed by funders, universities, and researchers themselves…Furthermore funding agencies need to recognize that preservation of and access to digital data are central to their mission, and need to be supported accordingly” (Nature 2009a:145).
Also in 2009 the United States’ National Academies released a book-length report on efforts to ensure the integrity, accessibility, and stewardship of digital research data (National Academies 2009). More recently, Science (2011) devoted a large special section to the challenges of “Dealing with Data.” Reports by experts in scientific disciplines from climatology to signal visualization reflected on how the deluge of data in their fields can be managed and used to advance knowledge.
While legacy data are important, there must also be a focus on the future. A substantial amount of public archaeological work is carried out annually. US federal agencies report approximately 50,000 annual field projects involving archaeological resources conducted in the United States, mostly by cultural resource management firms or agency staff (Departmental Consulting Archeologist 2009, 2010). Given the volume of data and reports produced each year, even archaeologists working in the same area are often unaware of important results already reported by their colleagues. At present, archaeological studies are accumulating a large magnitude of data, but these cannot be used efficiently and effectively to advance knowledge of the past due to inadequate preservation practice.
The difficulty of sharing information about and from existing research is exacerbated by the demographic transition underway in the ranks of professional archaeologists. Large numbers of archaeologists entered the profession in the 1960s and 1970s. These individuals are now retiring or passing away (Departmental Consulting Archeologist 2010:76-81). Now is the time to capture, for long-term preservation and access, the digital data associated with the work carried out by this cohort of archaeologists. Accessing the information by relying on the memories of individuals, no matter how prodigious these memories might be, will be impossible once these individuals are no longer available. Increasingly, these concerns about legacy data and preservation issues have led to an emphasis on the access to and preservation of archaeological data as an important facet of the responsible conduct of research in the discipline.
Today, a great deal of background research effort is expended searching for and acquiring relevant reports. Once found, more time is required to hunt for key data in volume after volume of hard copy reports that sometimes extend to more than a thousand pages. Yet, the ability to reanalyze existing data can make present-day investigations more productive and has the potential to identify and reduce costly redundant projects.
Resource discovery and reuse
The Swain report (1998:43-45) looked into the usage of archaeological archives and concluded that they were a grossly under-utilised resource. Reasons for this under-use may include difficulty in locating information about the contents of archives as well as the dispersal of finds and documentary material in different archival repositories. The development of appropriate resource discovery tools is fundamentally important in helping potential users not only to find the material they require, enabling reuse of digital resources, but also to direct them toward repositories containing the material of interest.
At present, potential user communities are largely unaware of the digital resources available. Making basic information about archaeological archives available at the earliest possible opportunity through a resource discovery tool like ArchSearch, the online catalogue of the Archaeology Data Service or tDAR, the digital repository of Digital Antiquity, increases awareness and improves reuse potential. ArchSearch contains, in the first instance, site level metadata, for example to a fieldwork project archive and to SMR or NMR records. The level of detail contained in resource discovery tools such as ArchSearch and tDAR can be increased as needs dictate and resources allow. For example, ArchSearch provides an index to the National Monuments Record of Scotland. Once a researcher has located sites of interest in the ADS catalogue, they are able to jump from the index-level records held in ArchSearch (or ‘drill-down’) to more detailed records held in Canmore-WEB, the online catalogue of the NMRS, for information regarding their collections. Such a system enables researchers to find the resources of interest and helps to target visits to an archive, spending their time there productively. tDAR is resource- or project-specific instead of being site-centric. Searches within tDAR will discover projects, documents, or other resources associated with your search. Users may search by map, keyword, culture, time period, or even based on the text within a document. tDAR includes citation, geographic location, and brief descriptive data about over 350,000 archaeological reports in the US. A preliminary search of tDAR can provide an initial understanding of the archaeological investigations already done in any particular area. With this basic information, researchers can begin to locate reports of these investigations and other studies in their area of interest.
Access is not the only reason that existing archaeological archives are under-utilised. Too often, archives are seen as the final resting place for archaeological information, rather than as one stage in a cycle of information-gathering and reuse. As Swain (1998:14) notes, “it has also been accepted for many years that archaeological techniques and technology will improve through time. A preserved archive will therefore allow future generations to extract new information from material that may not be possible at present.” Archives form a vital and living part of the archaeological resource and should be queried during later research projects. Current government planning guidance in the UK and archaeological resource management policies in the US emphasise a preference for preservation as opposed to excavation, so increasingly, the role of the archaeological archive is moving to centre stage (see also Childs 2010 for a similar perspective and examples in the US). In order to define appropriate management and investigation strategies, the archive of ex situ material must be consulted.
Archaeological publication of fieldwork projects is moving away from traditional large monographs towards the “slim volume” or a synthetic summary of the fieldwork. Such a development increases the importance of the digital archive, which may well become the only source for primary data. This form of publication has been chosen for the Fyfield and Overton Downs project (Fowler 2000), which is an integrated monograph and Internet publication. Using with the World Wide Web, readers are able to move from the high-level interpretations contained in this monograph to the minutiae of the data held in the digital archive. The project digital archive has been deposited with the Archaeology Data Service and can be remotely accessed via its catalogue, ArchSearch.
Finally, there is new international use of archaeological data, fed by growing access to the Internet and increasing curiosity about family research and local history. New inquiries and perspectives come with this widening public scrutiny, and bring with them new stimuli for archaeologists to review the way that data and associated interpretations are recorded and disseminated. We no longer operate in a world where archaeological data are created by archaeologists for archaeologists. As digitisation and computer literacy increase, the archaeological record will become more accessible and public. Archaeologists in Britain, for example, now have the opportunity to take advantage of new developments in integrated information systems, such as the National Grid For Learning, the People’s Network Online and Cornucopia, to ensure that archaeology has a voice in the interdisciplinary partnerships envisaged between national and local government, libraries, HEIs, schools and public organisations (Condron et al. 1999, 4 Recommendation 3).
Finally, ease of access and improved ability to share digital archaeological data provides new opportunities for scholarly and scientific research across national boundaries. Tools like ARENA2, and TAG represent the future potential of these repositories and tools – enabling centralized discovery of resources from around the world.
Childs, S. Terry, and Seth Kagan (2008) A Decade of Study into Repository Fees for Archeological Collections. Studies in Archeology Program, National Park Service, Washington, DC. http://www.nps.gov/archeology/PUBS/studies/STUDY06A.htm
Childs, S. Terry, editor (2010) Special Issue: The Dollars and Sense of Managing Archaeological Collections. Heritage Management 3(2):155-289.
Condron, F., J. Richards, D. Robinson and A. Wise (1999) Strategies for Digital Data – Findings and Recommendations from Digital Data in Archaeology: a Survey of User Needs. Archaeology Data Service, York.
Departmental Consulting Archeologist (2009) The Secretary of the Interior’s Report to Congress on the Federal Archeological Program, 1998-2003. Archeology Program, National Park Service, Washington, D.C. http://www.nps.gov/archeology/SRC/src.htm
Departmental Consulting Archeologist (2010) The Secretary of the Interior’s Report to Congress on the Federal Archeological Program, 2004-2007. Archeology Program, National Park Service, Washington, D.C. http://www.nps.gov/archeology/SRC/reportPdfs/2004-07.pdf
Eiteljorg, H. (2004) ‘Computing for Archaeologists’ in Schreibman, S., Siemens, R. and Unsworth, J. A Companion to Digital Humanities. Blackwell, London: 20-30.
Ferguson, L.M. and D.M. Murray (1997) Archaeological documentary archives: preparation, curation and storage. Institute of Field Archaeologists Paper 1.
Fowler, P. (2000) Landscape Plotted and Pieced: Landscape History and Local Archaeology in Fyfield and Overton, Wiltshire. The Society of Antiquaries of London, Oxbow Books.
Lipe, William D. (1997) Report on the Second Conference on Renewing Our National Archaeological Program, February 9-11, 1997. http://www.saa.org/AbouttheSociety/GovernmentAffairs/NationalArchaeologicalProgram/tabid/240/Default.aspx
McManamon, Francis P. (2000) Renewing the National Archaeological Program: Final Report of Accomplishments. A Report to the Board of the Society for American Archaeology from the Task Force Chair. Society for American Archaeology, Washington, D.C.
McManamon, Francis P. and Keith Kintigh (2010) ‘Digital Antiquity: Transforming Archaeological Data into Knowledge’. SAA Archaeological Record 10(2):37-40.
Michener, W.K., J.W. Brunt, J.J. Helly, T.B. Kirchner, and S.G. Stafford. (1997) ‘Nongeospatial Metadata for the Ecological Sciences’. Ecological Applications 7(1):330-342.
Museums and Galleries Commission (1992) Standards in the Museum Care of Archaeological Collections.
Museum of London (2009) General Standards for The Preparation of Archaeological Archives Deposited with the Museum of London.
National Academies (2009) Ensuring the Integrity, Accessibility, and Stewardship of Research Data in the Digital Age. The National Academies Press, Washington, D.C.
Nature (2009a) ‘Editorial: Data’s Shameful Neglect’. Nature 461(7261):145.
Nature (2009b) ‘Opinion: Prepublication data sharing’. Nature 461(7261):168-170.
Nelson, Bryn (2009) ‘Data Sharing: Empty Archives’. Nature 461(7261):160-163.
Richards, J. D. & Robinson, D (2000) Digital Archives from Excavation and Fieldwork: Guide to Good Practice (Second Edition). AHDS Guides to Good Practice.
Schofield, Paul N., Tania Bubela, Thomas Weaver, Stephen D. Brown, John M. Hancock, David Einhorn, Glauco Tocchini-Valentini, Martin Hrabe de Angelis, and Nadia Rosenthal (2009) ‘Opinion: Post-publication Sharing of Data and Tools’. Nature 461(7261):171-173.
Science (2011) ‘Dealing with Data: Special Section’. Science 331 (11 February 2011):692-728.
Swain, H. (1998) A Survey of Archaeological Archives in England. English Heritage and Museums & Galleries Commission, London.
Watts, J. (2011) Policies, Preservation, and Access to Digital Resources: The Digital Antiquity 2010 National Repositories Survey. Publications in Digital Antiquity No.
‘About these Guidelines’. Edited by Kieron Niven and Francis Pierce-McManamon.
Archaeology Data Service / Digital Antiquity (2011) Guides to Good Practice