8. Datasets in Digital and Other Media
The use of computers is increasing in all areas of archaeology. Digital datasets are growing as an integral part of project work, and new demands are being made of archive services that have the responsibility to protect information for the future. The previous chapter explored the information needs of archaeologists; this chapter will examine if they have the potential to be met.
The response rates for this part of the survey were not high. Some respondents without access to a computer left those parts of the questionnaire on general data creation. Consequently this area was the focus of follow-on telephone interviews (Appendix 3).
8.2 Data creation - the diversity and importance of digital datasets compared with those in other media
Section 5 of the questionnaire for organisations and section 6 of the questionnaire for individuals asked for information on the range of data being created, and whether this was held in digital form. Respondents were also asked to indicate which programs they used to create digital datasets.
Figures 8.1, 8.2, 8.3 and 8.4 show the variation in the kinds of datasets being created. Separate plots are provided for responses from organisations and individuals, as there are some differences in responses. The left-hand portion of each bar on the graphs shows the percentage of each kind of dataset that is created, either digitally or in other media, where respondents did not indicate if computers were used or not. The rest of the bar represents the percentage created digitally. Where respondents filled in more than one option, the one relating to digital datasets was selected. Consequently the smaller the left-hand portion of each bar, the greater the percentage of information that is created in digital form. The numbers on each column represent the number of respondents in the category. Results are represented as a summary of all responses, though comments are made where there is variation in the production of digital datasets in different areas of archaeology. As respondents did not always answer all of the options of these particular questions, it cannot be assumed that the failure to specify a particular dataset as digital identifies that dataset as non-digital. The results are indicative of general trends only.
Typical products of projects
Figure 8.1 illustrates that most of the respondents to these questions are based in contracting field units, local government archaeology departments, consultancies and HEIs. Those categories of information with the smallest digital component (around 50% among individuals, higher among organisations) are photographs, plans, and context descriptions. Among respondents, HE staff show the greatest use of computers in creating this type of information. Some independent archaeologists are also creating digital datasets, in particular when producing plans and images. Good proportions of consultants use computers in their project work. Museums are poorly represented, particularly regarding digital photographs and presumably few museums have the facilities to incorporate these into their work
Figure 8.1 Creating 'raw datasets'
As survey datasets are invariably created in digital form, it was not necessary to produce a plot. Most of those undertaking surveying work are based in field archaeology units, local government departments and museums (presumably those with field units), HEIs, national bodies, and also some independent archaeologists. There is more widespread involvement in GIS, particularly within local government departments.
Project syntheses and reports
Report writing is common throughout archaeology, and the use of computers is evenly spread across all groups (Figure 8.2). A similar picture is seen with syntheses, although with a smaller confirmed digital component. Individual returns indicate that computer-generated pictures are quite extensively created in HE, local government, national bodies and, to a lesser extent, amongst consultants and contracting field archaeologists. The responses from organisations, however, show that field units have the greatest involvement in digital imaging. The response for information on teaching material was low, although not confined solely to the HE sector, as field archaeologists, consultants and museum archaeologists also carry out some teaching and some use computers to prepare materials. Museums offer the greatest amount of material in digital form. Overall, computers may not be widely used for illustration and preparation of teaching material.
Figure 8.2 Creating project syntheses and reports
Figure 8.3 Creating museum catalogues and indices
Museums and archives
Responses to the questions relating to creating museum catalogues and indices were not high, and it was decided to collate information from both questionnaires (Figure 8.3). There was a strong response from the museums sector, and in general these are the areas where museums tend to be creating and using digital datasets. Those respondents for collections management information were not confined to museums; field archaeologists and consultants also claim to be creating and holding this kind of information. Collections management information is important for ongoing projects, although how much and in what form this information is passed to the final project holder must vary greatly. There were few responses regarding exhibition catalogues, and the majority of these are presumably created in paper form.
Sites and Monuments Records
Although a sizeable portion of archaeological project archives is held in digital form, the response from museums suggests that few can provide information on digital holdings (Figure 8.4). It is mostly local government archaeologists and those in national bodies that have digital records on project archives (Nb. results from chapter 6 on digital archives). Additionally, archaeological project archives are created in digital form particularly by consultants. Most responses for monument indices and detailed records came from local government archaeologists. The survey reflects the current state of SMRs in Britain, the majority of which include a computerised index. A large minority, however, do not claim to be creating a digital sites and monument index. Most of the respondents for detailed monument information are local government departments, although some consultants and national bodies also hold this information.
Figure 8.4 Creating sites and monuments management information
There is an important and extensive body of digital information available and growing in archaeology. Those categories of information with the smallest confirmed digital component (around 50%) are photographs, plans, context descriptions, illustrations, teaching datasets and exhibition catalogues. Those with more than 75% available in digital format are geospatially referenced site lists, specialists' catalogues, geophysical survey data, GIS, reports, basic monument indices (on average) and detailed monument records (on average). This represents the potential digital archive for the future.
In addition to the datasets listed in the questionnaire, organisations were also asked to identify any other information they held in digital form that could be made available for outside use. This ranged from very subject-specific catalogues to mapping and documentary information:
Specialist information: records of ancient ship construction, tree ring data, pollen data, tephra data, archaeometry. Databases: UADs, maritime archaeology indices, metrical database for animal bones, ceramics databases, Celtic Coin Index. Mapping: scanned and rectified 1st edition OS maps. Images: VR and 3D images, manuscript images. Surveying: building surveys, photogrammetric surveys. Other: bibliographies, abstracts, indices.
8.3 Software used to create digital datasets
The potential for archaeologists to re-use information is partly reliant on the programs used to create digital datasets. The use of specialist programs, or ageing packages, can reduce the potential of other users to access datasets. The questionnaires asked for details of the programs used to create digital datasets. Table 8.1 lists the range used, showing the five most popular packages for each category of information. The table collates the returns from individuals and organisations, as there was little difference between the two.
Table 8.1 Programs most commonly used by archaeologists
Software used for text reports ... for catalogues ... for plans and images ... for mapping contour and geophysical surveys 1st Microsoft Word (327) Microsoft Access (146) AutoCAD (106) AutoCAD (29) 2nd Word Perfect (105) Excel (81) Photoshop (62) Surfer (27) 3rd Aldus Pagemaker (15) Microsoft Word (76) Corel Draw (59) Geoplot (20) 4th AmiPro (12) DBase (28) Paintshop (25) Insite (11) 5th Wordstar (10) Fox Pro (18) ArcView (23) Arcinfo (10) 21 other programs 43 other programs 45 other programs 33 other programs
Clearly Microsoft programs are most widely used for text reports and to a lesser extent for compiling catalogues and databases. The variety of programs used to create images and for work with surveying data shows that archaeologists are employing the more extensive range of software commercially available. The greatest variation in software is seen amongst local government, field archaeologists and museum archaeologists, where software deals on behalf of local government may be negotiated by those with little experience or knowledge of the needs of archaeologists employed within their organisation.
This diversity in data creation results in difficulties with the re-use of digital data from completed projects. Section 6.4 illustrated that documentation containing details of the software used to create files, and compatible programs, may not be available in just under 50% of the organisations sampled.
Computers are being used to build up an extensive body of data, covering all areas of archaeology. When Table 8.1 is compared with the information needs identified in Tables 7.3-7.4, we see that archaeologists will be able to obtain records in digital form in the future. This is, however, dependent on datasets being archived, made available for re-use, provision of the relevant support documentation, and the user having access to software and support to run these datasets. Strategies for Digital Data shows that current practice often fails to meet these requirements.
8.4 Use of standards and thesauri in archaeology - what hope for cataloguing and information retrieval?
An important aspect regarding the usability of information is the extent to which common standards in terminology and practice have been applied in its creation. Unless standards are used in at least some parts of data creation, basic and essential tasks like cataloguing and locating information resources become chaotic and complex. Commonly agreed terms for objects, places, time-periods, materials and so on can be built into cataloguing systems (often the case for museum indices), and facilitate information material. The task of cataloguing, however, can be slow if the projects being archived do not conform to international, national or even local standards.
Organisations were only asked for details of the standards they used in the creation of digital data (question 5.7 on the questionnaire for organisations). In total, 108 responded to these questions, and details are shown in Table 8.2.
Table 8.2 Standards used in digital data creation
Standard/guideline No. using this Comment RCHME thesaurus of monument types 33 Used in all areas, particularly local government and national bodies In-house standards 25 In all areas, particularly museums and local government MDA object name 11 Mainly consultants, museum archaeologists and national bodies MDA Spectrum 11 Mainly museums MIDAS 9 Used by a few in all areas EH Guidelines 6 - None 14 Mainly field archaeology units
Table 8.3 List of the ways organisations locate spatial data
Standard/guideline No. using this % OS (Britain, Northern Ireland, Republic of Ireland), including letters 132 80% OS (Britain, Northern Ireland, Republic of Ireland), letters converted to 100km ref. 52 32% GPS 13 8% Latitude/Longitude 4 2% Postal address 37 22%
As well as those listed in Table 8.2, other thesauri and guides that were mentioned included: ADS guidelines, BM materials thesaurus, Cadw guidelines, IFA guidelines, MODES (in-built thesaurus), MPRG, Society of Museum Archaeologists guidelines, and use of European and international standards (e.g. The Getty Information Institute's Art and Architecture Thesaurus). Although many organisations have developed their own standards, they often stated that these were based on guides produced by the RCHME and MDA. Many organisations, however, stated that they did not use thesauri in their work.
Another area where standards are important is geospatial referencing, and organisations were also asked to indicate the ways in which they locate spatial data. A list of six options was provided in the questionnaire and space for additional free text (see 5.7 in Appendix 2). In addition to those listed, a handful of organisations also use parish name, local authority area, site name and their own map numbers to locate places and finds. In total, 165 responded to this part of the questionnaire. Table 8.3 provides the details.
Figure 8.5 Are standard thesauri of archaeological terms for the British Isles of no relevance to your work?
Individual response (298) Organisational response (169)
The great majority of organisations use the National Grids set up by the Ordnance Surveys of Britain and Ireland, maintaining the letters relating to 100km grid squares. Although some convert these grid square references to numbers, this is always in addition to the more traditional style. Other referencing systems are used (postal address etc.), and these are almost without fail implemented alongside the OS grids. Very few use latitude/longitude, although it is applied in marine archaeology.
Information was also sought from both organisations and individuals about the use of archaeological thesauri. Although Table 8.2 implies that the use of standards is not widespread, Figure 8.5 shows general support for the use of archaeological thesauri.
There is support for the use of standard thesauri, particularly among archaeologists based in national bodies, museums, local government departments, library/archives and, at an organisational level, field archaeologists and consultants. The lower levels of support for thesauri amongst those based in HEIs may be explained if their research interests lie beyond Britain. The returns from individual field archaeologists and consultants, however, cannot easily be explained. Society members may not have the time or interest in using standards for any archaeological work.
Although there is common support for the use of standards in archaeology, the extent to which these are being implemented is unclear. Without standardised terminology such as some indication of the time period, subject, scope of projects and any spatial location, it can be difficult to enter basic details of projects into indices (be they abstracts, SMRs etc.). Without detailed catalogues, locating information resources is very time-consuming and rarely effective. Attempts to link indices held by different organisations, or those created at different times within the same organisation, rely on common terminology. Standards are a starting point and assist in the communication of ideas, and while research can only proceed by pushing at boundaries, there is still the need for a common language within archaeology, as indicated by a couple of comments from our questionnaire:
There are no guidelines issued in the Republic of Ireland on creating a digital archive, standardising of data etc. I, for one, would welcome such guidelines.
My recent experiences of using archive material have suggested that they are only useful if standardised, and if cleaned. In our field (faunal remains) this has rarely been done.
The survey population has representation from all areas of archaeology, and this is reflected in the diversity and range of information being created. The following points arise from this section:
The use of computers varies, but digital datasets form an element of all categories of data (as listed in the questionnaire) There is a growing body of important information being created solely in digital format. Those categories of information with the smallest confirmed digital component (around 50%) are photographs, plans, context descriptions, illustrations, teaching datasets and exhibition catalogues. Those with more than 75% available in digital format are geospatially referenced site lists, specialists' catalogues, geophysical survey data, GIS, reports, basic monument indices (on average) and detailed monument records (on average). While the majority of project work is being created by professional archaeologists, some independent archaeologists also create digital datasets Digital datasets are created in a wide range of programs. Although Microsoft products dominate, our survey identified 162 other programs in use. It is unclear whether common, agreed standards are being implemented, despite obvious support for this.