GIS emerged from three principal roots: the need for data analysis and display tools, the automation of map production, and landscape architecture and environmentally sensitive planning. Although GIS have been available since the 1960s, it is only in recent years that hardware and software have become sufficiently powerful and inexpensive for its use to become widespread.
The scales used by HERs will vary but the basic mapping is usually 1:10,000 with more complex areas such as historic towns mapped at 1:2,500 or even 1:1,250. Some HERs have mapped directly on to paper or film copies of the OS maps, whereas others use overlays. The advantage of the latter approach is that the overlay is independent of the OS map base, which changes over time.
In practice, most HERs will have a variety of types of maps that have been developed for different purposes and projects. For example, crop-mark sites may be plotted on separate overlays, which can be placed over the main HER maps to enable the user to see the crop-mark features within each HER monument.
Maps form a fundamental tool without which HERs would be unable to function. However, paper maps have their limitations. It can be difficult to keep the map base itself up-to-date. Maps can be time-consuming to use and can be viewed by only a few people at any one time. Only a limited amount of information can be shown on one set of maps or overlays, making it harder to carry out assessment and analysis. For these reasons, most HERs have adopted or are exploring the use of GIS.
As a result, many local authorities have or are establishing corporate GIS-based databases helping to avoid duplication of effort, make best use of resources and bring together datasets which were becoming fragmented. Linking an HER dataset into a corporate GIS means that the HER data can be displayed and related to other datasets held in the authority. These may be topographical, such as contours and rivers, or other planning constraints such as conservation areas and SSSIs. This opens up new possibilities for taking a more integrated approach to planning and conservation.
GIS also opens up avenues for analysis and research into the historic environment. As desktop GIS software develops and its power continues to grow, there is increased potential for analysis and visualisation of datasets, for example in three dimensions or in virtual-reality models. Recent development of web browsers incorporating GIS is enhancing the potential for sharing and display of information through corporate intranets and the internet.
There are now many books on the uses of GIS in archaeology, including edited volumes illustrating the uses of GIS for research and management of projects (see for example Gillings et al 1999, Lock 2000, Westcott and Brandon 2000) and also more general sources (Wheatley and Gillings 2002, Connolly and Lake in press). The ADS's ''GIS Guide to Good Practice'' (Gillings and Wise 1998) provides practical guidance for individuals and organisations involved in the creation, maintenance, use and long-term preservation of GIS-based digital resources and also provides specific advice for HERs.
The aim of this section of the manual is to provide a primer on some of the issues to consider before embarking on system development. It does not set out to review general functionality of GIS in any detail or to review the current GIS market place. Rather, it sets out some of the considerations to be taken into account in establishing a GIS for an HER, and some of the benefits that can be gained through successful implementation.
Many local authorities will have either departmental or corporate policies governing GIS. These may include standards for hardware and software, data standards and policies for access. GIS is well suited to a corporate approach to data management, since it can bring together information from different sources, and even different data types into a single, spatial view. For example, GIS allows users to select a location (for example a property address) and to display text information from a database of planning proposals, a listed buildings database, an HER database or other digital information such as a scanned property deed from the record office or photographs from the engineer's department.
As with most computing, the continuing emphasis on communications and IT in higher education ensures that there is a growing awareness of GIS amongst recent graduates. For existing HER staff, training in the corporate GIS is generally available either from the local authority or from commercial training providers. There are also courses offered by university continuing education departments and others on the use of GIS in archaeology and for conservation.
Vector data is therefore similar to data in a CAD package. Each element in the layer is represented by some geometric entity such as a point, line, or polygon. The process of creating vector data is similar to drawing, using either a digitising tablet or by drawing objects on the computer screen and can be time consuming and expensive. This has the advantages of providing a compact data-storage format, allowing scalable presentation. Being based on geometric objects, it is straightforward to link these to text-based records. Vector representation permits easy quantification of areas and some analytical methods such as network analysis. Ordnance Survey Landline mapping captured at base scales of 1:1,250, 1:2,500 and 1:10,000 for urban, rural and upland/moorland areas respectively, is an example of vector data (in this case containing many different layers). Increasingly though OS Landline mapping is being replaced by the topography layer of OS MasterMap®.
Raster layers are more similar to digital images, as they are made up of a grid of cells, each of which contains a value at a particular location. Raster data is usually generated automatically by scanning paper originals, or obtained from digital sensors (in cameras, or satellites) and is therefore often rapid and cheap to generate. However the quality of the raster dataset is dependant on both the resolution the image is captured at and on the qualtiy of the locational data fixing the position of the image within the GIS. This is particularly suitable for applications requiring display of fine detail (for example aerial photography or historical mapping) but also facilitates many forms of terrain analysis and simulation modelling. Raster data may be dichotomous (that is, cell values are either 1 or 0, to provide a black and white picture) or continuous, where each cell may be assigned a range of values. Examples of widely-used raster data sources include the OS Siteplan® data (for scales between 1:500 and 1:2,500) and Ordnance Survey 1:10,000 and 1:25,000 scale raster products. Raster data is almost always supplied in pre-defined areal units or 'tiles' based on the OS National Grid.
Typical vector applications include spatially referenced database applications – for example, location maps, sites, monuments, artefacts – mapping applications, managing networks (such as roads and utilities) and terrain analysis using TIN elevation models. Raster themes are often employed to analyse continuously varying layers such as slope, elevation or resistivity and remote-sensed data such as satellite imagery. Analyses that employ raster data typically include neighbourhood analysis and overlay operations (for example reclassifying two separate maps of land use and height to obtain an intersected model of land use at height), simulation modelling, predictive modelling, decision support, cost surface and optimum route analysis and visibility analysis.
Many themes could be represented by either vector or raster data models. Terrain, for example, can be represented either by a vector model, using a network of triangles (referred to as a Triangulated Irregular Network or TIN), or by a raster altitude matrix in which each cell contains the elevation at that location. The choice of representation depends on a range of factors, including the capabilities of the software, availability of source data and the intended uses of the data.
Fortunately, most major GIS now work with both types of information, and can use them effectively together. Many forms of analysis (such as visibility analysis or hydrological modelling) can be undertaken using either raster or vector layers, and the two can also be employed together as, for example, when a satellite image is 'draped' over a vector terrain model, creating a 'digital landscape' which a user can explore (rather like a virtual reality 'fly through' (see Figure 39)). It is also possible to automatically convert data from vector to raster and – with some limitations – from raster to vector when needed.
It is important to note that users of third-party data should be aware of how the data was created, if good control of spatial accuracy is to be obtained. Ideally, spatial data should be supplied with metadata that records how and when the data was captured, and how it was georeferenced to the National Grid. This is particularly relevant with third-party surveys, where it is vital to fully record and understand the precision and accuracy of the survey and the methods used to georeference it. The widespread use of Global Positioning Systems (GPS) to undertake new surveys has recently made this even more pertinent (see below).
Most large GIS systems will provide mechanisms to store and manage all three of these either 'in-house' or by allowing links to be made with external data sources, as where a polygon on a GIS layer is linked to an event record in a conventional database.
Georeferencing usually comprises two steps. Initially, the data will be digitised in whichever system of coordinates is used in the original map. For vector data, this will involve calibrating the digitising device to the coordinate system on the map (it is also important to try to estimate and record the level of precision to which the map is digitised, as this contributes to the accuracy of the produced data). For scanned data, the initial stage of georeferencing involves at a minimum locating the corners of the raster grid in geographic coordinate space. In many cases, such as aerial photographs, this is not sufficient to accurately georeference all the raster cells in the theme and it is necessary to set up a more complex coordinate transformation between the raster and the coordinate system, usually by identifying control points on the raster and entering their known coordinate positions. In the UK the coordinate system used for georeferencing is usually the OS National Grid (more properlly called OSGB36, see below) and if all data is recorded within this sytem, it may not be necessary to delve further into the complexities of georeferencing.
To integrate OSGB36 data themes with sources of data referenced to other coordinate systems, however, it is necessary for the spatial database to also have a full description of both coordinate systems. This is required, for example, for the integration of digitised map data with independently recorded GPS readings. In this case, features on the map will 'know' their OS National Grid locations, but the GPS readings may be in a different GPS coordinate system (for example WGS84). Unless the spatial database contains information about how each of these coordinate systems relate to a common reference system, it will not be possible to visualise or analyse these two data themes together. This usually involves recording at least the projection and the horizontal datum of the coordinate system that is used. This should be printed on the maps, or can be found in publications of the agency who defined or maintain the coordinate system.
It is usually this second part of georeferencing which leads to confusion, although fortunately many contemporary GIS are now supplied with a wide range of pre-configured coordinate systems, projections and datums which can make things much easier. In order to fully understand georeferencing, however, it is advisable to have some basic understanding of geodesy: the study of the shape of the earth and the determination of the exact position of geographical points. This is increasingly important because of the growth of surveys undertaken with GPS receivers and is particularly problematic when it comes to integrating data about topographic height.
The widespread availability of accurate, globally-reference survey coordinates has recently rendered traditional triangulation networks effectively obsolete, and the National Grid is no longer defined in those terms. Ordnance Survey have therefore established a new national positioning infrastructure based on the European Terrestrial Reference framework (ETRS89). This is maintained by a group of permanently installed GPS receivers around Great Britain, referred to as Active Stations. These coordinates are now used to define OSGB36 through specifying how to convert between ETRS89 coordinates and OSGB36. The OS provides transformations for both plan position (National Grid Transformation, or OSTN02) and for height values (National Geoid Model, the OSGM02).
Collecting accurate GPS data in the UK that can be accurately positioned on existing data usually now involves (i) establishing the receiver's position in the ETRS89 system and then (ii) converting those surveyed coordinates using OSNT02 and OSGM02 to their equivalent OSGB36 coordinates. These stages may be undertaken either at the time of survey or later, and may be done within the GPS receiver itself, or in post-processing software.
Older or simpler GPS data may provide coordinate values in the World Geodetic System (WGS84). This differs from ETRS89 in that WGS84 is not tied to any point on the earth's surface. As such, it can be problematic to accurately convert coordinates in WGS84 and similar global systems to map coordinates because the surface of the earth is not stationary (plate tectonics can move two positions on the earth's surface by as much as 2cm in one year).
An excellent introduction to the issues arising from use of GPS-derived coordinates, from an archaeological perspective, can be found in Ainsworth and Thomason (2003).
The National Grid also allows for absolute references expressed in a fully numeric format. For example the reference 345678987654 refers to a location 345678m east 987654m north of the origin. This numerical format for references is convenient as most GIS packages do not recognise the letters associated with map sheets, using co-ordinate systems that depend entirely upon numeric fields.
Ideally, a location will be identified to the nearest metre within the National Grid. However, it is not uncommon to use less precise references – to the nearest 10 or 100m – by omitting the least significant digits. For example, a reference with only six figures after the letter code (for example SK123568) or with only 10 digits (3456898765) refers to a location with a 10m precision, while SK1257 or 34579877 have 100m precision. Clearly it is vital to treat locations supplied with 10 or 100m precision references with care: they frequently need to be interpreted to mean 'the location is somewhere in the square whose origin is specified by the grid reference' as opposed to “the location is within 10 or 100m of the reference”. For this reason both the original (source) formats of coordinates should be stored in HERs, as well as any 'GIS friendly' derived coordinate values (see also section B.6.8).
Note that, although most absolute grid references will be expressed in twelve figures, all sites locations in OS 100km map squares commencing with H (All of Shetland and much of Orkney) will have seven figure northings whilst the following map squares (NA, NF and NL (all covering the Western Isles) and SV covering the Isles of Scilly, have only five figure eastings.
An online tutorial on the National Grid can be found in the education section of the OS website http://www.ordnancesurvey.co.uk/resources/maps-and-geographic-resources/the-national-grid.html
One element of the user requirement is likely to be a list of the functions that the GIS is intended to perform. A useful source of advice is the Functional Requirement Specification for GIS (LGMB 1991), available from the Improvement and Development Agency, formerly the Local Government Management Board (LGMB). This includes a catalogue of GIS functions, which can be used as a 'checklist' to compare different software products and to assess if any customisation might be required and what skills would be needed to achieve the desired outcome. Target response times for operations that are important to users can provide a useful benchmark and can be used to make sure that the users' expectations and the developer's system performance targets are aligned. For example, if the identification of all records falling within an administrative boundary will be a frequent enquiry what would be the maximum acceptable time for this to take?
Scanning can be undertaken in-house, although scanners large enough to process whole maps are expensive (approximately £3,000 - £15,000 depending on features) and time consuming to set up and use. More commonly, the scanning of a document archive will be placed in the hands of an outside agency and it is therefore vital that a clear job specification is established in advance. This should cover the resolution of the scans, whether they will be monochrome, greyscale or colour, the file format and compression to be used and should also make clear what quality checks will be undertaken and whether the job includes user-intervention to clean the data after scanning. Agencies will probably supply scanned data on CD-ROM or DVD-ROM and it is advisable to make archive copies of these original materials quickly and store these under archive conditions.
Digitising paper sources is far more time consuming, as it involves attaching the map to a digitising tablet and tracing over the different data elements to, effectively, 'draw' the require data manually. To some extent, digitising in this way has been overtaken by the use of 'heads up digitising' (see below) and by automated raster-to-vector tools but it remains true that the creation of accurate, well-structured, topologically correct vector data requires considerable level of human intervention and is an order of magnitude more time consuming (and hence expensive) than scanning. The pay-off for this effort, however, is that the resulting data is a great deal more structured and useful for analytical purposes than any raster product. As with scanning, it is possible to undertake digitising 'in-house' using digitising tablets. These are available in sizes from A5 up to A0 although larger tablets are expensive to purchase and difficult to support. The vast majority of HERs are unlikely to be able to justify purchase of this kind of equipment unless it is as part of a wider institutional project supporting other areas as well as archaeology. As such, digitisation projects are likely to be undertaken by outside contractors as with scanning and it is even more vital that a thorough agreement is made in advance covering the accuracy, precision, quality and format of supplied data.
'Heads up' digitising refers to the two-stage process whereby maps are initially scanned, and then vector data is traced from the scans using the computer screen rather than the digitiser. Increases in computer power and screen resolutions in recent years have made this an attractive method of digitising data, which can often be undertaken by users themselves on an 'as needed' basis (although it is difficult to establish good quality standards in this way). This also makes it more attractive for HERs to scan paper archives because, if collected with the possibility of heads-up digitising in mind, then this is now an open-ended strategy that does not preclude generation of vectors in the future.
Moving between vector and raster data is possible in both directions, although it is important to understand the limitations of this. It is relatively straightforward to generate raster data themes from vector data (for display or for analysis in raster-based GIS) and most commercial GIS will provide functions for this. Generating vector data from raster (scanned) themes is, however, far more difficult as the computer needs to make 'intelligent' choices about how the pixels in the raster relate to geometric entities (lines, areas and so forth) in a vector theme. Nevertheless, software is available which will take scanned maps, such as contour maps, and generate vector outputs in the same way that Optical Character Recognition (OCR) software can generate text from scanned documents. These can be a valuable first step in generating vector data, reducing some of the tedious line-following procedures, but (as with OCR software) it is important to realise that none of these are foolproof, and that automatically generated vector themes will still require considerable user intervention to produce topologically-complete, clean vector data. There is a strong argument for HERs to refrain from becoming involved in decisions about how vector data is generated, but rather to concentrate on setting down the specification (in terms of quality, accuracy, precision and so forth) of the data that is required, and then leaving it to agencies to decide if that is best delivered by full automation, partial automation or human intervention.
Ordnance Survey provide both raster products and also vector data sets. Raster products are essentially scanned versions of OS maps available at a variety of scales and useful as backdrops for creating, for example, constraint maps. Raster data is available at 1:250,000, 1:50,000 and 1:25,000 scales for different mapping purposes.
Ordance Survey vector datasets are also available at a variety of spatial scales, and with a wide variety of thematic information. Among the more useful of these are the Landform PROFILE® dataset, which represents contours derived from 1:10,000 scale mapping and the LandLine® dataset which contains layers representing a wide range of manmade environment including houses, factories, roads, and administrative boundaries as well as heritage features. LandLine® data is scaled according to the region with 1:1,250 scale data for urban areas; 1:2,500 scale in rural areas; and 1:10,000 scale for remote areas such as mountains and moorland.
The Ordnance Survey's digital datasets (particularly LandLine) are currently being replaced with a new delivery mechanism for GIS data called OS MasterMap® (see below).
Other commercial digital map sources area available, although none can compete for completeness or up-to-date survey with the Ordnance Survey. However, for some tasks where OS data is not available, it may be possible to turn to providers such as Bartholemews (http://www.bartholomewmaps.com/) for particular digital map data supplies.
One of the most widely used sources for digital data is aerial photography, which may be held by larger HER maintainers in the form of negatives or prints. These can be scanned and georectified to provide not only data relating to crop and soil mark sites, but also additional map detail that may not be available in commercial mapping. Aerial photographic coverage can also be purchased commercially or licensed from the Ordnance Survey or from a range of commercial resellers such as Getmapping® (http://www1.getmapping.com/).
OS MasterMap® comprises topography, imagery, address and ITN (integrated transport network) layers. Based on the National Grid, the topographic layer contains information on every landscape feature – including buildings, roads, archaeological features - and represents a significant evolution from traditional cartography. MasterMap® depicts the real-world digitally and presents this information as themes in a series of layers, each layer carrying millions of features. Each feature has its own unique identifier or TOID® - a 16-digit reference number that can be shared with other users across different applications and systems. This allows easier data association and greater accuracy, focusing on real-world objects on the map. In addition, the Ordnance Survey have released a high quality Imagery Layer whose images have been fully orthorectified to represent truly and accurately what is on the ground. The Imagery Layer is available at 25 cm resolution and 24 bit colour. The Address Layer of OS MasterMap® provides precise coordinates for more than 26 million residential and commercial properties in Great Britain, whilst the ITN layer is probably of less use to the HER officer. It enables business needs from navigation to asset management and from traffic analysis to accessibility studies.
Users of MasterMap® data should also be aware that, in addition to the data format itself, the delivery of data with improved positional accuracy (see Positional Accuracy Improvement programme below) differs between MasterMap® and older formats and digital products such as Landline®, Profile® and Panorama®.
Further details can be obtained from the OS website http://www.ordnancesurvey.co.uk/business-and-government/products/mastermap-products.html.
In simple terms, data that has been through PAI will have better absolute and relative accuracy than previously supplied data, but at the expense that it may no longer match existing (legacy) data products. Where HERs contain data that has been created by reference to OS data products, there is now a possibility that this data will appear to be in error because the underlying OS mapping has 'moved' slightly. This movement should not be excessive, but may in some cases result in changes of up to a few metres.
It would therefore represent good practice for HER officers to undertake an audit of the data they are responsible for maintaining to assess how that data was originally created. The spatial elements of most databases were created against 1:10,000 paper maps with grid references expressed to the nearest 6 or 8 figures (that is to nearest 100m or 10m) and should not be affected by PAI. However, if the boundaries of an HER record are, for example, delineated along the representation of field boundaries on a 1:2,500 tile, then the shape of that land parcel may be altered through the transformation processes resulting in errors in the HER data. Those errors may include spatial (position) errors, but may also potentially result in topological errors (such as 'slivers' and 'gaps') where data is processed against new OS datasets.
Where migration to post-PAI Ordnance Survey data is necessary, then a variety of assistance and tools are available from OS and from third-party GIS manufacturers to help update HER data to match OS mapping. These include 'link files' of coordinate corrections and processing tools which, in combination, can provide for automatic (or semi-automatic) updating of spatial data themes according to the known changes in spatial position.
Ordnance Survey have produced a series of documents including a “PAI Companion” that explains the workings and implications of the PAI in more detail. This can be downloaded in PDF format from the OS website, which also contains the most up-to-date information about the progress of PAI (see http://www.ordnancesurvey.co.uk/business-and-government/help-and-support/navigation-technology/pai.html).
Precision and accuracy become significant when comparing disparate datasets is made possible with GIS. These sorts of issues come to the fore in GIS because vector displays can give a spurious impression of highly precise and accurate mapping. Scale, the ratio between distances on the map and in real space, can be manipulated almost infinitely in a vector GIS, as areas are 'zoomed' or 'panned' to suit. However, simply because it is possible to zoom does not mean that the data thus displayed will be accurate at the new scale. OS contour data, for example, may be digitised from 1:50,000 map sheets, at which scale the smallest distance that can be distinguished is 0.5mm or 5m on the ground. Because this data can be reproduced in the GIS at 1:10,000, at a resolution one fifth of the original, it does not become more accurate. Thus a point captured on screen against a 1:50,000 map base will be accurate at that scale and not progressively more accurate to any larger scale to which the map has been zoomed.
There are two approaches to representing the precision with which heritage objects are located within the GIS. The first approach (Figure 40), adopted by the former Archaeology Division of the OS, places the object in the bottom south-west corner of a virtual square somewhere in which the object is located. For example, an object recorded as a four-figure reference (such as TQ 77 89) could lie anywhere within that 1 kilometre square. Similarly, a six-figure reference (such as TQ 724 876) could lie anywhere within a 100 metre square. In both cases, the object would be represented as a point marked on a map in the south-west corner of the appropriate square, that is, the point marking TQ 77 89 would be marked at TQ 7700 8900. Variations in this approach include placing the point in the centre of a square rather than the south-west corner, (that is, the point marking TQ 77 89 would be marked at TQ 7750 8950) or in the centre of a virtual circle.
This approach has the advantage that, since most four-figure references are for stray finds, representation as a point bears some relation to the object depicted. The approach has the disadvantage that the object will only be retrieved by a spatial search that includes the point (whether located in the south-west corner or centre) although the implied imprecision means that the object could derive from a wider area.
The second approach (Figure 41) attempts to overcome the spatial retrieval problem by representing the object as the 'physical space' within which the object might lie, so that a square or a circle is depicted in the GIS. These would normally be transparent (that is, only the outline of the object would be visible) to avoid obscuring other heritage objects lying 'beneath'. In this way, the known boundaries of monuments and buildings would be visible at the same time as the fuzzy boundary represented for a stray find or other imprecisely located heritage object.
If 'area' features are also depicted as circular boundaries of approximate diameter (for example an artefact scatter) it is also worth adopting different conventions for the symbols used, such as a broken line, or a semi-transparent fill.
Whichever approach is adopted, recording the actual precision of the object is essential. Both methods will incur the problem of 'stacked' objects, where more than one object has been located in the same space. This is a common GIS problem, not confined to archaeological representation, it also occurs, for example, when representing the individual property units in a block of flats in two dimensions. Many GIS systems are able to indicate a stack of objects when the cursor is hovered over the objects or the stack is selected. However, a plot will give the appearance of a single object at the location, unless ID numbers are included in the plot and these may also 'overprint'. A possible approach to solve this might be to offset each of the objects slightly, so that they will be visually distinguishable.
Connolly, J. and Lake, M. 2006 Geographical Information Systems in Archaeology, Cambridge: Cambridge University Press (ISBN 0521797446)
Gillings, M., Mattingly, D. and van Dalen, J .(eds) 1999 Geographical Information Systems and Landscape Archaeology. The Archaeology of Mediterranean Landscapes 3. Oxford: Oxbow Books.(ISBN 1900188643)
Lock, G. (ed) 2000 Beyond the Map: Archaeology and spatial technologies. Nato Science Series, Series A: Life Sciences – Vol. 321. Oxford: IOS Press. (ISSN 1387-6686)
Richards, J.D. and Ryan, N.S. 1985 Data Processing in Archaeology. Cambridge: Cambridge University Press (ISBN 0521257697)
Westcott, K.L. and Brandon, R.J. 2000 Practical Applications of GIS for Archaeologists: a Predictive Modeling Kit. Philadelphia PA: Taylor and Francis. (ISBN 0748408304)
Wheatley, D. and Gillings, M. 2002 Spatial Technology and Archaeology: the Archaeological Applications of GIS. London: Taylor and Francis (ISBN 0415246393)
Australian and New Zealand Land Information Council: http://www.anzlic.org.au/
Bartholemews maps: http://www.bartholomewmaps.com/
Getmapping® (digital data reseller): http://www1.getmapping.com/