Appendix 2: georeferencing geophysical data
Locating geophysical results within the wider landscape
To use geophysical results for other archaeological investigations the location of a survey within the landscape or site must be known. Depending on the intended use, the required accuracy of such information may vary. For example, in order to direct excavations to the areas indicated by geophysical findings a precise reference to the archaeological site grid is required. On the other hand, if only a landscape assessment is required, including aerial photographic interpretations, accuracy is less critical. However, since future demands cannot be predicted, it is essential to record survey positions with the highest possible accuracy in the first place. If existing survey data are to be re-used but spatial information is insufficient, the only solution is to undertake several well-referenced keyhole surveys to relate previous results to accurate site coordinates. This is time consuming and can be avoided through careful documentation in the first place.
This appendix introduces basic concepts of coordinate systems, their coregistration, georeferencing and issues relating to information accuracy. This discussion is intended to provide a foundation for the recommendations on the geophysics georeferencing presented in Georeferencing and data combination and Geophysics georeferencing.
In the context of archaeological geophysics, a coordinate system is based on a model of the physical world that allows specification of a spatial position by providing its coordinates. In mapping science, this is usually referred to as a ‘datum’ and this expression will be used throughout this Guide. The choice of a coordinate system depends on the intended use of this positional information, as illustrated below. In archaeological geophysics the distinction is mostly between map coordinate systems and geophysics coordinate system (see Section A2.2.4 below).
Latitude and longitude
Locations are often specified by providing information on latitude and longitude, with the implicit assumption that the earth can be described as a globe (this is the model used) with a coordinate system where 0° longitude is normally assigned to the meridian through Greenwich and 0° latitude to the equator (the datum used). The numerical values for longitude and latitude are the coordinates. Had the same model been used, but with a different origin for the coordinate system (e.g. the 0-meridian running through Rio de Janeiro, i.e. defining a different datum) different coordinates would need to be used to specify the same point. This illustrates how coordinates are only useful if specified together with their respective underlying model and datum. The most common datum is now WGS84 that is used with GPS/GNSS equipment (see below Section A2.5).
Great Britain National Grid
Another common map coordinate system in Britain is the Great Britain National Grid. For this, the Airy Spheroid is used to approximate the earth’s surface for the British Isles and a Modified Transverse Mercator projection allows specification of each position in a metrical Cartesian (i.e. rectangular) coordinate system. To avoid negative coordinates a ‘false origin’ (i.e. one that is SW of the mathematical Modified Transverse Mercator origin) has been chosen to quote the coordinates. This numerical origin (i.e. the coordinates 0E and 0N) lies south-west of the Isles of Scilly. Where a position is specified in Great Britain National Grid coordinates, this coordinate system is implicitly used. The Ordnance Survey of Great Britain (OS) has developed a system to quote such coordinates conveniently, specifying the relevant 100km x 100km grid square with a two-letter abbreviation, followed by a numerical expression (e.g. a six-figure grid reference for a 100m x 100m grid square) to denote the location in more detail. However, it has to be borne in mind that the copyright for the use of this notation is held by the Ordnance Survey. It is recommended that the ‘eastings’ and ‘northings’ (or ‘x’ and ‘y’ coordinates) are stated explicitly for a location, measured from the false origin. Information in such a format is also simpler to handle by computerised databases and GIS. A coordinate in the Great Britain National Grid coordinate system, using a generalised notation for the coordinates, could consist of the eastings, followed by the northings and separated with a comma, or both measures provided as separate items (e.g. “423201, 339236”).
Archaeological site grid
For archaeological field projects site grids are often established. These reflect the mapping of the actual positions on the ground to their projection on a virtual horizontal plane. Neglecting the curvature of the earth (a reasonable assumption given the size of most archaeological sites), the spatial model used is a flat horizontal projection with a Cartesian coordinate system. Hence the coordinates are specified as ‘eastings’ and ‘northings’ within this datum’s coordinate system.
Geophysics coordinate system
The most commonly used geophysics coordinate system consists of Cartesian coordinates with a flat horizontal model. The relationship between this geophysics coordinate system and the real world, however, is more complex than for the site grid. While the latter is simply a projection of the real space onto a horizontal plane, the geophysics coordinate system is determined by the physical layout of data grids, sometimes with tapes and strings along the ground. It is thus warped and stretched over the undulating topographical surface. Despite the resulting distortions in physical space, the geophysics coordinate system itself, as a mathematical model, is perfectly flat and rectangular. By using a geophysics coordinate system this model is implicitly assumed. The data as displayed in a geophysics processing package show this flat view of the data.
The choice of a coordinate system and datum is often governed by the intended use of the spatial information. A geophysics coordinate system, for example, is best suited to represent the geophysical data recorded in a survey. In contrast, the Great Britain National Grid may be the best choice for the mapping of aerial photographic evidence onto Ordnance Survey maps. In many cases, however, the data recorded in one system have to be useable in others. The geophysical data may, for example, have to be compared to excavation results recorded on a site grid or with aerial photographs located on the Great Britain National Grid. The task of tying the coordinate system together is called coregistration.
The easiest way of defining a coregistration is to provide a list of points with their respective coordinates in all coordinate systems concerned. These points are sometimes referred to as control points, reference points, tie points or tics. Often these points are features that can easily be identified in the coordinates of the different datums and coordinate systems used. A good example is the rectification of aerial photographs where prominent features (e.g. road crossings or corners of houses) are identified on the photograph and on the map.
To coregister geophysics coordinates and a site grid, for example, one can provide coordinates for the corners of geophysics 20m x 20m data grids (e.g. 0/0; 20/0; 60/100 in the geophysics coordinate system) and their respective coordinates in the site grid (e.g. 0E0N; 18E2N; 56E104N) as measured with, for example, a Total Station. For an undulating topography, a considerable number of control points are required to establish the relationship between the coordinate systems. Due to the stretching of the geophysics grid over the undulating topography, the geophysics coordinates will not normally be the same as the coordinates of the site grid. Equipped with a list of control points, the mathematical process of transforming a data set from one coordinate system and datum into another can be undertaken. The algorithms used depend on the number of control points provided and whether they should be matched exactly or with a minimisation of errors. Details on various algorithms (e.g. linear, polynomial, spline) may be found, for example, in Scollar et al. (1990: 210).
The explicit specification of the equations for the mathematical transformation between coordinates from different datums provides another means of describing a coregistration. The simplest mathematical approach is an affine transformation. It consists of stretching (in X and Y direction), rotating (by an angle) and a translation (along X and Y axis). This can therefore be described with 5 parameters (sometimes 6 parameters are provided to allow for a simpler calculation). If it is assumed that there is no stretching in X and Y direction (e.g. on a flat surface where all dimensions have already correctly been measured with tapes), the transformation can be determined from the coordinates of two points alone, otherwise three or more points are required to find the best matching parameter set. In archaeological geophysics it is unusual to specify more complex transformations (e.g. polynomial, spline, rational) explicitly.
Tying coordinate systems together (coregistration) is sufficient for some applications (e.g. map production) but in many cases a correlation between spatial data and a position on the ground must be established. For example, in order to use geophysical evidence to direct the location of an excavation trench, the actual position of the geophysical anomaly on the ground has to be known. Relating spatial data to actual ground positions is termed georeferencing (note that the term is used by some authors to include coregistration and by others to mean finding a site based on its name).
Georeferencing is best undertaken by relating the coordinate system and datum concerned to features on the ground (Ground Control Points, GCPs) that are reasonably permanent (e.g. the concrete bases of a pylon). It is also advantageous to select features that can be identified on maps in order to provide information for later coregistration (see Appendix A2.3). If no such features exist, the only solution may be to introduce artificial markers that can, hopefully, be found again in the future. Such markers may be wooden or plastic pegs of reasonable dimensions that are either driven deeply into the ground or set in concrete. Clearly, if such markers are used, permission has to be sought and the design must minimise any risk to people or animals.
Once GCPs have been selected and very carefully documented (e.g. which corner of a pylon’s base, or which side of a wall) their relationship to the coordinate system has to be recorded. The method used will depend on the nature of the coordinate system and the methods available to establish it.
If, for example, a site grid was set out with a Total Station and if it will most likely be re-established with such an instrument, georeferencing information may simply comprise of the site grid coordinates for two GCPs together with details on the accuracy of the recording. Later measurements of the two GCPs with a Total Station from any point on the site will allow determination of the position and orientation of the Total Station (‘resection’) and hence the site grid. In order to secure a uniform level of accuracy, it is advantageous if the two GCPs are spaced at the opposite sides of the area to be surveyed.
In contrast, if a coordinate system has been laid out on the ground from baselines (e.g. a geophysics grids system), their most relevant points (e.g. ends and intersections) should be measured from the GCPs, for example with tapes. For each selected point, measurements from at least two GCPs must be taken. The record of these measurements, together with estimated accuracies, will allow practitioners to re-establish the baselines.
Absolute coordinates using Gnss/Gps
All of the currently available global navigation satellite systems (GNSS, often simply referred to as GPS, the United States’ implementation of GNSS) consists of a network of satellites which continuously transmit information that can be converted by a ground-based receiver into its own absolute coordinates on the earth’s surface without reference to maps or GCPs. This concept is an ideal solution to many applications which require positional information. However, there are several factors that influence the possible use for surveying.
Most importantly, the accuracy of positional information for a basic handheld device is only about 3-10m due to inherent features of the system. If, however a Real Time Kinematic system (RTK) is employed, accuracies of better than 0.01m can be achieved. For RTK to work, a permanent base station is required at a fixed location that provides the mobile receiver with correction data, either in real time or with post-processing software. A convenient solution is the use of a commercial network of base stations that transmits such a correction signal, for example through a mobile-phone network. If a RTK system with base station is used, the accuracy of the final readings is dependent on the accuracy of this base station’s coordinates, even if the precision of the system is very high (e.g. 0.01m). Post-processing of the data can greatly enhance the accuracy. While lateral accuracy can be improved by observing several satellites simultaneously, information on vertical positions can only be derived from satellites above the receiver. Therefore, the vertical accuracy of such satellite based systems is always worse than the lateral accuracy.
GNSS/GPS systems measure the relative positions of receiver and satellites using an ‘Earth-Centred, Earth-Fixed’ Cartesian coordinate datum (ECEF). This datum, which is aligned with the World Geodetic System 1984 (WGS84) reference ellipsoid, has its origin close to the earth’s centre of mass, its z axis parallel to the axis of rotation of the earth and its x axis passing through the intersection of the equator and the Greenwich meridian. Fortunately most receivers convert ECEF coordinates to WGS84 latitude, longitude and height for output, and some will also perform transformations to other coordinate datums, for example to the Great Britain National Grid. However, it has to be remembered that for historic reasons the Great Britain National Grid is not homogeneous and, despite complex conversion algorithms, differences between map- and satellite derived National Grid coordinates can be up to 2m (Dodson and Haines-Young 1993).
Precision, resolution and accuracy
It is worthwhile considering how coordinates are specified in different systems. In this respect the terms precision, resolution and accuracy need to be explained.
Precision of a coordinate (or indeed of any variable) is mainly determined by the numerical format used to represent it. If, for example, a metric coordinate is only represented by integer numbers (i.e. without decimal places), its precision is 1 m; if a map position is specified with a letters-cum-six-figure Great Britain National Grid reference (see above) the precision of this representation is 100 m x 100 m.
While the precision is only limited by the numerical format used, the resolution of a coordinate is determined by its physical limitation. For a geophysical survey the closest spacing between adjacent survey lines may, for example, be 0.5 m. In this case the spatial resolution of the survey is 0.5 m – even if it were recorded with two decimal places (i.e. a precision of 0.01 m). On a printed map the shortest distance between visually distinguishable objects may be 0.5 mm; this is the map’s resolution and using its scale this can be converted into the ground resolution. For example, a map with a scale of 1:25,000 and a resolution of 0.5 mm has a ground resolution of 12.5 m. In summary, the resolution is a measure of the smallest separation between distinguishable features.
Precision and resolution determine the minimal numerical and physical separation of objects in a coordinate system. The accuracy of a coordinate describes the confidence one can have in finding the object at the given position. This is often referred to as the measurement error.
In order to understand the difference between accuracy and precision a few more comments are useful. A typical example for specifying the accuracy of a measurement could be “the distance to the object is about 20.4 m, plus or minus 0.1 m”. In this example, the measurement value is 20.4 m and its accuracy is 0.1 m, which means that the true distance could be between 20.3 m and 20.5 m (for the statistical definition of accuracy as the standard deviation of a statistical error, see, for example, Lyons (1991)). If one decided, after this initial measurement, to archive the data in a numerical format that only has a precision of 1 m (i.e. the measurement would be represented as 20 m) the accuracy would have been reduced to 1 m since all measurements between 19.5 m and 20.4 m would be recorded as the same value. On the other hand, if one used a precision of 0.01 m for recording (i.e. the measurement would be represented as 20.40 m) the accuracy of the measurement would still not improve beyond 0.1 m. In conclusion, the accuracy of a measurement can never be better than the precision of the data format, although it can be worse.
Accuracy and coregistration
It is important to consider what happens to the accuracy of measurements during coregistration. It is the combination of accuracies of the original measurements and of the coregistration process that will determine the accuracy of the final data set, as the following discussion shows.
A geophysical survey may be undertaken along a grid system laid out on the ground. With a careful operator the spatial accuracy for locating a measurement point can be 0.05 m. The geophysical data are recorded with this accuracy within their coordinate system. In order to tie them to an existing site grid some control points may be measured with a Total Station. The accuracy of such measurements can be 0.005 m. Coregistration of the geophysical data to the site grid will eliminate most of the spatial distortions but will not improve the initial accuracy of 0.05 m. For referencing on a smaller scale it may be necessary to relate the site grid to map coordinates in a known coordinate system. In most cases the coregistration between the site grid and the map coordinates will be based on points measured on a map (paper or digital), the accuracy of the latter being often limited and rarely better than 1 m. Hence, expressing all geophysical measurements in coordinates of the map (from geophysics, to site, to map) would result in an unnecessarily degraded overall accuracy of the geophysical data of about 1 m. It is worth remembering that the use of coordinates derived from maps is also covered by copyright.
This example demonstrates that the best approach to providing spatial data is to describe individual coordinate systems, their internal accuracy and how they relate to each other (providing coregistration information with accuracy details) rather than expressing all data in the most common coordinate system which usually has the worst accuracy. There is one exception to this recommendation. If spatial measurements are made absolutely and with high accuracy (as can be achieved with modern RKT GNSS/GPS), no further conversion of coordinate systems may be required.
The need for information on accuracy
It has been explained that precision is related to the data format used (e.g. six-figure grid reference) but information on accuracy has to be specified explicitly. This is necessary to assess the usefulness of a data set for a specific purpose. For example, a trial excavation may be undertaken on the basis of a previous geophysical survey. In order to minimise the size of the excavation trench the location of the geophysical anomalies must be provided to a certain accuracy, say 0.5 m. Only if it is known that the accuracy of geophysical data within the site grid is of a comparable magnitude can such a ‘surgical’ excavation trench be established. Otherwise, a much larger area has to be opened ‘to be on the safe side’.