Skip to content

Help & guidance Guides to Good Practice

Scanning and digitising

Bob Bewley, Danny Donoghue, Vince Gaffney, Martijn van Leusen, Alicia Wise (1998). Revised by Bob Bewley and Kieron Niven, Archaeology Data Service / Digital Antiquity (2011), Guides to Good Practice

Scanning to raster formats

Where digital sources are not available, analog materials (film, paper, etc.) can be scanned with a flatbed or drum scanner to generate raster images. Scanning devices vary considerably in accuracy and resolution, with flatbed and drum scanners normally providing a resolution between 100 and 1200 dots per inch (dpi) and more expensive drum scanners claiming resolutions of between 3000 and 5000 dpi. In all cases care should be taken to distinguish between the true optical resolution of a given scanner and that obtained through interpolation procedures.

A scan normally results in a single raster data file and there are a very wide variety of image formats for holding raster data (e.g. TIFF, GIF, JPEG), the majority of which are designed for photographic images and not for spatially referenced data. Raster formats are discussed in detail in the Raster Images guide. Several GIS provide proprietary raster data structures and record spatial referencing information; they also provide tools for importing data from other common raster formats. In addition, the TIFF graphics standard has been extended to provide georeferencing and spatial data in a format called ‘geotiff’ and details of this standard, including the official specification of Geotiff 1.0, can be obtained from the Geotiff webpage[1]. Although currently supported by a limited number of proprietary GIS, many manufacturers have committed to supporting this standard which should provide a platform-independent method for archiving and transferring spatially referenced raster products.

It should be noted that the scanning process can result in some very large raster image files, and this problem can be compounded by the software used to integrate and study the raster layers which may require further increases in colour depth and therefore in file size.

Digitising to vector formats

AP and RS data (in either analog or digital form) may also be geometrically described using a digitising tablet, to provide vector data. Digitising tablets generally offer finite resolution in both x and y directions. This can be expressed as a quoted resolution, for example 0.02 inches or 0.001 inches, or as lines per inch (lpi), e.g. 200 lpi or 1000 lpi. This information can be found within the digitiser manual. Unlike the scanning process, where a scanned original generates a single raster image, digitising a single original may form the basis of a large number of discrete, thematic vector data layers. Various vector formats are discussed in detail in the Vector Images guide.

A hybrid option is to scan the source document but then to use the scanned product as the basis for ‘on screen digitising’, using a graphics workstation and pointing device to create vector data themes. This is often referred to as ‘heads-up digitising’ and is an attractive option if a digitising tablet is not available, or if digital raster data can be obtained from a third party. There are a number of software tools available to assist in obtaining vector data from a scanned image of a map or plan. These include very sophisticated semi-automatic tracing tools which, for an ideal image, can vectorise perhaps 70-80% of the data without intervention, and which automatically request intervention when a problem cannot be resolved. Such software tends to be expensive, although it is sometimes available to non-profit research and educational institutions at discounted rates. There are also, at the other end of the price/sophistication range, a number of cheap/shareware tools available. These may have limitations in terms of the maximum scan resolution they can handle, or the maximum size or complexity of the image. Note that none of these tools can be guaranteed to vectorise 100% of a scanned map/plan without intervention. The degree of intervention required will always be a function of the sophistication of the vectorising/tracing tool, the quality of the scan, and the nature of the original.