An application of Dublin Core from the Archaeology Data Service

-- DRAFT --

Paul Miller
University Computing Service, University of Newcastle, UK
A.P.Miller@newcastle.ac.uk

Version 1.05, Last modified: 10 July 1996
The most recent version may be found on the WWW at http://ads.ahds.ac.uk/project/metadata/dublin.html


Table of Contents


Introduction

A need has long been identified for methods by which online resources may be accurately but simply described. This description of the resource, or metadata, should be structured in such a way that both human readers and computerised search engines may utilise and understand the presented data with a minimum of effort on either part.

This paper discusses an extension to the Dublin Core (Weibel et al 1995) intended specifically for the basic description of Archaeological resources as collected by the Arts & Humanities Data Service's Archaeology Data Service, but also has a wider value as a contribution to the evolving metadata debate. Metadata descriptions such as these are intended to provide a standard high-level description of data from disparate sources, and are not necessarily concerned with the detail of each dataset where localised factors and data requirements make inter-archive standardisation infeasible.

Dublin Core & the Warwick Framework

While many of the currently evolving metadata standards are aimed primarily at the description of textual records, there is a clear need for a framework within which data providers and archivers may describe both text- and non-text information. There is a need, for example, for describing maps, satellite imagery, large databases, and text documents within a single metadata dialect rather than in the current plethora of forms (eg FGDC, NSDI, DIF and TEI).

The deliberations of two workshops (Dublin and Warwick) and their aftermath have gone a long way towards forming a sensible framework within which holders of information may provide clear metadata.

The first workshop resulted in a proposal for the Dublin Core (Weibel et al 1995) and the second is continuing to advance this.

The Dublin Core

The Dublin Metadata Core Element Set (or Dublin Core) consists of thirteen core metadata elements;

Element name Element Description
Subject The topic addressed by the object
Title The name of the object
Author The person(s) primarily responsible for the intellectual content of the object
Publisher The agent or agency responsible for making the object available
OtherAgent The person(s), such as editors and transcribers, who have made other significant intellectual contributions to the work
Date The date of publication
ObjectType The genre of the object, such as novel, poem, or dictionary
Form The data representation of the object, such as Postscript file
Identifier String or number used to uniquely identify the object
Relation Relationship to other objects
Source Objects, either print or electronic, from which this object is derived
Language Language of the intellectual content
Coverage The spatial locations and temporal duration characteristic of the object
Dublin Core fields, after Lagoze et al (1996), figure 1

The closest to a formal definition of these is available on the Web, and the proposed ADS interpretation is laid out below.

The Warwick Framework

The April 1996 meeting in Warwick (UK) looked at extension of the Dublin Core, and considered different possibilities for syntactically describing Dublin Core fields (Lagoze et al 1996). These deliberations, too, form an important part of the ADS' implementation, below.

Extending Dublin Core

The Dublin Core exists as a loose implementation model, rather than an established standard, and a number of people continue to work on refining the current model to increase its applicability.

Rather than wait for this evolution to slow, the ADS aims to suggest a structure of its own which obeys the Dublin Core and also incorporates much of the work emerging from the Warwick meeting.

This structure is intended to maintain the spirit of Dublin and Warwick, while at the same time specifying a framework within which the requirements of the ADS may be suitably met. It is hoped that an implementation such as this may, in itself, feed back into the ongoing debate on Dublin Core metadata and perhaps go some way towards furthering the whole.

The list of references at the end of this paper shows the huge contribution already made to this debate by those working on the Dublin Core, and shows how much has already been achieved in the short time since the first meeting in Dublin, Ohio, back in March 1995.

Concepts

Conceptually, Dublin Core represents a loose and extensible framework within which users with a wide variety of requirements may describe data of value to them. Current definitions of Dublin Core are extremely flexible and open to wide interpretation, allowing implementors of Dublin Core a great degree of latitude in their applications. Such flexibility is important in allowing the rapid take-up of a system such as this, but is less desireable within a single application such as the Archaeology Data Service. For this reason, some of the openness of Dublin Core will need to be curtailed in order to introduce a degree of ADS-wide standardisation.

Importantly, Dublin Core requires no modification to current browser and search engine technology in order to operate. This, again, facilitates rapid adoption of the Dublin Core system, but also provides scope for enhancing search engine technology to make the most of Dublin Core; searching only for the Dublin Core information contained within the <HEAD> </HEAD> tags, for example.

The Warwick Framework extends the conceptual model behind Dublin Core, and introduces the concept of storing application-specific metadata outwith the Dublin Core metadata proper, but within an associated package (Burnard et al 1996, Lagoze et al 1996).

Burnard et al discuss the relative merits of attaching metadata directly to computer files or storing the metadata separately. Given the forms of data likely to be available to the ADS, it seems most sensible to apply a mixture of both proposals; documents such as this one, and the metadata records themselves (like this, for example), will have Dublin Core metadata about themselves stored within the <HEAD> </HEAD> area as shown;

  <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
  <HTML>
  <HEAD>
  
  <TITLE>An application of Dublin Core from the Archaeology Data Service</TITLE>
  
  <META NAME = "package"
        TYPE = "begin"
        CONTENT = "Dublin Core">
  
  <META NAME = "DC.title"
        CONTENT = "An application of Dublin Core from the Archaeology Data Service">
  <LINK REL = SCHEMA.dc HREF =
        "http://purl.org/metadata/dublin_core_elements#title">
  
  <META NAME = "DC.author"
        TYPE = "name"
        CONTENT = "Paul Miller">
  <LINK REL = SCHEMA.dc HREF =
        "http://purl.org/metadata/dublin_core_elements#author">
  
  <META NAME = "DC.author"
        TYPE = "e-mail"
        CONTENT = "a.p.miller@newcastle.ac.uk">
  <LINK REL = SCHEMA.dc HREF =
        "http://purl.org/metadata/dublin_core_elements#author">
  
  <META NAME = "DC.date"
        TYPE = "creation"
        SCHEME = "ISO31"
        CONTENT = "1996-06-17">
  <LINK REL = SCHEMA.dc HREF =
        "http://purl.org/metadata/dublin_core_elements#date">
  <LINK REL = SCHEMA.iso31 REFERENCE =
        "ISO 31-1:1992 Quantities & units -- Part 1: space & time">
  
  <META NAME = "DC.form"
        SCHEME = "IMT"
        CONTENT = "text/html">
  <LINK REL = SCHEMA.dc HREF =
        "http://purl.org/metadata/dublin_core_elements#form">
  <LINK REL = SCHEMA.imt HREF =
        "http://sunsite.auc.dk/RFC/rfc/rfc1521.html">
  
  <META NAME = "DC.language"
        SCHEME = "ISO639"
        CONTENT = "en">
  <LINK REL = SCHEMA.dc HREF =
        "http://purl.org/metadata/dublin_core_elements#language">
  <LINK REL = SCHEMA.iso639 REFERENCE =
        "ISO 639:1988 Code for the representation of names of languages">
  
  <META NAME = "package"
        TYPE = "end"
        CONTENT = "Dublin Core">
  
  </HEAD>
  

Metadata for the archaeological data files themselves will be stored in individual metadata record files separate from (but LINKed to) the actual data. See the worked examples for clarification of this.

The extensions applied to the Dublin Core for the Archaeology Data Service -- and notionally called the ahdsDescriptor -- conform to the author's interpretation of both Dublin Core and the enhancements continuing to emerge from the Warwick Framework. While the requirements of the ADS and the definitions of the Dublin Core continue to evolve, the implementation described in this paper may also alter accordingly. However, a final definition of the ADS implementation of Dublin Core will no doubt be required towards the end of 1996 as large-scale data collection and description begins in earnest. Until that time, comments on the ideas presented herein are most welcome, and should be addressed in the first instance to the author.

Specifics

The implementation of Dublin Core syntax, above, builds upon that used by several authors and, rather than introducing anything new, merely picks the 'best' bits from the suggestions of each paper.

For example, Miller proposes use of the PACKAGE concept to enclose <META> elements described by the Dublin Core. This suggestion has been adopted. However, he also suggests replacing the <META> SCHEME and TYPE elements with '=' and ':' respectively. This suggestion has not been adopted as, although undeniably neat, it is felt to render the metadata description less clear to the human reader. The original use of SCHEME and TYPE (Weibel et al 1995) has been preserved instead, with a layout similar to that in Weibel (1996). Although involving more effort to create the record, it is felt that this format produces a record that is easier to understand.

The use of <LINK> suggested by Weibel (1996) is carried to a logical conclusion, with every metadata item linked to its reference description. That for Dublin Core is provided by OCLC and may be viewed at http://www.purl.org/metadata/dublin_core_elements.html. The reference descriptions for the ahdsDescriptor are defined by the Archaeology Data Service and may be viewed at http://ads.ahds.ac.uk/project/metadata/ahds_descriptor_elements.html.

Usage description for the ahdsDescriptor

As well as describing the syntax for ADS-specific metadata elements, this section outlines the recommended use of Dublin Core elements. In each case, the element name is given, followed by the associated attributes and an example of use. Associated attributes which are considered important are shown as <attribute> and those which are optional are shown as {attribute}.

In order to differentiate between elements from the Dublin Core, those from the ahdsDescriptor, and those from other sources, the convention discussed by Weibel (1996) should be used; DC is used to identify Dublin Core elements and AD to identify those from the ahdsDescriptor.

ADS recommended syntax for the Dublin Core element set

Element Name: Subject (DC.subject)
Usage: DC.subject {SCHEME} subject keywords
Example: DC.subject RCHME excavation
Notes: DC.subject is a repeatable field. Users requiring more than one SCHEME should use multiple occurences of DC.subject accordingly. Keywords following any DC.subject should be drawn from only one SCHEME.
Acceptable schemes include: Links: Metadata descriptions including this field should contain a LINK to http://purl.org/metadata/dublin_core_elements#subject. Where a {scheme} is used, a LINK should also be included to an on- or off-line description of the scheme and its acceptable terminology.

Element Name: Title (DC.title)
Usage: DC.title {type} Document title text
Example: DC.title main An application of Dublin Core from the Archaeology Data Service
Notes: This field simply records the title of the document.
Acceptable {type}s are;


Link: Metadata descriptions including this field should contain a LINK to http://purl.org/metadata/dublin_core_elements#title

Element Name: Author (DC.author)
Usage: DC.author {type} author text
Example: DC.author e-mail A.P.Miller@newcastle.ac.uk
Notes: DC.author is a repeatable field, and should be used as many times as necessary to describe all the authors, or to provide extra information such as their e-mail addresses. Where multiple authors are cited, they should be listed in order with the primary author first. Acceptable types include:

Link: Metadata descriptions including this field should contain a LINK to http://purl.org/metadata/dublin_core_elements#author

Element Name: Publisher (DC.publisher)
Usage: DC.publisher publisher text
Example: DC.publisher Taylor & Francis
Link: Metadata descriptions including this field should contain a LINK to http://purl.org/metadata/dublin_core_elements#publisher.

Element Name: OtherAgent (DC.otheragent)
Usage: DC.otheragent {scheme} {type} OtherAgent text
Example: DC.otheragent Editor Bloggs, Fred
Notes: DC.otheragent is a repeatable field, and should be used as many times as necessary to describe all relevant other agents. Valid uses of this field include describing editors, illustrators, those who transcribed non-digital records, etc.
Where no {scheme} is specified, {type} is assumed to be one of the following;

Where one of these defined {type}s is not suitable, a {scheme} such as TEI should be used to define a more appropriate {type}.
Links: Metadata descriptions including this field should contain a LINK to http://purl.org/metadata/dublin_core_elements#other agent. Where a {scheme} is used, a LINK should also be included to an on- or off-line description of the scheme and its acceptable terminology.

Element Name: Date (DC.date)
Usage: DC.date <scheme> {type} Date text
Example: DC.date ISO31 current 1996-06-17
Notes: DC.date is a repeatable field, and should be used as many times as necessary to describe any relevant temporal information. DC.date refers only to the history of the data, rather than to that which the data describe. For example, a description of Roman coins, compiled in 1763 and then transcribed onto a computer in 1996 would require DC.date creation 1763 and DC.date current 1996 entries. Any information on the Roman date ranges actually described would be covered in DC.coverage rather than here.
Valid <scheme>s are;

Valid {type}s are; Links: Metadata descriptions including this field should contain a LINK to http://purl.org/metadata/dublin_core_elements#date. Where a {scheme} is used, a LINK should also be included to an on- or off-line description of the scheme and its acceptable terminology.

Element Name: ObjectType (DC.objecttype)
Usage: DC.objecttype {scheme} object type text
Example: DC.objecttype AACR2 computer file
Notes: DC.objecttype should only need to occur once in any metadata description, and it describes the category into which any resource falls.
As well as those offered by any suitable {scheme}, valid object types include;

Links: Metadata descriptions including this field should contain a LINK to http://purl.org/metadata/dublin_core_elements#object type. Where a {scheme} is used, a LINK should also be included to an on- or off-line description of the scheme and its acceptable terminology.

Element Name: Form (DC.form)
Usage: DC.form {scheme} form text
Example: DC.form IMT text/html
Notes: DC.form should only need to occur once in any metadata description of a resource, and defines the type of data available.
DC.form should, wherever possible, use a definition from the Internet Media Type (IMT) {scheme}.
Links: Metadata descriptions including this field should contain a LINK to http://purl.org/metadata/dublin_core_elements#form. Where a {scheme} is used, a LINK should also be included to an on- or off-line description of the scheme and its acceptable terminology.

Element Name: Identifier (DC.identifier)
Usage: DC.identifier {type} identifier text
Example: DC.identifier URL http://www.ncl.ac.uk/~ngraphic/
Notes: DC.identifier has two possible uses, depending upon context, as it may identify the location of the file it is in or the location of the file to which the metadata description refers. Where DC.identifier is found within the <HEAD> </HEAD> area, it is assumed to refer to the file within which it sits and outside this area it is assumed to refer to another file; presumably the one for which the metadata description has been compiled.
Acceptable {type}s for DC.identifier include;

Links: Metadata descriptions including this field should contain a LINK to http://purl.org/metadata/dublin_core_elements#identifier. Where a {scheme} is used, a LINK should also be included to an on- or off-line description of the scheme and its acceptable terminology.

Element Name: Relation (DC.relation)
Usage: DC.relation {type} {identifier} Relation text
Example: DC.relation childof URL http://www.ncl.ac.uk/ucs/
Notes: DC.relation is a repeatable field, used to define the relationship between the resource under discussion and other closely related resources that may be of interest.
Acceptable {type}s include;

Link: Metadata descriptions including this field should contain a LINK to http://purl.org/metadata/dublin_core_elements#relation.

Element Name: Source (DC.source)
Usage: DC.source {type} source text
Example: DC.source ISBN 0-201-63337-X
Notes: DC.source is a repeatable field used to point to the original source of the record under description. A digital representation of a 1st edition Ordnance Survey map, for example, would refer to the paper master using DC.source. Multiple instances of DC.source may be used where an online record amalgamates several originals.
Acceptable {type}s include;

Link: Metadata descriptions including this field should contain a LINK to http://purl.org/metadata/dublin_core_elements#source.

Element Name: Language (DC.language)
Usage: DC.language <scheme> language text
Example: DC.language ISO639 en
Notes: DC.language defines the language in which the resource is presented. This language should be referred to in terms of ISO 639 terminology.
Links: Metadata descriptions including this field should contain a LINK to http://purl.org/metadata/dublin_core_elements#language. Where a {scheme} is used, a LINK should also be included to an on- or off-line description of the scheme and its acceptable terminology.

Element Name: Coverage (DC.coverage)
Usage: DC.coverage <type> {scheme} {extent} coverage text
Example: DC.coverage temporal York 512.1314
Notes: DC.coverage is a repeatable field, used to define the spatio-temporal coverage of the data being described. As such, the field is of great importance to the ADS.
Current schemes are poorly suited to describing the temporal complexities inherent in archaeological (and other) data. As such, the {scheme}s below are little more than stop-gap suggestions until a more suitable schema can either be found or developed.
Acceptable <type>s are;

Acceptable {scheme}s are; Acceptable {extent}s are; Links: Metadata descriptions including this field should contain a LINK to http://purl.org/metadata/dublin_core_elements#coverage. Where a {scheme} is used, a LINK should also be included to an on- or off-line description of the scheme and its acceptable terminology.

ADS recommended syntax for the ahdsDescriptor elements

In extending Dublin Core, the ADS suggests adding the fields... {fields}.

All of these should be enclosed within

  <META NAME = "package"
        TYPE = "begin"
        CONTENT = "ahdsDescriptor">
  
  ...ahdsDescriptor elements in here...
  
  <META NAME = "package"
        TYPE = "end"
        CONTENT = "ahdsDescriptor">
  
or the equivalent syntax when outside the <HEAD> </HEAD> area.

Element Name: Precision (AD.precision)
Usage: AD.precision <type> {type2} precision text
Example: AD.precision spatial recorded 2
Notes: AD.precision is a repeatable field which defines the level of precision available within the data being described.
Acceptable <type>s include:

Acceptable {type2}s include: A numeric code is used to describe precision. The list below shows the precision code number and its meaning under each {type} (in order: spatial; temporal; LOD); Link: Metadata descriptions including this field should contain a LINK to http://ads.ahds.ac.uk/project/metadata/ahds_descriptor_elements#precision.

Element Name: Access Rights (AD.accessrights)
Usage: AD.accessrights access rights
Example: AD.accessrights free
Notes: AD.accessrights describes the level of access available to users of the ADS for this particular dataset. In several cases, data owners may wish to charge for data, degrade the locational information provided, or only allow certain groups to have access.
Acceptable definitions of Access Rights include;

Link: Metadata descriptions including this field should contain a LINK to http://ads.ahds.ac.uk/project/metadata/ahds_descriptor_elements#access rights.

Worked examples of the ahdsDescriptor

The easiest way to see how this scheme is meant to work is to look at a few worked examples.

In each case, the visible metadata description describes a particular resource which might conceivably be held by the Archaeology Data Service. Viewing the source of each Web page, below, will show the metadata entry relating to the metadata page itself (eg. if using Netscape, select Source... from the View menu).

Record numbers etc are merely illustrative rather than real, and the layout of the report form itself requires further thought. Interesting developments in the generation of automatic web pages such as Clay Basket and Frontier need to be monitored with regard to this.

Unsurprisingly, each example is from archaeology. The actual usage of Dublin Core should have a wider value, though.

Examples


References

Burnard, L., Miller, E., Quin, L. & Sperberg-McQueen, C.M., 1996, A syntax for Dublin Core metadata: Recommendations from the second metadata workshop, URL: http://info.ox.ac.uk/~lou/wip/metadata.syntax.html.

Chartrand, J.A.H. & Miller, A.P., 1994, Concordance in rural and urban database structure: the York experience, Archeologia E Calcolatori 5: pp. 203-217.

FGDC, 1994, Content standards for Digital Geospatial Metadata, Federal Geographic Data Committee, 8 June.

FGDC, 1996, Federal Geographic Data Committee WWW site, URL: http://fgdc.er.usgs.gov/.

Global Change Master Directory, 1996, Directory Interchange Format (DIF) Writer's Guide version 5, URL: http://gcmd.gsfc.nasa.gov/difguide/difman.html.

Knight, J., 1996, MIME implementation for the Warwick Framework, Draft document, dated 29 May 1996, URL: http://www.roads.lut.ac.uk/MIME-WF.html.

Lagoze, C., Lynch, C.A. & Daniel Jnr, R., 1996, The Warwick Framework: A container Architecture for aggregating sets of metadata, URL: http://cs-tr.cs.cornell.edu:80/Dienst/UI/2.0/Describe/ncstrl.cornell%2fTR96-1593?abstract=warwick.

Miller, E.J., 1996, An approach for packaging Dublin Core metadata in HTML 2.0, URL: http://www.oclc.org:5046/~emiller/publications/metadata/minimal.html.

Miller, E.J., n.d., Issues of document description in HTML, URL: http://www.oclc.org:5046/~emiller/publications/metadata/issues.html.

NSDI, 1996, National Spatial Data Infrastructure WWW homepage, URL: http://fgdc.er.usgs.gov/nsdi2.html.

Schwartz, M., 1996, Report of the W3C Distributed Indexing/ Searching Workshop, URL: http://www.w3.org/pub/WWW/Search/9605-Indexing-Workshop/.

Sperberg-McQueen, C.M., 1996, On information factoring in Dublin metadata records, URL: http://www.uic.edu/~cmsmcq/tech/metadata.factoring.html.

TEI, 1996, Text Encoding Initiative WWW homepage, URL: http://www.uic.edu/orgs/tei.

Weibel, S., 1995, Metadata: the foundations of resource description, d-Lib magazine July 1995, URL: http://www.dlib.org/dlib/July95/07weibel.html.

Weibel, S., 1996, A proposed convention for embedding metadata in HTML, URL: http://www.oclc.org:5046/~weibel/html-meta.html.

Weibel, S., Godby, J., Miller, E. & Daniel, R., 1995, OCLC/NCSA Metadata Workshop Report, URL: http://www.oclc.org:5047/oclc/research/publications/weibel/metadata/dublin_core_report.html.

Weibel, S. & Miller, E.J., 1996, Dublin Core Metadata Element Set WWW homepage, URL: http://purl.org/metadata/dublin_core.


Modification history

1996-06-17
Document first made available

1996-06-18
Added remaining {type} and {scheme} qualifiers from Miller n.d., Appendix 2
Added extra {type} and {scheme} qualifiers
Enhanced list of references

1996-06-21
Added to ahdsDescriptor section
Increased internal consistency
Added first worked example
Created ahds_descriptor_elements page

1996-06-23
began enhancement of temporal referencing schemes

1996-07-03
tidied up, and increased external referencing

1996-07-10
alterations made to reflect new location of document on ADS website

[ Archaeology Data Service ] [ University of
  Newcastle ]
[ select View… | Document Source to see the metadata]