FAIR data

The ADS is an advocate for the FAIR principles for data stewardship. As such the ADS recognise that while preservation and dissemination of data remain of core importance, stewardship should also include demonstrable quantitative and qualitative evidence for data reuse. The ADS is actively investigating how the datasets it curates can be fully compliant with the FAIR principles and is working within SSHOC, ARIADNEplus and E-RIHS to promote this.

As a result when you deposit your datasets with the ADS, you can be confident that your data becomes FAIR data.

What is FAIR Data?

The FAIR Principles provide an important framework to evaluate and publish data in order to facilitate discovery, provide sustainable access to resources, and encourage and enable better sharing and reuse of data. To achieve these goals the core principles emphasise:

Findability:

improving the discoverability of data through the use of appropriate documentation and metadata, and supporting the use of sustainable referencing of resources.

Accessibility:

ensuring the sustainable availability of digital assets.

Interoperability:

providing both syntactically parseable and semantically understandable datasets and metadata, and facilitating data exchange and reuse between researchers, organisations, institutions across national and international boundaries.

Reusability:

sufficiently documenting and sharing data using the least restrictive licences possible, thereby facilitating data reuse and supporting the integration of other data sources.

In an environment where humans increasingly rely on computational systems and processes to find, access, interoperate, and reuse data these principles emphasise machine-actionability with limited, or minimal, human intervention.

Find out more about the FAIR data principles via the Force11 community, GoFair, or OpenAire.

How is ADS data FAIR data?

Each of the FAIR Principles and sub-principles is described below, along with the specific ways in which the ADS ensures compliance with all aspects of FAIR.

Findable

F1. (Meta)data are assigned a globally unique and persistent identifier.

For a fuller discussion of the ADS metadata and the use of persistent identifiers see our Metadata policy and procedures pages.

The ADS uses Digital Object Identifier (DOIs) persistent identifiers for all collections.
The ADS supports the use of ORCID IDs.
The ADS supports the use of WikiData Q Codes.

F2. Data are described with rich metadata (defined by R1 below).

All ADS resources are documented using the Dublin Core Metadata Element Set (DCMES) plus Dublin Core Metadata Initiative (DCMI) recommended qualifiers.
The ADS also provides rich qualitative and technical metadata for all digital objects. These are repository specific metadata requirements, derived from domain-specific community standards (i.e. Guides to Good Practice, see also R1.3 below).
All metadata is displayed alongside data, with technical metadata downloadable in open formats.

F3. Metadata clearly and explicitly include the identifier of the data they describe.

All persistent identifiers for ADS collections are clearly displayed, alongside data, within each archive interface.
The ADS supports the use of additional or supplemental identifiers relating to the dataset that link to external repositories, agencies or resources. This includes identifiers for physical, as well as digital, collections.

F4. (Meta)data are registered or indexed in a searchable resource.

ADS datasets are findable through the repositories own indexes and catalogues.

ADS collections are also available through external catalogues and resources, including:

ADS catalogues and indexes are searchable and harvestable through a series of OAI-PMH targets, and as linked open data using a SPARQL query web interface.

Accessible

A1. (Meta)data are retrievable by their identifier using a standardised communications protocol.

All ADS datasets utilise the HTTPS protocol to ensure free and open access to resources and to facilitate data retrieval.
In rare instances, where discrete data objects are too large to support easy exchange using HTTPS, the ADS makes data available ‘on request’ using free and open exchange services (e.g. University of York DropOff Service).

A1.1 The protocol is open, free, and universally implementable.

The ADS uses the HTTPS protocol for the sharing of resources and transfer of datasets. This is widely supported, open, and freely available.
The repository utilises open and free file-sharing services where files or datasets are too large for easy exchange using HTTPS. Typically the ADS utilises the open and free University of York DropOff Service to share data when this is necessary.

A1.2 The protocol allows for an authentication and authorisation procedure, where necessary.

The use of HTTPS provides authentication of the ADS website, and ensures the protection of the privacy and integrity of disseminated data. The repository ensures that all server-side digital certificates are current and up to date.

A2. Metadata are accessible, even when the data are no longer available.

As an accredited digital repository the ADS supports long-term preservation and access of its holdings, consequently all datasets and metadata are maintained in perpetuity.
The ADS maintains a clear Appraisal and Deaccession Policy which outlines current practice for datasets removed from the archives holdings. In such instances the ADS is committed to supporting identifiers (i.e. DOIs), maintaining resource discovery metadata, and updating current information on resources.

Interoperable

I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.

All resource discovery metadata is made available using a qualified Dublin Core in RDF/XML through the ADS Linked Data repository
External services also consume and disseminate metadata.

I2. (Meta)data use vocabularies that follow FAIR principles.

For a wider discussion on the vocabularies used in ADS metadata see our Strategy and Standards page.

The ADS uses a variety of sustainable, open vocabularies to qualitatively classify and identify resources and datasets, including:

Heritage Data vocabularies, including those provided by the Forum on Information Standards in Heritage (FISH), Historic England (HE), Historic Environment Scotland (HES), and the Royal Commission on Ancient & Historical Monuments of Wales (RCAHMW)
Library of Congress Subject Headings (LCSH)
Marine Environmental Data and Information Network (MEDIN)
Getty Thesaurus of Geographic Names (TGN)

The ADS also utilises recognised technical vocabularies to denote and categorise preservation activities.

PREservation Metadata: Implementation Strategies (PREMIS)
Getty metadata types (Baca 2016)

I3. (Meta)data include qualified references to other (meta)data.

The ADS supports the qualified referencing with and between publications, datasets and resources. Where available the repository uses sustainable referencing, e.g. DOIs.

Reusable

R1. Meta(data) are richly described with a plurality of accurate and relevant attributes.

R1.1. (Meta)data are released with a clear and accessible data usage license.

All ADS resources have clearly defined terms of access and reuse within each collection interface, and within metadata records distributed by the ADS or externally. Typically, data is disseminated under the terms of Attribution 4.0 International (CC BY 4.0), but data may also be disseminated under other forms of Creative Commons (see also the ADS Terms of Use and Access to Data).

R1.2. (Meta)data are associated with detailed provenance.

The ADS provides detailed provenance metadata for all data. At a collection level this is clearly expressed in the archive interface and discovery metadata, but also at a file level within the technical metadata disseminated alongside the data.

R1.3. (Meta)data meet domain-relevant community standards.

The ADS utilises a qualified Dublin Core metadata standard for all collection level metadata (noted above). The repository also uses standardised templates to ensure metadata consistency. All data must be accompanied by appropriate, file specific ‘technical’ metadata, this is derived from recognised community standards (Guides to Good Practice) to ensure consistency. All (meta)data is accepted, preserved and disseminated in sustainable, open formats. These are expressed in the ADS Instructions for Depositors and the ADS Policy and Procedures. The repository employs appropriate vocabularies to qualitatively describe datasets (noted above) and document preservation actions.

Resources

Collins, S., Genova, F., Harrower, N., Hodson, S., Jones, S., Laaksonen, L. et al. (2018). Turning FAIR into reality. Final Report and Action Plan from the European Commission Expert Group on FAIR Data.

FAIRsFAIR (Fostering Fair Data Practices in Europe) aims to supply practical solutions for the use of the FAIR data principles throughout the research data life cycle. Emphasis is on fostering FAIR data culture and the uptake of good practices in making data FAIR.

Force11 aims to improve research practices by supporting innovations in the ways knowledge is created and shared across research disciplines, communities, sectors and timeframes. This includes a group working on the FAIR principles.

GO FAIR Initiative is a bottom-up, stakeholder-driven and self-governed initiative that aims to implement the FAIR data principles. It offers an open and inclusive ecosystem for individuals, institutions and organisations working together

OpenAIRE aims to shift scholarly communication towards openness and transparency and facilitate innovative ways to communicate and monitor research. An OpenAIRE Task Force on Research Data Management is active in creating materials for supporting FAIR.

References

Baca, M. (ed.) (2016). Introduction to Metadata. Getty Research Institute: Los Angeles. https://www.getty.edu/publications/intrometadata/, accessed 05/11/2020.
Bezjak, S., Clyburne-Sherin, A., Conzett, P., Fernandes, P., Görögh, E., Helbig, K et al. (2018). Open Science Training Handbook (Version 1.0). Zenodo. http://doi.org/10.5281/zenodo.1212496, accessed 05/11/2020.

Help & guidance Data access and reuse