In early 2018, as part of the ADS strategic plan to maintain and develop our world-leading position in digital preservation and Open Access publishing in Archaeology, the ADS management team commissioned a Business Analyst at the University of York (Jamie Holliday) to provide an external, critical, yet friendly review of the work of the ADS and Internet Archaeology. The aim was to identify opportunities to improve our service delivery, processes, management practices and staff development. The review took a mainly qualitative approach, using a balanced scorecard methodology, looking at ADS from the perspective of:
Learning & Growth
The review also commented on more general strategic issues that emerged, including succession planning, achieving clarity of vision and improving our financial position to allow for increased reinvestment. A follow-up review, conducted by the University’s Assistant Director of Information Services and Head of IT Infrastructure, Arthur Clune, focused on ADS Technical Systems. The reports, recommendations and ADS Action Plans were received by the ADS Management Committee in October 2018, although there is ongoing work on charging models.
The most immediate and visible impacts of the review have been some changes to ADS roles and staffing. In September 2018, with the departure of Louisa Matthews to undertake a PhD in the University of Newcastle we took the opportunity to create a new post, held by Katie Green. Whilst it has the job title of Collections Development Manager, it actually combines aspects of this role with that of her former job as Communications and Access Manager. Other aspects of the former CDM role have been taken by Ray Moore, our new Archives Manager. Ray is now the first port of call for archive costings, and also oversees the day-to-day work of the archivists. The most recent change is that we have appointed a Deputy Director to oversee operations management: Tim Evans, who joined ADS in 2006 as ALSF Digital Archivist and is currently HERALD project manager, will take this post up from December. Tim will retain responsibility for oversight of HERALD, the OASIS redevelopment project, and will also begin to represent ADS in a broad range of external partnerships. Finally, we hope soon to be looking to appoint at least one Digital Archives Assistant, an entry-level trainee grade for budding archivists.
As it’s World Digital Preservation Day I thought I’d finished the following blog about our work with managing the digital objects within our collection. Like most of my blogs (including the much awaited sequel to Space is the Place) these often languish for a while awaiting a final burst of input. To celebrate WDPD 2018, here we go….
I half-heartedly apologise for the self indulgent and title to this blog, which most readers will know is taken from Rutger Hauer’s speech in the film Bladerunner (apparently he improvised). Aside from being an unashamed lover of the original film, like Roy Batty in the famous rooftop finale I’ve recently been prone to reflection on the events I’ve witnessed [at the ADS] over the last few years. In all honesty these aren’t quite on a par with “Attack ships on fire off the shoulder of Orion”, but perhaps as impressive in their own little way.
This reflection isn’t prompted by any impending doom – that I’m aware of – but rather that the some of my recent work has been looking at the work the ADS has done in the context of the last two decades, for example looking at the history of OASIS as we move further into the redevelopment project, and revisiting the quantities and types of data we store as we find we’re rapidly filling up file servers. Along with this is a sudden realisation that after so long here I have become part of the furniture (I’ll let the reader decide which part!). However as colleagues inevitably leave – and although we do take the utmost care to document decisions (meeting minutes, copies of old procedural documents etc) the institutional memory sometimes becomes somewhat blurred, even taking on mythical status: “We’ve always done it like that”, “That never worked because…”, “So-and-so was working on that when they left”, “A million pounds” and so on.
Saving uploading Julian’s consciousness to an AI, which even with our best efforts we’re still some way off perfecting, there’s a danger of much of this internal history becoming lost (like tears in the rain). Over the past few years I’ve quite enjoyed talking – mainly to peers within the wider Digital Preservation community – about issues and problems/successes at the ADS. And just recently I gave a talk at CAA (UK) in Edinburgh about the twenty year journey of the ADS, from one of the 5 AHDS centres to a self-sustaining accredited digital archive. The talk itself didn’t have a particularly large audience, perhaps a result of the previous nights party (the conference as a whole was welcoming and well-organised) or the glittering papers in the parallel session, plus on this occasion I think I struggled to get 20 years of history into exactly 20 minutes!
The main thing I really wanted to communicate to people was quite how far the ADS have come technically and conceptually, from our beginnings in 1996, where we are now, and more importantly where we want to be. As a previous blog has covered in massive detail (WITH GRAPHS!) our holdings have grown considerably over the years, with associated problems in finding room to store things. Another issue as we surge past 2.5 million files is increasing the capacity for our users (and us!) to find things. As I showed an enraptured audience at CAA we’ve come along way from 2006 (when I joined) when we were running 2 or 3 physical servers, to the present day where we have a dispersed system of nearly 40 virtual machines with a range of software, which in turn support a large array of tools, applications and services that underpin our website(s), and the flows of data we provide to third parties.
I always think this is an unseen part – to many outsiders – of what the ADS do, and along with the procedures we have for actually being an archive there’s a whole lot of work going on underneath what we make visible to our users. In the talk to CAA I used the common analogy of a swan, what you see is the website, what you don’t see are the feet paddling away underneath. This doesn’t detract from the website of course, a commitment to providing access to data has always been a fundamental part of what and who we are. It’s as frustrating to us as to a user when someone can’t find what they’re looking for, especially when they know it exists. Which is why it is interesting (and I really think it is) to look at how we manage our data, and to make the ‘ADS Swan’ as efficient as possible.
For example, back in the old days (2006) interfaces to data were effectively hard coded into web-pages using the ColdFusion platform (CFML) as an interface between the XHTML and underlying file server and database. This was ok in its way, although still required someone to either code in links to file, generate file listings in page (or via separate scripts or commands). A common source of many broken links of this era is simply human error in generating these lists and replicating them in the web-page.
Of course, even at the time my colleagues were aware that this was not the most efficient way we could work, and even the functions of ColdFusion (and its successors OpenBlue Dragon and Lucee) that generated listings directly in the code were still reliant on someone actually setting which directory was needed and how to handle the results directly in the page. Not great for when we had to update things… There was also an issue of the information displayed in the page, effectively you came to an archive, scrolled through and were presented with descriptions that were often little more than the file-name. There was also the massive issue of a disconnect between the files and the interface, actual file-level metadata was only stored in the files (e.g. CSV) in the file store. Our Collections Management System (CMS) stored lots of information about the collection, and we know it had files in it, but not the details. Any fixing, updating, migrating, querying all had to be done by hand, which was fine when we only had a small number of collections but presented problems when scaling up. Effectively, we had to get our files (or objects/datastreams) into some sort of Digital Asset Management System. Cue project SWORD-ARM…
This project is probable deserving of its own essay, suffice to say we investigated using Fedora (Commons, as it later became) as a DAM for storing all the lovely rich technical and thematic metadata we collect, and perhaps most importantly had already collected (we already had several hundred collections of nearly a million files at this point). In short, an implementation of Fedora to suit our needs was deemed too complicated, and with too high a level of subsequent software development and maintenance for us to sustain. At that point -and again to our understanding and needs – if even deleting a record required issuing a ticket for our systems team (the magnificent Michael and prodigious Paul at that point), then we were onto a loser. For our needs, perhaps all we needed as a database and a programming language…
The heroes of this story were undoubtedly Paul Young, Jenny Mitcham, Jo Gilham and Ray Moore who between them created an extension to our existing CMS: the Object Management System (OMS). The OMS is really too big to explore in too much detail, but the design of it was based on three overarching principles:
To manage our digital objects in-line with the PREMIS data model
To store accurate and consistent technical file-metadata
To store thematic metadata about the content (what does the file show/do?)
The ambition was, and still is, to have a situation where a user provides much of this information ready-formed courtesy of an application such as ADS-EASY or OASIS. But most importantly I believe (and for this blog not to derail into masses of detail) was the move towards an implementation of the semantic units as defined in PREMIS. To explain, consider the shapefiles below.
In our traditional way of doing things we just had a bunch of files on a server. Here, we have the files in the database but also a way of classifying and grouping them to explain what they are. So for example, a Shapefile has commonly used the dBase IV format (.dbf) for storing attributes; we also get .dbf as stand-alone databases. We need to know that this .dbf is part of a larger entity, and should only be “handled” as part of that entity. In this case a Shapefile is normalized to GML (3.2) for preservation, and zipped up for easy dissemination. All of these things are part of the same representation object, we need to keep them together however dispersed they are across servers, associate them with the correct metadata, and plan their future migration accordingly.
And of course this is where we can store all our lovely technical and thematic metadata. For example I know for any object:
When it was created
What software created it
Who created it
Who holds copyright
It’s subject (according to archaeological understanding)
The file type – according to international standards of classification
Its content type
If it’s part of a larger intellectual entity
And we’re close to also fully recording an objects life-cycle within our system
When it was accessioned
When it was normalized – and the details of this action
When it was migrated
If it was edited
I’ve deliberately over-simplified a very complicated process there as I’m running out of words. But suffice to say that the hard work many people (including current colleagues Jenny and Kieron) have put in on developing this system is nearing a stage where the benefits of all this are tantalizing close.
Now, readers from a Digital Preservation background will understand how that’s essential for how we need to work. The lay reader may well be thinking of the benefit to them. Put simply, this offers the chance to explore our objects having and independence away from their parent collections. For example, when working on the British Institute in Eastern Africa Image Archive (https://doi.org/10.5284/1038987) Ray built this specialised interface for cross-searching all the images. In this case all the searching is done on the metadata for the object representation, so for example:
It’s not too much of a jump to see future versions of the ADS website look to incorporate cross-collection searching. Allowing people quick, intuitive access to the wealth of data we store and perhaps, a way to cite the object… Something to aim for in a sequel at least.
Anyway, as always if you’ve made it this far thanks for reading
Journey into the archive with our new online gallery.
The Wonders of the ADS, is a digital exhibition dedicated to highlighting the outstanding digital data held in the ADS archive.
The Wonders of the ADS digital exhibition developed out of a collaborative project with Carlotta Cammelli, a Leeds University MA Art Gallery and Museum Studies student as part of her Masters dissertation. The project entitled Unearthing the Archive: Exploring new methods for disseminating archaeological digital data aimed to develop an innovative online approach to present specific digital objects (such as photographs, drawings, documents, videos and 3D data files) from the ADS collections in order to increase public engagement with the data in our archive.
Traditionally the ADS is used by researchers with specific interests in mind. The structure of the ADS into individual archives also means that sometimes interesting material can be buried within the vast quantity of data held by the ADS. Continue reading Wonders of the ADS:→
Following the closure of Birmingham Archaeology (BUFAU), a project was initiated to identify and secure important born-digital archival material, and latterly to arrange transfer to the ADS. I’ve had the pleasure of archiving this digital material, including images, CAD files, databases and GIS over the last few months. The archives and reports of Birmingham Archaeology can now be accessed from the overview page: http://archaeologydataservice.ac.uk/archives/view/1959/
A total of 68 BUFAU archives have been released. Below I will highlight some of my favourite archives that I have worked on over the last couple of months.
Ahead of the redevelopment of Derby Inner Ring Road, Birmingham Archaeology was commissioned to undertake archaeological fieldwork. This site consisted of several different archaeological investigations, including a watching brief, an evaluation, an excavation and an historic building recording. Stratified archaeological deposits spanned a period from the 11th to the 20th centuries. This archive includes an extensive image gallery, reports, CAD files and GIS. Continue reading Birmingham Archaeology Digital Archives→
The ADS are pleased to announce that the ADS Library will be moving out of its Beta phase and go Live on Tuesday 16th January. Concurrently with this the ADS will also be launching a newly designed website. The main aim of the new website design is to make it easier for our users to access our searchable resources. With the launch of the ADS Library the ADS now provides three main heritage environment search tools:
Each of these tools should be used to search for different types of information held by the ADS. Archsearch is for searching metadata records about monuments and historic environment events in the UK. The ADS Archives is the place to search for historic environment research data (such as images, plans, databases) and contains international and UK data. The ADS Library is a bibliographic tool for searching for written records on the historic environment of Britain and Ireland. Where possible, the record will provide a direct link to the original publication or report.
In order to make the differences between these search tools clear to users, and to make all three tools easy to find from our main website, we will be introducing a new website menu with drop-down links that enable a user to go straight to each of our search resources. This new drop-down menu can be seen in the image on the right.
Users will also be given the option to access a main search page that will explain the differences between each of the available search options. This page will then allow you to choose which search facility to send your chosen keywords to.
The ADS has also taken this opportunity to redesign the layout of our website, creating a bold new home page, designed to better highlight our featured collections and news items, while providing links to our new search and deposit pages.
Our new Deposit page will also provide clearer links to the different types of data deposit options available to researchers wishing to archive data with the ADS.
Our new About page provides clear links to our operations policies and details of our governance.
The new design will include a help tab on our menu with links to frequently asked questions and our contact details, allowing users to troubleshoot problems faster and get the right help quicker.
The new design will reduce the number of main tabs in the menu. This means that some of our resources have moved location. For example our Teaching and Learning page will now be found under the Advice tab. However, despite the reduction in the number of main options on the menu, the introduction of the drop-down feature will mean that, in practice, more pages will be directly accessible from the menu than previously. Overall the new design will surface the most important pages of our website better and make our key resources accessible via fewer clicks.
Although the design and structure of the website has changed, and some things may now be found in a different location, very few URLs have changed. Only out-of-date pages have been removed so bookmarks to specific pages should still work, and Archsearch, the ADS Archives and the ADS Library are still navigated in exactly the same way. If you have any trouble finding resources please contact firstname.lastname@example.org .
On 30th November 2017 the first ever International Digital Preservation Day will draw together individuals and institutions from across the world to celebrate the collections preserved, the access maintained and the understanding fostered by preserving digital materials.
The aim of the day is to create greater awareness of digital preservation that will translate into a wider understanding which permeates all aspects of society – business, policy making and personal good practice.
To celebrate International Digital Preservation Day ADS staff members will be tweeting about what they are doing, as they do it, for one hour each before passing on to the next staff member. Each staff member will be focusing on a different aspect of our digital preservation work to give as wide an insight into our work as possible. So tune in live with the hashtags #ADSLive and #idpd17 on Twitter or follow our Facebook page for hourly updates. Here is a sneak preview of what to expect and when:
To mark the 2017 Open Access week, we thought it would be a good time to introduce the winner of our first Open Access Archaeology fund award (see our original announcement here), decided on after much deliberation and consideration by the panel of 3 independent judges. So…
Chris Whittaker carried out a survey at Breedon on the Hill, a multi-period hilltop site, as part of his undergraduate dissertation at Newcastle University, supervised by Dr Caron Newman. After graduating he worked outside archaeology in the technology sector. However conscious that his data was potentially at risk, he applied to the fund to help preserve the data and publish his findings. He has since started to study for a research master’s in settlement archaeology at Newcastle University.
The judges felt that Chris’ proposal – Breedon Hill, Leicestershire: an archaeological investigation at the multi-period hilltop site – was “an important site and methodically-collected dataset, which made good use of both Internet Archaeology and ADS, with the data having considerable potential for re-use to inform future fieldwork”.
About Breedon Hill
Breedon Hill, Leicestershire is a scheduled ancient monument. The hilltop was the site of a univallate hillfort present from the Early-Middle Iron Age. From the 7th century AD, a minster church was founded within the hillfort enclosure. Today, approximately two-thirds of the Iron Age rampart, and much of the hillfort interior, has been irretrievably lost due to quarrying (Figure 2). The investigation combined magnetometry and resistivity geophysical surveys, alongside digital terrain models (processed LIDAR data), to contribute to the understanding of the character and development of the hillfort interior and its immediate environment. Very little is known about the different phases of occupation at the hilltop, as previous excavations have primarily focussed on the ramparts, and so Chris’ investigation sought to address this issue.
The results of Chris’ geophysical survey reveal several phases of roundhouses and post-hole built structures, as well as several potential associated enclosures, in the south-eastern part of the hillfort interior. These will be published as part of a future open access article in Internet Archaeology and will link to a related digital archive deposited with the Archaeology Data Service. We are looking forward to working with Chris in the coming months.
Chris said “The work was undertaken while I was an undergraduate student, firstly as part of an independent summer research programme (processing the LIDAR data), and secondly as part of an undergraduate dissertation (undertaking the geophysical survey). Publisher or institutional paywalls are often barriers for local researchers to study the world around them. And I know from personal experience that projects such as the digitisation of volumes of the Derbyshire Archaeological Journal, preserved with the ADS, are of great benefit to local and school-level research alike. From a research perspective [open access] offers many opportunities for colleagues from different backgrounds to build on and potentially refine the resources preserved.”
And now, we start all over again…
As you know, the Open Access Archaeology fund is made up of donations, set aside to support the digital archiving and publication costs of those researchers for whom funding is simply not available despite research quality and whose digital data is potentially at greater risk.
Thank you to everyone for your support for our #OAFund which is now being used to support the open access dissemination of Chris’ work. Of course, in making the first award, we now need to start all over again to raise sufficient funds for the next round to help more early career and independent researchers like him. So please consider donating today and help to reduce the barriers to open archaeological research and advance knowledge of our shared human past.
Nine months ago, we launched our Open Access Archaeology Fund. We have sent our little USB trowels all over the globe by way of a ‘thank you’ and we have been thrilled with everyone’s generosity, not least in such austere times.
So, it makes us even happier to say that sufficient funds have now been accrued and we are in a position to make our first award to cover costs of an unfunded proposed archive or article. (Full details of eligibility can be found here)
So if you or someone you know, has already submitted an article proposal or approached ADS about an archive for which you have no funding, then you can apply to the fund today.
Have you donated yet?
The successful application will likely deplete the fund substantially but we did not want to delay making the first award – it is infinitely preferable that the benefits of the fund can be fast and tangible. However we need more donations to do it all again in 6 months time!
Every donation you make helps to ensure that more archaeological research is open and accessible.
In December of last year (2016), I completed the final stage of the digital archive and dissemination for the The Rural Settlement of Roman Britain project. The first publication and (revised) online resource were launched at a meeting of the Society for the Promotion of Roman Studies at Senate House of the University of London.
I’ve written previous blogs on the project, so won’t repeat myself here too much. Suffice to say that the final phase publishes the complete settlement evidence from Roman England and Wales, together with the related finds, environmental and burial data. These are produced alongside a series of integrative studies on rural settlement, economy, and people and ritual, published by the Society for the Promotion of Roman Studies as Britannia Monographs. The first volume, on rural settlement, has now been published, while the two remaining volumes will be released in 2017 and 2018.
The existing online resource has been updated both in content and functionality: the project database is available to download in CSV format, and most key elements of the finds, environmental and burial evidence have been added into the search and map interface. Hopefully the dissemination of the data in these forms allows re-use of this fantastic dataset in a variety of ways and, I hope, by a variety of users.
As with previous posts on this project, I’d like to say how much I’ve enjoyed working with the team at Reading and Cotswold. Producing an online archive and formal publication in tandem and in such a short time is no mean undertaking. I’m particularly happy/impressed with the determination by the researchers to make their data openly available at the earliest opportunity. Hopefully this is a benchmark that others will aspire to reach. A debt of thanks is also due to all those organisations that assisted the project, particularly the HERs of England and Wales who provided exports from their systems and aided the team at Cotswold with access to fieldwork reports. Finally, I’d have been lost without the awesome Digital Atlas of the Roman Empire created by Johan Åhlfeldt. At an early stage it became clear that creating any kind of ‘baseline mapping’ of Roman archaeology (combining NMP + HER data for example) would be problematic – both in terms of technical overheads and copyright. To do something on the scale of the EngLaId project’s ArcGIS WebApp simply wasn’t in the scope of the project! Johan’s work was thus timely and extremely useful in providing a broad backdrop of Roman Britain in which to compare the project results.
The rationale behind much of the interface work was to act as data publication of an academic synthesis and not to get tied down in building something akin to a Roman portal. Throughout the project we’ve been at pains to point out that this is very much a synthesis and interpretation of the excavated evidence in relation to a research question. Not a complete inventory or atlas of every Roman site. Indeed, it became clear that as soon as the data collation had been completed 31st December 2014 for sites in England and March 2015 for sites in Wales), it was effectively missing all the discoveries made in the following years. Thus although providing broad context was necessary in this case, if someone wanted to know everything about the Roman period (including sites not excavated) from a particular area they’d be best off consulting the relevant HER.
This in turn leads onto the $64,000 Question which I was asked at every event around England and Wales (including the final one in London). “What plans are there to keep this database updated”? Without wishing to appear pessimistic, I would always answer “None”. Aside from the logistics and finances of keeping a large database as this constantly updated, there’s also the fact that this is a very subjective synthesis of a much larger resource. To my mind, the key question is how do we make it easier for other researchers to build on this and have academic synthesis of a period or theme happen on a more regular basis. One of the answers to this is surely access to data, especially the published and non-published written sources. This isn’t really radical, and indeed increased access to data is being explored and recommended by the Historic England Heritage Information Access Strategy. The work of the Roman Rural Settlement project has many lessons to inform these strategies, some of which will form future papers by the project team. Out of curiosity I’ve undertaken my own analysis of the project database and ‘grey literature’ sources (a term I don’t like!) and the OASIS system but will save that for a separate blog post. ..
At the post-launch meal I did end up asking the team a rather cheesy question of “which is your favourite record”? The responses were often based around the level of finds, or in the relative level of information the site could add to a regional picture. My answer(s) were perhaps a little more prosaic, for example I really like records such as Swinford Wind Farm (Leicestershire) which has fieldwork reports disseminated via OASIS, and a Museum Accession ID. However my heart veers towards 42 London Road, Bagshot (Surrey): the site of my very first experience of archaeology as a somewhat geeky 16 year old. The site was never published, and thus it’s great to see it live on in this resource and with a link to the corresponding HER record to (hopefully) allow users to go and explore the wider area. Perhaps even to undertake their own research project. To my mind, to stimulate further work large and small that would be a great legacy of the project.
The following blog is simply a musing on our historic approaches to archiving formatted text files, prompted by a user enquiry into “best formats” for preservation of their reports, and my role at the ADS as keeping abreast of said formats and our internal policies.
Many years ago, in a meeting of the curatorial and technical team (CATTS), conversation veered towards our procedures for handling text documents. That is files whose significant properties were formatted text/typeset reports, as opposed to plain text files (with ascii or UTF-8 encoding) often used for exporting or importing of data. One colleague, half in jest, commented that as the Archaeology Data Service our focus should be on the literal data as understood in computer science – the individual pieces of information being generated from various instruments or collected in databases. Reports it may be argued are the interpretation of that data, but often not the raw data itself.