The Arch-I-Scan Project, 2019-2023

Penelope Allisonorc id logo , Daan van Heldenorc id logo , 2026. https://doi.org/10.5284/1138130.

Introduction

The Arch-I-Scan Project, 2019-2023

This data archive comprises a data resource of photographs of ceramic remains from Roman Britain – sherds and complete/near complete vessels – taken by the Arch-I-Scan Project between 2019 and 2023 in five different collections and 3D models of terra sigillata forms. The archive also includes spreadsheets with information on these ceramic remains to accompany these photographs.

The photographs were taken by the AHRC-funded Arch-I-Scan Project which aimed to develop an extensive dataset of images of ceramic remains to train and test an AI system for the automatic identification and recording of pottery (Núñez Jareño et al. 2021; van Helden et al. 2022; Tyukin et al. 2024; Allison et al. n.d.). The main ceramic remains that were photographed by the project are those of imported, Gaulish, terra sigillata. Terra sigillata is not only ubiquitous across the Roman world but also has a limited number of relatively standardised forms, making this fabric appropriate for the purpose of the initial training and testing of an AI tool.

The five collections that have provided ceramic remains for the Arch-I-Scan Project to photograph for this purpose are those of project’s partner organisations – London Museum (formerly Museum of London) (MoL), MOLA, University of Leicester Archaeological Services (ULAS), Vindolanda Charitable Trust, Colchester and Ipswich Museums (CIMS). MoL provided the complete/near complete vessels for photographing at the inception of the project and the other four partner organisations provided fragmentary remains for more extensive training and testing of the tool. While the project’s focus was on photographing remains of terra sigillata, remains in some other Roman fineware fabrics were also photographed in the MoL and MOLA collections and are included in this data archive.

This archive comprises two major datasets of photographs with their accompanying information spreadsheets, and a smaller one with the 3D models. The first, major, dataset consists of photographs of the 28,160 sherds photographed in the MOLA, ULAS, Vindolanda Trust and CIMS collections. To enhance automated form identification, in principle, six photographs were taken of each sherd and a further magnified photograph was taken of the vessel fabric (see van Helden and Allison n.d. Section 3). The second dataset comprises photographs of 384 near complete vessels photographed in the MoL collection. These were photographed from numerous positions to capture as much information as possible, so there are many more photographs of these vessels in this dataset (for the photography protocol see van Helden and Allison n.d. Section 2). The third dataset consists of 64 3D models of terra sigillata forms generated from Peter Webster’s drawing of these forms (1996). These models were used to generate simulated sherds for the AI training process (for the code for this process see Mirkes et al. 2025).

These data – i.e., the photographs of ceramic remains, their form identifications, and the 3D models – have been used by Arch-I-Scan to develop and train an AI tool. The codes that have been developed for this purpose are also available at Mirkes et al. (2025). These data and these codes provide researchers with a resource through which they are able to verify the Arch-I-Scan Project’s results. This resource also provides the material for building and expanding on Arch-I-Scan’s processes for developing AI tools for automated archaeological vessel identification from ceramic sherd remains. As the training of visual AI models requires huge corpora of photographs, this archive can contribute to future efforts in this direction. It can also potentially provide data and comparanda for further ceramic analyses, in particular inter-site analyses.

Acknowledgements

The Arch-I-Scan Project is grateful, first and foremost, to the organisations who provided the project with access to their collections and the facilities to photograph these collections –London Museum, MOLA, Vindolanda Trust, Colchester and Ipswich Museum Services, and the University of Leicester Archaeology Service. Staff at these organisations we would like to thank especially are: Roy Stephenson and Nicola Fyfe (London Museum); David Bowsher, Fiona Seeley, Emma Dwyer and Richard Ward (MOLA); Andrew Birley, Barbara Birley, Marta Alberti, Colin Galloway; and Cristina Crizbasan (Vindolanda Trust); Frank Hargrave, Glynn Davis and Pippa Pickles (CIMS); Richard Buckley, Nicholas Cooper and Elizabeth Johnson (ULAS). We are also grateful to Sonya Galloway, Penny Trichler and Helen Charlie Nellist for support for the Vindolanda volunteer programmes and open days; and to Liz Baliol-Key from the Leicester stores, who helped with access to ULAS’ Causeway Lane material. We would also like to thank Fiona Seeley (now freelance), Nicholas Cooper, Elizabeth Johnson and also David Griffith as the ceramic specialists who assisted in our recording programmes and in identifying vessel forms to train the AI service. Special thanks are due to Danielle de Carle and Sarah Morriss from the School of Archaeology and Ancient History, University of Leicester, who provided valuable technical support throughout the project. We are very grateful to all the students and other volunteers who took part in our photographing sessions who are too numerous to name here but whose assistance has made this project viable. Last but by no means least, we thank Katie Green and Marco Brunello, Archaeology Data Service, for their assistance in putting this archive together.

References

Allison, P. M., Tyukin, I. Y., Mirkes, E. M., van Helden D. P., Tyukina, T. A., and Zheng, Z. n.d. The Arch-I-Scan Project: Artificial Intelligence and other approaches to ceramic identification and analyses in the Greek and Roman worlds, Internet Archaeology special volume.

van Helden, D., Mirkes, E., Tyukin, I., and Allison, P. 2022. The Arch-I-Scan Project: Artificial Intelligence and 3D Simulation for Developing New Approaches to Roman Foodways. Journal of Computer Applications in Archaeology, 5(1), 78–95. https://doi.org/10.5334/jcaa.92

van Helden, D, P. and Allison, P. M. n.d. The Arch-I-Scan Project’s recording and image preparation processes: development and refinements, in Allison et al. n.d. The Arch-I-Scan Project: Artificial Intelligence and other approaches to ceramic identification and analyses in the Greek and Roman worlds, Internet Archaeology special volume.

Mirkes, E. M., van Helden, D. P., Zheng, Z., Tyukina, T. A., Tyukin, I. Y., Núñez Jareño, S. J., and Allison, P. 2025. The Arch-I-Scan Project repositories. https://github.com/ArchiScn/Access

Núñez Jareño, S. J., van Helden, D. P., Mirkes, E. M., Tyukin, I. Y., and Allison, P. M. 2021. Learning from Scarce Information: Using Synthetic Data to Classify Roman Fine Ware Pottery. Entropy 23, 1140. https://doi.org/10.3390/e23091140

Tyukin, I. Y., Tyukina, T., van Helden, D., Zheng, Z., Mirkes, E. M., Sutton, O. J., Zhou, Q., and Gorban, A. N.  2024. Weakly supervised learners for correction of AI errors with provable guarantees, 2024 International Joint Conference on Neural Networks (IJCNN), Yokohama, Japan. https://doi.org/10.1109/IJCNN60899.2024.10650338

Webster, P. 1996 Roman Samian Pottery in Britain. York: Council for British Archaeology.