Re-purposing Excavation Database Content as Paradata

An Explorative Analysis of Paradata Identification Challenges and Opportunities




metadata, paradata, metadata extraction, data reuse, research data, unstructured data, archaeological data


Although data reusers request information about how research data was created and curated, this information is often non-existent or only briefly covered in data descriptions. The need for such contextual information is particularly critical in fields like archaeology, where old legacy data created during different time periods and through varying methodological framings and fieldwork documentation practices retains its value as an important information source. This article explores the presence of contextual information in archaeological data with a specific focus on data provenance and processing information, i.e., paradata. The purpose of the article is to identify and explicate types of paradata in field observation documentation. The method used is an explorative close reading of field data from an archaeological excavation enriched with geographical metadata. The analysis covers technical and epistemological challenges and opportunities in paradata identification, and discusses the possibility of using identified paradata in data descriptions and for data reliability assessments. Results show that it is possible to identify both knowledge organisation paradata (KOP) relating to data structuring and knowledge-making paradata (KMP) relating to fieldwork methods and interpretative processes. However, while the data contains many traces of the research process, there is an uneven and, in some categories, low level of structure and systematicity that complicates automated metadata and paradata identification and extraction. The results show a need to broaden the understanding of how structure and systematicity are used and how they impact research data in archaeology and in comparable field sciences. The insights into how a dataset’s KOP and KMP can be read is also a methodological contribution to data literacy research and practice development. On a repository level, the results underline the need to include paradata about dataset creation, purpose, terminology, dataset internal and external relations, and eventual data colloquialisms that require explanation to reusers.


Author Biography

Isto Huvila, Department of ALM, Uppsala University

Isto Huvila is Chair of Information Studies in the Department of ALM (Archival Studies, Library and Information Studies, and Museums and Cultural Heritage Studies) at Uppsala University, Adjunct Professor in Information Management in the Department of Information Studies at Åbo Akademi University. During the academic year 2019/2020 he worked as a Visiting Professor in Library, Archival, and Information Studies at UBC iSchool. His areas of research include information and knowledge management, information work, knowledge organisation, documentation, research data, and social and participatory information practices in the context of archaeology and cultural heritage, archives, libraries and museums as well as health information and e-health, social media, virtual worlds, and corporate and public organisations.


