Re-purposing Excavation Database Content as Paradata
An Explorative Analysis of Paradata Identification Challenges and Opportunities
Keywords:metadata, paradata, metadata extraction, data reuse, research data, unstructured data, archaeological data
Although data reusers request information about how research data was created and curated, this information is often non-existent or only briefly covered in data descriptions. The need for such contextual information is particularly critical in fields like archaeology, where old legacy data created during different time periods and through varying methodological framings and fieldwork documentation practices retains its value as an important information source. This article explores the presence of contextual information in archaeological data with a specific focus on data provenance and processing information, i.e., paradata. The purpose of the article is to identify and explicate types of paradata in field observation documentation. The method used is an explorative close reading of field data from an archaeological excavation enriched with geographical metadata. The analysis covers technical and epistemological challenges and opportunities in paradata identification, and discusses the possibility of using identified paradata in data descriptions and for data reliability assessments. Results show that it is possible to identify both knowledge organisation paradata (KOP) relating to data structuring and knowledge-making paradata (KMP) relating to fieldwork methods and interpretative processes. However, while the data contains many traces of the research process, there is an uneven and, in some categories, low level of structure and systematicity that complicates automated metadata and paradata identification and extraction. The results show a need to broaden the understanding of how structure and systematicity are used and how they impact research data in archaeology and in comparable field sciences. The insights into how a dataset’s KOP and KMP can be read is also a methodological contribution to data literacy research and practice development. On a repository level, the results underline the need to include paradata about dataset creation, purpose, terminology, dataset internal and external relations, and eventual data colloquialisms that require explanation to reusers.
Allison, Penelope. 2008. “Dealing with Legacy Data - an Introduction.” Internet Archaeology 24. https://doi.org/10.11141/ia.24.8.
Atici, Levent, Sarah Kansa, Justin Lev-Tov, and Eric C. Kansa. 2013. “Other People’s Data: A Demonstration of the Imperative of Publishing Primary Data.” Journal of Archaeological Method and Theory 20 (4): 663–81. https://doi.org/10.1007/s10816-012-9132-9.
Bhargava, Rahul, Erica Deahl, Emmanuel Letouzé, Amanda Noonan, David Sangokoya, and Natalie Shoup. 2015. “Beyond Data Literacy: Reinventing Community Engagement and Empowerment in the Age of Data.” Data-Pop Alliance White Paper Series. New York: Internews Center for Innovation and Learning and the MIT Media Lab Center for Civic Media. https://datapopalliance.org/item/beyond-data-literacy-reinventing-community-engagement-and-empowerment-in-the-age-of-data/. Archived at: https://perma.cc/7AGR-245E.
Boozer, Anna Lucille. 2014. “The Tyranny of Typologies: Evidential Reasoning in Romano-Egyptian Domestic Archaeology.” In Material Evidence: Learning from Archaeological Practice, edited by Robert Chapman and Alison Wylie, 92–109. Abingdon: Routledge.
Börjesson, Lisa. 2016. “Beyond Information Policy: Conflicting Documentation Ideals in Extra-Academic Knowledge Making Practices.” Journal of Documentation 72 (4): 674–95. https://doi.org/10.1108/JDOC-10-2015-0134.
Börjesson, Lisa, Olle Sköld, and Isto Huvila. 2021. “Paradata in Documentation Standards and Recommendations for Digital Archaeological Visualisations.” Digital Culture & Society 6 (2): 1. https://doi.org/10.14361/dcs-2020-0210.
Brown, Hannah, Helen Goodchild, and Søren M. Sindbæk. 2014. “Making Place for a Viking Fortress. An Archaeological and Geophysical Reassessment of Aggersborg, Denmark.” Internet Archaeology 36. https://doi.org/10.11141/ia.36.2.
D’Andrea, Andrea, and Kate Fernie. 2013. “CARARE 2.0: A Metadata Schema for 3D Cultural Objects.” In 2013 Digital Heritage International Congress (DigitalHeritage), 137–43. https://doi.org/10.1109/DigitalHeritage.2013.6744745.
Ellis, Steven J. R. 2008. “The Use and Misuse of ‘Legacy Data’ in Identifying a Typology of Retail Outlets at Pompeii.” Internet Archaeology 24. https://doi.org/10.11141/ia.24.4.
European Commission. n.d. “Open Science.” Accessed May 21, 2021. https://ec.europa.eu/info/research-and-innovation/strategy/goals-research-and-innovation-policy/open-science_en. Archived at: https://perma.cc/N22U-X2JF.
Faniel, Ixchel, Anne Austin, Sarah Whitcher Kansa, Eric Kansa, Jennifer Jacobs, and Phoebe France. 2021. “Identifying Opportunities for Collective Curation During Archaeological Excavations.” International Journal of Digital Curation. https://doi.org/10.2218/ijdc.v16i1.742.
Faniel, Ixchel, Eric Kansa, Sarah Whitcher Kansa, Julianna Barrera-Gomez, and Elizabeth Yakel. 2013. “The Challenges of Digging Data: A Study of Context in Archaeological Data Reuse.” In JCDL ’13: Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, 295–304. https://doi.org/10.1145/2467696.2467712.
Faniel, Ixchel M., Rebecca D. Frank, and Elizabeth Yakel. 2019. “Context from the Data Reuser’s Point of View.” Journal of Documentation 75 (6): 1274–97. https://doi.org/10.1108/JD-08-2018-0133.
Federal Geographic Data Committee. n.d. “Content Standard for Digital Geospatial Metadata (CSDGM).” Accessed May 28, 2021. https://www.fgdc.gov/metadata/csdgm-standard. Archived at: https://perma.cc/3JMX-Z3AJ.
Furner, Jonathan. 2021. Information Studies and Other Provocations. Sacramento: Litwin Books.
Geiger, R. Stuart., and David Ribes. 2011. “Trace Ethnography: Following Coordination through Documentary Practices.” In 2011 44th Hawaii International Conference on System Sciences (HICSS), 1–10. https://doi.org/10.1109/HICSS.2011.455.
Gitelman, Lisa, ed. 2013. “Raw Data” Is an Oxymoron. Cambridge, MA: MIT Press.
Gustafsson, Anders, and Björn Magnusson Staaf. 2001. “Rapport Om Rapporter – En Diskussion Kring Kvalitetsbedömningar Av Arkeologiska Rapporter.” Report 2001:3. Stockholm: RAÄ.
Heitman, Carrie, Worthy Martin, and Stephen Plog. 2017. “Innovation through Large-Scale Integration of Legacy Records: Assessing the ‘Value Added’ in Cultural Heritage Resources.” Journal on Computing and Cultural Heritage 10 (3): 1–10. https://doi.org/10.1145/3012288.
Hodder, Ian. 1989. “Writing Archaeology: Site Reports in Context.” Antiquity 63 (239): 268–74.
Hodder, Ian. 1997. “Always Momentary, Fluid and Flexible’: Towards a Reflexive Excavation Methodology. Antiquity 71 (273): 691–700.
Huvila, Isto. 2006. The Ecology of Information Work: A Case Study of Bridging Archaeological Work and Virtual Reality Based Knowledge Organisation. Åbo: Åbo akademis förlag.
Huvila, Isto. 2012. “Being Formal and Flexible: Semantic Wiki as an Archaeological e-Science Infrastructure.” In Revive the Past: Proceedings of the 39th Conference of Computer Applications and Quantitative Methods in Archaeology, edited by Mingquan Zhou, Iza Romanowska, Zhongke Wu, Pengfei Xu, and Philip Verhagen, 186–97. Amsterdam: Amsterdam University Press. https://doi.org/10.1017/9789048516865.
Huvila, Isto. 2016. “Awkwardness of Becoming a Boundary Object: Mangle and Materialities of Reports, Documentation Data, and the Archaeological Work.” The Information Society 32 (4): 280–97. https://doi.org/10.1080/01972243.2016.1177763.
Huvila, Isto. 2017. “Being FAIR When Archaeological Information Is MEAN: Miscellaneous, Exceptional, Arbitrary, Nonconformist.” Presentation at the Centre for Digital Heritage Conference 2017, Leiden, June 14–16, 2017. http://www.istohuvila.se/node/526.
Huvila, Isto. 2020a. “Information-Making-Related Information Needs and the Credibility of Information.” Information Research 25 (4): paper isic2002. https://doi.org/10.47989/irisic2002.
Huvila, Isto. 2020b. “Use-Oriented Information and Knowledge Management: Information Production and Use Practices as an Element of the Value and Impact of Information.” Journal of Information & Knowledge Management 18 (4): 1950046. https://doi.org/10.1142/s0219649219500461.
Huvila, Isto, Olle Sköld, and Lisa Börjesson. 2021. “Documenting Information Making in Archaeological Field Reports.” Journal of Documentation 77 (5): 1107–27. https://doi.org/10.1108/JD-11-2020-0188.
ISO. 2014. “ISO 19115-1:2014 Geographic information – Metadata – Part 1: Fundamentals.” https://www.iso.org/cms/render/live/en/sites/isoorg/contents/data/standard/05/37/53798.html.
Kansa, Eric, and Sarah Whitcher Kansa. 2021. “Digital Data and Data Literacy in Archaeology Now and in the New Decade.” Advances in Archaeological Practice 9 (1): 81–85. https://doi.org/10.1017/aap.2020.55.
Kersel, Morag M. 2015. “STORAGE WARS: Solving the Archaeological Curation Crisis?” Journal of Eastern Mediterranean Archaeology and Heritage Studies 3 (1): 42–54. https://doi.org/10.5325/jeasmedarcherstu.3.1.0042.
Kim, Jihyun, Elizabeth Yakel, and Ixchel M. Faniel. 2019. “Exposing Standardization and Consistency Issues in Repository Metadata Requirements for Data Deposition. College & Research Libraries. https://doi.org/10.5860/crl.80.6.843.
Koesten, Laura, Pavlos Vougiouklis, Elena Simperl, and Paul Groth. 2020. “Dataset Reuse: Toward Translating Principles to Practice.” Patterns 1 (8). https://doi.org/10.1016/j.patter.2020.100136.
Larsson, Åsa M., and Daniel Löwenborg. 2020. “The Digital Future of the Past - Research Potential with Increasingly FAIR Archaeological Data.” In Re-Imagining Periphery: Archaeology and Text in Northern Europe from Iron Age to Viking and Early Modern Periods, edited by Charlotta Hillerdal and Kristin Ilves, 61–70. Oxford: Oxbow.
Löwenborg, Daniel, Maria Jonsson, Åsa Larsson, and Johan Nordinge. 2021. “A Turn Towards the Digital. An Overview of Swedish Heritage Information Management Today.” Internet Archaeology 58. https://doi.org/10.11141/ia.58.19.
Montoya, Robert D., and Katherine Morrison. 2019. “Document and Data Continuity at the Glenn A. Black Laboratory of Archaeology.” Journal of Documentation 75 (5): 1035–55. http://doi.org/10.1108/JD-12-2018-0216.
Nadim, Tahani. 2021. “The Datafication of Nature: Data Formations and New Scales in Natural History.” Journal of the Royal Anthropological Institute 27 (S1): 62–75. https://doi.org/10.1111/1467-9655.13480.
Olson, Carina, and Yvonne Walther. 2007. “Neolithic Cod and Herring Fisheries in the Baltic Sea, in the Light of Fine-Mesh Sieving : A Comparative Study of Subfossil Fishbone Form the Late Stone Age Sites at Ajvide, Gotland, Sweden and Åland, Finland.” Environmental Archaeology 12 (2): 175–85. https://urn:nbn:se:su:diva-11197.
Richards, Julian D., Ulf Jakobsson, David Novák, Benjamin Štular, and Holly Wright. 2021. “Digital Archiving in Archaeology: The State of the Art. Introduction.” Internet Archaeology 58. https://doi.org/10.11141/ia.58.23.
Roskams, Steve. 2001. Excavation. Cambridge: Cambridge University Press.
Roy, Sohon, Felienne Hermans, Efthimia Aivaloglou, Jos Winter, and Arie van Deursen. 2016. “Evaluating Automatic Spreadsheet Metadata Extraction on a Large Set of Responses from MOOC Participants.” In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), 135–45. https://doi.org/10.1109/SANER.2016.98.
Sabo, Katalin Schmidt, Magnus Andersson, Mats Anglert, Caroline Arcini, Adam Bolander, Torbjörn Brorsson, Annica Cardell, Bo Knarrström, Per Lagerås, Linda Rosendahl, Fredrik Strandmark, Marie Svedin, and Håkan Svensson. 2013. Arkeologisk Undersökning 2010 Örja 1:9 Skåne, Landskrona Kommun, Örja Socken, Örja 1:9, Fornlämningarna Örja 9, 35, 40, 41 Och 42. Vol. 2013: 68. UV Rapport. Lund: RAÄ.
Secci, Massimiliano, Carlo Beltrame, Stefania Manfio, and Francesco Guerra. 2019. “Virtual Reality in Maritime Archaeology Legacy Data for a Virtual Diving on the Shipwreck of the Mercurio (1812).” In “Multidisciplinary Study of the Sarno Baths in Pompeii,” special issue, edited by Lara Maritan, Caterina Previato, and Filippo Lorenzoni, Journal of Cultural Heritage 40 (November–December), 169–76. https://doi.org/10.1016/j.culher.2019.05.002.
Silverman, David. 2013. A Very Short, Fairly Interesting and Reasonably Cheap Book about Qualitative Research. London: SAGE Publications. https://doi.org/10.4135/9781526402264.
Sobotkova, Adela. 2018. “Sociotechnical Obstacles to Archaeological Data Reuse.” Advances in Archaeological Practice 6 (2): 117–24. https://doi.org/10.1017/aap.2017.37.
Stilborg, Ole. 2021. “A Study of the Representativity of the Swedish Ceramics Analyses Published in The Strategic Environmental Archaeology Database (SEAD).” Fornvännen 116 (2): 89–100.
The Swedish Research Council. 2017. God forskningssed. Stockholm: Swedish Research Council. https://www.vr.se/analys/rapporter/vara-rapporter/2017-08-29-god-forskningssed.html.
Tkaczyk, Dominika, Paweł Szostek, Mateusz Fedoryszak, Piotr Jan Dendek, and Łukasz Bolikowski. 2015. “CERMINE: Automatic Extraction of Structured Metadata from Scientific Literature.” International Journal on Document Analysis and Recognition (IJDAR) 18 (4): 317–35. https://doi.org/10.1007/s10032-015-0249-8.
Ullah, Isaac I. T. 2015. “Integrating Older Survey Data into Modern Research Paradigms: Identifying and Correcting Spatial Error in ‛Legacy’ Datasets.” Advances in Archaeological Practice 3 (4): 331–50. https://doi.org/10.7183/2326-37184.108.40.2061.
Voss, Barbara L. 2012. “Curation as Research. A Case Study in Orphaned and Underreported Archaeological Collections.” Archaeological Dialogues 19 (2): 145–69. https://doi.org/10.1017/S1380203812000219.
Warner, Julian. 2010. Human Information Retrieval. Cambridge, MA: MIT Press.
Wilkinson, Mark D., Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3: 160018. https://doi.org/10.1038/sdata.2016.18.
Wylie, Alison. 2017. “How Archaeological Evidence Bites Back: Strategies for Putting Old Data to Work in New Ways.” Science, Technology, & Human Values 42 (2): 203–25. https://doi.org/10.1177/0162243916671200.
Yan, An, Caihong Huang, Jian-Sin Lee, and Carole L. Palmer. 2020. “Cross-Disciplinary Data Practices in Earth System Science: Aligning Services with Reuse and Reproducibility Priorities.” Proceedings of the Association for Information Science and Technology 57 (1). https://doi.org/10.1002/pra2.218.
How to Cite
Copyright (c) 2022 Lisa Börjesson, Olle Sköld, Zanna Friberg, Daniel Löwenborg, Gísli Pálsson, Isto Huvila
This work is licensed under a Creative Commons Attribution 4.0 International License.