RESEARCH ARTICLE

The Power to Structure: Making Meaning from Metadata Through Ontologies

Erin Canning
University of Guelph

Susan Brown
University of Guelph

Sarah Roger
University of Guelph

Kimberley Martin
University of Guelph

Information systems are developed by people with intent—they are designed to help creators and users tell specific stories with data. Within information systems, the often invisible structures of metadata profoundly impact the meaning that can be derived from that data. The Linked Infrastructure for Networked Cultural Scholarship project (LINCS) helps humanities researchers tell stories by using linked open data to convert humanities datasets into organized, interconnected, machine-processable resources. LINCS provides context for online cultural materials, interlinks them, and grounds them in sources to improve web resources for research. This article describes how the LINCS team is using the shared standards of linked data and especially ontologies—typically unseen yet powerful—to bring meaning mindfully to metadata through structure. The LINCS metadata—comprised of linked open data about cultural artifacts, people, and processes—and the structures that support it must represent multiple, diverse ways of knowing. It needs to enable various means of incorporating contextual data and of telling stories with nuance and context, situated and supported by data structures that reflect and make space for specificities and complexities. As it addresses specificity in each research dataset, LINCS is simultaneously working to balance interoperability, as achieved through a level of generalization, with contextual and domain-specific requirements. The LINCS team’s approach to ontology adoption and use centers on intersectionality, multiplicity, and difference. The question of what meaning the structures being used will bring to the data is as important as what meaning is introduced as a result of linking data together, and the project has built this premise into its decision-making and implementation processes. To convey an understanding of categories and classification as contextually embedded—culturally produced, intersecting, and discursive—the LINCS team frames them not as fixed but as grounds for investigation and starting points for understanding. Metadata structures are as important as vocabularies for producing such meaning.

Keywords: bias; digital humanities; linked open data; metadata; ontology; power

 

How to cite this article: Canning, Erin, Susan Brown, Sarah Roger, and Kimberley Martin. 2022. The Power to Structure: Making Meaning from Metadata Through Ontologies. KULA: Knowledge Creation, Dissemination, and Preservation Studies 6(3). https://doi.org/10.18357/kula.169

Submitted: 28 July 2021 Accepted: 6 March 2022 Published: 27 July 2022

Competing interests and funding: The authors have no competing interests to declare. The editors would like to note that Susan Brown is an editorial board member of KULA but that this article went through the same submission process, including anonymous peer review, as all other research articles.

Copyright: @ 2022 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/.

 

Introduction

Metadata does not speak for itself: often invisible structures—in the case of linked data, ontologies—profoundly impact the meaning that can be derived from that data. In fact, computational ontologies, which emerge from the broader field of knowledge representation (an interdisciplinary field of inquiry that spans philosophy, psychology, logic, linguistics, and information science as well as computer science [Levesque 1986]), are all about structuring data in meaningful ways. Specifically, computational ontologies trade in meaning in ways that have profound implications for how representations circulate on the web. As a vital component of the Semantic Web, the linked open data (LOD) technology stack has in both theory and practice been most concerned with metadata. Indeed, the very name of the Resource Description Framework (RDF) says it all: LOD is a means of describing—which is to say providing metadata for—data about resources on the web (Bray 1998). Formal ontologies, as a key component of LOD, are designed to build meaning into metadata structures. As such, ontologies are supplanting earlier forms of knowledge organization as a means of mobilizing metadata on the web for libraries and other memory institutions (Giunchiglia, Dutta, and Maltese 2014).

This paper takes up an important aspect of this challenge in a consideration of how the structuring of metadata through LOD ontologies embeds meaning. Like metadata and thesauri, ontological structures carry the biases of the people and institutions that have created them, but those biases are less easy to perceive and evaluate than are linguistic classifications. This paper describes the iterative approach taken by the team at Linked Infrastructure for Networked Cultural Scholarship (LINCS), a Canadian linked open data infrastructure development project on which the authors of this paper collaborate, to establish—through a combination of policy and consultation with researchers—data structures appropriate for the research datasets that it is mobilizing as LOD. It outlines the crucial initial ontological choices that have resulted from this process and how they have worked in implementation. LINCS starts from the premise that no decision is purely technical, and all decisions involve some degree of value or judgement. As Flanders articulates in “Building Otherwise” (2018, 290), “technical systems are meaning systems and ideological systems, as far down as we are willing to look.” The LINCS team is incorporating an analysis of the meaning that ontologies bring to data as a core part of the project’s LOD infrastructure development work. The question of what meaning the structures being used bring to the data is as important as what meaning is introduced as a result of linking data together, and the project has sought to build this insight into its ontology decision-making and implementation processes. This strategy responds to the growing understanding of digital humanities technologies and infrastructure as always already ideological and contributes theoretically and practically to feminist intersectional approaches to metadata for the humanities in support of representing multiple, diverse ways of knowing through LOD.

Context

Linked Infrastructure for Networked Cultural Scholarship (LINCS)

LOD has entered cultural heritage institutional and scholarly discourses thanks to initiatives like Linked Jazz, Europeana, and the Linked Data For (LD4) community. However, the technology and infrastructure required for LOD are so complex that it is a struggle for humanities and cultural heritage projects to encompass the full spectrum of possibilities it promises, although substantial advances are being made in various domains (Barbera 2013; Sanderson 2013; Hoekstra et al. 2016; Hyvönen 2020). Canadian institutions that work with humanities LOD have generally been limited to exploratory or partial projects (for instance, the Out of the Trenches project) due to a lack of infrastructure support and insufficient resources. As a result, LOD’s potential has not yet been fully evaluated, let alone realized, in the humanities and cultural heritage sectors in Canada. That said, wider progress is being made: excellent pioneering articulations (Cope, Kalantzis, and McGee 2011; Frontini, Brando, and Ganascia 2015; Hyvönen 2012), emerging standards (e.g., OWL, RDF, RDF-Schema, CIDOC CRM, IIIF), and components of LOD ecosystems for cultural heritage studies now exist, including tools and infrastructure from within and beyond academia (e.g., 3M, Arches, CWRC-Writer, D2ME, Isidore, Karma, LOV, OmekaS, OntoME, OpenRefine, ResearchSpace, Wikibase) and trailblazing academic LOD projects (e.g., Enslaved, Historic Places LA, Linked Jazz, O Say Can You See). LINCS is adopting LOD best practices in its aim to create LOD tools and data that are usable (Sanderson 2018) and compelling for humanities researchers.

LINCS, a three-year cyberinfrastructure project funded by the Canada Foundation for Innovation, is converting humanities datasets into an organized, interconnected, machine-processable set of resources for Canadian cultural research to make these resources usable and useful to researchers in Canada and across the world. LINCS aims to provide context for cultural materials published online, interlink them, and ground them in their sources, and in doing so, improve the trustworthiness of web resources. LINCS is drawing in datasets from a diverse range of fields in order to convert, connect, enhance, and make accessible previously heterogeneous and siloed datasets of digitized books, manuscripts, photographs, periodicals, postcards, music, and more. To accomplish the project’s goals, LINCS will store converted data in a triplestore accessible through a custom-built graphical user interface, a SPARQL endpoint, APIs, and a suite of tools for searching, browsing, and visualizing the data. Its core components are a conversion toolkit for creating LOD from structured, semi-structured, and natural language data; a national linked data storage infrastructure; and a system for data access, reuse, and tools for further conversion or enhancement (see Figure 1).

Fig 1
Figure 1: LINCS components.

The humanities need domain-specific infrastructure of the sort that LINCS is building. The primary means by which most humanities scholars interact with cultural and social materials, despite the wealth of content available on the web, is still via reading materials that have been located through quite imprecise searching. This is in part because computational processes have not been optimized to help cultural researchers—for instance, by filtering large amounts of material with precision or showing interrelationships between materials. Scholars who do work with humanities data (linked or otherwise) are rarely able to make full use of existing data or tools because of the extent to which digital materials are siloed on servers with bespoke interfaces and data structures. As a result, digital work on cultural materials does not live up to its full research potential, lacks interoperability, and is at risk of not being preserved (Brown 2010, 2011). With the right infrastructure—that is, infrastructure developed for cultural inquiry, with sufficient technical sophistication, standardization, and customizability—LOD has great potential to advance the ability of humanities scholars to work with cultural data in new ways.

Information Systems as Sites of Bias

Data Standards and Structures

Cultural heritage information systems are composed of three kinds of standards working together: data value standards, data content standards, and data structure standards (Coburn and Baca 2004). Linked data for cultural heritage is no different. Data content and value standards are more immediately visible than data structure standards because they refer to vocabularies, authorities, and thesauri—the descriptive terms that are used in information systems—and how those values should be formatted within specific data structure fields. Metadata terms and the controlled vocabularies that structure them have long been the focus of critical attention (Billey, Drabinski, and Roberto 2014; Turner 2020; Littletree and Metoyer 2015), but these critiques focus on the terms used by information systems and less so on the structures of those systems themselves—the data structure standards that create the fields to be filled by these terms and vocabularies in the first place. However, these structures help to give metadata their meaning, especially in relationship to the entity being described, and are as ripe for analysis as the language employed within them (Bowker and Star 2000; Giunchiglia, Dutta, and Maltese 2014; Duarte and Belarde-Lewis 2015; Hacıgüzeller, Taylor, and Perry 2021). Critiquing data value and content standards involves analyzing the words used to describe the world, while critiquing data structure standards involves questioning how the information system designers have declared the world to work and how its elements fit together. It is essential to attend to data structures as well as values when considering how information systems work to create and convey meaning, which in the context of LOD means a focus on the role of ontologies, the data structure standards underpinning LOD infrastructure.

Ideology and Infrastructure

Data structure standards demonstrate value judgements most visibly through classification, an important aspect of “the power to name” as powerfully articulated by Olson (2001). In LOD, this can come from the different classes declared by an ontology. For example, the Conceptual Reference Model of the International Council of Museums’ Committee for Documentation (CIDOC CRM), a foundational ontology for representing cultural heritage data, distinguishes between a physical thing and a conceptual object, and thus declares that there is a difference between the concept of a thing and the physical characteristics of something that embodies that concept. These are two major classes of “things” according to this ontology. Classification has long been an area of critique in terms of metadata and vocabularies as well as in terms of the lines that they draw between groups of entities. As Brown notes, “problematizing category boundaries is critical not only for the feminist project of bridging from legacy vocabularies to new ones but also for basic digital literacy, given that categories and the values they embed govern information systems” (2020, 168).

However, classification is not the only way that ontologies impose structure or power. Evaluations of ontologies should not be limited to classification and classes, but should also consider how those classes are related to each other and what systems of meaning they encode. Bowker and Star interrogate this topic in Sorting Things Out: Classification and Its Consequences, where they argue that “every standard and each category valorizes some point of view and silences another” (2000, 156). Historically, voices that are silenced include those of people of colour, Indigenous populations, the LGBTQ2S+ community, and non-English-speaking communities (particularly in systems constructed in the West).

Turner draws on Bowker and Star to point out how infrastructure is linked to power through its normalization of knowledge systems: “creating any knowledge organization scheme is a formative and world-building exercise, and in the world building of systems, other worlds are put aside” (2017, 473). The affordances of various ways of structuring knowledge in memory institutions and beyond, from card catalogues to databases, are often uncritically accepted and maintained because they are continuations of existing practices and ways of working, made invisible by their ubiquity. The invisibility of these systems is key to their pervasiveness; they become naturalized and thus unexamined, enacting power without calling attention to themselves or their effects and avoiding becoming the subject of questioning about why they are the way they are, how they got that way, and who has benefited—and continues to benefit—from them.

Furthermore, these systems neither exist nor operate in isolation: they reflect and reinforce categorizations that structure understandings of the world at large, and as such do not merely describe the world but also work to create it (Bowker and Star 2000) and to perpetuate the value systems that are encoded into them (D’Ignazio and Klein 2020). There is growing attention to these concerns, and many information scholars have argued that much needs to be done to break down these “norms” to establish better ethical guidelines and practices for knowledge work (Bowker and Star 2000; D’Ignazio and Klein 2020; Drabinski 2013; Duarte and Belarde-Lewis 2015; Flanders 2018; Hacıgüzeller, Taylor, and Perry 2021; Littletree and Metoyer 2015; McPherson 2012, 2014; Posner 2016).

This call to examine information system structures has also come from the broader digital humanities community. Posner’s (2016) influential article “What’s Next: The Radical, Unrealized Potential of Digital Humanities” offers a provocation for digital humanists to attend to the power wielded by data models—data structure standards—to create and impose meaning on data that academics work with and, more importantly, on the people who are often the subjects of that data. She traces how biases in information systems reflect the values of those who built the systems and produce inaccurate or stereotypical representations of the entities (often people) described in those systems, which then result in the perpetuation of those values—a cycle that will not be solved by adjusting the existing system but which, rather, requires “ripping apart and rebuilding the machinery of the archive and database so that it does not reproduce the logic that got us here in the first place” (Posner 2016, 35). The locus of critique should not be just the information system but the people around it: systems enact the power relations of those who originally created them, to the detriment of those upon whom power is enacted. Who designed and built the information system, and for whom, is important because the biases of these groups become encoded in its structure (D’Ignazio and Klein 2020; Silva 2007). Analyzing information systems is analyzing a symptom, and scholars such as D’Ignazio and Klein (2020), McPherson (2012, 2014), Posner (2016), Ruberg, Boyd, and Howe (2018), and others call on the digital humanities to go beyond analysis to attend to the root causes of biases in information systems.

In “Toward a Diversity Stack,” Liu advocates a platform “at once ideological and technical” to mobilize “the social and ethical commitment of the digital humanities to diversity” (2020, 130). He projects a “diversity stack” composed of modular layers; each layer, by analogy with the internet protocol stack, “is limited in its goals because it is about doing one thing well. Building capacity for diversity research is not the responsibility of any specific modular layer. It is the goal at the top where the stack as a whole might support fresh, applied ways of thinking about diverse identities” (Liu 2020, 135). LINCS offers a case study of an attempt to develop an operational platform or stack that aligns with this vision of “a virtuous circle in which research on diversity helps shape technical innovation and, in turn, technical innovation designs new ways to understand and act on diversity” (Liu 2020, 136). In LINCS, one can start to think through how the diversity stack interacts with the Stack, Benjamin Bratton’s broader socio-technical world system that informs Liu’s vision (Liu 2020, 133; Bratton 2016).

Making Meaning from (Meta)data at LINCS

LINCS Ontologies Adoption & Development Policy

At the start of the project, the LINCS team produced the first draft of the LINCS Ontologies Adoption & Development Policy (OADP), which governs the project’s approach to ontology selection and implementation (Canning et al. 2022). This policy was drafted before any decisions regarding ontologies were made because the team felt that it was important to first outline the values and criteria against which potential ontological solutions would be assessed. The policy is a living document, to be updated throughout the life of the project to reflect how the LINCS team approaches questions of data representation.

The purpose of the OADP is to articulate key project values and lay out how the LINCS team plans to operate within the linked data space. By design, LINCS includes researchers whose datasets embed contestations of mainstream or hegemonic understandings of history and how knowledge works, and which often involve boundary objects (Star and Griesemer 1989). This diversity of representation was a criterion at the initial project-selection stage, as grappling with multifaceted and marginalized ways of knowing through LOD was a major goal from the outset. To serve this data in a way that does both the data and the researchers using it justice, the LINCS team must ensure that selected ontologies can represent non-hegemonic epistemologies and that project data structures are capable of describing alternative knowledge representations.

The OADP states:

LINCS ontologies will be selected, adopted, and developed with an attention to intersectionality, multiplicity, and difference. Where these ontologies deal with description of people, it will seek to convey an understanding of identity categories and social classification as being contextually embedded: culturally produced, intersecting, and discursive. LINCS will aim to present these kinds of categories as not fixed classifications, but instead as grounds for investigation, starting points for understanding. Furthermore, LINCS will seek to foreground the contextual data of all of the data involved in this project, not just that dealing directly with the description of people, places, or groups, as well as the relationship between the data and its sources. (Canning et al. 2022, 2)

The OADP details key data representation requirements for the project and the positions held by LINCS regarding the nature of specific kinds of information, such as those related to identity and social classification. While the focus of the policy is on project-wide needs, space is also provided for more specific data concerns since these micro-level decisions are not divorced from the big picture: they represent the application of the policy values in response to specific cases. This living document plays a crucial role for LINCS as a record of key decisions and transparent guidelines for ontology decision-making over the course of the project, while still allowing room for flexibility and modification as unforeseen matters arise.

Ontology Decisions at LINCS

Making Decisions About Ontologies

The LINCS team has adopted a collaborative and iterative process of ontology selection and development, one that involves researchers and members of the LINCS community. This approach emphasizes that ontologies are linked to the communities whose data they seek to represent, and thus their adoption or development should involve those voices at all stages of the process.1 Guided by the aforementioned policy document, ontologies are evaluated and selected in collaboration with LINCS user communities. All ontology decisions go through the Ontology Working Group before being used in the project. This group is made up of researchers from across LINCS and bolstered by additional domain experts as required. This process ensures that decisions about ontologies that affect specific domains are made in consultation with researchers from those domains: for example, the meeting to decide on ontologies for representing bibliographic and library data was attended by librarians and researchers working directly with bibliographic data, in addition to regular Ontology Working Group participants.

However, requests for attendance are not sufficient to support involvement: the LINCS team aims to be transparent about the nuances of the choices to be made, as well as to truly engage researchers and team members in the evaluation process. To this end, before the Ontology Working Group meets to make a decision, the LINCS team provides attendees with clear and extensive documentation about the domain ontologies under consideration, along with an analysis of those options and a recommendation for review and debate. This brings to the fore not only the question of what ontology to use but the reasoning behind these decisions, especially the aspects of that consideration that are reflected by the values and positions described in the OADP.

If LINCS is to address non-hegemonic use cases, the knowledge of domain experts is necessary but not sufficient, as some perspectives are rarely (or never) captured by this expert knowledge. For example, ways of knowing related to specific Indigenous communities might well require dedicated resources, and ongoing relationships with these communities would need to be forged over time. For this kind of work, research funding would almost certainly be required to provide sufficient resources to engage in exploratory processes.

The LINCS team is incorporating initial datasets as a first step in probing the ways that the OADP and related policies can meaningfully elucidate points of difference, to the extent that an infrastructure project allows with respect to both time and resources. The ontology work required for the infrastructure will undoubtedly extend beyond the timeframe of the initial grant; as LINCS grows, the team will confront other research challenges related to non-hegemonic use cases requiring further investigations into diversity and difference and new modes of collaborative ontology development.

Deciding on an Ontology Approach

The LINCS Ontology Working Group made the crucial early decision that a multiple-ontology approach would be required to address the conceptual (and technical) complexities of the diversity of data and data sources that the project team would be dealing with. This is a different approach than that adopted by many existing data aggregation projects, which define a single data structure and vocabulary source and require that contributing projects map their data to it for ingestion (e.g., Europeana, ARC). LINCS is an infrastructure dedicated to representing the contents of and internal relationships within datasets, and thus has data representation needs that go beyond those of aggregation platforms that work with metadata alone. A single-ontology approach would not fit LINCS since it would not take into consideration the myriad and varying needs of each researcher dataset, especially given the diversity of data and domains that make up LINCS and the project team’s commitment to enabling the representation of situated knowledge (Haraway 1988). Instead, LINCS ontology work is evolving throughout the project to respond to the requirements of an increasingly wide range of researchers and their data.

Following this high-level decision, LINCS considered two major approaches: a) developing an ontology by selecting pieces from a number of existing ontologies and minting new classes and properties where an adequate solution did not exist, or b) adopting a modular ontology that could serve as a core structure and to which we could append and grow domain-specific extensions. The former is the approach taken by the Canadian Writing Research Collaboratory (CWRC), a LINCS partner project, in extracting LOD from XML documents on literary and cultural history, working out from the detailed semantic markup of the Orlando Project (Simpson and Brown 2013; Brown et al. 2019, “Preamble”). While this methodology made sense given the CWRC domain focus and the team’s resistance to the overt positivism of the various upper-level ontologies available, such an approach risks becoming unscalable as additional datasets introduce further complexities and domain scope increases. For a project such as LINCS, which must balance the need to make space for diversity with a competing need for supporting interoperability and coexistence in a shared information infrastructure, a core ontology approach offers the greatest flexibility with the least compromise. This approach provides LINCS with two key benefits:

  1. A core ontology provides a base level of interoperability through which dataset connections can be traced while also making space for domain-specific extensions that will inevitably be required.
  2. By implementing a solution that deals with common and straightforward use cases quickly, a core ontology frees up attention for complex areas that need specific and detailed consideration. This allows LINCS to attend to the points of diversity in its datasets, and it also provides an opportunity to contribute back to the wider LOD community by proposing methods of working with such data.

Selecting a Core Ontology: CIDOC CRM

The Conceptual Reference Model of the International Council of Museums’ Committee for Documentation (CIDOC CRM) is an extensive, generic ontology for cultural heritage data. It is event-centric, meaning that it conceptualizes points of data as the occurrences or outcomes of events, such as interactions between (human) actors and physical and/or conceptual objects, in places, at or during time-spans (see Figure 2) (Doerr 2003). CIDOC CRM was also designed to make space for multiple alternative propositions about a given entity (Bekiari et al. 2021). It is a conceptual model that has been implemented as a formal ontology that can be used for linked data projects (Doerr, Light, and Hiebel 2020). As such, it is both a theoretical and a practical tool for cultural heritage data integration.

Fig 2
Figure 2: A high-level view of CIDOC CRM (adapted from Doerr 2003, 10).

CIDOC CRM was developed with the intention of being the “semantic glue” needed to mediate between different types and sources of data (CIDOC CRM n.d., “Homepage”). The CRM provides a semantic framework with which to articulate details about the wide variety of events that can be connected to the history of objects held by cultural heritage institutions. CIDOC CRM was developed to assist with interoperability: to support multiple data holders as they share and align their data through the use of generic classes and properties. The CRM can then be refined through the use of vocabularies to specify the terminology appropriate for each specific domain, without compromising on interoperability. By adopting this structure and keeping vocabularies independent of and external to the ontology itself, CIDOC CRM is highly extensible: new vocabularies can be introduced without requiring changes to the ontological structure, and domain-specific extensions can be developed to specify new classes and properties as subclasses and subproperties of the core ontology, also without requiring changes to that primary model. In addition to the core CRM, there are a number of compatible models (including official extensions, which have been approved by the ICOM CIDOC Special Interest Group, and unofficial extensions, which are independently published and maintained) that address specific domains, including areas such as argumentation, belief adoption, and social phenomena that complement the initial focus of the model on physical objects (CIDOC CRM n.d., “Compatible Models”; Moraitou et al. 2019).

Adopting CIDOC CRM allows LINCS to accommodate a wide variety of data domains and sources, which is an essential requirement for LINCS. Additionally, this kind of structure—one that positions actions at the core of data creation, curation, and interpretation processes—highlights the contextual nature of information, as CIDOC CRM recognizes the dependence of data on human activities of meaning-making (Canning 2018). The event-centric, vocabulary-independent data structure also allows LINCS to represent general concepts that are shared across datasets and domains, even when they are called different things, while still allowing the description of each element involved in each dataset to be as detailed as researchers require. This allows LINCS to achieve a high level of interoperability without compromising on domain-specific terminology. Beyond that, the extensions from the wider CIDOC CRM ecosystem provide compatible support for a number of domain-specific needs without any requirement to compromise on the compatibility that the core CRM provides. Lastly, CIDOC CRM is a stable and long-established data standard—recognized since 2006 as the official ISO standard 21127:2014 (ISO 2014)—that is maintained by an international network and made available in multiple languages. The structure, size, and stability of the ontology, as well as the community that comes with adopting CIDOC CRM, are all beneficial to LINCS. Using CIDOC CRM introduces the opportunity for LINCS to participate in a large international linked cultural data community.

In addition to practical and technical affordances, adopting CIDOC CRM as a core ontology also means that LINCS is positioning data as fundamentally the outcome of activities—things that people have done and the results and outcomes of their actions—as opposed to facets of an entity. For example, this change in meaning can be seen when considering the creation of an object: as opposed to directly connecting a person to an object, CIDOC CRM requires an event to sit between the two. For example, the entry on Violet McNaughton in the Canada’s Early Women Writers (2020) project, hosted by CWRC (and which will push linked data to LINCS), includes bibliographic metadata for her contributions to periodical publications. In converting this metadata to linked data using CIDOC CRM, events are created to connect McNaughton to both the periodicals and the publications by representing McNaughton’s writing of her weekly column, “Jottings Way,” as a repeated activity and the regular publication of that column by the Western Producer as a second activity linked to the first. Structuring data this way makes space to discuss the acts of labour that took place in order for these publications to have come into being (as discussed by Dr. Michelle Meagher and Dr. Jana Smith Elford regarding AdArchive, detailed below in “Implementing the Decisions”). This space is not available, for instance, in a direct relationship such as “Violet McNaughton wrote ‘Jottings Way.’”

Furthermore, CIDOC CRM offers a way to model the assertion of any statement (or association of any other CRM entity) in relation to entity in the dataset. In the ontology, the class that represents the activity of making an assertion is E13 Attribute Assignment. Properties that take this class as their domain then allow data modelers to connect to the act of asserting details such as the statement being asserted, the individual doing the asserting, and additional contextual information that may exist in the source data to be included. This design allows the representation of multiple, even conflicting, statements and, through the data structure, makes visible the connection between the statement and the act of its making. Data is never a neutral, indisputable aspect of something: it is made with intent and reason by people at moments in time and in particular places, and this context is essential for understanding the data. The event-centricity of CIDOC CRM, combined with this support for representing the making of statements, helps the LINCS team to work towards project goals of highlighting the relational and contextual nature of the data that project participants seek to represent.

For all that CIDOC CRM offers a project such as LINCS, the project team is nevertheless engaging critically with this ontology. It will serve as a core structure for the project’s modular ontology design, not as the anticipated solution to all of the project’s data representation needs. Like any other information infrastructure, the CRM represents the views of its makers and the source documentation that was analyzed as part of its creation. There will be data representation needs from LINCS researcher participants for which CIDOC CRM will not be an adequate solution; we seek to use the ontology as a way to bridge domains while exploring or developing ontology solutions for these areas. For example, Srinivasan (2013) has identified ways in which the CRM is often incompatible with Indigenous ways of knowing, and Hacıgüzeller, Taylor, and Perry (2021) have explored parts of archaeological data records that the CRM appears unable to reliably represent. There are also efforts currently being undertaken by the CIDOC CRM Special Interest Group to define the foundational philosophies on which the CRM is based as well as to examine issues of bias in the data structure, its documentation, and the practices of the Special Interest Group itself (CIDOC CRM 2021a, 2021b). The LINCS team closely follows and participates in these discussions. Finally, the LINCS team recognizes that the relatively complex structure of the CRM presents challenges when it comes to communicating data modeling decisions and their implications to users and that the CRM therefore requires greater time and expertise on the part of both LINCS staff and participating researchers.

Additional Ontology Decisions

In addition to CIDOC CRM, the LINCS team evaluated ontologies for bibliographic data; performance, music, and intangible cultural heritage data; prosopographical, relationship, and social role data; and annotation and statement provenance data. For the first three of these domains, the Ontology Working Group adopted CIDOC CRM extensions. For bibliographic data as well as performance data, LINCS uses FRBRoo (Functional Requirements for Bibliographic Records, object-oriented); where additional classes and relationships specific to music are required, LINCS uses DoReMus (Doing Reusable Musical Data), an unofficial further extension of FRBRoo. For prosopographical and relationship data, LINCS uses the CIDOC CRM Property Classes extension to show the roles that individuals play in relationships with each other. The decision to use CRM Property Classes brings a new layer of meaning to the representation of interpersonal relationships through the event-centric structure; rather than simply connecting two individuals, a relationship becomes an activity involving two people. This structure allows LINCS data to show details about these relationships—such as start and end dates and changes in the nature of the relationship (say from friendship through partnership to marriage and ending in divorce)—and positions individuals as agents in their relationships to one another.

In the final case, annotation and statement provenance data, LINCS uses the Web Annotation Data Model (WADM), an ontology from outside of the CIDOC CRM ecosystem (Sanderson, Ciccarese, and Young 2017). The WADM is a generic and flexible model for annotating resources: annotations are used to identify, tag, or comment on web resources or the entities they contain. Annotations are formed by linking one or more body resources to the target of the annotation, which can itself be multiple. The relationship between a body (the substance of the annotation) and target (the resource or portion of a resource being annotated) is flexible, as is the size of the target, which can be as large as an entire novel or as small as a single character.

The LINCS partner CWRC was already using the WADM to address their annotation and statement provenance requirements. CWRC initially adopted the WADM for the CWRC-Writer browser-based XML editor’s generation of linked data from entity tagging. CWRC then extended this use to project-wide ontology work in order to contextualize and provide provenance for linked-data assertions derived from existing textual sources, such as Orlando: Women’s Writing in the British Isles from the Beginnings to the Present, the published textbase of the Orlando Project (Brown et al. 2019, “Preamble”). For LINCS, some of the complexity of statement provenance requirements has been obviated by the adoption of CIDOC CRM since, as discussed above, the CRM can represent the assertion of statements. However, there is still a need to describe the sources used in the dataset and to track the textual contexts of assertions; the WADM’s strengths in these areas make it a good complement to CIDOC CRM. The needs that led to CWRC’s adoption of the WADM for their ontology paralleled those of LINCS; as a result, the LINCS team, including the Ontology Working Group, was able to use CWRC as a case study through which to evaluate the suitability of the WADM for meeting its requirements. The subsequent decision to adopt the WADM then necessitated aligning the WADM with CIDOC CRM.

Aligning Ontologies: Connecting the Web Annotation Data Model and CIDOC CRM

Using the Web Annotation Data Model along with CIDOC CRM facilitates robust and straightforward connections between assertions and source records. While a solution using solely CIDOC CRM would have also been possible, incorporating the WADM offers sufficient complexity without additional overhead. The two ontologies can then be made to work together in the LINCS ontology ecosystem via alignment (i.e., declaring a point of similarity that allows the WADM to hook onto the CIDOC CRM data structure). LINCS has aligned the WADM with CIDOC CRM through shared classing: LINCS positions WADM Annotations as a type of CIDOC CRM class E33 Linguistic Object, therefore conceptualizing them as the annotations produced by the act of annotating.2

Aligned in this way, WADM and CIDOC CRM representations play complementary roles in relation to textual data housed on the web. For instance, CWRC contains a number of projects with prosopographical data about cultural producers that are being converted to LOD on the basis of the semantic relationships embedded in their XML. WADM Annotations of specific entities will promote the findability of the source texts by exposing them in a highly generalized and widely adopted form of LOD, while CIDOC CRM will express more granular relationships of particular concern to researchers. So, for instance, the Violet McNaughton entry in Canada’s Early Women Writers mentions Harris, Saskatchewan, which will be flagged by an identifying WADM Annotation. This basic identification will help make the connection findable by anyone researching the history of Harris and will point the researcher to the mention of Harris in the entry on McNaughton. On the other hand, researchers investigating broader topics (for example, the influence of English suffragism on the suffrage struggle in Canada, the increasing geospatial mobility of women or writers over time, or migration patterns in former British colonies in the early twentieth century) will benefit from CIDOC CRM’s ability to represent McNaughton as having lived in both Borden, Kent, England, and Harris, Saskatchewan, to link those events to dates, and to indicate her occupation as a writer and her connection to the suffrage cause.

The alignment of the WADM and CIDOC CRM also has implications for what an annotation is or means: by classing a WADM Annotation as a CIDOC CRM Linguistic Object, LINCS asserts that a WADM Annotation is the linguistic object produced by the act of annotating and not the annotation activity itself. This interpretation comes from analyzing the WADM documentation, consulting experts, and ultimately determining that a) the definition of annotation from the WADM more closely resembles that of a CIDOC CRM object than activity, and b) that this method of alignment would provide LINCS with what it needed out of using the WADM without introducing an additional level of complexity. However, other potential users of the WADM may argue for defining the annotation as the act of annotating and find use for the additional level of detail that would be introduced by this framing. This decision regarding alignment is therefore an example of how the decisions about ontology adoption at LINCS introduce new meaning to the ontologies as they are used with the project’s data, and thus to the data represented by them. By asserting that a WADM Annotation is an object, the LINCS team has clarified this question of activity or object that exists in the WADM and firmly ties WADM Annotation data in the LINCS information ecosystem to the notion of being a thing and not an action. For data created outside of the LINCS context using the WADM that is now being imported—such as that produced by the CWRC project—LINCS is introducing an aspect of meaning to that data that did not exist previously.

Implementing the Decisions

LINCS ontology decisions related to specific datasets are implemented in collaboration with project researchers through an iterative process of meetings, mappings, and feedback. The researchers are made aware of LINCS’s policy guiding ontologies and provide feedback on the appropriateness of the ontologies and vocabularies used to represent and describe their research data. In this way, LINCS undertakes continual testing and review of the decisions made in the context of each new dataset and researcher need (see Figure 3). Solutions developed for existing projects are proposed for new projects, where appropriate, and the feedback allows LINCS to reconsider the suitability of the decisions for each new context throughout the project. At the end of the project’s development period, the LINCS team will review datasets ingested in the early phases of the project to allow researchers with early ingested datasets to benefit from the collaborative knowledge gained during the mapping of subsequent datasets.

Fig 3
Figure 3: Implementing ontology decisions with researcher datasets.

The first step in this implementation process is to conduct a questionnaire and interview with the researchers. In order to map the data, it is essential to understand the data, not only in how it is provided but also in how the researcher thinks about it and works with it. These interviews are an entry point for larger conversations about the data and projects; a series of questions provide a framework for the interview, but the conversation is semi-structured so as to explore facets of interest as they come up. Question prompts in these interviews get researchers thinking about their data in new ways, talking about their dataset in detail, and situating their data with respect to other datasets both within and beyond LINCS (Martin et al. 2022).

Through the questionnaire and interview, the LINCS team gains a clear understanding of the dataset as viewed through the eyes of the researcher and access to a sample of data or to the dataset as a whole. The LINCS ontology systems analyst then reviews the data and creates a preliminary analysis and mapping, which is then brought to the researchers for review. During the review meetings, the ontology systems analyst focuses on discussing not just the proposed changes to the data structure but the changes to the meaning of the data that restructuring and alignment with LINCS ontologies entails. The review meeting is an opportunity to talk about the different meaning brought to the data as it is described in different ways and ideally to show researchers how linked data structures may be able to help them more clearly represent and articulate their data or the information required to get to the heart of their research questions. These three steps—analysis, mapping, and review—are repeated as many times as required to come to a mapping that the researchers feel represents their data.

Once an informal mapping is agreed upon, the data moves into a formal mapping and conversion process, and the decisions made throughout the mapping and review process are documented. This documentation then becomes a key asset for LINCS team members, researchers, and other users: it tells researchers how their data is structured, it tells users how to explore and query the data, and it tells LINCS team members how the concepts present in the dataset have been mapped and represented. Then, when a concept that is the same or similar shows up in a different dataset, LINCS can suggest and test an already-implemented solution for the new context. Using the same solutions for similar use cases and requirements increases interoperability between datasets, even though the vocabularies referenced by the different projects may not be the same. In this way, the LINCS team is able to find paths to the greatest possible level of semantic interoperability between datasets while maintaining space for representing differences. Additionally, although an existing solution may be proposed for a use case that appears the same, this too is a discussion: if the solution that fits best for one dataset does not in fact work for a second one, the new dataset will not be forced to use the existing pattern and a fitter solution will be developed by the LINCS ontology systems analyst and the researcher.

In addition to ontologies, patterns—specific ways of describing concepts using an ontology’s classes and properties—are tested across datasets. A notable example of this is a pattern to represent identities such as gender, religion, and other social classifications that came from the CWRC Orlando ontology work (Brown et al. 2019, “Preamble”). This pattern, referred to as “cultural forms” (Brown et al. 2017), positions identity categories not as fixed classifications but instead as contextually embedded, culturally produced, intersecting, and discursive—in other words, as starting points for investigation and understanding, not immutable points of data about a person. After mapping this pattern from the original CWRC ontology into CIDOC CRM, the LINCS team is testing out this pattern with additional incoming datasets to see if this representation of identity matches how researchers conceive of this concept, even if it is not how it is (yet) represented in their data.

As is evident from this final case, it is essential to attend to the changes in meaning brought to the data through the conversion process; this is not a solely technical process, but an interactive human and ideological one as well. The data is not only changing in format but, at times, also in meaning as it moves to a new data structure. This has proved beneficial for researchers such as Dr. Michelle Meagher and Dr. Jana Smith Elford, whose project—AdArchive, which details late twentieth-century feminist periodicals and the advertisements found in them—is grounded in feminist values such as attention to labour. Meagher and Smith Elford found that the event-based structure introduced by LINCS through the application of CIDOC CRM echoes these core values; the data model is therefore able to represent AdArchive data in a way that is closely aligned with how the project researchers think and talk about their data and related research. Insights such as this are valuable outcomes of the researcher review process and demonstrate how the close relationship between researchers and the LINCS technical team is beneficial for both LINCS and the researcher community.

Meaningful (Infra)structure

The present article engages with provocations by McPherson (2014), Posner (2016), Flanders (2018), and Liu (2020) to address questions pertaining to diversity and difference in the structuring of data in conjunction with developing information infrastructure for the humanities. It builds on the work of Brown and other members of the LINCS team and its precursor projects, Orlando and CWRC, to “intervene, with experimental models, in the knowledge structures of our time” (Brown 2020, 173). Through this policy and process work, the LINCS team is making a start at designing LOD infrastructure developed with “the concerns of cultural theory and, in particular . . . a feminist concern for difference” (McPherson 2014, 178) in mind. Through the use of a core event-centric ontology, the incorporation of an annotation ontology to support robust and contextualizing statement provenance data, and the development of ontology patterns such as “cultural forms” to represent the highly subjective and contextual nature of systems of classification—particularly those related to the classification of people—LINCS is implementing strategies for dealing with categories that “are not binary or one-dimensional or stable” (Posner 2016, 34). The LINCS team hopes that these strategies will enable the kind of research proposed by Posner and others, where identity categories can be seen “as they have been experienced, not as they have been captured and advanced by businesses and governments” (Posner 2016, 34). In the cases where the data reflects how businesses and governments have structured them, we hope that these strategies will make visible how this data is the result of information systems enacting systems of power.

The LINCS team aims to embed intersectional feminist principles (Crenshaw 1989; Collins and Bilge 2020) in the project’s LOD infrastructure—not just in the final product, but also in the processes of decision-making and development that result in the user-facing tools, platforms, and data. This tactic takes up Flanders’s (2018, 289) provocation to ask “whether in [our] current projects [we] are able and willing to take different approaches” throughout the development process. LINCS hopes to manifest the kind of “alternative practice” that Flanders imagines and thereby to “build something entirely different and weirder and more ambitious” of the sort that Posner speaks of (Posner 2016, 36). Like any other information infrastructure, LINCS is an ideological as well as a technical system, and as such we aim to lay bare the values and processes that led to the development of the infrastructure itself.

LINCS is an ambitious project, considering its three-year span: the team is working to establish a robust infrastructure to support working with LOD from creation or conversion to publication and use, while also tackling the conversion of a large and heterogeneous collection of datasets. Investing serious effort in ontology policies and processes is key to producing LOD of a quality adequate to supporting humanities inquiry rather than “information jukeboxes” (Oldman 2012). The project as a whole—along with each research dataset—is working to balance interoperability, as achieved through a level of generalization, with contextual and domain-specific requirements. The ontology processes are designed to connect heterogeneous datasets through areas of subject similarity or structural alignment while ensuring space for nuance, specificity, and difference. Now that ontology mapping is well underway, researchers can begin to explore the implications of the ways in which individual linked datasets are structured in relation to each other through affinity groups on Canadian publishing, literary and performance history, London and the British Empire, prosopography, material and textual cultures, knowledge systems, Indigenous knowledges, resistant epistemologies, and geohumanities. The ultimate test of whether the correct balance has been struck will happen as the number of converted and connected datasets grows to the point where researchers are able to use LINCS to generate new insights and interpretations.

The policies and processes outlined here address the core goal of LINCS to support the production of contextualized, situated knowledge in the form of LOD and provide a foundation for representing non-hegemonic epistemologies. To date, work with such epistemologies has been most intense in connection with the work on the “cultural forms” pattern discussed above, which allows for nuanced knowledge representations of social identities that pose an alternative to traditional binary categories, historical flattening, and simplistic notions of identity not only with respect to gender but also other components of social selves. There are other challenges to come: for instance, as mentioned above, the project has not yet addressed Indigenous datasets, which will undoubtedly require flexibility, revision, and broad expert and community consultation. However, having piloted the ontology selection and implementation process with several datasets has provided a strong foundation from which to do so.

Ultimately, developing information infrastructure means creating structures that give data meaning: no systems are purely technical, and no organization of knowledge exists without embedded biases and worldviews. Information systems are developed by people with intent—they are designed to help their creators and users to tell specific stories with their data—and the LINCS team is likewise building infrastructure to help humanities researchers tell the stories they are interested in. To do so with linked data means devising ways to tell stories with nuance and context, situated by and supported by data structures that reflect and make space for precisions and complexities. By foregrounding the powerful yet invisible structures of ontologies via project policies and processes, the LINCS team has established—and will continue to refine—infrastructure for humanities researchers that supports the kinds of representations and interventions demanded by the multiple, diverse ways of knowing associated with linked open cultural data.

Acknowledgements

LINCS and the authors would like to acknowledge the support of the Canada Foundation for Innovation’s Cyberinfrastructure Initiative as the primary funder of the LINCS project along with provincial, university, and organizational partners. See lincsproject.ca. The activities described here have also been supported by the Social Sciences and Humanities Research Council of Canada and the Canada Research Chairs Program.

The authors would also like to thank the editors of this special issue for their support, and the anonymous reviewers of this paper for their valuable feedback.

References

Barbera, Michele. 2013. “Linked (Open) Data at Web Scale: Research, Social and Engineering Challenges in the Digital Humanities.” Italian Journal of Library, Archives and Information Science 4 (1): 91. http://dx.doi.org/10.4403/jlis.it-6333.

Bekiari, Chryssoula, George Bruseker, Martin Doerr, Christian-Emil Ore, Stephen Stead, and Athanasios Velios, eds. 2021. Volume A: Definition of the CIDOC Conceptual Reference Model. ICOM CIDOC. https://doi.org/10.26225/FDZH-X261.

Billey, Amber, Emily Drabinski, and K. R. Roberto. 2014. “What’s Gender Got to Do with It? A Critique of RDA 9.7.” Cataloging & Classification Quarterly 52 (4): 412–21. https://doi.org/10.1080/01639374.2014.882465.

Bowker, Geoffrey C., and Susan Leigh Star. 2000. Sorting Things Out: Classification and Its Consequences. Cambridge, MA: MIT Press. https://doi.org/10.7551/mitpress/6352.001.0001.

Bratton, Benjamin H. 2016. The Stack: On Software and Sovereignty. Cambridge, MA: MIT Press. https://doi.org/10.7551/mitpress/9780262029575.001.0001.

Bray, Tim. 1998. “RDF and Metadata.” XML.com. Accessed July 2, 2021. https://www.xml.com/pub/a/98/06/rdf.html.

The British Museum. 2021. ResearchSpace. http://researchspace.org/.

Brown, Susan. 2010. “Socialized Scholarship: It Starts with Us.” ESC: English Studies in Canada 36 (4): 10–13. http://doi.org/10.1353/esc.2010.0036.

Brown, Susan. 2011. “Don’t Mind the Gap: Evolving Digital Modes of Scholarly Production Across the Digital-Humanities Divide.” In Retooling the Humanities: The Culture of Research in Canadian Universities. Edited by Daniel Coleman and Smaro Kamboureli, 203–231. Edmonton: University of Alberta Press. https://doi.org/10.7939/R3W08WH5V.

Brown, Susan. 2020. “Categorically Provisional.” PMLA 135 (1): 165–74. https://doi.org/10.1632/pmla.2020.135.1.165.

Brown, Susan, Joel Cummings, Jasmine Drudge-Willson, Colin Faulkner, Abigel Lemak, Kim Martin, Alliyya Mo, Jade Penancier, John Simpson, Thomas Smith, Gurjap Singh, Deborah Stacey, Robert Warren, and Constance Crompton. 2019. “The CWRC Ontology Specification 0.99.80.” Scholars Portal Dataverse, V1. https://doi.org/10.5683/SP2/HXMS24.

Brown, Susan, Abigel Lemak, Colin Faulkner, Kim Martin, and Rob Warren. 2017. “Cultural (Re-)formations: Structuring a Linked Data Ontology for Intersectional Identities.” Paper presented at Digital Humanities 2017, Montreal, QC, August 2017. https://dh2017.adho.org/abstracts/580/580.pdf.

Canada’s Early Women Writers Project. 2020. “Violet McNaughton.” https://cwrc.ca/islandora/object/ceww%3Ad9ebf526-9254-421f-ac11-2d869a090192.

Canadian Writing Research Collaboratory. n.d. Accessed July 14, 2021. https://cwrc.ca/.

Canadian Writing Research Collaboratory. 2019. “CWRC Ontology Preamble.” http://sparql.cwrc.ca/ontologies/cwrc-preamble-EN.html.

Canadian Writing Research Collaboratory. n.d. CWRC-Writer. Accessed 14 July, 2021. https://cwrc-writer.cwrc.ca/.

Canning, Erin. 2018. “Affective Metadata for Object Experiences in the Art Museum.” MMst diss., University of Toronto. https://tspace.library.utoronto.ca/bitstream/1807/91417/4/Canning_Erin_201811_MMSt_thesis.pdf.

Canning, Erin, Susan Brown, Kim Martin, Alliyya Mo, and Sarah Roger. 2022. LINCS Ontologies Adoption & Development Policy. LINCS Project. https://doi.org/10.5281/zenodo.6047748.

Center of Digital Humanities Research at Texas A&M University. n.d. The Advanced Research Consortium (ARC). Accessed July 14, 2021. http://arc.dh.tamu.edu/.

Coburn, Erin, and Murtha Baca. 2004. “Beyond the Gallery Walls: Tools and Methods for Leading End-Users to Collections Information.” Bulletin of the American Society for Information Science and Technology 30 (5): 14–19. https://doi.org/10.1002/bult.323.

CIDOC CRM. n.d. International Council for Documentation. Accessed July 14, 2021. http://www.cidoc-crm.org/.

CIDOC CRM. 2021a. “Issue 504: Formulate the philosophical underpinnings of crm and its relation to reality and the objectivity of observations.” Accessed July 14, 2021. https://cidoc-crm.org/Issue/ID-504-formulate-the-philosophical-underpinnings-of-crm-and-its-relation-to-reality-and-the.

CIDOC CRM. 2021b. “Issue 530: Bias in data structure.” Accessed July 14, 2021. https://cidoc-crm.org/Issue/ID-530-bias-in-data-structure.

Collins, Patricia Hill, and Sirma Bilge. 2020. Intersectionality. 2nd edition. New York: Wiley.

Cope, Bill, Mary Kalantzis, and Liam MaGee. 2011. Towards a Semantic Web: Connecting Knowledge in Academic Research. Cambridge, MA: Chandos.

Crenshaw, Kimberlé. 1989. “Demarginalizing the Intersection of Race and Sex: A Black Feminist Critique of Antidiscrimination Doctrine, Feminist Theory and Antiracist Politics.” University of Chicago Legal Forum 1989 (1): 139–67.

D’Ignazio, Catherine, and Lauren F. Klein. 2020. Data Feminism. Cambridge, MA: MIT Press.

Digitised Manuscripts to Europeana (D2ME). n.d. Accessed July 14, 2021. http://dm2e.eu/.

Doerr, Martin. 2003. “The CIDOC Conceptual Reference Module: An Ontological Approach to Semantic Interoperability of Metadata.” AI Magazine 24(3): 75–92. https://doi.org/10.1609/aimag.v24i3.1720.

Doerr, Martin, Richard Light, and Gerald Hiebel. 2020. “Implementing the CIDOC Conceptual Reference Model in RDF.” https://docs.google.com/document/d/1NdrWpzo7EFChryh4Qg-Ue8WLvnwejHx20eiwdJuZEck.

Drabinski, Emily. 2013. “Queering the Catalog: Queer Theory and the Politics of Correction.” The Library Quarterly 83 (2): 94–111. https://doi.org/10.1086/669547.

Duarte, Marisa Elena, and Miranda Belarde-Lewis. 2015. “Imagining: Creating Spaces for Indigenous Ontologies.” Cataloging & Classification Quarterly 53 (5–6): 677–702. https://doi.org/10.1080/01639374.2015.1018396.

Enslaved: Peoples of the Historical Slave Trade. n.d. Accessed July 14, 2021. https://enslaved.org/.

Europeana. n.d. Accessed July 15, 2021. https://www.europeana.eu/en.

Flanders, Julia. 2018. “Building Otherwise.” In Bodies of Information: Intersectional Feminism and the Digital Humanities. Edited by Elizabeth Losh and Jacqueline Wernimont, 289–304. Minneapolis: University of Minnesota Press.

Frontini, Francesca, Carmen Brando, and Jean-Gabriel Ganascia. 2015. “Semantic Web Based Named Entity Linking for Digital Humanities and Heritage Texts.” In Proceedings of the First International Workshop Semantic Web for Scientific Heritage at the 12th ESWC 2015 Conference, edited by Arnaud Zucker, Isabelle Draelants, Catherine Faron-Zucker, and Alexandre Monnin, 77–88. http://ceur-ws.org/Vol-1364/paper9.pdf.

Giunchiglia, Fausto, Biswanath Dutta, and Vincenzo Maltese. 2014. “From Knowledge Organization to Knowledge Representation.” Knowledge Organization 41 (1): 44–56. https://doi.org/10.5771/0943-7444-2014-1-44.

Hacıgüzeller, Piraye, James Stuart Taylor, and Sara Perry. 2021. “On the Emerging Supremacy of Structured Digital Data in Archaeology: A Preliminary Assessment of Information, Knowledge and Wisdom Left Behind.” Open Archaeology 7 (1): 1709–30. https://doi.org/10.1515/opar-2020-0220.

Haraway, Donna. 1988. “Situated Knowledges: The Science Question in Feminism and the Privilege of Partial Perspective.” Feminist Studies 14 (3): 575–99. https://doi.org/10.2307/3178066.

Historic Places LA. n.d. Los Angeles Historic Resources Inventory. Accessed July 14, 2021. http://www.historicplacesla.org/.

Hoekstra, Rinke, Albert Meroño-Peñuela, Kathrin Dentler, Auke Rijpma, Richard Zijdeman, and Ivo Zandhuis. 2016. “An Ecosystem for Linked Humanities Data.” In ESWC 2016: The Semantic Web, edited by Harald Sack, Giuseppe Rizzo, Nadine Steinmetz, Dunja Mladenic, Sören Auer, and Christoph Lange, 425–40. Heidelberg: Springer, Cham.

Huma-Num. n.d. Isidore. Accessed July 14, 2021. https://isidore.science/.

Hypothes.is. n.d. Accessed July 14, 2021. https://web.hypothes.is/.

Hyvönen, Eero. 2012. Publishing and Using Cultural Heritage Linked Data on the Semantic Web. Morgan & Claypool. https://doi.org/10.2200/S00452ED1V01Y201210WBE003.

Hyvönen, Eero. 2020. “Linked Open Data Infrastructure for Digital Humanities in Finland.” In Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, edited by Sanita Reinsone, Inguna Skadina, Anda Baklãne, and Jãnis Daugavietis, 254–59. Riga, Latvia.

ICS Forth. n.d. 3M Mapping Memory Manager. Accessed July 14, 2021. http://139.91.183.3/3M/.

IIIF Community. n.d. International Image Interoperability Framework (IIIF). Accessed July 14, 2021. https://iiif.io/.

International Organization for Standards. 2014. “ISO 21127:2014 Information and Documentation—A Reference Ontology for the Interchange of Cultural Heritage Information.” https://www.iso.org/standard/57832.html.

Knoblock, Craig. n.d. Karma. Accessed July 14, 2021. https://usc-isi-i2.github.io/karma/.

Levesque, Hector. 1986. “Knowledge Representation and Reasoning.” Annual Review of Computer Science 1 (1): 255–87.

Linked Data for Production: Pathway to Implementation (LD4P2). n.d. Accessed July 14, 2021. https://wiki.lyrasis.org/display/LD4P2. Archived at: https://perma.cc/BEZ4-F6QN.

Linked Open Vocabularies (LOV). n.d. Accessed July 14, 2021. https://lov.linkeddata.es.

Littletree, Sandra, and Cheryl A. Metoyer. 2015. “Knowledge Organization from an Indigenous Perspective: The Mashantucket Pequot Thesaurus of American Indian Terminology Project.” Cataloging & Classification Quarterly 53 (5–6): 640–57. https://doi.org/10.1080/01639374.2015.1010113.

Liu, Alan. 2020. “Toward a Diversity Stack: Digital Humanities and Diversity as Technical Problem.” PMLA 135 (1): 130–51. https://doi.org/10.1632/pmla.2020.135.1.130.

Martin, Kim, Sarah Roger, Erin Canning, and Alliyya Mo. 2022. LINCS Research Dataset Intake Questionnaire. LINCS Project. https://doi.org/10.5281/zenodo.6048520.

McPherson, Tara. 2012. “Why Are the Digital Humanities So White? Or Thinking the Histories of Race and Computation.” In Debates in the Digital Humanities, edited by Matthew K. Gold. https://doi.org/10.5749/9781452963754.

McPherson, Tara. 2014. “Designing for Difference.” differences 25 (1): 177–88. https://doi.org/10.1215/10407391-2420039.

Moraitou, Efthymia, John Aliprantis, Yannis Christodoulou, Alexandros Teneketzis, and George Caridakis. 2019. “Semantic Bridging of Cultural Heritage Disciplines and Tasks.” Heritage 2 (1): 611–30. https://doi.org/10.3390/heritage2010040.

Oldman, Dominic. “The British Museum, CIDOC CRM and the Shaping of Knowledge.” Dominic Oldman (blog), September 4, 2012. https://web.archive.org/web/20160820105411/www.oldman.me.uk/blog/the-british-museum-cidoc-crm-and-the-shaping-of-knowledge/.

Olson, Hope A. 2001. “The Power to Name: Representation in Library Catalogs.” Signs 26 (3): 639–68. https://www.jstor.org/stable/3175535.

OmekaS. n.d. Accessed July 14, 2021. https://omeka.org/s/.

OntoME: Ontology Management Environment. n.d. Accessed July 14, 2021. https://ontome.net/.

OpenRefine. n.d. Accessed July 14, 2021. http://openrefine.org/.

Pan-Canadian Documentary Heritage Network (PCDHN). 2018. “Out of the Trenches: A Linked Open Data Project.” UAL Dataverse, V2. https://doi.org/10.7939/DVN/URXSGC.

Pattuelli, Cristina, dir. n.d. Linked Jazz. Accessed July 14, 2021. https://linkedjazz.org/.

Paul Getty Trust. 2021. Arches Project. https://www.archesproject.org/.

Posner, Miriam. 2016. “What’s Next: The Radical, Unrealized Potential of Digital Humanities.” In Debates in the Digital Humanities 2016, edited by Matthew K. Gold and Lauren F. Klein, 32-41. https://doi.org/10.5749/9781452963761.

Quintman, Andrew, and Kurtis R. Schaeffer. 2019. The Life of the Buddha. https://lotb.iath.virginia.edu/project.

Ruberg, Bonnie, Jason Boyd, and James Howe. 2018. “Toward a Queer Digital Humanities.” In Bodies of Information: Intersectional Feminism and the Digital Humanities, edited by Elizabeth Losh and Jacqueline Wernimont, 108–27. Minneapolis: University of Minnesota Press.

Sanderson, Robert. 2013. “RDF: Resource Description Failures and Linked Data Letdowns.” Journal of Digital Humanities 2 (3): 33–34. http://journalofdigitalhumanities.org/files/jdh_2_3.pdf.

Sanderson, Robert. 2018. “Shout it Out: LOUD.” Keynote address presented at EuropeanaTech conference, Rotterdam, Netherlands, May 2018. YouTube video, 41:34. https://www.youtube.com/watch?v=r4afi8mGVAY.

Sanderson, Robert, Paolo Ciccarese, and Benjamin Young, eds. 2017. “Web Annotation Data Model.” W3C. https://www.w3.org/TR/annotation-model/. Archived at: https://perma.cc/34T7-G3HB.

The Shelley-Godwin Archive. n.d. Accessed July 14, 2021. http://shelleygodwinarchive.org/.

Silva, Leiser. 2007. “Epistemological and Theoretical Challenges for Studying Power and Politics in Information Systems.” Information Systems Journal 17 (2): 165–83. https://doi.org/10.1111/j.1365-2575.2007.00232.x.

Simpson, John, and Susan Brown. 2013. “From XML to RDF in the Orlando Project.” Paper presented at International Conference on Culture and Computing, Kyoto, Japan, September 2013. http://dx.doi.org/10.1109/CultureComputing.2013.61.

Srinivasan, Ramesh. 2013. “Re-thinking the Cultural Codes of New Media: The Question Concerning Ontology.” New Media & Society 15 (2): 203–23. https://doi.org/10.1177/1461444812450686.

Star, Susan Leigh, and James R. Griesemer. 1989. “Institutional Ecology, ‘Translations’ and Boundary Objects: Amateurs and Professionals in Berkeley’s Museum of Vertebrate Zoology, 1907–39.” Social Studies of Science 19 (3): 387–420. https://doi.org/10.1177%2F030631289019003001.

Thomas III, William G., Kaci Nash, Laura Weakley, Karin Dalziel, and Jessica Dussault. n.d. O Say Can You See: Early Washington, D.C., Law & Family. University of Nebraska-Lincoln. Accessed July 14, 2021. https://earlywashingtondc.org/.

Turner, Hannah. 2017. “Organizing Knowledge in Museums: A Review of Concepts and Concerns.” Knowledge Organization 44 (7): 472–84. https://doi.org/10.5771/0943-7444-2017-7-472.

Turner, Hannah. 2020. Cataloguing Culture: Legacies of Colonialism in Museum Documentation. Vancouver: UBC Press.

Wikibase. n.d. Accessed July 14, 2021. https://wikiba.se/.

Footnotes

1 The LINCS team recognizes that there are two layers of community involvement: the researcher community and the community that is the subject of the data. The researchers actively working with LINCS are not always members of the communities from which their data is drawn, and we acknowledge that there are ethical concerns specific to this fact (for instance, in regards to data sovereignty or data that is connected to living persons) in the context of mobilizing datasets as LOD. While this is outside of the scope of this paper, the LINCS team plans to address this in a forthcoming article.

2 Any other entity (subclasses of CIDOC CRM’s top-level class E1 Entity) can be the subject of (CIDOC CRM property P67i referred to by) an annotation (Web Annotation Data Model class Annotation/CIDOC CRM class E33 Linguistic Object).