Abstract:
The burgeoning field of Digital Humanities has seen a great deal of interest in methodologies that support the exploration, cross-pollination, and programmatic analysis of heritage collections across the web of data. Although the heritage community has generally agreed that these data should be semantically enriched using the CIDOC Conceptual Reference Model and published as Linked Open Data, a lack of agreement at both data and infrastructural levels has hindered advancements that would allow for greater data integration and computational exploration. This project provides an institutional roadmap for publishing such data in a Semantic Web research environment, proposing a set of best practices for the community. Using a collection of 230,000 images and index metadata, this project presents methodologies and tools for data cleaning, reconciliation, enrichment, and transformation for publishing in a native Resource Description Framework system. A semantic framework for integrating computer vision services enables subsequent enrichment and visual analysis, enabling the mass-digitization of heritage collections with minimal burden on institutions, all while ensuring the long-term preservation and interoperability of these data at a global scale.