CSDM: Census semantic data models

Start » Daten­bank » CSDM: Census Semantic Data Models

Since 2021, George Bruseker and Denitsa Nenova of Takin.solutions have been engaged in the coor­di­na­tion of a project to make The Census of Antique Works of Art and Archi­tec­ture Known in the Renais­sance data available within a semantic, linked open data frame­work, using the ISO stan­dard, cultural heri­tage formal onto­logy, the CIDOC-CRM. More speci­fi­cally, they have been alig­ning the Census data with the CIDOC-CRM using as a base­line the Semantic Refe­rence Data Models (SRDM) created by the Swiss Art Rese­arch Infra­struc­ture (SARI) in Zurich. These models aim to develop a widely repli­cable, well known pattern for enco­ding art histo­rical know­ledge regar­ding objects, events and actors as unders­tood in art history. By adop­ting this stra­tegy, the Census aims to join sister projects to support wide­spread data inte­gra­tion and the gene­ra­tion of scho­lar­ship in a born-digital envi­ron­ment that takes advan­tage of the rich­ness and accu­racy of semantic data repre­sen­ta­tion to enable scho­lars to ask, answer and record new know­ledge in a semantic graph of data online. The endpoint of this project will be the crea­tion of seman­ti­cally rich, linked open Census data which will support new possi­bi­li­ties for using, sear­ching and sharing the know­ledge encoded in this resource.

Below is the intro­duc­tion to Bruseker and Nenova’s docu­men­ta­tion of Census Semantic Data Models (CSDM), first published on GitHub on 14 April, 2022. Please click here for the full CSDM documentation.

Intro­duc­tion

The Census Semantic Data Models (CSDM) repre­sent a coherent set of seman­ti­cally-encoded data models for the use of art histo­rians in the (re-)representation of art histo­rical data related to the history of monu­ments, the docu­ments that respond to or bear witness to them, the people or places that relate to these objects of study through histo­rical events, and the images and biblio­graphy which refe­rence them.

The CSDM has been built from three major sources:

1) The Census Data Model

The imme­diate scope and repre­sen­ta­tional direc­tion of the CSDM has been guided by the original data and data model of The Census of Antique Works of Art and Archi­tec­ture Known in the Renais­sance (Institut für Kunst- und Bild­ge­schichte, Humboldt-Univer­sität zu Berlin). The Census project — which began in 1946, deve­loped over decades as a system of index cards and photo­graphs, and was first compu­te­rised in the early 1980s, provides a time-tested foun­da­tion from which to build a set of semantic data models of use to art histo­rical rese­arch. The Census was created in the 1940s with the goal of repla­cing what was then a vague notion of “clas­sical influence” in the Renais­sance with a specific know­ledge of which antique monu­ments were known in the Renais­sance, in which setting, and in which condi­tion. Over its more than seventy-five-year history the Census has traced Renais­sance artists’ and huma­nists’ response to anti­qui­ties, and has had at its core the docu­men­ta­tion of events which connect artists and authors with parti­cular antique monu­ments. The Census dataset has become a tool useful for archaeo­lo­gists who are inte­rested, for example, the condi­tion and state of preser­va­tion of anti­qui­ties in the Renais­sance and for histo­rians and art histo­rians concerned with the crea­tive recep­tion of anti­qui­ties during the Renais­sance period.

It is a core objec­tive of the CSDM to provide a new repre­sen­ta­tion of the Census data that enables its faithful re-repre­sen­ta­tion in a semantic format, allo­wing for rese­ar­chers to query it accor­ding to a common logic.

2) The CIDOC CRM

The CIDOC CRM (CIDOC Concep­tual Refe­rence Model) offers itself as an appro­priate frame­work, as a formal onto­logy which allows for the repre­sen­ta­tion of Cultural Heri­tage data in general in a common form. The CRM is an event-based model whose form enables the rich repre­sen­ta­tion and connec­tion of data from hete­ro­ge­neous data models into a common system. The passage to Linked Open Data is richest when this data is repre­sented in a common format that is well docu­mented and openly acces­sible to a wider commu­nity of scho­lars. The CRM enables precisely this tran­si­tion. The event-centred model­ling stra­tegy of the onto­logy, further­more, is highly compa­tible with the exis­ting abstract data model of the Census, making the choice of this onto­logy the most appropriate.

3) The SARI Refe­rence Data Models

While the CRM provides a general language and frame­work for the semantic repre­sen­ta­tion of cultural heri­tage data, it expli­citly refrains from speci­fying how it should be imple­mented in any parti­cular context, leaving the stan­dard open to adop­tion and evolu­tion accor­ding to new needs and requi­re­ments. At the same time, the CRM enables the crea­tion of a common set of well-unders­tood methods for repre­sen­ting data that can be read and adopted both by end users of the data (rese­ar­chers) and by those who create and support the systems which allow access to the data (deve­lo­pers).

SARI (Swiss Art Rese­arch Infra­struc­ture, Univer­sität Zürich) is an infra­struc­ture which has adopted the CIDOC CRM in order to enable widely-inter­ope­rable cultural heri­tage (CH) data about art history, inter alia. To support this stra­tegy, using the CIDOC CRM they have invested in and deve­loped the SRDM (Semantic Refe­rence Data Models), a set of well-docu­mented, funda­mental semantic model­ling patterns suited to rese­arch on the visual arts. They have created the SRDM to repre­sent basic targets of cultural heri­tage docu­men­ta­tion and to provide well-docu­mented patterns that can be adopted in repre­sen­ting enti­ties as well as the typical infor­ma­tion coll­ected in rela­tion to them. The CSDM thus takes up the CIDOC CRM though the SRDM model­ling, adop­ting its basic patterns and repre­sen­ta­tion in order to follow exis­ting best prac­tice and to join and add to a growing commu­nity of like-minded users wishing to share semantic data.

The imme­diate use of CSDM docu­men­ta­tion is to act as an explicit docu­men­ta­tion and speci­fi­ca­tion of the semantic data struc­tures used to repre­sent the Census data as Linked Open Data. Users of this docu­men­ta­tion can refer to this paper as a guide to the overall semantic model­ling decis­ions that were taken during this project, as well as a detailed descrip­tion of each model that was created and the infor­ma­tion that it enables the rese­ar­cher to encode or query.

The broader purpose of this docu­men­ta­tion is both to empower rese­ar­chers to query the exis­ting Census data that has been repre­sented seman­ti­cally and to enable the possi­bi­lity of inte­gra­ting this data by using and/or exten­ding the CSDM. The Census data is offered in a semantic format as LOD in order to open Census data to a broader uptake by scho­lars and, moreover, to expand the data in unfo­re­seen ways that extend beyond the scope of corpus-buil­ding projects such as the Census.

Metho­do­logy

Reuse of SRDM, create a subset, any addi­tions docu­mented with their own nota­tion in a compa­tible format.

Contents of CSDM

The CSDM is, then, a series of semantic data models, which follows the SRDM patterns and enables the docu­men­ta­tion of key objects (monu­ments, docu­ments) and supporting data regar­ding people, places and biblio­graphy. It enables a rich repre­sen­ta­tion and querying of this data by its use of the CIDOC CRM, which joins the models through a robust docu­men­ta­tion of events connec­ting the various entities.

The Corpus of CSDM includes:

Name Descrip­tion CRM Entity SARI Equi­va­lent
Monu­ment This model is intended to enable the repre­sen­ta­tion and sharing of data rele­vant to artistic, archi­tec­tural and histo­rical arte­facts extant in the past or present. In the Census, Monu­ments are antique inscrip­tions, coins, pain­tings, sculp­tures, archi­tec­tural monu­ments and other ancient arte­facts and works of art. In some cases, Monu­ments can also be works of art and archi­tec­ture which are not antique, but were believed in the Renais­sance to be antique. E22 Human Made Object Built­work
Docu­ment This model is intended to enable the repre­sen­ta­tion and sharing of data rele­vant to physical items, which carry content that provides evidence of a histo­rical witnessing of, response to or refe­rence to a parti­cular Monu­ment. In the Census, this cate­gory can include drawings, prints, pain­tings, medals, and statu­ettes, or written docu­ments such as printed guide­books, manu­scripts, letters and inventories. E22 Human Made Object Archival Item
Person This model is intended to enable the repre­sen­ta­tion and sharing of data rele­vant to real-world, physical persons. E21 Person Person
Loca­tion This model is intended to enable the repre­sen­ta­tion and sharing of data rele­vant to geogra­phic places used to iden­tify the loca­tions of items and events over time. E53 Place Place
Biblio­graphy This model is intended to enable the repre­sen­ta­tion and sharing of data rele­vant to biblio­gra­phic sources that have been used in the course of rese­ar­ching and documenting. E33 Lingu­i­stic Object Biblio­gra­phic Entity
Image This model is intended to enable the repre­sen­ta­tion and sharing of data rele­vant to images repre­sen­ting objects within the field of research. E36 Visual Item Image
Period This model is intended to enable the docu­men­ta­tion and correct refe­ren­cing of histo­rical periods for use in datation. E4 Period N/A

Each of these models supports a sepa­rate unit of docu­men­ta­tion adopted in the Census. An instance of such a model stands for one real-world entity that is docu­mented by that instance. The models exist in rela­tion, joining on key rela­tions in order to create a web of supporting inter­re­la­tions that repre­sent the real-world histo­rical trajec­tory of objects through time and space, and map specific instances of the recep­tion of antique monu­ments by post-clas­sical obser­vers. While event-centric model­ling is used, the events are recorded within the context of the objects to which they relate (e.g.: birth rela­tive to the indi­vi­dual, produc­tion rela­tive to the monument/document). For the purpose of docu­men­ta­tion, this allows sepa­rate objects to be docu­mented in sepa­rate models. Within the context of an open graph network of facts, the event nodes can be used to traverse a complex histo­rical graph of obser­va­tions of and responses to antique objects through time.