Identifiers and contextual data

Persistent identifiers, their links to other identifiers and various contextual data are essential components of the DiSSCo architectural design. Here is a brief example to demonstrate the importance of these identifiers, linking them and providing enough contextual information to perform operations on the digital specimen objects. The example here also highlights some of the challenges we are facing regarding standards mapping and interoperability.

At Naturalis, we are currently adding images to some of our bird’s nest specimen data. Here’s one nest for the bird Turdus merula merula (common blackbird, merel in Dutch).

Nest of Turdus merula merula from Naturalis Biodiversity Center. source: https://bioportal.naturalis.nl/specimen/ZMA.AVES.64793

We have data about this specimen in the museum portal and also in GBIF. However, we start to see some challenges while trying to find these specimens and link them. The museum system describes the nest using the ABCD schema with the following two terms: “recordBasis”: “PreservedSpecimen” and “kindOfUnit”: “nest”. recordBasis maps to the Darwin Core term basisOfRecord but kindOfUnit does not map to any terms (FYI: there is a mapping schema and the community is aware of these issues). As a result, searching for nests of Turdus merula merula can only be done from the Naturalis bioportal. Even though the GBIF record points back to the museum record, from the museum record we cannot get to GBIF. Other museums are using the field dynamicProperties. Here’s an example of a nest from NHMUK (this is a good example because it shows bi-directional links). As different standards and systems are involved in the data management and publishing pipeline that are not fully interoperable, we lose the context. Again, we are well aware of these problems. And various initiatives are currently addressing these issues from different data points.

For this example, I created a simple Digital Specimen. It has a persistent identifier and enough contextual information to tell us that this is a nest, not a bird specimen. It also shows that the collector is “Max Weber” (not the famous sociologist) but Max Wilhelm Carl Weber, a German-Dutch zoologist.

Screenshot from Bionomia that uses wikidata and gbif identifiers to connect specimens to collectors. source: https://bionomia.net/Q63149

A snippet from our test digital specimen:

ods:authoritative": {
"ods:midsLevel": 1,
"ods:curatedObjectID": "https://data.biodiversitydata.nl/naturalis/specimen/ZMA.AVES.64793",
"ods:institution": "https://ror.org/0566bfb96",
"ods:institutionCode": "Naturalis",
"ods:objectType": "Bird's nest",
"ods:name": "Turdus merula merula"
}
,

We can also add more information about this digital object to provide more context:

“ods:supplementary”: {
“gbifId”: https://www.gbif.org/occurrence/2434245775,
“dwc:recordedBy”: “Weber, Max”,
“dwc:recordedByID”: https://www.wikidata.org/wiki/Q63149
}

To learn more about the elements and the structure of Digital Specimen please check out the openDS repo. The python codes of this example are available in this notebook.

This simple example shows the value of identifiers (such as Handle, ROR, Wikidata) that can help link various contextual information to make these specimens FAIR. In the coming years, DiSSCo with others will tackle some of these challenges.

Published by Sharif Islam

Data Architect, DiSSCo

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: