With projected lifespans of many decades, infrastructure initiatives such as Europe’s Distributed Systems of Scientific Collections (DiSSCo), USA’s Integrated Digitized Biocollections (iDigBio), National Specimen Information Infrastructure (NSII) of China and Australia’s digitisation of national research collections (NRCA Digital, available through the Atlas of Living Australia) aim at transforming today’s slow, ineﬃcient and limited practices of working with natural science collections. The need to borrow specimens (plants, animals, fossils or rocks) or physically visit collections, and the absence of linkages to other relevant information represent signiﬁcant impediments to answering today’s important scientiﬁc and societal questions.
As a logical extension of the Internet, Digital Object Architecture (DOA) (Kahn and Wilensky, 2006) oﬀers a way of grouping, managing and processing fragments of information relating to a natural science specimen. A ‘digital specimen’ acts as a surrogate in cyberspace for a speciﬁc physical specimen, identifying its actual location and authoritatively saying something about its collection event (who, when, where) and taxonomy (what), as well as providing links to high-resolution images. A digital specimen exposes supplementary information about related literature, traits, tissue samples and DNA sequences, chemical analyses, environmental information, and much more, stored elsewhere than in the natural science collection itself.
A simple Digital Specimen example
The figure below illustrates a simple example of an instance of a Digital Specimen (DS). It has a unique identifier string, “nsid: 20.5000.1025/486a7e883f14f88bba37” to identify it persistently and globally. Resolving this identifier will always return the digital representation of a specific physical specimen; in this case “BMNH:2006.12.6.40-41” – a parasitic worm (of tropical fish) that can be found in the zoology collection of the Natural History Museum, London; and links to any associated data or information known about or derived from the physical specimen itself.
The large, green shaded area surrounded by a thick dashed line in the top-centre of the figure illustrates an instance of the Digital Specimen (DS). The DS contains data about the physical specimen, as well as links to other data/related artifacts (typically, entries in other databases) represented by the smaller blue-shaded boxes surrounding the DS – literature article in Zootaxa, DNA sequence in the European Nucleotide Archive, images, and a taxonomic treatment entry in Plazi’s TreatmentBank. The small DS, “MNHN JNC 1848-D1” on the right of the diagram is another physical specimen to which the main example is also related, as illustrated by the hasHolotype/isParatypeOf relations.
At the bottom middle of the diagram, the rectangular box illustrates how the DS content could be serialized as JSON for transfer between systems, as well as for application processing and entry into databases. Other representations are also possible.
Anchoring and extending what is accessible about a specimen
Digital Specimens are the transformative mechanism that links physical specimens with other artifacts and data about them. They are the means by which information about specimens can be found, processed and used, by which specimens can be unambiguously attributed, and by which usage and discoveries associated with specimens can be tracked; for example, for repatriating the benefits of specific discoveries back to the country of origin of the specimen. Digital Specimens implement the ‘extended specimen’ concept (Webster et al., 2017, Lendemer et al., 2019) and provide the means by which ‘Next Generation Collections’ (Schindel and Cook, 2018) can be managed.
Implicitly findable, accessible, interoperable and reusable (FAIR)
More than just digital representations, Digital Specimens are a specific kind of digital object that can lead to new working practices and a digital transformation in collections-based science. Philosopically, digital objects represent a new category of industrial object sitting alongside natural objects such as plants, animals, fossils, rocks, minerals, etc. and tools (hammer, wrench, drill) (Kallinikos et al., 2010).
In an article in Data Intelligence, Special Issue on Emergent FAIR Practices, Larry Lannom (CNRI), Alex Hardisty (Cardiff University) and Dimitris Koureas (Naturalis Biodiversity Center) explain (Lannom et al. 2020) how Digital Specimens, as aggregations of widely distributed and heterogeneous data about biological and geological specimens, and the use of the Digital Object Architecture (DOA) data model and components act as an approach to solving the challenges of offering adherence to the FAIR principles (findable, accessible, interoperable, reusable) as an integral characteristic of data for biodiversity and geodiversity sciences.
On extended specimen and next generation collection concepts:
- Lendemer et al., 2019. doi: 10.1093/biosci/biz140;
- Schindel and Cook, 2018. doi: 10.1371/journal.pbio.2006125;
- Webster et al., 2017. isbn: 978-1-4987-2915-4.
On Digital Object Architecture (DOA), digital objects, FAIR and FAIR Digital Objects (FDO):
- Kahn and Wilensky, 2006. doi: 10.1007/s00799-005-0128-x.
- Kallinikos et al., 2010. url: http://ear.accc.uic.edu/ojs/index.h/fm/article/view/3033/2564.
- Wittenburg et al., 2019. doi: 10.23728/b2share.b605d85809ca45679b110719b6c6cb11.
- Wittenburg, 2019. doi: 10.1162/dint_a_00004.
- Mons et al., 2017. doi: 10.3233/ISU-170824.
And their use by DiSSCo: