Roundup of technical updates

Dear Readers,

In this post, we share some of our recent updates. We have been super busy with various technical developments. Here are a few highlights:

Upcoming conference presentations 📺

We will be at The Biodiversity Information Standard (TDWG) 2022! This year the annual conference will be a hybrid meeting, hosted in Sofia, Bulgaria.

https://doi.org/10.3897/biss.6.90987 “Human and Machine Working Together towards High Quality Specimen Data: Annotation and Curation of the Digital Specimen”. We are excited to share our proof of concept around community annotation and curation service with a focus on data quality checks. This talk is part of SYM05: “Standardizing Biodiversity Data Quality”. SYM05 will start Tuesday at 09:00 EEST.
https://doi.org/10.3897/biss.6.91168: “Zen and the Art of Persistent Identifier Service Development for Digital Specimen”. No, we won’t be talking about Zen philosophy here. The title is a nod to the 1974 book Zen and the Art of Motorcycle Maintenance. This talk (part of the LTD14 session entitled “Ensuring FAIR Principles and Open Science through Integration of Biodiversity Data”) will feature our current local Handle server setup and persistent identifier for Digital Specimen and related metadata work. LDT14 will be held on 18th Oct Tuesday, between 14:00-16:00 EEST.
https://doi.org/10.3897/biss.6.91428 “Connecting the Dots: Joint development of best practices between infrastructures in support of bidirectional data linking”. This talk will focus on our work in the BiCIKL project highlighting the best practices for reliably linking specimen collection data with other data classes. All three above talks have FAIR Digital Objects as the key focus. More on that is below. Connecting the dots is part of SYM09: “A Global Collections Network: building capacity and developing community” (Thu, Oct 20, 09:00-10:30 EEST).
https://doi.org/10.3897/biss.6.94350 “DiSSCo Flanders: A regional natural science collections management infrastructure in an international context” (part of SYM08: Monday 17 Oct 11:30-12:30 EEST) and https://doi.org/10.3897/biss.6.91391 “DiSSCo UK: A new partnership to unlock the potential of 137 million UK-based specimens” (part of SYM03: Thu 20 Oct 14:00-16:00 EEST). Both of these will highlight national level initiatives that are doing some amazing work on scaling up digitisation, data mobilisation and implementing FAIR principles.

After TDWG, we are back at Leiden for the 1st International Conference on FAIR Digital Objects (26-28 Oct 2022 at the Naturalis Biodiversity Center). The following presentation will focus on DiSSCo and related collaborations such as the Biodiversity Digital Twin.

https://doi.org/10.3897/rio.8.e93816 “From data pipelines to FAIR data infrastructures: A vision for the new horizons of bio- and geodiversity data for scientific research”. In this presentation, we will touch upon how from various data pipelines and data aggregations, we can go to the next step of machine actionability — A FAIR (Findable, Accessible, Interoperable, and Reusable) and Fully AI Ready data infrastructure that can support pressing research questions. Check out the full program for exciting keynotes and panels.

This is Bob Kahn. He's from Brooklyn and has a degree in electrical engineering. Oh, yes: he also CO-INVENTED THE INTERNET. And he's one of the keynote speakers at #FDO2022. You do need to join us! https://t.co/ZYU1O7sErz #FAIR #FAIRdata #OpenScience #DataScience #OpenSource pic.twitter.com/M6buiyh839
— FAIRDOForum (@FAIRDOForum) October 7, 2022

openDS data modeling work 🏃

Within the openDS working group, we are focusing on how the different digital objects and their relationships should look like — Digital Specimen, Annotation, and Media objects. All of these are FAIR Digital Objects with their own persistent identifiers (Handle), PID Kernel, and structured serialisation (JSON and JSON-LD). We are paying close attention to the existing Darwin Core and ABCD elements in use, the ongoing work with MIDS to ensure the reusability of existing efforts. We also had several conversations that provided feedback on the new GBIF data model and we are exploring various pilots to see how DiSSCo and GBIF can help each other.

The data modelling work is a community effort and often takes time to reach a consensus. So we are taking a two pronged approach. First, continuing our regular working group meetings within the DiSSCo Prepare project and keep other outside stakeholders informed. We need this focused effort to fine tune the details. Second, as the data model is evolving we are taking an agile and DevOps approach to test and deploy the infrastructure. Over the past few months, we have deployed a robust test implementation of the DiSSCo Digital Specimen architecture following modern data and software architecture principles with FAIR and FAIR Digital Objects in mind. We will share some of these developments during TDWG and in this space as well.

CMS Roundtable 🦜

On Oct 10, we organised a virtual roundtable inviting several Collection Management System (CMS) vendors, developers to think about how local data systems can interact, integrate or make use of the envisioned DiSSCo infrastructure. For a technical background, please check out DiSSCo Prepare report D6.1 (“Harmonization and migration plan for the integration of CMSs into the coherent DiSSCo Research Infrastructure“) where we talked about API integration and Event Driven Design. The report summarises a previous workshop we did on Event Storming. The CMS roundtable is a follow of this workshop to get more feedback from the community. We had a lively discussion around the future possibilities and challenges. A report and future undertakings based on this roundtable will be available soon.

Right in the middle of our discussion during @DiSSCoPrepare's Roundtable 2 virtual meeting. Our focus is in how to connect #CMS with DiSSCo's #digitalspecimen infrastructure. pic.twitter.com/aahS2Y7eIQ
— DiSSCo (@DiSSCoEU) October 10, 2022

Geo-diversity data 🌋

DiSSCo will be working with both bio and geo-diversity data. There are already several European and global efforts going on (such as GeoCASe and Mindat) that are using existing data pipelines from museums via BioCASe and ABCD Extension for Geosciences to mobilise minerals, rocks, meteorites and fossils. We are working closely with several different stakeholders (software developers, data managers, and collection managers) to understand current strengths and gaps. More on this soon.

Roundup of technical updates

Published by Sharif Islam

Leave a comment Cancel reply

Share this:

Related

Published by Sharif Islam

Leave a comment Cancel reply