Why it matters more in the age of AI and Machine Learning
One of the key concepts outlined in the 2016 FAIR principle paper is “machine-actionability”:
..the idea of being machine-actionable applies in two contexts—first, when referring to the contextual metadata surrounding a digital object (‘what is it?’), and second, when referring to the content of the digital object itself (‘how do I process it/integrate it?’). Either, or both of these may be machine-actionable, and each forms its own continuum of actionability.
Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
Since then, different interpretations and applications of machine-actionability has appeared. However, in recent times, with growth of data repositories, data silos, new architectures principles (such as Lakehouse and Data Mesh), critique of large scale AI models the continuum of machine-actionability is ever expanding. And even with the larger scale and higher volume, the question remains the same. We still need to know “what is it?” and “how do I process it/integrate it?”. We still need to understand and process each data element (the different digital objects) with “minimum viable metadata” and do operations on them — this could be image recognition program that distinguishes a husky from a wolf or diagnosing cancerous cells. The attributes and context of the individual artefacts matter. As we are expanding the scale and usage of AI and Machine Learning, this matters even more now.
And furthermore, even though machine-actionability might imply minimal human intervention, the operations and results of these actions have real world implications. Along with precise definitions and semantics, the context and provenance will become more and more relevant. The husky vs wolf example often time used to show the bias in model training. The original research was designed to see how human users react to such errors and how to create confidence and trust in AI models. In order to go towards such trustworthy system we need to understand the implications and implementation of machine-actionability.
FAIR and FAIR Digital Objects can play a significant role in creating such confidence and trust. In particular when it comes to open science and data intensive research. To begin with, precise definitions and formal semantics are essential. Along with that capturing the context and provenance can tell us why, where, who and when. All these are building blocks that can make data and information “Fully AI-ready” (another interpretation of the FAIR acronym). This readiness needs to be a modular approach instead of a one size fits all. At the same time, we need to provide an open and standard framework for better interoperability. A recent paper entitled “FAIR Digital Twins for Data-Intensive Research” by Schultes et al. proposes such modular approach to building systems based on FAIR Digital Objects. DiSSCo is also working on similar efforts. Some of these ideas also be explored in the newly launched Biodiversity Digital Twin project.
We welcome you to engage with us and think through these topics at the first international conference on FAIR Digital Objects. This event will be on Oct 26-28 (2022) in Leiden — 2022 European City of Science. More information about the conference is here. Hope you see you in Leiden!