#1 - Be FAIR (Enough)

Originally published in 2016, the FAIR principles were conceived for research data management(1). On a general level, they provide the right paradigm for SAs(2) development because they directly address long-standing challenges. They overcome barriers to discovery and reuse by replacing imprecise metadata and fragmented practices with a structured approach that supports both humans and machines. By making SAs easier to locate and apply, FAIR guidelines reduce development costs and efforts while enhancing the quality of semantic resources. They also bring consistency and clarity to metadata descriptions across platforms and communities, addressing the problem of scattered and inconsistently described artifacts. Beyond technical efficiency, FAIR principles strengthen scientific progress by enabling greater interoperability and data reuse across disciplines, aligning with initiatives like the Research Data Alliance that emphasize barrier-free data sharing. In doing so, FAIR principles ensure that SAs remain useful, interoperable, and sustainable over time.

Wilkinson, M. D., et al. (2016). The FAIR Guiding Principles for Scientific Data Management and Stewardship. Scientific Data, 3 1–9. https://doi.org/10.1038/sdata.2016.18.
Semantic artefacts (SAs) comprise ontologies, thesauri, metadata standards, and any resources that sit at the crossroads of these categories. The term Heritage Semantic Artefact (HSA) is used here to denote those SAs that are specifically developed for the Heritage domain.

Read on below for a deeper look at FAIR data and the tools that support it. RADIOSA encourages the adoption of FAIR (Enough) practices while also highlighting the limitations and potential pitfalls that arise when applying FAIR guidelines in the Heritage sector.

What Are FAIR Principles?

The FAIR guidelines or principles define four key characteristics for digital research data:

Findability – Data should be easy to discover, with persistent identifiers and rich metadata.
Accessibility – Data must be retrievable by standardized protocols and remain available over time.
Interoperability – Data should use shared vocabularies and formats, enabling seamless integration.
Reusability – Data must carry clear licenses, provenance, and metadata that support reuse.

Since the FAIR principles were introduced in 2016, the concept of FAIRness has grown beyond data management to encompass a wide range of research outputs—including software and tools. This expansion has driven the creation of new guidelines and the assessment of methods that enable FAIR compliance even before data is produced. In this context, the outcomes of the FAIRsFAIR project are highly relevant for developers of Heritage Semantic Artefacts (HSAs), as the project has made significant strides in establishing best practices for the development of SAs.

The H-SeTIS survey revealed that HSAs often fail to meet basic FAIR requirements(1). Dissemination is frequently limited to ad hoc methods, such as single RDF/Turtle files or static web pages, and minimal or missing metadata hinders both evaluation and reuse. Licensing practices are often inconsistent, further complicating accessibility and compliance. As a result, even when HSAs exist online, they are difficult to discover, assess, or reuse effectively.

On the results of our survey and the description of H-SeTIS, see Scarpa and Valente (2024), “A Resource Hub for Interoperability and Data Integration in Heritage Research: The H-SeTIS Database”, Archeologia e Calcolatori 35, 543–62, (10.19282/ac.35.1.2024.32), and Scarpa, Valente, and Rossi (2025), “Modeling an Ontology for Heritage Science: Challenges and Key Strategies.” in Proceedings Del XIV Convegno Annuale AIUCD2025 (10.6092/unibo/amsacta/8380).

Skeptics might argue that by privileging standardization, reuse, and discoverability, the FAIR principles risk flattening the richness, ambiguity, and difference in interpretations that characterize Heritage data. FAIR is inherently machine-oriented, which can conflict with the interpretive and narrative-driven nature of Heritage. Moreover, FAIR implementation is not cost-free: ensuring persistent identifiers, maintaining repositories, providing rich metadata, and securing long-term governance demand stable funding and technical expertise. In practice, maintaining persistent URIs and ensuring resource findability remain particularly challenging in the Heritage domain, as highlighted by the H-SeTIS results. FAIRness, however, is better understood as a spectrum rather than a binary condition, with different levels of compliance achievable depending on context, resources, and community priorities.

Assessing FAIRness

Regardless of whether a SA was created before or after the introduction of the FAIR guidelines, its level of compliance can be evaluated using a range of tools and initiatives that support FAIR assessment and implementation:

FOOPS! (Ontology Pitfall Scanner for FAIR) – evaluates FAIR compliance of ontologies.
O'FAIRe, an open source FAIRness assessment method and tool for SAs.
FAIR-IMPACT project – provides methodologies for integrating FAIR solutions into the European Open Science Cloud (EOSC) as well as implementation & adoption stories illustrating good practices to support the implementation of the FAIR principles.

Key Recommendations

Use a FAIR-by-design methodology

A central FAIRsFAIR recommendation is to adopt FAIR‑by‑design methodologies when developing SAs(1). When building new HSAs, the best practice is to embed FAIR principles from the outset, rather than adding them retroactively later. One such methodology is Linked Open Terms (LOT)(2).

Jonquet, C., & Grau, N. (2025). D4.2—FAIR Semantic Artefact Lifecycle from Engineering, to Sharing. Zenodo. zenodo.org/records/14643279.
https://lot.linkeddata.es/. See also Poveda-Villalón, M., Fernández-Izquierdo, A., Fernández-López, M., & García-Castro, R. (2022). LOT: An Industrial Oriented Ontology Engineering Framework. Engineering Applications of Artificial Intelligence, 111, 1–22. 10.1016/j.engappai.2022.104755

The LOT methodology proposes a lightweight, iterative workflow that includes Ontological Requirements Specification, Ontology Implementation, Ontology Publication, and Ontology Maintenance. Central to LOT is the reuse of existing terms from published vocabularies, consistent with Linked Data principles, and the publication of ontologies in both human- and machine-readable forms accessible via their URIs. Its iterative, sprint-based workflow integrates smoothly with agile development, enabling rapid progress and frequent releases. By addressing shortcomings in traditional methodologies—such as limited attention to publication and web-based reuse—LOT provides a modern framework that ensures ontologies are not only developed efficiently but also openly available and interoperable from the start.

In the FAIRsFAIR project, a FAIR‑by‑design approach based on LOT was created: LOT4FAIR. LOT4FAIR guarantees that SAs are inherently FAIR, promotes cross‑disciplinary semantic interoperability, and enables the construction of FAIR Scientific Knowledge Graphs. It extends LOT with an iterative workflow that aligns with agile software development practices (e.g., sprints and iterations). The method emphasizes both reusing existing terms and publishing new ontologies that follow Linked Data principles(1). LOT4FAIR incorporates developer experience, best practices, guidelines, and existing ontology‑creation tools. It identifies gaps in current guidelines and proposes recommendations to ensure FAIRness at every stage of development.

Linked Data outlines principles for publishing structured, machine-readable data on the Web in a way that enables it to be interconnected. If the data is compliant with open standards, then you have Linked Open Data (LOD).

Implement Identifiers Efficiently

A Globally Unique, Persistent, Resolvable Identifier (GUPRI) is an identifier that guarantees uniqueness (being distinct across all systems), persistence (remaining valid over time), and resolvability (being accessible via standard protocols such as HTTP). The use of GUPRIs is recommended for SAs, their contents (including terms, concepts, classes, and relations), as well as their versions. Nevertheless, as illustrated below, the adoption of GUPRIs is not without cost: for smaller heritage institutions, this practice can be administratively burdensome, often requiring the minting of hundreds of URIs for only marginal benefit.

The use of URI-based identifiers (e.g., in accordance with OBO Foundry conventions) and versioned URIs can be supported through services such as purl.org or w3id. These services enable the creation of persistent URIs that redirect to current resource locations, with w3id additionally offering more granular control over redirection, including the management of different serializations for versioning purposes(1).

It is important to note that purl.org—currently maintained by the Internet Archive—experienced significant service disruptions in 2024 as a result of two major DDoS attacks (1, 2). During these incidents, a substantial number of SAs identified through purl.org became temporarily inaccessible.

Digital Object Identifiers (DOIs) represent an additional option for ensuring persistence; however, unlike other solutions, they are not freely available. Whereas w3id relies on GitHub-based workflows, DOI and other PID systems require structured governance mechanisms. It must also be acknowledged that reliance on external PID providers entails certain risks, as policies, funding models, or even the providers themselves may change or disappear (see also below, subsection “Build TRUST to be FAIR”).

Overall, GUPRIs, PIDs, and URIs are fundamental for the creation, dissemination, and FAIRification of SAs. These identifiers are critical enablers of the FAIR principles—ensuring that SAs are Findable, Accessible, Interoperable, and Reusable. Nonetheless, identifiers alone are insufficient to guarantee FAIRness: while they support Findability and Accessibility, true Interoperability and Reusability depend on the availability of rich metadata, appropriate licensing, and adherence to community conventions.

"Be FAIR and CARE": The Case of Indigenous Data

Some scholars argue that the CARE Principles(1)—Collective benefit, Authority to control, Responsibility, and Ethics—offer a path "beyond FAIR" for Indigenous or Heritage contexts. Although a universally accepted definition of “Indigenous peoples”(2) remains elusive, they are generally understood to be communities that possess a long historical continuity with and a deep connection to the land and its natural features. Their socio‑economic and political structures, languages, and cultures are distinct from the majority society, and they often hold conservative cultural, social, and religious beliefs. A complementary objective for data stewards and other users of Indigenous data is to “Be FAIR and CARE.” This dual approach seeks to enhance machine actionability (FAIR) while simultaneously safeguarding the unique characteristics, purposes, and rights of specific peoples within data governance.

Carroll, S. R., Garba, I., Figueroa-Rodríguez, O. L., Holbrook, J., Lovett, R., Materechera, S., Parsons, M., Raseroka, K., Rodriguez-Lonebear, D., Rowe, R., Sara, R., Walker, J. D., Anderson, J., & Hudson, M. (2020). The CARE Principles for Indigenous Data Governance. Data Science Journal, 19, 43. 10.5334/dsj-2020-043.
See the UN deifinition of Indigenous People, the UNESCO definition, World Heritage Sustainable Development Policy, and the Indigenous and Tribal Peoples Convention of 1989.

The CARE Principles from the Global Indigenous Data Alliance (GIDA) introduce a people‑ and purpose‑oriented perspective that is essential for the development of HSAs, particularly when dealing with Indigenous data or Cultural Heritage. They address the tension that Indigenous communities experience between safeguarding their rights and interests in data—including traditional knowledge—and supporting open‑data initiatives. Consequently, applying the CARE Principles in addition to the FAIR Principles in HSA development means designing artefacts that do not exclude Indigenous knowledge systems, thereby moving beyond the worldviews of the societies that originally shaped the mainstream knowledge framework. This requires that the terms, concepts, relationships, and even constraints embedded within semantic artefacts be culturally appropriate and meaningful to the Indigenous communities they describe. Tools such as the Traditional Knowledge (TK) Labels demonstrate how extra‑legal digital mechanisms can restore Indigenous cultural authority and governance over Indigenous data and collections by enriching metadata with additional cultural context. This enhances provenance quality and promotes a more equitable approach to data reuse.

"Build TRUST to be FAIR": Hosting HSAs

The TRUST Principles for Digital Repositories—Transparency, Responsibility, User Focus, Sustainability, and Technology—outline the core, long‑term capabilities that trustworthy digital repositories (TDRs) must provide to enable the access, reuse, and preservation of data, including SAs(1). They complement the FAIR guidelines by underscoring that long‑term preservation is impossible unless a repository delivers these essential services. The FAIRsFAIR project recommends publishing SAs in repositories that are both FAIR and TRUST‑compliant(2), such as re3data and FAIRsharing. By the same token, it cautions against using code‑hosting platforms such as GitHub or GitLab for long‑term preservation; these services may be suitable only for short‑term archival.

Lin, D., et al. (2020), «The TRUST Principles for Digital Repositories». Scientific Data, 7, 144. https://doi.org/10.1038/s41597-020-0486-7.
Best Practice 11 "Use TRUSTed and FAIR compliant repositories to persist Semantic Artefacts" in Hugo, W., Le Franc, Y., Coen, G., Parland-von Essen, J., & Bonino, L. (2020). D2.5 FAIR Semantics Recommendations Second Iteration (1.0). Zenodo. https://doi.org/10.5281/zenodo.5362010.

CC Signals: HSAs and AI

We are closely monitoring the CC Signals initiative(1), which aims to establish a standardized framework for articulating preferences regarding the reuse of works in AI training. The initiative, primarily targeted at content stewards responsible for large datasets, remains under development, with its launch anticipated in November 2025. While its implications for RADIOSA cannot yet be fully assessed, CC Signals may ultimately shape requirements concerning the attribution and treatment of SAs when subject to machine-mediated reuse.

See the report From Human Content to Machine Data: Introducing CC Signals.