We’ve taken a pretty detailed tour of Arches capabilities to this point, especially with our recent foray into the building blocks of your Arches implementation. In our exploration, we’ve stressed the importance of building semantic metadata into your Arches implementation in order to build complete, accurate, and interoperable data.
At the heart of this effort lies the need for authoritative data definitions, terminologies, and ontologies. Arches is specifically designed to support interoperability using thesauri, which can be implemented as concept hierarchies—an ISO international standard for defining the labels, definitions, and scope of concept-based controlled vocabularies. In other words, the items that appear in the dropdown lists of your data entry forms.
The Reference Data Manager (RDM) is how Arches implements thesauri, allowing data modelers to create and manage the contents of their controlled vocabulary lists without hardcoding values or allowing free text input.
A Decade of Evolution
Longtime Arches users may remember that the RDM was introduced as a part of Arches 3.0, and can handle both simple lists and complex hierarchical structures, thereby providing context-sensitive options for users. Additionally, it supports importing and exporting SKOS files, enabling interoperability with external systems and standards. Over time, the RDM has simplified the management of intricate data structures, making them more accessible and user-friendly.
Fast forward nearly a decade, and Arches has evolved through four more versions, with version 8 set to launch this summer. Over these years, we’ve gathered invaluable feedback from the Arches community about the RDM, leading us to an exciting new approach for managing controlled vocabularies.
The RDM in Practice: Two Distinct User Groups
Users typically interact with the RDM in two primary ways:
- Defining Controlled Lists for Managing Resource Data – Most users rely on existing authoritative hierarchies and import thesauri from trusted sources (such as the Getty AAT and others) into the RDM to generate controlled lists for data entry.
- Creating Custom Thesauri – A smaller but significant group actively builds and maintains their own controlled vocabularies. For example, Historic England develops its own authoritative thesauri, which serve as the reference standard for its Historic Environment Records network.
The RDM was originally designed to accommodate both groups, but community feedback has revealed distinct functionality needs for each.
Introducing Lingo: A Purpose-Built Solution for Thesaurus Management
“This was the revelation—everybody uses the RDM but very few people actually need to manage thesauri”, said Rob Gaston, who is Farallon’s Project Leader for the development of LINGO, a business application that will replace the RDM for thesauri managers.
Lingo will be a fully featured management tool for expert business users responsible for maintaining authoritatively defined, hierarchical concepts. Unlike the RDM, which stored data in a custom model requiring manual translation for semantic representation, Lingo natively supports semantic data models—ultimately, Lingo will make it easier for expert users to build vocabularies that can improve interoperability between datasets.
“Lingo uses arches models, that’s the novel thing, and the big gain there is that now Lingo data is in-and-of itself semantic; whereas previously if you wanted any sort of semantic representation of a concept, you would have to figure out that translation yourself. Now, it’s built in… this represents a huge improvement for open-source thesaurus management”
— Rob Gaston, Senior Farallon Developer
Introducing Arches Controlled Lists: A Streamlined Way to Manage Controlled Lists
The introduction of Lingo won’t just streamline thesauri management for the exclusive group of experts responsible for defining these relationships. Alongside Lingo, Farallon is in active development on Arches Controlled Lists, a feature that enables users to streamline the creation of controlled lists and references to pre-existing thesauri, allowing the management of data along international standards. Arches Controlled Lists will allow users to quickly integrate trusted vocabularies into their specific applications and start building datasets that take advantage of Lingo resources.
“Arches Controlled Lists is a flexible tool where if you need to get something running today—great, you’ve got a thesaurus up and running. But let’s say down the road, your needs change, your deployment of Arches is growing, you want to import metadata from someone else—there’s a pathway for that, it’s something that you can grow with,”
— Jacob Walls, Farallon Developer
The Future of Arches: A New Era of Data Access and Customization
Lingo and Arches Controlled Lists are the first step on the evolution of Arches through a new data access layer that was developed by Jacob. It will be usable for any Arches implementation with Arches 8 and will bring an entirely new level of flexibility to Arches deployments.
Using Arches v8, the Arches community will have the ability to customize how they access their data, deploy frontend experience for end users, and develop business applications that can be built to meet specific project needs.
As we look to the future, our goal is clear: to empower the Arches community with more intuitive, scalable, and interoperable tools—ensuring that no matter how their needs evolve, Arches evolves with them.