Celebrating the Getty Provenance Index And Glimpsing the Future of Arches

On May 2nd, the Farallon Geographics team proudly joined the J. Paul Getty Trust in celebrating the launch of the remodeled Getty Provenance Index (GPI). This project has been years in the making, and it represents a major step forward in how the art and cultural heritage world can explore the histories of artworks and their movements over time.

We’ve been excited to collaborate closely with Getty Digital on the technical side of this work. And while the Getty Provenance Index itself is a huge advancement in the world of provenance research, it’s also a major leap forward for Arches, the open-source platform that powers the system behind the scenes. So sit back and enjoy as we pop open the hood on some of the custom features built into this deployment of Arches. 

Behind the Scenes: Performance Challenges at Scale

Let’s step back for a moment. Provenance—tracking where an artwork has been and who has owned it—is complex. As Getty Research Institute Director Mary Miller explains, it helps us understand “the lives of objects and how objects have moved in this world.”

From a technical point of view, the Getty Provenance Index includes descriptions of over 12 million resources; that is distinct objects, activities (e.g.: auctions, sales, and other activities), auction lots, people, organizations, records of sale books, visual works, and places.  In Arches, a “resource” refers to all database records related to a single object listed above. This means that the GPI manages more than 140,000,000 database records, all deeply connected. Each resource in the GPI is part of a network: people, places, transactions, events—each with its own data and relationships that Arches needs to manage, search over, and retrieve. So when you load a data into Arches, you’re not just loading a database; you’re building a web of semantically connected information.

As data began to be loaded into the GPI, traversing its data networks became a challenge due to the sheer size of the database. Initially, loading a single resource instance often meant waiting 30 seconds at least—if the report rendered at all. For a tool meant to empower researchers and curators, this was a critical bottleneck.

Anyone who has been reading the Farallon Blog or who has used Arches in the past is likely very familiar with these resource reports. For newcomers, let’s use an example. In a traditional relational database, generating a report requires calling a series of a rows from the appropriate table(s) within a database schema.  In Arches, a resource report is an in-depth representation of a resource instance and its network of relationships across multiple database schema. Thus, the report for a single work of art would include not just all the information about the object itself, but also all its owners, each sale it participated in, all the documents associated with the object, visual depictions of the item, and all places that the object was located at. 

It is important to understand exactly how these resource reports function when a user calls the information. In Arches, a resource report typically renders all related data—including metadata, media, and interlinked records—in one go. This all-at-once approach is analytically rich but can be computationally expensive, especially with highly interconnected data like that in the GPI.

Traditionally, resource reports simply list data in a single page according to the hierarchy that administrators define in the graph designer. This load doesn’t cause performance issues in the pictured example, as the specified resource is relatively light on associated data. But things change when resources have rich and deep relationships.

As Farallon’s Director of Web Development, Alexei Peters, puts it:  “Reports gathered all of the data all at once in order to render the page, which leads to a poor user experience for Provenance users—just spinning, spinning, spinning.” 

Performance Enhancers: Custom Editable Reports

To solve the performance issue, the Farallon and Getty Digital teams reimagined how Arches loads and displays data. The result is Custom Editable Reports—a flexible, modular reporting system that dramatically boosts performance while giving administrators greater control over the presentation of information.

Instead of loading every piece of data before rendering, custom reports load content in sections, only when needed and reducing the system load. For example, the Related Resources section (which shows how a resource connects to other resources) normally appears at the bottom of the resource report. Within the GPI, however, a dedicated tab within the report enables users to visualize the relationships between resources in the system. This separation is particularly important for handling the Provenance dataset, which is characterized by deeply nested resource models and a high degree of interconnectivity between instances.

Whereas previously speed was directly correlated to the amount of data a resource instance contained, this design isolates the heavy-lifting of rendering complex relationships, improving load times without sacrificing detail. By decoupling performance from data volume, Arches now delivers consistent speed—whether a resource has a little data or a lot. It’s a fundamental architectural improvement that scales gracefully with even the most data-rich implementations. 

Plus, this application design comes with a bonus.

A New Level of Flexibility


Alexei and the Farallon team not only turbocharged the performance of Arches’ reporting module, but they also developed a way to configure the contents and layout of a resource report, making it much easier to deploy fast and customizable instance reports and initiate data editing capabilities directly from a report. The modularity of these custom editable reports isn’t just linked to related resources; rather, Getty Digital administrators now have fine-grained control over how all information for a resource is displayed. Reports are now fully configurable using JSON within the Django admin panel, allowing users to define which data appears, how it’s grouped, and what components (like tables or charts) are used to display it.

The Getty Provenance Index gives administrators a greater level of control over how information is grouped and presented. Notice the separation between the resource’s Data and Related Resources tabs—this separation of large datasets helps reduce performance load.

Built with Vue.js, an open-source JavaScript framework, these custom components make it easy to craft tailored, insightful reports. With Vue, it’s now possible for Getty Digital to elevate any card, group unrelated nodes, and create entirely new analytical views, all through the configurable report system. Additionally, any components created in the future can be easily added to any of the configurations without the need to change the underlying code.

“The way things used to be, when it was hierarchical, a resource report wouldn’t give weight to one card over another, and cards are nested within each other top to bottom. Here Getty Digital can take any card—it could be deep in the nesting, it could be at the top, it doesn’t matter—and aggregate these cards together to create something new; something that isn’t found within the graph model, it’s not found in the arches system, it’s created by the customizable config.” – Alexei

Within the Django Administration side of things, users have free reign to create the Report config, whether that is from scratch or via predefined templates, and every model can have its own configuration. So as new models are created, report templates are initially generated with intelligent defaults—giving admins a head start.

The Bigger Picture: What the Getty Provenance Index Means For the Arches Community

We expect that custom editable reports catch the interest of a great many users, admins, and organizations that have implemented Arches. Just to get it out of the way, this powerful new functionality isn’t something coming with the Arches version 8, which is slated for release in the early summer. But that doesn’t mean this implementation of custom resource reports is just a one off.

While this feature set was originally developed for Getty Provenance Index, it won’t remain exclusive for long. The Getty Conservation Institute is collaborating with Getty Digital and Farallon Geographics to package this functionality as a reusable Arches application, expected to be released later this year. 

Once released, Arches users around the world will be able to take advantage of these improvements—enhancing performance, customizing user experience, and unlocking new ways to present and interpret data.

A Glimpse Into the Future of Arches

The launch of the remodeled Getty Provenance Index is more than just a celebration of past work—it’s a look ahead at the future of Arches. With faster load times, customizable layouts, and an architecture built for scale, Arches is now more capable than ever of supporting complex, interconnected datasets.

Custom Editable Reports stand as a testament to what’s possible when innovative institutions and dedicated developers collaborate to solve real-world problems. And for the global Arches community, this is just the beginning.

At Farallon, we’re excited about where Arches is headed—and proud to have contributed to something that’s already making a difference.

Here’s to better tools, stronger stewardship, and continuing to build together.

Related articles

FARL_Divider_Graphic-cropped
Celebrating the Getty Provenance Index And Glimpsing the Future of Arches
Farallon Round-up: CalGIS 2025
Reference Data Manager for Concept Hierarchies: What, Why, and How