Turn hidden gems into strategic assets: how AI unlocks digital collections

Vienna University Library. Shutterstock.com/Gabor Kovacs Photography

New Clarivate whitepaper highlights Vienna University Library’s approach to metadata challenges

Across Europe, institutions have invested heavily in digitising their collections. Libraries are leveraging academic AI to enhance digital collections to progress research, teaching and institutional identity. The latest Clarivate whitepaper, Turning Hidden Gems into Strategic Assets, highlights Vienna University Library’s approach to metadata challenges and how the library team simplifies processes from ingestion and cataloguing to exhibition building, making curation and outreach more efficient.

The ambition across the library community is clear: to open extraordinary collections to researchers who could never visit in person, and to preserve materials too fragile to handle or in danger of destruction. Online use of digitised collections, like those in The National Archives in the UK, increased exponentially during the pandemic and has remained high, with a marked rise in international access.

The potential locked inside digital collections is significant. At the University of Cambridge, the Casebooks Project spent a decade transforming 80,000 400-year-old medical records into a fully searchable portfolio. Once made discoverable, it became the foundation for new research, public exhibitions, and even a video game. 

Across the sector, libraries that have invested in making collections discoverable have seen the same pattern: research opportunities, new audiences, and institutional recognition.

Digitisation is only half the challenge

But many institutions are now confronting an uncomfortable truth: digitising a collection does not automatically make it discoverable. Without good metadata, a digital file is simply an inaccessible file in a different format. 

Research libraries have long had a name for this problem: hidden collections. These collections are undescribed or under-described, making them undiscoverable. Even with huge progress in digitisation, there is a long way to go to deliver true discoverability; the collections exist, but the researchers who need them often do not find them.

Digital collections are no longer just about preservation. They are central to an institution’s mission, supporting scholarship, teaching, and community engagement. Yet as digital collections expand in both scale and complexity, traditional workflows often struggle to keep pace, and metadata quality is where they most frequently fall short.

The challenge is partly human capacity. Cataloguing, metadata enrichment, and exhibition building require significant expertise and time, resources that are increasingly stretched*. As Wolfgang Mayer, Head of Content and Data at Vienna University Library, explains: “It’s not only about metadata creation, but also metadata correction. There were thousands and thousands of hours spent by cataloguers during the last decades to either correct false data or to adopt rules that have changed.” 

Successfully integrating AI tools into library workflows also requires investing in staff training and change management. Librarians and metadata experts need access to professional development opportunities to build the skills required to work effectively with AI. Institutions must also foster a culture of collaboration between technology providers and library staff, ensuring that AI implementation is accompanied by clear communication, ongoing support, and a shared strategic vision.

The hardest collections to describe

The metadata challenge is most acute for non-text materials. Photographs, maps, drawings, and archival documents make up some of the most distinctive and valuable parts of academic special collections, but they are the hardest to describe well. As Mayer explains: “While cataloguers and metadata experts can skilfully describe books and scholarly articles without deep subject expertise, non-text resources — especially in the arts and humanities — demand intimate knowledge of the material. Employing enough librarians to meet this need is simply not feasible.”

The consequences are measurable. A study of a collection of over 600,000 historical postcards* found that manually classifying even one characteristic would have taken thousands of hours. By employing machine learning, the task was completed in 4.28 hours.

Balancing technology and library expertise

The scale of the challenge has made one thing clear: technology must be part of the answer. The Association of Research Libraries 2024 survey of research library leaders* found that automated cataloguing and metadata generation were top priorities for using AI. The Association of Research Libraries published guiding principles for AI, positioning libraries as active partners in shaping how the technology is used.

The question is not simply whether to use AI – it is how. For managing special collections, the risk of fully automated approaches in metadata creation is well-understood*: AI can generate descriptions at scale, but without professional oversight, accuracy and contextual nuance suffer. The approach that is gaining traction combines AI’s capacity for scale with the irreplaceable expertise of libraries*.

However, it is important to acknowledge the risks of algorithmic bias in AI-generated metadata. AI systems may perpetuate or even amplify historical biases present in training datasets, leading to metadata that marginalises certain collections or misrepresents cultural heritage materials. To combat this, libraries must actively evaluate and align AI tools for inclusivity and ensure that diverse perspectives are incorporated into metadata strategies.

Shaping the future of academic libraries with AI

Alma Specto, a new platform designed to address the unique challenges of managing digital special collections, was developed by Ex Libris (part of Clarivate) in close partnership with leading academic libraries, including Vienna University Library and Bocconi University Library. It uses AI to automate and enhance metadata creation, making collections more discoverable, accessible, and valuable to researchers, students, and the broader academic community. The platform covers the full lifecycle from ingestion and cataloguing through to exhibition building and public showcase, integrating with existing discovery systems and linked open data to expand reach and offer tailored access for different audiences.

Importantly, Alma Specto keeps libraries in control. Academic AI acts as a collaborative assistant, generating metadata suggestions while professionals retain oversight, validation, and strategic direction. This ensures that automation enhances, rather than replaces, the expertise of library staff. 

As libraries embrace AI-driven solutions, it is critical to ensure that these solutions align with institutional values and ethical standards. This includes prioritising transparency in how AI tools generate metadata, addressing potential biases, and fostering inclusivity in metadata practices. By embedding these principles into their AI strategies, libraries can position themselves as leaders in the ethical use of technology while advancing their mission to serve diverse communities.

Real-world impact

As one academic librarian involved in developing the platform put it, “the application of AI to entity recognition and enrichment of bibliographic records gave us a glimpse of the possibility of making the cataloguing process smoother, faster and more complete.” At Vienna University Library, AI-powered metadata enrichment has improved efficiency, especially for image-heavy collections. Wolfgang Mayer explains: “The exhibition tool of Alma Specto provides an easy, smooth way — even for non-experts — to exhibit collections for special exhibitions, presentations of new collections or museum acquisitions. It’s an interesting way to say, ‘This is the thing we need some funding to create,’ and it’s easy to understand for the non-academic public, as well.”

As digital collections become vital for research, teaching, and institutional reputation, tools like Alma Specto serve as a key enabler, empowering libraries to connect with their communities, support academic missions, and fulfil a broader social purpose: expanding access to cultural and academic heritage. As one librarian involved in the platform’s development put it: “Digitising and promoting our heritage is a matter of reputation. [It is] an important way to strengthen the ties with our community and with prospective students and alumni. Making once exclusive knowledge heritage available to all means contributing to this right.”

While the immediate benefits of AI-powered platforms, such as improved efficiency and enhanced discoverability, are clear, the long-term impact of these technologies on research outcomes, user engagement, and access equity warrants careful evaluation. Are these tools enabling researchers to uncover new knowledge that was previously hidden? Are they helping to elevate underrepresented collections or voices? Libraries adopting AI solutions should consider developing metrics to measure these outcomes over time, ensuring that the technology delivers sustainable value for their communities.

Download the Clarivate whitepaper Turning Hidden Gems into Strategic Assets to tackle the hidden collections challenge. To find out how Alma Specto can support your academic community, get in touch to book a demo.

The following articles are cited above:

hangingtogether.org/filling-the-bench/
www.sciencedirect.com/science/article/abs/pii/S0099133323000757
journals.cilip.org.uk/catalogue-and-index/article/view/658
hangingtogether.org/striking-the-right-balance-opportunities-and-challenges-of-ai-in-metadata-workflows/
www.arl.org/resources/evolving-ai-strategies-in-libraries-insights-from-two-polls-of-arl-member-representatives-over-nine-months/

Be first to read the lastest industry news and analysis! SUBSCRIBE to the Research Information Newsline!

Back to top