Patron-driven library

Share this on social media:

Finbar Galligan imagines a future library where patrons drive all collection development

In the world of academic e-books, “patron-driven” is a particularly common buzz-term. It symbolises a fundamental shift in the way a library collection is developed, by transferring at least some of the responsibility for new content away from the collections librarian to the researcher.

Where content isn’t provided in advance, search and discovery becomes the most important part of the workflow. If the end-users can’t find particular content unless they already know it exists, then the system will automatically fail.

Where search tools act as gatekeepers for nearly all scholarly content, they will need to be refined in not only the extent of their indexes, but also in the underlying algorithms that allow them to harvest, index and connect the wealth of content available across the net. Hidden and open resources will be combined in one search interface, combining the power of index and federated search technologies, and stretching over the entire web.

Hidden meanings

Advanced semantic techniques can aid the discovery process, linking individual pieces of content and making suggestions and connections that are relevant to a single researcher’s profile and reading preferences. Such an approach to search and discovery could address the gap between sufficient discovery platforms, and the fast-expiring ‘just in case’ collection building model for a large number of libraries.

Library and information professionals have been deeply involved in determining what the semantic web means for organising their own information and complying with linked open data (LOD) standards, due to the volume of content they deal with in their collections. Organising this and making it available for end-users to search is a high priority.  

There are two ways in which we might be able to gain more value from online content. The first is text mining: employing more sophisticated and intelligent machines to apply natural language processing and other analytical methods to bodies of text to extract intended human meaning, context and statistical insights from that text. The more the machine ‘knows’ about the content, the better it can match human-driven search queries.

The second way is from linked data; a structured approach to expressing the relationship between one data object and another, using RDF triples and URIs. Linked data as a concept is backed by the W3C web organisation, and is generally seen as a sub-component of the wider concept of the semantic web.


If building a personal research collection is the future, knowing the likes and dislikes and research behaviour of each individual will be critical to a recommendation function. Suitable recommendations should drive more patron acquisitions. In addition, the suggestions that are not acquired will enable that particular researcher’s profile and preferences to be fine-tuned, enabling better recommendations in the future.  Google Scholar, and its recommendation engine, does a fairly good job at this already, but it is limited by being only based on previous publications listed under the researcher’s Google scholar profile, rather than reading behaviour.

Critical mass of data

Data at the micro level of the single researcher is interesting but it starts to become really useful when you can aggregate it up over several layers of granularity. This could mean that content will not only be recommended based on the individual’s preferences, but on similar researchers and what they are using, giving an automatic recommendation engine that can be scaled all the way up to institutional level.

This high-level snapshot could then be used for a host of purposes, including: refining institutional objectives; building teaching programmes based on topics that are being used by the actual faculty; determining preset materials that are applicable to a particular course, based on actual usage or micro-acquisition data over time for the same course; and allowing the library to develop ancillary services around the core content offering, which would be largely automated based on the collective collection building of all library users.

Instead of library collections being driven by library staff, each researcher would have their own customised profile and resources. This is not dissimilar to the world of music, where end-users have been empowered by the internet and digital music formats to build and curate their own collections according to their own tastes, moods (which can equate to different research projects), and means (this would reflect the overall materials budget which would be available).

Ownership of the music in the digital curated collection is never solely with the end-user. The same can be said for much of digital academic licensing. In fact, a 2013 Pew Internet Study confirmed that in the public library context at least, patrons wanted Amazon-like recommendations based on their preferences and behaviour.

Does it fit a business model?

What would be the business model that drives this? Could the library budget be spent on pay-per-view (PPV) article tokens? PPV is often quite restrictive in how the content may be used. The model could be based on micro-licences for each piece of content, but how do you account for many thousands of micro-transactions even each day?

Taking the lead from the e-book world, journal articles could be licensed, and used until they are no longer needed by the researcher. The more end-users who request the same piece of content, the lower the cost to the library of distributing that content to those who request it.

This kind of system would probably work best through a central gateway for all content, since accounting across multiple libraries or publishers would prove an administration nightmare unless it was wholly automated and transparent. Document delivery services are the closest we have to a model like this at the moment, but are typically used to ‘top up’ the library collection for content that is peripheral, has very low demand, and is not part of the main bulk of subscribed content.

Licensing is bad enough as it is

Licensing is always tricky around PDA for e-books, and copyright restrictions on acquired content are also a minefield for the ill-informed. Exactly how demand-driven collection would work with licensing could be the topic of probably endless debate between libraries and publishers.

However, very simply, each user could have their own licence (under the broader agreement between the library and the publisher) to read, and use any materials they have acquired for their own micro-collection. Theoretically a single library might have several separate individuals all with the same licence to the same content. On a level more comfortable for those familiar with PDA for e-books, perhaps a certain number of user demands for the same content would result in a licence that covers all future users for that same piece of content.

What about publishers?

This kind of acquisition model is bound to impact publishers, whose bread and butter is still the raw journal (and book) content they offer through subscriptions and subject bundles. The big deal would cease to exist in most cases.

However, with new business models would come new opportunities. New support services would need to be developed to ensure budgets and licences could be managed effectively by the library staff. In addition, it might encourage publishers pushing out average content to raise their standards to be able to compete with journals with higher-quality research. In the end, the content is still the most important feature of the publisher, but they may have to find new ways to extract value from that resource.

Large and medium-size publishers are already developing new value-added services around information that probably already exists in its raw journal or data form. This includes decision support tools, acquisitions and developments of research management tools for institutions, and cloud-based software that facilitates a particular stage of the information workflow.

Lots of users, lots of content

If libraries in the future were to go to a total PDA model for all content, some interesting things might evolve from the resulting landscape:

There could be micro and macro filters of content. What a total PDA library collection might achieve on a much wider scale is to serve as a post peer-review user filter on the academic record; only the content deemed suitable or relevant will be acquired by each individual. These filters could then be recycled across institutions or research groups to provide insights into the best and most relevant content for a particular research specialty. We all know that as research output increases, so the likelihood we can filter everything that might be relevant ourselves decreases. These filters would be integrated into the whole demand-driven system.

It could socialise research from the library up. Just as Mendeley has managed to build an ecosystem that mirrors much of the research landscape, if all content were acquired by user demand, a similar phenomenon could occur. The user profiling and behavioural technology embedded in the ‘search, discovery and acquisition’ tool that allows total PDA would provide a great opportunity for networking and sharing of resources.

New data markets could also be possible. If libraries or intermediaries were generating vast amounts of user data, its value for publishers and other players in the industry increases rapidly.

New products that offer value are being rolled out all the time to replace declining revenues from the traditional core content.

There could also be potential for refocusing around the user experience. Giving the user the content they want is great, but a competitive business could also arise around the provision of tools that help the users make the best use of the content that they are ‘acquiring’ in their own personal collections.

Maintaining total PDA programmes with multiple publishers would be unsustainable for the library, so too would it be for the publisher to maintain separate total PDA accounts for all of their customers.

Intermediaries have a reasonably good position in the information supply chain with respect to a total PDA scenario. They are generally content-neutral, and by virtue of that are able to attract large amounts of content from many publishers, and offer that virtual hypermarket of content back to multiple libraries.

For example, Swets currently has over 30,000 publishers from which it can offer content, spanning the very large STM publishers, all the way down to niche organisations producing relatively small amounts of content.

Perhaps total PDA is a tool that intermediaries could offer libraries. Perhaps it is an amalgamation of various parts of the existing information ‘toolscape’ – a bit of search, a bit of discovery, a bit of online networking, and a bit of data management. Maybe someday someone will build it so we can all see.

Finbar Galligan is marketing communications team leader at Swets