Unstructured information presents new opportunities for libraries

Share this on social media:

Libraries house vast amounts of unstructured data and this is increasing as user-generated content becomes more important. John Pomeroy reports from the recent MarkLogic UK summit about some of the challenges and opportunities with this type of information 

Unstructured information is all around us. It has taken the world by storm, and companies around the globe are now realising that effectively managing it offers a substantial opportunity for not only innovation, but business insight and value.

Unstructured information includes data that doesn’t fit neatly into relational databases. Examples include textural documents, pictures, video, research, social media posts, tweets and similar data. Analysts and industry experts have estimated that unstructured information accounts for 70-80 per cent of all data.

Recently, a group of thought leaders came together for a UK summit hosted by MarkLogic to discuss the explosion of unstructured data and what it means for organisations around the world. The event highlighted how companies can improve information sharing, add value to customers, reduce complexity of data management and increase transparency. Many companies are already starting to use unstructured information to add value and create revenue.

This information has a significant impact on libraries as well. A major goal of libraries, be they public, educational or corporate, is to discover and share information, and the technology that people are using for this information is evolving. There is a large amount of unstructured information housed in libraries.

According to MarkLogic UK Summit attendee Nick Patience, research director of information management at the technology industry analyst firm The 451 Group, one of the biggest hurdles associated with unstructured data is that it is nobody’s job to manage it. However, this is changing as organisations seek more control over their content and the value it represents.

Another challenge associated with unstructured information is growth. There has been an explosion of data over the last five years, and especially in the amount of information that is outside an organisation’s control. Organisations have been looking to traditional big database vendors to solve this problem. As people continue to learn that old relational database technology is not well-suited (or designed) to address the challenges presented by unstructured information, they are looking at new technologies and expanding their database strategy to include more than just a relational database.

In addition, changes in the ways in which people access and share information is affecting libraries and their customers. Library customers today expect the potential to search the library catalogue using a smart phone. The customer does not have to be in the library physically, waiting for a terminal, to get this information. It’s all available on their mobile device. The evolution of library mobile technology presents a myriad of opportunities and possibilities.

Another trend in unstructured data that is having a big impact on libraries and publishers is social media. A centralised body or library organising and categorising information may not be the ideal or most useful method for library customers. It may be more useful for them to see information in terms of how their peers are viewing and using it.

Some publishers see user-generated content as the way of the future. For example, a library may categorise periodicals by any criteria they feel may be relevant to the subject. Rather than relying on a pre-defined way to organise information - Author, Title, etc.., publishers can use new database technology to allow customers to organise and share information in different ways. Crowd sourcing is another way to make information more useful based on how people are using it. A college student could see how other students have tagged certain information, perhaps tagging it as being relevant to a specific course, which makes it more useful to them.

As information requirements evolve, libraries can use an unstructured database such as MarkLogic Server to modify existing capabilities – and even create new applications quickly – allowing further exploration of content assets. The challenge is to get inside the data, pull together the precise pieces of required information, and deliver only the most relevant information in the format the library customer finds most useful.

John Pomeroy is vice president of MarkLogic Europe