Studying social media can give detailed insight into human behaviour. Rob Procter describes a new project that aims to help researchers analyse this data
May 20, 2013 was a black day for Oklahoma, USA: a category 4 tornado tore across large areas of the state. It wasn’t the first devastating tornado to hit the area and it won’t be the last, but it goes down in history for the way that social media was used to get life-saving alerts out, document the storm’s progress, share useful information and emotional reactions and – in the immediate aftermath – encourage practical help for the injured and homeless.
This is a powerful example of how social networks are transforming daily lives, and the analysis of 2m+ tweets from those who were there in the first 48 hours has yielded some valuable insights into how people think, act and feel in times of crisis.
Similarly, work is being done to explore information being tweeted about the crisis in Syria, and in the UK the ‘Reading the Riots’ project has explored the way that social networks, especially Twitter, were used by participants and onlookers during the 2011 UK riots.
Analysis of the 2.6m riot-related tweets showed that Twitter didn’t deserve the brickbats thrown at it by commentators who claimed the platform was used to co-ordinate the mayhem. More importantly, it also uncovered evidence of the beliefs, grievances and attitudes that led people to become involved in the violence and looting, and provided insights that helped to explain some of apparently odd double standards in behaviour that were widely reported in mainstream media. This kind of fine detail would have been lost to researchers before the advent of social networks or at best, would have been reported partially, via anecdote, later.
And as these new sorts of data have become available, it has been necessary to develop ways to analyse them.
Grant funding from the UK’s Economic and Social Research Council (ESRC) is helping sociologists and computer scientists at Cardiff, Warwick, Edinburgh and St Andrews universities to work together on developing new ways to harvest and analyse data, and to create tools that can detect tensions and cohesion in online social networks.
Building on that work, the ESRC and Jisc have also funded the development of the Collaborative Online Social Media Observatory (COSMOS), an information collection and analysis engine that can harvest freely-available, socially-significant data from blogs, micro blogs, RSS feeds and other social network platforms, as well as important data – such as crime statistics – that is now routinely made openly available.
Current projects supported by COSMOS include a study of the formation and spread of hate speech and antagonistic content in social media networks and an investigation of the value of social media data for building more accurate statistical models for predicting crime.
We are also exploring ways that these new technologies can be used to support innovative publishing strategies via, for example, crowd sourcing, and to facilitate ‘citizen social science’ where members of the public can get involved in research, and record their beliefs and opinions at volume.
Developments like these could transform the way research is conducted, especially in social sciences where making the production of knowledge a more public and collaborative act seems particularly appropriate.
Over the past decade we’ve heard a lot of concerns being expressed about the fact that large businesses can harness the power of big data for their own commercial ends, and that they could close the door on social scientists and public scrutiny of any kind. But what we’ve seen from all these examples is that people of all ages and in all walks of life are now facebooking, tweeting and blogging enthusiastically, and using smartphones and tablets routinely to keep them connected to friends, colleagues and like-minded complete strangers even on the move.
That has allowed COSMOS and others to develop techniques that allow academics to level the playing field and create their own useful big datasets that will help them answer some of social science’s upcoming big questions.
The pace of development is so rapid that we can’t yet predict the impact that the new techniques will have on research processes. They may promote the use of methods and data that researchers will choose in place of more traditional quantitative and qualitative research methods such as sample surveys and in-depth interviews. They may also influence thinking and re-orientate social research around new objects, populations and techniques, but it is probably most desirable that the new methods will be used in conjunction with existing ones, to make research richer and more nuanced.
The analysis of social processes as they happen is bound to give researchers insights and interesting avenues to explore that are absent, or only imperfectly articulated in the official construction of events that is available via traditional research instruments and curated datasets.
For those interested in pursuing the use of social media for research, our top tips are:
The reliability of results produced by computational methods for social media analysis should not be taken for granted. It is important that results be validated, e.g., by using more traditional methods for content analysis. This requires that information on how tools work must be must be openly available so their behaviour can be verified.
Social researchers need to be trained in the underlying concepts of computational methods for social media analysis so they can understand how to select and apply them appropriately.
Social media are part of a much larger and complex media and information ecology, and their interrelationships need to be understood.
Social media analytics tools need to evolve by making use of social scientific knowledge and input. Social relations and organisation are subject to a relentless process of change. ‘Closed’ analytical tools make for bad social scientific insight.
- Ethics A concern and awareness of research ethics should be a guiding principle throughout the process of developing tools and especially in their use in analysing data.
There is a growing range of social media analytic tools now available, but not all are well-matched to the needs of academic researchers. Many, such as DiscoverText and DataSift, have been developed primarily for the commercial marketplace and they are ‘closed’ systems: information on how they work is not openly available and users cannot extend their capabilities by, for example, adding new tools.
Analytic tools of this type targeted at academics are relatively few in number. With COSMOS we aim to create an open and extensible suite of text and network analysis capabilities, including tools for extracting information from social media, for example, gender and location of tweeters, sentiment and tension scores for individual tweets and topic discovery for collections of tweets. The COSMOS platform will be available to academic users in 2014.
Rob Procter is professor of social informatics at the University of Warwick and part of the project team for COSMOS: Supporting Empirical Digital Social Research for the Social Sciences with a Virtual Research Environment project