MASTERCLASSING

Study Uses Mobiles and Twitter Usage to Estimate Crowd Sizes

David Murphy

[caption id="attachment_54289" align="alignleft" width="150"]san siro The scientists based their study on people attending games at the San Siro football stadium in Milan[/caption]

New findings by scientists from the Data Science Lab at Warwick Business School suggest that data generated simply through our use of mobile phones and Twitter might offer surprisingly accurate estimates of crowd sizes.

Federico Botta, Suzy Moat and Tobias Preis, of Warwick Business School, analysed Twitter and mobile phone data from Milan, Italy, and found that they could estimate attendance numbers for football matches at the San Siro stadium, as well as the number of people at Linate Airport at any given time.

Their research, published in Royal Society Open Science, could be of value in a range of emergency situations, such as evacuations and crowd disasters.

“Measuring crowd size is a difficult task, as the hugely varying estimates we see of the number of people at protests underline,” said Botta. “Given that most people now carry a mobile phone with them, we wondered if we could measure the number of people in a given location simply by analysing data on usage of these mobile phones. We found that this automatically-generated data provides an excellent basis for estimating the size of a crowd. Quick and accurate measurements of crowd size could be of vital use for police and other authorities charged with avoiding crowd disasters."

In the paper, Quantifying crowd size with mobile phone and Twitter data, the scientists analysed two months of both Twitter data and mobile phone data from Milan, from 1 November to 31 December, 2013. The mobile phone activity dataset was provided by Telecom Italia and reflects both the volume of outgoing and incoming calls and text messages, as well as the number of active internet connections. Both datasets make it possible for the scientists to determine not only when mobile phones were active, but where their users were.

They found that the size of spikes in Twitter and mobile phone activity allowed them to estimate the number of attendees at football matches in the San Siro stadium, home of AC Milan and Internazionale.

Dr Preis, associate professor of Behavioural Science and Finance, said: “We plotted mobile phone calls, Twitter and SMS activity in the geographical area in which the San Siro is located and in all three we observed 10 distinct spikes. We found that the dates these spikes occurred coincided exactly with the dates on which the 10 football matches took place in the stadium.

“Furthermore, we noted that the relative sizes of the spikes strongly resembled the official attendance figure for each match. By drawing on historic internet activity in the San Siro, we were able to generate estimates of the number of attendees which fell within 13 per cent of the true value.”

Dr Moat, assistant professor of Behavioural Science, added: “One of the key challenges we faced was to identify situations for which we had a reliable measurement of the number of people present, against which we could calibrate our method. The football stadium at the San Siro was ideal, as football fans need to buy a ticket to attend a match. We found that data on nine football matches was sufficient for us to generate accurate estimates of the number of people attending a 10th match.

"The relationship between data on internet usage and match attendance was strongest of all – perhaps because smartphones automatically check services such as email, without the need for the user to actively intervene."