Detection and Prediction of Epidemic Patterns: Comparison between Social Media and Google

18th March 2020

The following study was conducted by Scientists from Computer Science Department, Polytechnic Building, University of Alcalá, Ctra. De Barcelona km, Alcalá de Henares, Madrid, Spain. Study is published in Scientific Reports Journal as detailed below.

Scientific Reports; Volume 10, Article Number: 4747; (2020)

Comparing Social Media and Google to Detect and Predict Severe Epidemics


Internet technologies have demonstrated their value for the early detection and prediction of epidemics. In diverse cases, electronic surveillance systems can be created by obtaining and analyzing on-line data, complementing other existing monitoring resources. This paper reports the feasibility of building such a system with search engine and social network data. Concretely, this study aims at gathering evidence on which kind of data source leads to better results. Data have been acquired from the Internet by means of a system which gathered real-time data for 23 weeks. Data on influenza in Greece have been collected from Google and Twitter and they have been compared to influenza data from the official authority of Europe. The data were analyzed by using two models: the ARIMA model computed estimations based on weekly sums and a customized approximate model which uses daily sums. Results indicate that influenza was successfully monitored during the test period. Google data show a high Pearson correlation and a relatively low Mean Absolute Percentage Error (R = 0.933, MAPE = 21.358). Twitter results are slightly better (R = 0.943, MAPE = 18.742). The alternative model is slightly worse than the ARIMA(X) (R = 0.863, MAPE = 22.614), but with a higher mean deviation (abs. mean dev: 5.99% vs 4.74%).


Samaras, L., García-Barriocanal, E. & Sicilia, M. Comparing Social media and Google to detect and predict severe epidemics. Sci Rep 10, 4747 (2020).