Why Google Flu is a failure: the hubris of big data

Why Google Flu is a failure: the hubris of big data

People with the flu (the influenza virus, that is) will probably go online to find out how to treat it, or to search for other information about the flu. So Google decided to track such behavior, hoping it might be able to predict flu outbreaks even faster than traditional health authorities such as the Centers for Disease Control (CDC).

Instead, as the authors of a new article in Science explain, we got “big data hubris.” David Lazer and colleagues explain that:
“Big data hubris” is the often implicit assumption that big data are a substitute for, rather than a supplement to, traditional data collection and analysis.

The problem is that most people don’t know what “the flu” is, and relying on Google searches by people who may be utterly ignorant about the flu does not produce useful information. Or to put it another way, a huge collection of misinformation cannot produce a small gem of true information. Like it or not, a big pile of dreck can only produce more dreck. GIGO, as they say.

Google’s scientist first announced Google Flu in a Nature article in 2009. With what now seems to be a textbook definition of hubris, they wrote:
“…we can accurately estimate the current level of weekly influenza activity in each region of the United States, with a reporting lag of about one day.”

If you have any opinion, feel free to send us a messageor leave your comment below.