Tuesday, January 03, 2012
In May of this year we launched Google Correlate on Google Labs. This system enables a correlation search between a user-provided time series and millions of time series of Google search traffic. Since our initial launch, we've graduated to Google Trends and we've seen a number of great applications of Correlate in several domains, including economics (consumer spending, unemployment rate and housing inventory), sociology and meteorology. The correspondence of gas prices and search activity for fuel efficient cars was even briefly discussed in a Fox News presidential debate and NPR recently covered correlations related to political commentators.
Health has always been an area of particular interest to our team (Matt Mohebbi, Julia Kodysh, Rob Schonberger and Dan Vanderkam). Correlate was inspired by Google Flu Trends and many of us worked on both systems. So we were very excited when the BioSense division at the CDC published a page which shows correlations between some of their national trends in patient diagnosis activity and Google search activity. With just three years of weekly data, relevant search terms are surfaced. For example, the time series for bloody nose surfaces "bloody snot" and "blood in snot".
While these terms shouldn't come as a surprise, there are others which are more interesting, including searches related to static electricity, dry skin, and red cheeks. Of course, correlation is not causation but we hope that Correlate can be used as a method for researchers to generate new hypotheses with their data.
To help researchers outside the United States, we're pleased to announce support for 49 additional countries in Google Correlate. It's now possible to see correlations like "snorkeling" in Australia, "cherry blossoms" in Japan , and "beer garden" in Germany. We look forward to seeing what new correlations researchers can find with this data!