When Twitter recently unveiled a new grant program that will allow outside researchers to mine its stockpile of tweets, the social media site pointed to Johns Hopkins' flu tracking as one example of the useful data that may be buried in its short online posts. In recent years, Johns Hopkins researchers have shown that tweets can help trace nationwide trends in flu outbreaks.
Now, in a new study, a team from Johns Hopkins and George Washington universities has drilled even deeper, probing flu-related tweets from a single bustling metropolis: New York City. Twitter data, the team concluded, can accurately gauge the spread of flu at the local level, too.
The finding, published in the journal PLoS ONE, is important because key decisions on how to prepare for and treat a flurry of flu patients are made mostly in the cities and towns where the disease is spreading. For example, when flu cases are on the rise, hospital administrators must make sure they have enough beds and staff to cope with an increased influx of patients.
Also, an early alert can lead local health officials to boost efforts to vaccinate healthy residents to help contain the virus.
Citing data from the 2012–13 U.S. flu season, the researchers reported on results they obtained by sifting through billions of tweets to identify flu infections—as opposed to people merely talking about the flu—and where these flu patients were located. "We found that we could do just as well in predicting flu trends in New York City as we did nationally," says Mark Dredze, an assistant research professor of computer science in Johns Hopkins' Whiting School of Engineering, who supervised the research. "That's critical because decisions about what to do during a flu epidemic are largely made at the local level."
The lead author on the PLoS ONE paper, David A. Broniatowski, worked on the project with Dredze and Michael J. Paul, a Johns Hopkins computer science doctoral student, when Broniatowski was a postdoctoral fellow in the Johns Hopkins Department of Emergency Medicine's Center for Advanced Modeling in the Social, Behavioral, and Health Sciences. Last August he joined George Washington University as an assistant professor of engineering management and systems engineering.
The team used software developed in Dredze's lab to scan through hundreds of millions of tweets, which are messages or comments—each no more than 140 characters—that are posted on Twitter. Many Twitter users list the cities where they live or use a GPS-equipped cellphone to tweet. This information allows the researchers to focus on posts from particular geographic areas. The team's software is also designed to distinguish between a tweet from someone who likely is ill with flu as opposed to someone who is merely worried about catching it.
During last year's severe flu season, running from Sept. 30, 2012, through May 31, 2013, the team members compared their national Twitter flu findings with data that the Centers for Disease Control and Prevention collected from health care providers. For the first time, the researchers isolated flu patient tweets from a smaller geographic area—the five boroughs of New York City and some adjoining communities—and compared their results with flu cases compiled by the New York City Department of Health and Mental Hygiene.
"Not only did our results track trends on the national level, but they also did so on the local level," says Broniatowski. "It gives our system validity. It shows that we're measuring what we say we're measuring, that we're tracking very useful information. And that localized data is valuable because the flu activity in, say, Boise, Idaho, may be quite different from the national flu trends."
Although Dredze's team collected its own Twitter data for this project, Twitter's recently announced Data Grants program will give scholars access to its public and historical data for use in gleaning helpful information on various topics. Broniatowski suggests that the techniques used to track flu trends might also be applied to the study of subjects such as crime, political developments, and response to natural disasters.
Paul, the graduate student on the team, adds, "The exciting results we've come up with so far bring up new questions that will require additional data that the Twitter grant program may enable us to work with. The more experiments we do with Twitter posts, the more proof I see that this is a great idea."
The flu trend research was funded in part by a National Institutes of Health Pioneer Award to Joshua Epstein of the Johns Hopkins Department of Emergency Medicine and a National Science Foundation Graduate Research Fellowship Grant. Publication of the journal article was funded in part by the Open Access Promotion Fund of the Johns Hopkins University Libraries.