Inferring Atmospheric Particulate Matter Concentrations from Chinese Social Media Data
Tao, Zhu; Kokas, Aynne; Zhang, Rui; Cohan, Daniel S.; Wallach, Dan
Although studies have increasingly linked air pollution to specific health outcomes, less well understood is how public perceptions of air quality respond to changing pollutant levels. The growing availability of air pollution measurements and the proliferation of social media provide an opportunity to gauge public discussion of air quality conditions. In this paper, we consider particulate matter (PM) measurements from four Chinese megacities (Beijing, Shanghai, Guangzhou, and Chengdu) together with 112 million posts on Weibo (a popular Chinese microblogging system) from corresponding days in 2011–2013 to identify terms whose frequency was most correlated with PM levels. These correlations are used to construct an Air Discussion Index (ADI) for estimating daily PM based on the content of Weibo posts. In Beijing, the Chinese city with the most PM as measured by U.S. Embassy monitor stations, we found a strong correlation (R = 0.88) between the ADI and measured PM. In other Chinese cities with lower pollution levels, the correlation was weaker. Nonetheless, our results show that social media may be a useful proxy measurement for pollution, particularly when traditional measurement stations are unavailable, censored or misreported.