11/15/2024 | Press release | Distributed by Public on 11/15/2024 10:14
Over 25 percent of adults in the U.S. suffer from seasonal allergies, but scientists have struggled to track allergy trends because cases don't always require medical care. Some allergy sufferers venture online to post about their itchy eyes, runny noses, and sneezing on social media or to search for remedies. Now, CU Boulder scientists are harnessing information from online activity to track allergy intensity across the U.S. The work, published recently in PNAS Nexus, reveals important regional patterns, including an allergy "hotspot" in the Southeastern U.S. and a winter allergy season in Colorado, Texas, and Florida.
"There isn't a good metric for measuring the intensity of seasonal allergies," said Elías Stallard-Olivera, a PhD student in Ecology and Evolutionary Biology at CU Boulder and lead author of the paper. "Traditional allergy prediction methods, like pollen counts, often fall short, making it essential to develop more reliable ways of identifying and tracking allergen sources."
Stallard-Olivera and Noah Fierer, CIRES fellow and CU Boulder professor of Ecology and Evolutionary Biology, decided to explore readily available data online. They used machine learning to identify and extract counts of allergy-related social media posts on X, formerly known as Twitter, between 2016 and 2022. They then grouped the data by county.
Using online activity to track health trends is not new, but in recent years, scientists have raised concerns about the accuracy of the approach, including in Google's flu tracker. To validate their dataset, the team added an extra step - they used a statistical technique called "cointegration" to compare the occurrence of allergy-related social media posts to Google search frequency and hospital records from California.
Economists use cointegration to track the relationship between more than two things or variables over time. Variables are "cointegrated" when they exhibit the same patterns; for example, when the price of cacao beans rises, the price of chocolate also tends to increase. The study is one of the first to apply the statistical technique to online activity.
The team found a strong cointegration between the allergy-related online data and hospital records, which meant they could use their social media dataset as a "proxy" for seasonal allergy intensity.
Stallard-Olivera and Fierer calculated Z-scores, a measure of how far away a data point is from the average, for U.S. counties with populations greater than 500,000. Counties with higher, more positive Z-scores represent areas with more intense seasonal allergies, while counties with lower, more negative Z-scores represent areas with less intense seasonal allergies. They then used the Z-scores to build annual and monthly maps of seasonal allergy intensity.
Their results show that seasonal allergies are most severe in the Southeastern U.S. and least severe in Florida and Southern California. The team also discovered stark differences between regions within California; for example, allergies in the Central Valley are much more intense than elsewhere in the state.