Neighborhood Effects: A Social Media Survey to Validate Health Behavior Measures and Assess Bias

Presenter

Matthew Willis

Health Informatics Research Fellow @ UMSI

Time and location

Thursday 1:00-2:00 pm on Zoom: https://umich.zoom.us/j/93603896057?pwd=VklTSjZjSTBKVGJ5NkFrcVJGN0Nndz09

Abstract

The neighborhoods in which people live have an impact on their health and wellbeing, including their health behaviors. The health behavior-related effects of living in socioeconomically disadvantaged neighborhoods are well-established, with significant associations having been found for: physical activity, tobacco use, alcohol consumption, and dietary fat intake, as well as related health outcomes such as diabetes. Some of these health effects may be explained by environmental variables such as air pollution, walkability and lack of access to grocery stores (“food deserts”). Despite this knowledge, there are gaps in information concerning neighborhood-level health behaviors and related attitudes. Ongoing disease surveillance survey initiatives attempt to monitor health behaviors in specific geographic areas, and to inform place- based responses to health behavior disparities, but survey response rates have continued to decline in recent years, accompanied by rising costs to administer large scale surveys. At the same time, social media has become a de facto method for sharing personal information, rating amenities, and interacting with members of one’s local communities. Therefore, social media channels present a unique, potentially more cost-effective, modality for public health experts to investigate neighborhood-based health behavior disparities based on these real-time data streams.

However, there are challenges in leveraging the volumes and variety of social media data, with a chief barrier being the biases and uncertainties that arise from social media as a source of public health data. We address these barriers to the use of social media for public health surveillance by developing new methods for modeling, and adjusting for, biases that can emerge from georeferenced social media data about health behaviors.

In this study, we aim to characterize biases that may emerge in social media data based on what is tweeted, which we define as “reporting bias,” or the difference between health behaviors actually performed and those which are mentioned in Twitter posts. Specifically, this study involves conduct of a survey of Twitter users (n=750) who have previously posted about one of four health behaviors: eating food, physical activity, alcohol consumption, and/or smoking. We will construct our survey sample to be generalizable to the population of Twitter users who tweet about health behaviors at the county level in Wayne, Washtenaw, and Lenawee. We will test hypotheses that it is more socially acceptable to tweet about behaviors that are: (a) healthy (as opposed to unhealthy); (b) conducted with others (as opposed to alone); and (c) unusual (as opposed to usual).

The multi-purpose survey will include an embedded experiment in which respondents will be presented with health behavior-related tweets about food, physical activity, alcohol, and smoking, and asked to evaluate their social acceptability on Twitter. We manipulate the content of each of these health behavior tweets with a different “treatment condition” (healthy/unhealthy, collocated/alone, unusual/usual). Participants are shown six tweets for each health behavior and within each treatment condition. For example, a tweet might indicate that the tweeter engages in smoking and is doing so alone. Then a tweet that shows they are smoking among a group of people. For each tweet, the participant answers four questions. (1) if they posted the shown tweet, would it make people who follow them uncomfortable? (2) Would it be socially acceptable to their followers to post the tweet? (3) Would it be out of the ordinary for the participant to post the tweet? (4) And would it be out of the ordinary for the participant to see the tweet from people they follow?

To analyze the results of the experiments, we will construct one-way ANOVA models to assess mean differences between treatment conditions concerning the type of tweet displayed (eating unhealthy vs. healthy food; physical activity vs. sedentary behavior; smoking vs. not smoking; binge drinking vs. drinking in moderation); (2) behaviors conducted alone vs. collocated with others; and (3) unusual vs, usual behaviors). In these models, the dependent variable will be mean scores of four measures concerning the social acceptability of each type of tweet.