Covid-19 Vaccine Stance Detection and Social Media Exploration

Exploring Diverse Opinions on Covid-19 Vaccines through Social Media Tweets.

Created Using: Python

In the ever-evolving landscape of social media, every issue, even a pandemic and its vaccine, has its champions and critics. This project delves deep into the nuances of public sentiment surrounding the COVID-19 vaccine. By analyzing tweets from both Pro-Vaccine and Anti-Vaccine campaigns, we uncover intriguing insights about cluster formations, campaign connectivity, and the epicenters of these discussions.

The code for this project is available at sahilvora10/Covid-19-VaccineAnalysis. If you have any questions or inquiries, please feel free to contact me at sahilvora2021@gmail.com

Project Objective

Gather and analyze Twitter data focused on the COVID-19 Vaccine to gain valuable insights and conduct in-depth exploratory analysis.

How Data Was Scraped

Twitter data was collected using Twitter's APIs, which require the creation of a developer account and a request for elevated access. In this project, we utilized Twitter's version 2 APIs. With the help of the Tweepy library, data was obtained by searching for specific hashtags. Below is a step-by-step explanation of the data scraping process:

  1. Utilizing Tweepy Cursor Methods:
    • Tweepy provides cursor methods that allow us to retrieve data in bulk.
    • The API called the search_tweets method, searching for tweets containing the specified hashtags.
    • For the smaller dataset, approximately 30 tweets per hashtag were fetched, while 100 tweets were collected for the larger dataset.
    • The following tweet attributes were collected: created_at, id, in_reply_to_screen_name, in_reply_to_status_id, in_reply_to_user_id, retweeted_id, retweeted_screen_name, user_mentions_screen_name, user_mentions_id, full_text, user_id, screen_name, and followers_count.
  2. Defining a Date Range:
    • To narrow down the search, a date range starting from April 1, 2021, was chosen. This timeframe coincided with the global introduction of COVID-19 vaccines.
  3. Managing Rate Limits:
    • Due to rate limits imposed by Twitter's developer portal, we could make up to 180 requests every 15 minutes.
    • Each response typically contained 10 to a maximum of 100 results.
    • When the maximum limit was reached, the crawler script paused for 15 minutes before resuming data retrieval.
  4. Selecting Relevant Hashtags:
    • Keywords for both anti-vaccine and pro-vaccine campaigns, such as #StopVaccination and pro-vaccine hashtags, were chosen based on a repository.
    • During a dry run, some hashtags did not yield the desired number of tweets and were consequently excluded.
    • The selected hashtags are stored in AntiVaccineKeywords.txt and ProVaccineKeywords.txt and can be updated as needed.

Results

Diffusion Network created for Anti-Vaccine(left) and Pro-Vaccine(right) attitudes.
Diffusion Network for the largest connected subgraph created for Anti-Vaccine(left) and Pro-Vaccine(right) attitudes.

Based on the analysis of the graphs and network structures, notable differences emerge between the two campaign tweets:

  • Anti-Vaccine Campaign:
    • The diffusion network for the Anti-Vaccine campaign exhibits a dense interconnectedness.
    • Certain tweets clearly serve as central hubs or origin points within this network.
    • Many tweets emanate from these central points, contributing to the overall density.
    • Additionally, there are smaller, connected components within this network.
  • Pro-Vaccine Campaign:
    • In contrast, the diffusion network for the Pro-Vaccine campaign is distinct.
    • It features a single, less densely populated central point or origin.
    • The interconnectedness among tweets in this network is less pronounced.

This analysis underscores the differing patterns of information diffusion and connectivity between the Anti-Vaccine and Pro-Vaccine campaigns.

Summary

The project provides insights into the creation of social networks using Twitter data, revealing patterns of closed connections and network formations among campaigns with shared objectives. The analysis of Anti-Vaccine campaign tweets highlights a strong and densely connected opinion network.