Investigating echo chambers, filter bubbles and polarization on YouTube

Team Members

Facilitator: Salvatore Romano, Davide Beraldo, Giovanni Rossetti, Leonardo Sanna.

Group members: Bruno Sotic, Nilton Da Rosa, Paul Grua, Armand Bazin, Maxime Bertaux, Youcef Taiati, Antonella Autuori, Andrea Elena Febres Medina, Wen Li, Inga Luchs, Annelien Smets, Lynge Asbjørn Møller, Alexandra Elliott, Matthieu Comoy, Ali El Amrani, Eirini Nikopoulou, Nicolas Pogeant, Yamina Boubekeur, Arthur Lezer, Mehdi Bessalah, Andrea Angulo Granda, Tcheutga Corine, Lisa Lan, Kaothar Zehar, Dong Pha Pham, Josue Charles, June Camille Ménard, Minhee Kyoung, Hangchen Liu, Yiran Zhao.

Winterschool presentation slides: link slides


0. Summary of Key Findings

This paper studies the construction of filter bubbles and political polarization under YouTube 's algorithmic personalization, in a time where the political division runs deep in the US and the 2020 election reaffirms the polarization. Using artificially generated personalized user accounts, we find that search results differ according to users' political affiliations, both in terms of the media type and political ideology of the channels suggested, showing some empirical evidence of filter bubbles' existence on YouTube, which possibly exacerbates an echo chamber behavior and enhancing political polarization in the US political debate.

1. Introduction

The consumption of news on online platforms is increasing year after year. While the video-sharing platform YouTube has become known as the place where users can find almost anything, it is also becoming an essential news source. About a quarter of all U.S. adults (26%) consume news from YouTube, and about 78% of them in some way or another rely on the YouTube algorithm's recommended news videos (Pew Research Center, September 2020). Thus, algorithmic mediation is a crucial issue for our democracy. This project investigates the emergence of filter bubbles, echo chambers, and polarization within YouTube 's recommender system with the U.S. post-electoral debate as a case study.

In the run-up to the 2020 U.S. elections, YouTube announced specific measures to curb the spread of fraud or errors that could influence the election's outcome and made efforts to connect people with authoritative information. Therefore, the U.S. post-electoral debate moment provides insight into the valid test on the intimacy between YouTube 's personalization algorithm and the possibility of polarization.

The project distinguishes between echo chambers and filter bubbles. Echo chambers describe loosely connected clusters of users with similar interests or political ideologies, formed by its members discovering and sharing only information within set interest or ideology (Zimmer et al., 2019). The users themselves thus create echo chambers. Filter bubbles, on the other hand, are a direct effect of algorithmic personalization. The filter bubble argument has dominated the narrative on personalization since the publication of Eli Pariser's influential book on the subject (2011) arguing that algorithmic personalisation only provides the user with more of the same, thus reducing the diversity of information and opinions that people are exposed to and subsequently increasing polarization. Polarization refers to the process of increased segregation into distinct social groups, separated along racial, economic, political, religious, or other lines (Castree et al., 2013). However, how the YouTube algorithm amplifies extremist voices and isolates users into "filter bubbles" pushes them into a polarized situation concerning in such a crucial period (Nikas, 2018).

Current empirical evidence supports a more nuanced view on filter bubbles, and many scholars have criticized the concept and cast doubt on the fear of filter bubbles (Bruns, 2019; Zuiderveen Borgesius et al., 2016). Some studies, however, indicate that YouTube might be an exception. Kaiser and Rauchfleisch (2020) found that YouTube 's algorithms foster highly homophilous communities, while O'Callaghan et al. (2013) found that the algorithms form specifically far-right ideological video bubbles. Yet, more evidence is needed. In general, the issue is difficult to approach because it entails studying many different user experiences, involving various variables challenging to handle and because it is difficult to produce evidence about the algorithmic personalization using the official APIs. Youtube's API only allows researchers to access limited amounts of data and does not provide any information about personalized suggestions, hindering studies of the recommender system itself.

This project aims to create a mixed methodology to investigate echo chambers, filter bubbles, and polarization all at once, highlighting the relation between those phenomena. Our approach involves artificially creating echo chambers on YouTube by controlling a group of users' watching behavior, thus personalizing the users. Using the Youtube Tracking Exposed Tool (YTTREX), we then assess whether the algorithm behind YouTube 's recommender system personalizes these users' search results and recommended videos to an extent where we can talk about the creation of a filter bubble. Finally, we approach the issue of polarization by analyzing the comments of the recommended videos.

3. Research Questions

RQ: Does YouTube ’s algorithm enforce a filter bubble and polarization pattern based on an (artificially generated) echo chamber?

  • Sub-RQ1: Are there differences in the videos suggested as search results across different user types?
  • Sub-RQ2: Are there differences in comments to the videos suggested as search results across different user types?

4. Methodology and initial datasets

The data collection was divided into two distinct phases. In the first phase –the personalization phase– we aimed to simulate echo chambers on YouTube with our watching behavior. To this end, all participants of the project were divided into two groups, each group representing a political orientation in US politics (Conservative/Republican and Progressive/Democrat).The categories of “progressive” or “conservative” of videos and channels were retrieved from the project. Using a clean browser, each user watched six videos from channels considered either progressive or conservative depending on the assigned group, thus personalizing the content that the user gets recommended on the platform. In the second phase – the data collection phase – all users performed three YouTube search queries; “American Elections,” “Coronavirus” and the control query “New Year.” The search results were scraped by the YouTube Tracking Exposed (YTTREX) browser extension, a tool that scrapes metadata from YouTube, such as recommended videos or search results. The tool enables collaborative research. It assembles data collected by all browsers with the extension installed, allowing for comparison by letting the user download a CSV file with all relevant information (Sanna et al., forthcoming).

The data was cleaned and manually coded to allow for thorough analysis. Due to some users' technical issues, the final data set ended up including approximately nine conservative and nine progressive users. The recommended channels were coded according to media type (mainstream/youtube-native/missing link) and political orientation (left/center/right). Different modes of analysis were then conducted on the data: data flow visualizations, statistical analysis, network analysis, and natural language processing (NLP).

To identify possible patterns in personalized information retrieval, we looked into the data flows between specific media channels present on YouTube and the two groups of users. We mapped those associations through data flow visualizations, quantitative and qualitative statistical analyses. We used the free, web-based software RawGraphs in order to visualise the data flows and the proprietary software Tableau in order to produce the charts.

A small-scale statistical, qualitative analysis was carried out to identify what type of content is offered to particular users in video content retrieval (Buckland and Grey, 1994) to infer the extent to which processes of a feedback loop (Adomavicius and Tuzhilin, 2005) can be thought to be emerging or in progress during the particular research. Also, to infer possible implications of such personalized content retrieval processes for a potential polarisation and / or radicalization of individual users within groups.

To visualize clusters of recommended videos according to user type, network diagrams were constructed using the open-source network analysis software Gephi. On all the network graphs, we used Gephi’s Force Atlas 2 algorithm that provides a layout structure for the arrangement of nodes, situating nodes in proximity to other nodes to which they are connected using measures of gravity and repulsion (Bastian et al., 2009).

We used a Python script to retrieve the comment threads (comments and their replies if existing) for each of the videos recommended as a result of our queries. Our intent was to be as close as possible to exhaustivity allowed by the Google API. We found that getting the top 100 threads by relevance, as defined by the YouTube platform, was sufficient. This yielded approximately 400 000 comments for each query result, which were split between videos suggested to the “progressive” browsers and “conservative” browsers. In order to highlight the difference between them, we excluded comments from videos recommended to both sides. For the “American election” query, this amounted to approximately 65 000 comments for the “progressive” suggestions and 130 000 for the “conservative” suggestions.

The unequal number of comments is due to variance in our original experiment. Conservative browsers scrolled further on the query result pages and thus collected more recommendations. The same was found for the “Coronavirus” query, which led us to collect 75 000 comments from the “progressive” suggestions and 150 000 for the “conservative” ones.

Our first analysis consisted of a simple “bag-of-words” approach, computing statistics such as term frequency normalized as a percentage of each text body. We did not correct the size discrepancy between samples during this step, which is to say the conservative benefited from a more precise analysis (larger sample).

Before analysis, comments were submitted to a cleaning process. The text was reduced to lowercase. Sentences and words were tokenized, that is, broken down into the smallest meaningful units or n-grams (the bigram “Donald Trump” would be kept as one token). This allowed us to remove stop-words (e.g. “and,” “as”…) and separate actual text from punctuation signs, emojis, URLS… that might be irrelevant to some analysis like bag-of-words, but relevant to sentiment analysis.

When deemed relevant, we further reduced the terms into their roots, as expressed by the stem or morphological root word (lemma). This was done using specific stemmers or lemmatizers such as spacy or nltk, trained in the English language, or the built-in tools of Python libraries like Vader.

We produced word clouds to illustrate the most common terms for each side and each query. Terms that occur in more than 95% of the comments (e.g., “people” for the American Election query) were removed because they don’t amount to significant insight.

We performed sentiment analysis using the open-source python library Vader, which allows us to gauge the positive, negative, and neutral ratio of comments.

As for the Word clouds that we performed, using all the methods previously explained. Performing the query "american election" we found that the most dominant words s-dominance established by the size of the word in the cloud- for Progressives browser were: Trump, like and Biden, and for the Conservatives: President ,Trump, like, Biden and God.

(Fig 0).

Fig 0: Progressives’ WordCloud on the left in blue, Conservatives’ WordCloud on the right in red.

We have noticed that Conservatives’ cloud fit perfectly the Conservatives narrative elaborated around religious value and Republicanism and the Progressives one reflects more a speech on unity and inclusivity which are again, the core content of their narrative.

As for the CoronaVirus queries, we can say that the narratives on both sides are somehow either polluted by the election queries or the election subject has been deeply entangled with it because the dominant words are mostly the same, at the exception of few minor words.

Fig 0.1: Progressives’ WordCloud on the left in blue, Conservatives’ WordCloud on the right in red.

We aimed to compare the topics that were discussed in the comments of each selection of videos. We modeled these topics using Latent Dirichlet Allocation (Blei et al., 2001). This generative algorithm finds k collections of words (topics) that would best recreate the original document if they were “mixed” in proportions according to a specific probability distribution. This being an unsupervised technique, we ran it several times with different values of k and computed a Topic Coherence measure (Röder et al.) to choose the optimal number of topics. We used the open-source library pyLDAvis to visualize these topics. After lemmatization, conservative and progressive corpora were reduced to a similar size. We modeled the topics twice for each query: we first looked at the comments from side-specific suggestions only, then included comments from videos that were offered to both sides.

We computed the Toxicity Index and the Automated Integrative Complexity - AutoIC (Houck, 2014) (Conway, 2014) of the comments across all queries using a metric defined by Gallacher & Heerdink (2019) and coded in an open-source Python library. Words associated with insults or profanity inside the comments are extremely likely to be classified as toxic. This was done on same-size random samples from the comment section of all exclusively ‘‘conservative’ or ‘‘progressive” suggestions of videos for each one of the queries.

5. Findings

5.1 Network Analysis and Statistical Analysis Results: Personalization/Polarization of Search Results .

Finding 1: The results for the search "american elections" according to media type indicate polarization, as progressive users are recommended mostly mainstream media, while conservative users are recommended a lot of YouTube -native news media.

Figure 1: Top 20 search results of "american elections" across conservative (Trump pseudo) and progressive (Biden pseudo) user groups in terms of media types.

Figure 1 illustrates clusters of recommended videos in the search results for "american elections" in terms of media types. It shows that a large cluster of mainstream media videos in the center of the graph, meaning that videos from mainstream media are suggested to most users. However, we can also see some clusters of videos recommended to the specific user groups. Indeed, progressive users are recommended mostly videos from mainstream media channels, while conservative users are recommended a lot of YouTube -native media channels. Nevertheless, most of YouTube -native videos come from the same channels that the conservative users watched in the personalization process, which means that YouTube is simply recommending conservative users more of the same channels that they have already watched. Overall, the results show personalized results for both progressive and conservative user groups.

This result is replicated in the quantitative statistical analysis of all the results (figure 2) appearing following the particular query.

Figure 2. Statistical Analysis. Proportion of media type per category of users for the “american elections” query.

In the graph for the “american elections” query (figure 3), which represents all results retrieved, we can observe a slim overlap of videos offered to both groups by mainstream media ranging from Sky News, CNN, CBC News Streamed, the Economist etc. Outlets such as “Now This News”, “Democracy Now” and channels by individuals such as Evan Edinger are offered to Biden supporters only, while niche digital media outlets such as “Thinkr”, “Motivation Madness”, “RoTenDO” are offered as results only to Trump supporters.

Figure 3. Visualisation of data flows for the “american elections” query.

Finding 2: The search results for "american elections" in terms of the political orientation of the YouTube channels similarly indicate polarization, as conservative users are recommended more right-leaning channels, while the progressive users are recommended more left-leaning channels.

Figure 4: Top 20 search results of "american elections" across conservative (Trump pseudo) and progressive (Biden pseudo) user groups in terms of political orientations.

Figure 4 shows the clusters of recommended videos in the search results for "american elections" in terms of political orientation of the Youtube channel. It is worth noting that the cluster of videos in the center of the graph is mostly left-leaning and center channels, meaning that many left-leaning and center channels are recommended to all users. Moreover, we can clearly see that conservative users are recommended more right-leaning channels, while the progressive users are recommended more left-leaning channels. Therefore, the results indicate that polarization is happening in the search results for "american elections" on YouTube.

This is also evident in the charts representing the quantitative statistical analysis (figure 5)

Figure 5. Channel orientation of recommended videos per group of users for the “american elections” query.

Finding 3: The search results for "Coronavirus" indicate less polarization in terms of the media type than the previous query for “American Elections”, as mostly mainstream media channels are recommended to both user groups.

Figure 6: Top 20 search results of "Coronavirus" across conservative (Trump pseudo) and progressive (Biden pseudo) user groups in terms of media types

Figure 6 illustrates clusters of recommended videos in the search results of "Coronavirus" coded in terms of media type. The graph shows that mainstream media channels dominate the results for both user groups, as both user groups are recommended almost only mainstream media and almost no youtube-native channels. The progressive user group share a number of same recommendations to mainstream videos among them - the concentrated cluster on the right of the figure - while the conservative user groups are provided with more fragmented search results beyond the ones recommended to all the users from the center cluster, indicating more personalization for this specific group. Based on these results, it can be assumed that YouTube has intentionally made sure that mostly verified information from mainstream channels are recommended to users searching for information on the coronavirus.