Russia, Twitter & Authenticity: Establishing Credibility Metrics

Team Members

Tim Groot - Sophie Minihold - Jessica Robinson - Manuel Schneider - Joanna Sleigh - Dydimus Zengenene

Contents

Summary of Key Findings

In this project we first established a systematic way of assessing the features of tweets, accounts, and account activity that enabled the creation of a credibility metric. Second, we found that this metric applied to 2017 tweets by accounts believed to be tied to the Russian IRA identifies that high engagement is correlated with credible content. Thus, the tweets by Twitter accounts suspected of being from Russia’s IRA are a post-truth achievement in the sense that:

  • At first glance, most tweets look credible.

  • Credibility scores are related to the retweet count.

  • Most credible-looking tweets circulated the most.

1. Introduction

Twitter is an important social network site that users increasingly use as a news source (Kwak, Lee, Park & Moon, 2010). A major area of concern is that alongside information from traditional news sources, mis/dis-information is being spread on Twitter (Lazer, Baum, Benkler, Berinsky, Greenhill, Menczer et al., 2018). One recent example is the case of the disinformation campaigning by Russia during the run-up to the election of U.S. President Donald Trump – a topic of significant public discourse (Allcott & Gentzkow, 2017; Knight Foundation, 2019). Concurrently the term 'fake news’ (used to describe a variety of materials, including both misinformation as well as deliberate hoaxes) has become mainstreamed (Phillips, 2018). In response, counter-campaigns that range from fact-checking to media literacy have been increasingly implemented at national and grassroots levels. Yet as Marwick (2018) notes, these approaches have not stemmed the flow of so-called 'fake news' and its ilk on Twitter. Given this context, the question arises of what makes tweets credible? What is meant by credibility? And how can it be assessed?

1.1 Understanding credibility on Twitter

What is credibility? When one looks up the term in the Oxford English Dictionary (2019) the term 'credibility' is defined as the objective and subjective components of the believability of a source or message. It is both objective, based on facts and evidence, as well as subjective, based on opinions and feelings. Credibility is also closely related to concepts of trust, quality, authority, as well as persuasion. The process of establishing credibility entails users making judgements. These can be made consciously, after much consideration, while others are based on appearance and more intuitive (Lazar, Meiselwitz and Feng, 2007). Credibility is thus situation-specific and culturally-bound.

In the land of social media, specifically in the Twittersphere, information credibility is difficult to judge. This is partly due to the absence of a filtering mechanism that ensures good quality of information (such as the peer review process academic journals use). As well, there is the inability to trace information back to a reliable source, such as a newspaper. There is also the fact that tweets are by nature social, with their social value signified by the number of a Tweet's retweets and favorites and the user's friend to follower ratio. The credibility of a tweet can therefore be judged not just according to its content, but also by its popularity and the grooming / influence of its author. Some scholars have thus conflated credibility with engagement metrics (Menchen‐Trevino and Hargittai, 2010). In doing so, they acknowledge that the amplification process that is not simply organic, but one that is dictated by Twitter’s algorithm which facilitates the propagation of heavily engaged-with tweets (Lee, 2014; Patel, 2014; Stein, 2015). Marres (2018) argues that fake-news mitigation strategies to verify ‘The Truth’, such as fact-checking sites, are insufficient in part because they do not address this algorithmic selection process, and therefore do not address a key component of the post-truth climate in a social network site such as Twitter is a claim to authenticity via profile quality and consistency.

1.2 The game of credibility in disinformation campaigns

What does credibility look like in the context of a disinformation campaign? In this project, we were concerned with developing a credibility metric specific to our data set of Twitter accounts associated with Russia’s Internet Research Agency (IRA), accused of propagating 'fake news' and disinformation internationally (Twitter, 2018). We assumed that as successful disinformation agents must also play by Twitter’s rules, they may even 'game' the algorithm in a way that technically makes them ‘real’ or ‘authentic’ and arguably, credible. We identified features, and signals at both the account and tweet level, that successful disinformation efforts may have in common. Our hope was to come one step closer to a potential near real-time tool to scan for disinformation campaigns more reliably on Twitter in our heightened post-truth climate.

Our approach drew upon the work of Gupta, Kumaraguru, Castillo & Meier, (2014) who to assess the credibility of twitter content developed TweetCred; a tool attributing credibility scores based on various characteristics. We took a similar approach, but built a credibility metric from a specific set of misinformation data.

2. Initial Data Sets

The data set used in this project came directly from Twitter:

On 17 October Twitter, in a blog post entitled ‘Enabling further research of information operations on Twitter’, released data sets containing: “3,841 accounts affiliated with the IRA, originating in Russia, and 770 other accounts, potentially originating in Iran. They include more than 10 million Tweets and more than 2 million images, GIFs, videos, and Periscope broadcasts, including the earliest on-Twitter activity from accounts connected with these campaigns, dating back to 2009.”

3. Research Questions

  • RQ1: How can the credibility of a tweet and user be determined?

  • RQ2: To what extent can qualitative and quantitative features of tweet text, users, and account activity predict how much a tweet is retweeted?

4. Methodology

The Twitter data sets contained more than 10 million tweets that the company believes were created by accounts connected to the IRA. This data contains information on which tweets received the most favorites and retweets. In addition, the data contains fields on a number of features of each tweet and the user who created it, for example the tweet text, hyperlinks in the tweet, the user’s location and profile description, when the account was created, and other metadata. Our goal in this project was to investigate what about these tweets made them credible, and indeed if it is even possible to predict what tweets will be perceived as credible.

4.1 Data collection and sampling

We accessed the data through the Digital Methods Initiative’s Twitter Capture and Analysis Toolset (TCAT). Although the data goes back to 2009, in this project we focused on the data from 2017. This was a period of emerging revelations about misinformation campaigns, when users would theoretically be more on guard. Yet the IRA accounts continued to successfully propagate content on the platform perceived as credible.

To make our sample, we ordered users during this period by retweet count and then selected 15 users to study: five from among the top of the list (the most successful), five from the middle, and five from the bottom. The top most three and bottom most three where deliberately omitted to minimize the outlier effect. Our 15 user accounts comprised 13,371 tweets from which a random sample of 498 tweets was then chosen for further manual analysis.

4.2 The Credibility Metric (RQ1)

Our first research question (RQ1) addresses a methodological challenge: how can credibility in a tweet be measured? To do this we developed a structure that identified discrete features of each tweet, including aspects of its user and metadata, and an assessment of how these features can be assessed to determine credibility. Coupled with a review of literature it was decided that credibility could potentially be assessed using characteristics that can fall into six categories, referred to as levels of analysis: account level, tweet level, consumption, account activity level, network level and timing level. In this project the levels were synthesized into three Broader levels, namely: Tweet, Account, and Activity. (Consumption, as measured by retweets, was used as a point of comparison.) After identifying measurable features pertaining to each level (Tweet, Account, and Activity) these features were divided into two:
1) those that could computationally be analysed, and
2) those that could be qualitatively analysed manually.

The divisions between levels of information and types of measurability can be seen in the below illustration Credibility metrics. A python-based computation was run to analyze qualities that could be computed. A web-based questionnaire (Qualtrics) was developed to support the manual analysis. While all 13,371 tweets were analyzed computationally, a random sample of 498 tweets were manually analyzed by three members of the team.

Based on the available data, a metric with 11 features was used for computational analysis and 9 features for the manual analysis. For the manual part, the features were assessed through questions where YES was assigned the value 2, NO the value 1 and NA (not applicable) the value 0. The total score would determine the overall credibility of the analyzed tweet where a higher score would indicate more credibility. As a last step the score was normalized with regard to the occurrence of features within an account or tweet to allow for comparison.

In addition to the discrete features used in the metric, we added a final question to the manual qualitative analysis in which the team member was asked to evaluate the overall credibility of the tweet. We called this 'face value credibility.' This item was added in recognition of the possibility that a tweet could meet all of the other criteria for credibility in our assessment (e.g. contain hashtags and emoji, come from an account with contact information and description, from which the user posts a variety of different types of tweets) and yet clearly be inauthentic or false. This item provided an additional validity check on our metrics.

5. Findings

The test of the metrics showed that the Twitter accounts which had the most followers generally also ranked high on our credibility metric, as seen in the below Figure. This indicates that with regard to RQ2, qualitative and quantitative features can be used to predict the 'credibility' of a tweet. Additionally, both the computational and manual analysis found more or less the same trends. However, more tweets seem to be more credible at face value than the metric would suggest – the opposite of what we suspected.

We found that a high portion of the IRA disinformation tweets were credible according to our metric, pointing to how these 'trolls' might have initially gained traction on Twitter and further highlighting the difficulty of countering their efforts successfully. Although this study did not look specifically at the emotional valence or particular subject matter of the tweets, the analysis did turn up some tentative trends. We found that disinformation agents expressing opinions were often considered more credible than those sharing news. One of the most credible tweets according to our metric was the following tweet dated Feb. 26, 2017, sent from the account @TrayneshaCole who presented herself as an African American woman.

This message linked to another tweet containing photographs of African American women standing up to police during the Civil Rights Era and more recently in Black Lives Matter protests. Thus, this tweet rated high on the metric because it came from an authentic looking account with a high number of followers (25,373) and the tweet contained emoji, emotive punctuation (!), and a hashtag. What wasn’t captured in our metric was that this tweet was also an opinion, not a statement of fact or otherwise easily labelled 'fake news.'

6. Discussion

Based on the high number of tweets that our metric identified as credible, it could be interpreted that the Russian disinformation campaign was a success.

Our results also confirmed that there is a strong relationship between the success of a disinformation agent’s efforts and the magnitude at which their tweets get retweeted. In other words, an account with more followers gets more retweets. We thus identified that the establishment of a large follower base could be used as a tactic to propagate dis-information.

A direction for future research could be looking at the type of followers one had, as this could also be a factor influencing a tweets amplification and thereby our identification of credibility. This is based on the fact that Twitter is a social space, a place where what is important is not just the size of one’s network but also who it is made up of. For if one’s network is made up of influential users, a tweet would have a higher level of amplification. The friending of ‘influential’ users to boost the reach of one’s tweets could thus be interpreted as tactic. Further research could investigate to what extent disinformation agents engage in this 'parasitic behavior.'

Due to limitations of this project, we were also not able to investigate the impact of the content type. However this is an important factor. For how does one measure the credibility of a tweet that expresses an opinion? Especially a tweet that expresses an opinion shared by many others? Active participation in discourse by appropriating or using similar content could thus be used as a tactic to earn followers, popularity and be used to mask another agenda.

As an extension to our findings, we also explored a potential gamification of the qualitative face value credibility measurement that would enable more large-scale implementation and generate more data than we were able to within the scope of this project. Looking ahead, gamification would need a suitable means to account for potential insincerity of answers, as well as a user experience that is streamlined and intuitive enough for users to effectively utilize it. Should we succeed in this regard, it would open the gateway to a new credibility analysis paradigm where the qualitative credibility metric can effectively be crowd sourced.

7. Conclusions

To conclude, in this project we established a systematic way of assessing the features of tweets, accounts, and account activity that enabled the creation of a credibility metric. What we found was that when we applied this metric to the 2017 tweets of accounts believed to be tied to the Russian IRA, there was a correlation between engagement and credible content. Thus, the tweets by Twitter accounts suspected of being from Russia’s IRA are a post-truth achievement.

Our exploration also demonstrates that the friend count and retweet count is related to credibility. In other words, the most credible looking tweets were re-tweeted the most, and were tweeted by users with a high friend count. This is interesting as it highlights that post-truth subversive practices utilize and work to their advantage the structure of the system, which on social media is based upon engagement and network metrics. However further research is needed to explore such post-truth subversive practices in more detail.

8. References

Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of Economic Perspectives, 31(2), 211-36.

Castillo, C., Mendoza, M., & Poblete, B. (2011). Information credibility on twitter. Proceedings of the 20th International Conference on World Wide Web - WWW ’11, (May 2014), 675. https://doi.org/10.1145/1963405.1963500

Gupta, A., Kumaraguru, P., Castillo, C., & Meier, P. (2014). TweetCred: Real-Time Credibility Assessment of Content on Twitter. https://doi.org/10.1007/978-3-319-13734-6_16

Gupta, M., Zhao, P., & Han, J. (2012). Evaluating Event Credibility on Twitter. In Proceedings of the 2012 SIAM International Conference on Data Mining. https://doi.org/10.1137/1.9781611972825.14

Knight Foundation. ‘Seven Ways Misinformation Spread during the 2016 Election’. Knight Foundation, Retrieved from https://knightfoundation.org/articles/seven-ways-misinformation-spread-during-the-2016-election.

Kwak, H., Lee, C., Park, H., & Moon, S. (2010, April). What is Twitter, a social network or a news media?. In Proceedings of the 19th international conference on World wide web 591-600.

Kumar, S. (2003). Tweeting is Believing? Understanding Microblog Credibility Perceptions. Neurology India, 51(2), 285–6; author reply 286. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/14571036

Lazar, J., & Meiselwitz, G. (2007). Understanding Web Credibility: A Synthesis of the Research Literature. Foundations and Trends® in Human-Computer Interaction, 1(2), 139–202. https://doi.org/10.1561/1100000007

Lazer, D. M., Baum, M. A., Benkler, Y., Berinsky, A. J., Greenhill, K. M., Menczer, F., ... & Schudson, M. (2018). The science of fake news. Science, 359(6380), 1094-1096.

Lee, Kevan. (3 July 2014).‘The Science and Psychology of Twitter. Why We Follow and Share. Social Blog,Retrieved from

Marres, N. (2018). Why we can’t have our facts back. Engaging Science, Technology, and Society, 4, 423-443.

Marwick, A. (2018). Why Do People Share Fake News? A Sociotechnical Model of Media Effects. Georgetown Law Technology Review, 474. Retrieved from https://georgetownlawtechreview.org/why-do-people-share-fake-news-a-sociotechnical-model-of-media-effects/GLTR-07-2018/

Mitra, T., & Gilbert, E. (2015). CREDBANK: A Large-Scale Social Media Corpus With Associated Credibility Annotations. Icwsm, 258–267. https://doi.org/10.1175/1520-0450(1993)032<0948:TLSISS>2.0.CO;2

Mitra, T., Wright, G. P., & Gilbert, E. (2017). A Parsimonious Language Model of Social Media Credibility Across Disparate Events. Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing - CSCW ’17, 126–145. https://doi.org/10.1145/2998181.2998351

Menchen‐Trevino, E., & Menchen‐Trevino, E. H. (2011). Young Adults ’ Credibility Assessment of Wikipedia 1 Ericka Menchen ‐ Trevino and Eszter Hargittai. Information, Communication and Society, 14(1), 1–25.

Oxford English Dictionary (2018). Retrieved from http://www.oed.com.

Patel, Neil. (7 Aug. 2014). 10 Twitter Tactics to Increase Your Engagement. Social Media Marketing | Social Media Examiner, Retrieved from https://www.socialmediaexaminer.com/twitter-tactics-to-increase-engagement/.

Phillips, W. (2018). Part 1: In Their Own Words: Trolling, Meme Culture, and Journalists’ Reflections on the 2016 US Presidential Election. The Oxygen of Amplification: Better Practices for Reporting on Extremists, Antagonists, and Manipulators Online. Data & Society.

Stein, Jaime. (2 Feb. 2015). How We Increased Our Twitter Engagement Rate by 180% in Two Months. Hootsuite Social Media Management, Retrieved from https://blog.hootsuite.com/how-to-increase-twitter-engagement/.

Twitter. (2018). Elections integrity: Twitter’s focus is on a healthy public conversation. [Data set]. Retrieved from https://about.twitter.com/en_us/values/elections-integrity.html#data.

Zubiaga, A., & Ji, H. (2014). Tweet, but verify: epistemic study of information verification on Twitter. Social Network Analysis and Mining, 4(1), 1–12. https://doi.org/10.1007/s13278-014-0163-y
Topic revision: r11 - 18 Jan 2019, JoannaSleigh
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback