Within the context of Information Disorders (i.e., mis-, mal-, and disinformation), much attention has been brought in recent years towards the concept of narratives, as in the framework that conveys a particular understanding or interpretation of an event expressed through a story. Virtually all organizations involved in fact-checking or counter-disinformation have shifted their activity towards a narrative-centered approach to offer more comprehensive debunking to their external audience and improve detection and pre-bunking capabilities within their internal operations.
The use of narratives has shown promising results, but there are some limitations that restrict us from harnessing full benefit from this approach. One significant limitation is the intangible nature of narratives, often due to the lack of a standardized definition, which relegates them to a qualitative role. While they offer a valuable method for analyzing the ever-increasing amount of information on the internet, they still heavily rely on domain-specific knowledge from human researchers. This dependency interferes with our capacity for automation and introduces the potential for biased assessments.
To bridge these gaps and provide a more objective and scalable approach, a quantitative dimension must be added. One way to achieve this result is to introduce elements of Narrative Intelligence, a methodology that intersects with OSINT processes and brings a data-first approach to exploring narratives.
The original dataset was collected on behalf of the NGO Sustainable Cooperation for Peace & Security (SCPS). They are an NGO focused on Peacebuilding activities.
Data was collected from:
News Articles: 100k articles from Italy, Germany, the Netherlands, Romania, and Georgia, focusing on political topics one month before the European Elections.
Telegram Channels: 30 channels linked to Kremlin propaganda and Russian military bloggers. All content was originally in various languages and translated to English.
Objective: To validate the efficacy of Narrative Intelligence methodologies in extracting and analysing narratives from news articles and Telegram channels.
Are there patterns that can be observed within the dataset(s)?
Are there patterns that can be correlated with external sources, world events, or other significant circumstances?
How could the narratives, or some of their components, be visualised?
How could similar narratives be identified and clustered?
In Narrative Intelligence, a narrative is defined by a set of fundamental components (characters, setting, plot, conflict, themes, message, point of view) and enriched by contextual metadata (sentiment score, keywords, disambiguated entities) to provide a predictable, consistent, and machine-readable structure to the data.
While Narrative Intelligence was traditionally confined to conventional Machine Learning techniques, it now stands to gain significantly from the advent of Large Language Models, offering unprecedented opportunities for the extraction of narratives from massive amounts of data. LLMs have demonstrated their efficacy in extracting such information by providing a specific set of prompts with clear instructions on how a narrative is defined.
Therefore, we have decided to put this methodology to the test, applying it to a body of around 100k news articles in the month preceding the European Elections, focusing on political topics sourced from newspapers and news agencies in Italy, Germany, Netherlands, Romania, and Georgia. We have applied this same methodology to 30 Telegram channels directly or indirectly linked to Kremlin propaganda, as well as Russian mil-bloggers. The two datasets resulting from this experiment will serve as the foundation for this project, aiming to validate the efficacy of Narrative Intelligence methodologies and explore potential use cases of a data-first, quantitative approach towards narratives.
Taking inspiration from the body of literature pertaining the fields of communication studies, StratComm, and narratology, the schema of a narrative can be defined as follows:
characters: the entities who are protagonists of the story;
setting: time and place where the story takes place;
plot: beginning, middle, and end of an event’s sequence;
conflict: the point of divergence, or problem, from which the story develops; These narratives are likely to inherit the authority of true belief and preclude alternative ways of thinking about political or strategic narratives.
themes: the subjects that are explicitly or inherently mentioned in the story; Themes in the discussion described as the lexical ties between messages.
message: the moral of the story, or persuasion points, directed towards the reader; this incorporates a notion of the goals, ideological bias held by characters in a narrative and the various means they have to accomplish these goals.
pov: the point of view of the narrator; invisible biases.
The research was separated into two parts.
The web dataset was categorized into several different narrative tools. We then used different visualization techniques to explore what these data could mean.
The data was visualized in the following ways:
Characters: people, places, and things
Relating biassed sources to characters
Scatterplot of source bias and credibility per character
Setting
Plot: beginning, middle, and end
Conflict
Themes: 30 by occurence
Italy, Germany, Netherlands, and Romania
Line Charts
Message: Embeddings with Sentence Transformer LLM
Reducing to 2 dimensions with UMAP
Manually identifying messages on scatterplot
Density plots per country
A JSON sample of the narrative schema is available here. Adopting this schema, we have scraped and processed around 50k news articles in the 30 days up to the 2024 European Elections, focusing on political coverage from Italy, Germany, Romania, and the Netherlands. A sample of this dataset is available here.The same narrative extraction was performed on 30 Russian channels on Telegram in the same time span, resulting in a dataset of around 6k messages. Full access of both datasets will be offered for the duration of the project.
Part two of the group’s methodologies was more experimental, using generative AI. The experiment had two separate elements, the first was attempting to depict main characters per country and the second section was to pull in and visualize multiple perspectives on a theme. In Depicting Main Characters Per Country, Generative AI was used to visualize the narrative around the characters, we used web articles about Meloni, Geerts, and Scholz and created images that represent the narrative within the text.
The second experiment was analyzing the themes and how they support multiple political perspectives. To see different thematic narratives DEMOCRACY was selected as a core theme and one random article per day about that theme within the selected countries was selected. This web article data was then used to visualise through Stable Diffusion.
Original Text:
“Struggle against terrorism, hatred, and violence in the face of threats to freedom and democracy.
Massacre of Brescia followed by plagues, intimidation, neo-fascist attacks. Response of Brescian civil society against threats and violence, black terrorism raising criminal action. Rejecting and isolating preachers of hatred, living the principles and values of the Constitution, working for unity and peace”
Prompt:
Subject: President of the Republic Sergio Mattarella, Brescian civil society, attackers, preachers of hatred, operators of mystification,sowers of discord,citizens
Medium: PHOTO-REALISTIC
Environment: 50 years ago, Brescia, Piazza della Loggia
Mood: very negative about
Terrorism, Unity, Freedom, Democracy, Peace, Justice
Composition: Point of view of President of the Republic, Sergio Mattarella
Main themes that were associated with more than one country were: climate change, conflict, corruption, and democracy. Preliminary findings of this project point to the idea that issues of public salience span across the European union.
DEPICTING MAIN CHARACTERS PER COUNTRY
Visualization
Within the sample visualization, there was a notable AI art-style towards cartoonization of political figures. Different characters had different themes and interesting quirks integrated into each visualization set.
Geert Wilders
Surrounded by other characters in his representation
More dynamic expressions
Less ‘real’ most cartoon of the set
Most ‘real’ and neutral
Muted colours
Giorgia Meloni
Surrounded by faceless crowd
Not shown with other women
Only one presented in a more intimate, private environment
All countries chosen had a combination of hyper real and cartoon imagery
Combo of black and white a colour representations
Individual character in each - strong ties to one or two political figures
Germany
Images show themes of war, voting, international relations
Romania
Trump, war,
Italy
European flags, Migration, war, Giorgia Meloni
Our findings are the start of a greater research that needs to be done to benefit the Cyber-Peacebuilding (SCPS) program dedicated to the research of Information Disorders, with a focus on narratological methods and OSINT, to provide new investigative tools for researchers and cyber responders. Our work on trying to clean and understand 2024 EU election data is the first attempt at setting a new data driven framework for conducting investigations into narratives, testing a quantitative approach to narrative analysis, and pinpointing a more exact definition of the terminology. This modelization of narratives attempted tp establish an opinionated modelization for future cross-integration of datasets related to mis-, dis-, or malinformation.
Understanding this data set means understanding its biases and discussing its validity. There are many intersectional identities at play within this data, many demographics that situate the individual producer of messages within their respective countries' political climates, and in turn, the mosaic that shapes the EU and Fortress Europe . The way that the data was collected deeply impacted our research questions, and results. We produced visualizations that allow us to identify the relations between different narratives and their changes through time.
This project and, indeed, data science as a whole can be approached from a play standpoint. This analysis is based in part on the reflection of how we approached our original unruly data set. We experimented with the data via different cleaning methods and then through many visualizations. Our goal was ultimately to make the data more intuitive and clean, which unfortunately took the majority of our time.
The data was, in part, explored using narrative analysis. This was done to attempt to understand how narratives around the EU election were constructed, in particular which characters engaged with which themes. Contextualizing these characters into their recurring motifs gives us insight into the meanings people ascribe to political leaders and events, the performativity behind political values, as well as general hopes and fears that drive people to vote in elections. It also helps provide examples for normative Right Wing vs Left Wing politics. A brief critique of the narrative framework is that there could be narrative fatigue and the fact that placing people’s data into narrative frameworks could be reductive. This being said, it is still valuable. Narrative enquiry can help to understand the human condition. Specifically narrative methods are useful for understanding the past retrospectively, and connect actions to their implications.
Although we were mainly doing thematic analysis, the research could be expanded to trace the change of political thought over the sample period, not just theme occurrence in Germany. As researchers we were not looking for the facts, or even the objectivity of the data. Ultimately, due to the condition our data was in upon being assigned it, our goal was to create a usable data set. In the future, this data could be placed in conversation with research into fake news to understand fake news impact, or to cross verify where people are getting their news and political commentary from.
This project has demonstrated the potential of integrating Narrative Intelligence methodologies with Large Language Models (LLMs) to systematically analyze and understand political narratives surrounding the 2024 European Elections. By employing a quantitative approach, we have been able to dissect and visualize the complex web of narratives present in a substantial dataset of news articles and Telegram messages. Our analysis has highlighted the recurring motifs and themes that shape public perception of political leaders and events, providing insights into the performative nature of political values and the underlying hopes and fears that influence voter behavior.
The findings underscore the importance of narratives in the context of information disorders, particularly in the realm of mis-, mal-, and disinformation. The project's approach offers a scalable and objective framework for narrative analysis, which could be instrumental in enhancing fact-checking and counter-disinformation efforts. However, the reliance on narrative frameworks also poses challenges, such as potential narrative fatigue and the reductive nature of categorizing complex human data into predefined schemas.
Despite these challenges, narrative inquiry remains a valuable tool for understanding the human condition and the evolution of political thought. The project's methodologies and findings lay the groundwork for future research, particularly in exploring the intersection of narratives with fake news and the sources of political commentary. By refining and expanding upon this research, we can continue to develop innovative tools and strategies for addressing the challenges posed by information disorders in the digital age.
Citations:
Barkhuizen, Gary Patrick. The Handbook of Narrative Analysis. Edited by Anna De Fina and Alexandra Georgakopoulou, 1st ed., Wiley Blackwell, 2015.
Foret, Francois, and Noemi Trino. “The ‘European Way of Life’, a New Narrative for the EU? Institutions’ vs Citizens’ View.” European Politics and Society (Abingdon, England), vol. 24, no. 3, 2023, pp. 336–53, https://doi.org/10.1080/23745118.2021.2020482.