The solution to filter bubbles? Artificial Intelligence!

22 June 2020

News coverage of the coronavirus pandemic has once again underlined the fact that in recent years we have increasingly started to live in our own filter bubble. Since algorithms on social media determine what we see in our newsfeed and traditional media are also putting a greater emphasis on personalization, people only get to see information that confirms their view of the world, which in turn could lead to more polarization in society. Tommaso Caselli, Assistant Professor of Computational Semantics at UG, and the Data Science team of the Centre for Information Technology (CIT) want to break our filter bubbles using... an algorithm.

_{Author: Jorn Lelong}

When we talk about filter bubbles, we automatically think of social media. Here, algorithms determine which posts we get to see, based on the pages we, or our friends, like. This way Facebook, Instagram and Twitter create a personalized bubble for each of us, containing news and information that fits our profile.

The filter bubble: a personalized bubble for each of us, containing news and information that fits our profile.

Breaking filter bubbles

But this personalization is not limited to social media. For example, Blendle recommends articles based on interests or previously read articles. In recent years, traditional media have also been experimenting more and more with personalized newsletters, for example, in an attempt to retain readers. What consequences does that have for the way we read about the world? It is precisely this question that assistant professor Tommaso Caselli of the University of Groningen wants to try to answer with his project ‘Breaking filter bubbles’. A unique feature of the project is that it is an interdisciplinary collaboration: not only is Marcel Broersma, Professor of Media and Journalistic Culture, involved in the project, but Caselli is also being assisted by the CIT’s Data Science team.

Dimitrios Soudis of the CIT can sink his teeth into a New York Times archive spanning two decades.

A corpulent corpus

As an Assistant Professor of Computational Linguistics, Caselli has been developing algorithms to analyze linguistic phenomena for some time now. ‘In a previous project as a postdoc at the VU Amsterdam I examined how ten different sources told the same story. I wanted to expand on that project with news reports, but on a large scale.’ And you can take that literally. In this project, data scientist Dimitrios Soudis of the CIT can sink his teeth into a New York Times archive spanning two decades. To make it a little more manageable, for the time being they are limiting themselves to news articles about natural disasters. ‘These are relatively simple stories,’ says Caselli. ‘Later on, we also want to look at crime and political reporting, but first we need to get a clear picture of how news stories work.’ It would be impossible to retrieve all the reports about natural disasters from the enormous New York Times archive manually. So to do that, Soudis is using his expertise in Artificial Intelligence (AI). ‘We go to the NY Times website and search for articles tagged with the category “earthquakes” for example.’ We download those articles, and use a mathematical model to calculate which articles in our corpus correspond with them.’

Language still proves tricky

That’s the article collection part of the project. But according to Soudis, the real challenge of the project lies in the fact that language is not an exact science. ‘Algorithms still have a hard time understanding language. Just think of how often words like whirlwind, tsunami or landslide are used figuratively. So we have to filter them out.’ Computers are not yet able to understand language at a deeper level, so as a data scientist you have to be creative. Dimitrios Soudis came up with the idea of using frequencies. ‘We are creating a kind of dictionary containing terms related to natural disasters. Then we look at how often certain words appear in the selected articles and, more importantly, we look at the words with which they are used in a sentence. This allows us to unravel syntactic relationships between words.’

Caselli: ‘Those filter bubbles are also of your own making. But you have to give people the opportunity to get the complete picture and make their own decisions.’

Reconstructing patterns

This method allows them to look at news reports in a new way. Instead of looking at how an event is reported in individual articles, Caselli and Soudis are investigating the underlying patterns that appear in all news reports about natural disasters. ‘Regardless of a journalist’s individual style, news reports are written according to certain templates. When reporting on an earthquake, for example, you often mention the size of the earthquake, its location and the number of victims, or how the emergency services responded. That’s what we’re trying to reconstruct.’

From small to large

According to Caselli, once that model is up and running, it’s just a matter of expanding the corpus. ‘You have to start small, but eventually we want to see how other media report on the same event. Which sources do they mention, which information do they provide first or possibly leave out. We can then examine the differences between tabloid and traditional media in our research, as well as social media.’

Google News 2.0

However, Caselli believes we will never completely break down these filter bubbles. ‘People will keep using and visiting the media they trust. So those filter bubbles are also of your own making. But you have to give people the opportunity to get the complete picture and make their own decisions. So we want to provide an overview: these are the facts, and these are the different perspectives in the media. A Google News 2.0, if you will.’

More information

Last modified:

17 December 2024 08.57 a.m.

Share this Facebook LinkedIn

View this page in: Nederlands

More news

08 May 2025

Prof. Petra Hendriks elected member Academia Europaea

Prof. Petra Hendriks, professor of Semantics and Cognition at the Faculty of Arts and director of the Center for Language and Cognition Groningen (CLCG), has been elected a member of the prestigious Academia Europaea.
01 May 2025

AI and freedom of speech

On Saturday 3 May, we will celebrate World Press Freedom Day—a day on which we are reminded of the importance of press freedom and our duty to respect and uphold freedom of expression.
22 April 2025

Liekuut | The United States has always been less democratic than we think

The ferocity with which Donald Trump is eroding American democracy may seem unprecedented. Presidents have used their power to issue executive orders in the past, but not at the pace set by Trump. What if the US is less democratic than we think?