Making news content analysis more multi-dimensional

Imagine trying to understand a lot of information about a topic. Tools like Google’s GDELT help organize news, but they often fall short in providing researchers and journalists with a deeper understanding of individual news events. In response, MediaFutures WP3 co-leader Fazle Rabbi, along with PhD candidate Bahareh Fatemi, and professors Yngve Lamo and Andreas Lothe Opdahl, developed a model-based framework for content analysis which puts focus on monitoring the progression of news by providing various perspectives.

This new system is different from conventional news analysis methods, such as text mining and semantic technologies. Their multi-dimensional model provides a mechanism for comparative analysis, assisting journalists in comprehending various facets of the news, such as capturing tones, exploring different reporting angles, and framing articles over a period of time. This aid helps journalists in better understanding the complexities and different aspects of news stories.

Four methods to analyze news

Their framework comprises of various techniques such as perspective comparison, progression of events, and variant analysis of news events, offering journalists a significantly deeper overview. What sets it apart is its high level of abstraction and the diverse set of dimensions available for user interaction. By employing pre-trained Large Language Models, they extract information from the event chain generated by Google’s GDELT project. These news events are organized using a dimensional meta-model based on location, event type, individuals, and country. Rabbi et al. showcased the effectiveness of their model by analyzing news from Niger and Gabon published in six selected newspapers for a certain period of time since the start of coups in those regions.

News events have been labeled using IPTC metadata standards for media topics. By combining attributes and relationships with domain knowledge in IPTC Media Topics, users can extract different perspectives from the knowledge graph. This framework integrates a computational model based on category theory, enabling the analysis of news events at a higher abstraction level. This includes capabilities such as comparing and categorizing events and analyzing the progression of events.

While Google GDELT and our model-based framework both address the challenge of handling a vast amount of news data, there are practical differences in their approaches and applications. Our framework relies on a novel application of category theory and knowledge graph for analyzing events in a transparent manner.

Fazle Rabbi

The framework operates through four methods:

Utilizing natural language processing techniques

Dimensional meta model to arrange news, structuring information

Content comparison using category theory approach
Conducting statistical analysis on article variants

Advancing computational journalism

Their model and framework serve multiple purposes, such as aiding multilingual news comparison and allowing users to compare perspectives from different news outlets over time. This comparison feature is made possible by the framework’s unique method of organizing information in a knowledge graph. Moreover, their system has the potential to contribute to fact-checking and verifying news sources, thereby enhancing their credibility. Additionally, their model can analyze biases and framing within news articles.

The application of category theory for analyzing events stored in a knowledge graph contributes to advancing the theoretical foundations of computational journalism.

Fazle Rabbi

Their framework offers a comprehensive structure for navigating news content, ensuring transparency and facilitating a deeper comprehension across different dimensions and abstraction levels. The approach stands out by providing high abstraction levels while allowing users the flexibility to explore diverse dimensions. Unlike standard language models (LLMs), their method encompasses statistical analysis, enabling the discovery of intricate patterns and insights in news content.

We believe that the integration of generative AI and category theory can contribute to the evolution of journalism in the digital age, fostering transparency, accountability, and enriched news content for both journalists and readers.

Quote from the paper

WP3 is all about Media Content Production & Analysis

MediaFutures WP3 produces novel tools for computational journalism to produce quality generated content in terms of both trustworthiness and engagement as well as fact checking software. The paper “A model-based framework for NEWS content analysis” written by Fazle Rabbi (UiB), Bahareh Fatemi (UiB), Yngve Lamo (HVL), Andreas Lothe Opdahl (UiB) has now been accepted to the 12th International Conference on Model-Based Software and Systems Engineering 2024.

Rabbi, F., Bahareh Fatemi, Lamo, Y., & Opdahl, A. L. (2024). A model-based framework for NEWS content analysis. 12th International Conference on Model-Based Software and Systems Engineering.