Recommender systems often show users ‘more like this’. But if a user likes something, which features should you actually consider to finding new similar items? This is an ongoing challenge, also in the news domain. The goal of this study was not to optimize similarity-based functions but to evaluate how existing metrics perform in the domain of news articles. For example, news recommendation systems may require metrics less dependent on personal taste compared to those for movies or recipes.

Few studies have explored how users evaluate the recommendations they receive. Associate Professor Alain Starke, Vegard R. Solberg, Sebastian Øverhaug and Professor Christoph Trattner conducted three studies to understand how well similarity functions work in news recommender systems, focusing on human impressions.

Perception varies depending on the type of the news

In the first study, participants assessed the overall similarity between randomly paired political news articles. In the second study participants were asked to write down the criteria they considered most important in news recommendations, recognizing potential differences based on article type. The third study repeated the first but used news articles matched by topic, date, and named entities.

The studies found that both topic and named entities better represented similarity judgments, with stronger correlations between similarity judgments and similarity scores when articles were matched on these features. In contrast, matching based on date had little impact, suggesting this strategy may not be as effective for news recommendation. Overall, user perceptions of similarity in the news domain are best reflected by relying on the body text of articles, with topical matching having a stronger influence on recent events and named entities being more critical for sports news. To better reflect user perceptions, the researchers recommend that content-based news recommender systems should focus on the body text, with support from image embeddings, article categories, and the author.

The results show also that retrieving similar items is effective for news recommendations, though participants’ perceptions of similarity varied depending on whether the reference article was about sports or recent events. The strength of similarity judgments by humans and similarity scores by feature-specific functions was strongly affected by how news article pairs were matched.

New metrics based on LLM

This paper set the foundation for the paper “Shaping the Future of Content-based News Recommenders: Insights from Evaluating Feature-Specific Similarity Metrics“ by Daniel Rosnes, Associate Professor Alain Starke, and Professor Christoph Trattner. Their study focused on evaluating similarity metrics by analyzing how people judge the similarity between national and local news. They compared new metrics based on large language models with traditional ones, explored differences in similarity judgments between national and local news, and identified the most effective content-based strategies for news recommendations.

The paper “Examining the Merits of Feature-specific Similarity Functions in the News Domain using Human Judgments” can be found here.