New study highlights harmful and toxic content in selected Scandinavian language models

A new study co-authored by MediaFutures’ researcher Samia Touileb, shows that some of the popular Norwegian, Danish, and Swedish language models generate harmful and gender-biased stereotypes. The finding goes against the general expectations related to gender equality in Scandinavian countries.

Large-scale pre-trained models have become the backbone of most Natural Language Processing (NLP) models. While yielding new breakthroughs and achieving high accuracy across many natural language tasks, research shows that pre-trained language models can exhibit and reinforce biased representations.

“Language models have proven useful for a broad variety of language processing tasks. Their easy accessibility broadens their usage and permits everybody to use them to solve a plethora of tasks. However, there is evidence that those models generate and can even amplify some of the biases in the datasets they were trained on, reinforcing representational and stereotypical harms,” says MediaFutures’ researcher Samia Touileb.

To what extent harmful and toxic content exists in popular PLMs has been actively researched especially for English models. Few works, however, have focused on Scandinavian languages. The recent work of Touileb and Nozza adds to the pool of the latter.

First analysis of this type for Scandinavian languages

In the new study, the researchers examine the harmfulness and toxicity of nine Scandinavian pre-trained LMs.

Touileb explains, “In our recent work we focus on the three Scandinavian countries of Denmark, Norway, and Sweden. This is in part due to the cultural similarities between these countries and their general perception as belonging to the “Nordic exceptionalism”(Kirkebø et al., 2021), and leading in gender quality.

The researchers focused on sentence completions of neutral templates with female and male subjects. This is the first analysis of this type in Scandinavia.

“We selected the most downloaded and used models on the Hugging Face library. We show that Scandinavian pre-trained language models contain harmful and gender-based stereotypes. In our analysis we also saw that for all models, sentences about women are more toxic than sentences about men,” adds Touileb.

The MediaFutures’ researcher emphasizes that the finding is alarming and that this should be known by users of such models. She also points out the possible problematic outcomes of using such models in real-world settings:

” We know that a growing number of government agencies are working on developing AI-models that for instance can be used in case management and other decision-making processes. As a researcher, I find it concerning. Many of the problems faced by AI systems today are due to the bias-infused data they are trained on, and their ability to amplify such problematic content. If artificial intelligence is to help decide who should receive support or residence, the decision makers must have an exceptionally good overview of the content of the data set they use as well as a deep understanding of models’ behaviors. If not, they risk doing great damage,” she says.

To access Touileb’s & Nozza’s recent research paper click here.

References:

Tori Loven Kirkebø, Malcolm Langford, and Haldor Byrkjeflot. 2021. Creating gender exceptionalism: The role of global indexes. In Gender Equality and Nation Branding in the Nordic Region, pages 191–206. Routledge.

Illustration: Adobe Stock