About us

Home / Research / Work Package 5

Norwegian Language Technologies

Home / Research / Work Package 5

About us

Home / Research / Work Package 5

/ Introduction

Language technologies are at the core of media technologies. This work package aims to provide datasets and models for Norwegian (Bokmål/Nynorsk) that support the automated understanding as well as the automated production of media texts in this language.

Objective: WP5 adopts theoretical approaches and methodologies primarily based on linguistic data science, including neural learning. Based on language data in the media from the user partners and data and tools at the research partners, large corpora will be annotated. The labelled examples in these corpora will be used for training and evaluating supervised models that demonstrate advanced approaches in areas such as robust deep language analysis, adaptive language generation, event identification and extraction, and analyzing opinions. The partners will cooperate to explore the use of such models for innovative purposes.

/ Introduction

Objective: WP5 adopts theoretical approaches and methodologies primarily based on linguistic data science, including neural learning. Based on language data in the media from the user partners and data and tools at the research partners, large corpora will be annotated. The labelled examples in these corpora will be used for training and evaluating supervised models that demonstrate advanced approaches in areas such as robust deep language analysis, adaptive language generation, event identification and extraction, and analyzing opinions. The partners will cooperate to explore the use of such models for innovative purposes.

/ Introduction

Objective: WP5 adopts theoretical approaches and methodologies primarily based on linguistic data science, including neural learning. Based on language data in the media from the user partners and data and tools at the research partners, large corpora will be annotated. The labelled examples in these corpora will be used for training and evaluating supervised models that demonstrate advanced approaches in areas such as robust deep language analysis, adaptive language generation, event identification and extraction, and analyzing opinions. The partners will cooperate to explore the use of such models for innovative purposes.

/ People

Lilja Øvrelid

Work Package Leader

University of Oslo

Koenraad De Smedt

Work Package Co-Leader

University of Bergen

Lubos Steskal

TV2

Eivind Throndsen

Work Package Industry leader

Schibsted

Samia Touileb

Researcher

University of Bergen

Huiling You

PhD Candicate

University of Oslo

Emiliano Guevara

Amedia

Erik Velldal

UiO

Magnus Breder Birkenes

Nasjonalbiblioteket

/ Publications

2024

Samia Touileb; Jeanett Murstad; Petter Mæhlum; Lubos Steskal; Lilja Charlotte Storset; Huiling You; Lilja Øvrelid

EDEN: A Dataset for Event Detection in Norwegian News Conference

LREC-COLING 2024, 2024.

Abstract | BibTeX | Links:

2023

David Samuel; Andrey Kutuzov; Samia Touileb; Erik Velldal; Lilja Øvrelid; Egil Rønningstad; Elina Sigdel; Anna Palatkina

NorBench – A Benchmark for Norwegian Language Models Conference

2023.

Abstract | BibTeX | Links:

Jeremy Barnes, Samia Touileb, Petter Mæhlum; Pierre Lison

Identifying Token-Level Dialectal Features in Social Media Conference

2023.

Abstract | BibTeX | Links:

Ghazaal Sheikhi; Samia Touileb; Sohail Ahmed Khan

Automated Claim Detection for Fact-checking: A Case Study using Norwegian Pre-trained Language Models Conference

2023.

Abstract | BibTeX | Links:

Samia Touileb; Lilja Øvrelid; Erik Velldal

Measuring Normative and Descriptive Biases in Language Models Using Census Data Conference

2023.

Abstract | BibTeX | Links:

2022

Samia Touileb; Debora Nozza

Measuring Harmful Representations in Scandinavian Language Models Conference

2022.

Abstract | BibTeX | Links:

Petter Mæhlum; Andre Kåsen; Samia Touileb; Jeremy Barnes

Annotating Norwegian language varieties on Twitter for Part-of-speech Workshop

2022.

Abstract | BibTeX | Links:

Samia Touileb; Lilja Øvrelid; Erik Velldal

Occupational Biases in Norwegian and Multilingual Language Models Workshop

2022.

Abstract | BibTeX | Links:

2020

Samia Touileb; Lilja Øvrelid; Erik Velldal

Gender and sentiment, critics and authors: a dataset of Norwegian book reviews Journal Article

In: Gender Bias in Natural Language Processing. Association for Computational Linguistics, 2020, (Pre SFI).

Abstract | BibTeX | Links:

J Barnes; Erik Velldal; Lilja Øvrelid

Improving sentiment analysis with multi-task learning of negation Journal Article

In: 2020, (Pre SFI).

BibTeX | Links:

J Barnes; Lilja Øvrelid; Erik Velldal

Sentiment analysis is not solved! Assessing and probing sentiment classification Proceedings

2020, (Pre SFI).

BibTeX | Links:

Wafia Adouane; Samia Touileb; Jean-Philippe Bernardy

Identifying Sentiments in Algerian Code-switched User-generated Comments Conference

2020, (Pre SFI).

Abstract | BibTeX | Links:

Paul Meurer; Victoria Rosén; Koenraad De Smedt

Interactive Visualizations in INESS Book Chapter

In: Butt, Miriam; Hautli-Janisz, Annette; (Eds.), Verena Lyding (Ed.): 2020, (Pre SFI).

BibTeX | Links:

F Jørgensen; T Aasmoe; ASR Husevåg; Lilja Øvrelid; Erik Velldal (Ed.)

NorNE: Annotating Named Entities for Norwegian Proceedings

2020, (Pre SFI).

BibTeX | Links:

Lilja Øvrelid; Petter Mæhlum; Jeremy Barnes; Erik Velldal

A Fine-Grained Sentiment Dataset for Norwegian Proceedings

2020, (Pre SFI).

BibTeX | Links:

Pierre Lison; Aliaksandr Hubin; Jeremy Barnes; Samia Touileb

Named Entity Recognition without Labelled Data: A Weak Supervision Approach Journal Article

In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 1518–1533, 2020, (Pre SFI).

Abstract | BibTeX | Links:

Koenraad de Smedt; Dimitris Koureas; Peter Wittenberg

FAIR Digital Objects for Science: From Data Pieces to Actionable Knowledge Units Journal Article

In: 2020, (Pre SFI).

BibTeX | Links:

2019

Jeremy Barnes; Samia Touileb; Lilja Øvrelid; Erik Velldal

Lexicon information in neural sentiment analysis: a multi-task learning approach Conference

Linköping University Electronic Press, 2019, (Pre SFI).

Abstract | BibTeX | Links:

2018

Andrey Kutuzov; Lilja Øvrelid; Terrence Szymanski; Erik Velldal

Diachronic word embeddings and semantic shifts: a survey Proceedings

2018, (Pre SFI).

BibTeX | Links:

Erik Velldal; Lilja Øvrelid; Eivind Alexander Bergem; Cathrine Stadsnes; Samia Touileb; Fredrik Jørgensen

NoReC: The Norwegian Review Corpus Proceedings

2018, (Pre SFI).

Abstract | BibTeX

2017

Samia Touileb; Truls Pedersen; Helle Sjøvaag

Automatic identification of unknown names with specific roles Journal Article

In: Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pp. 150-158, 2017, (Pre SFI).

Abstract | BibTeX | Links:

Andrei Kutuzov; Murhaf Fares; Oepen Stephan; Erik Velldal

Word vectors, reuse, and replicability: Towards a community repository of large-text resources Proceedings

2017, (Pre SFI).

BibTeX | Links:

2016

Victoria Rosén; Martha Thunes; Petter Haugereid; Gyri S. Losnegaard; Helge Dyvik; Paul Meurer; Gunn Lyse; Koenraad De Smedt

The enrichment of lexical resources through incremental parsebanking Journal Article

In: 2016, (Pre SFI).

BibTeX | Links:

Helge Dyvik; Paul Meurer; Victoria Rosén; Koenraad De Smedt; Petter Haugereid; Gyri S. Losnegaard; Gunn Lyse; Martha Thunes

NorGramBank: A 'Deep' Treebank for Norwegian.Proceedings of LREC Proceedings

2016, (Pre SFI).

BibTeX | Links:

Victoria Rosén; Koenraad De Smedt; Gyri S. Losnegaard; Eduard Bejcek; Agata Savary; Petya Osenova

MWEs in Treebanks: From Survey to Guidelines Proceedings

2016, (Pre SFI).

BibTeX | Links:

Lilja Øvrelid; Petter Hohle

Universal dependencies for Norwegian Proceedings

2016, (Pre SFI).

BibTeX | Links:

2012

Emanuele Lapponi; Jonathon Read; Lilja Øvrelid

Representing and resolving negation for sentiment analysis Proceedings

2012, (Pre SFI).

BibTeX | Links:

Erik Velldal; Lilja Øvrelid; Jonathon Read; Stephan Oepen

Speculation and negation: Rules, rankers, and the role of syntax Journal Article

In: 2012, (Pre SFI).

BibTeX | Links:

/ Publications

2024

Touileb, Samia; Murstad, Jeanett; Mæhlum, Petter; Steskal, Lubos; Storset, Lilja Charlotte; You, Huiling; Øvrelid, Lilja

EDEN: A Dataset for Event Detection in Norwegian News Conference

LREC-COLING 2024, 2024.

Abstract | Links | BibTeX

2023

Samuel, David; Kutuzov, Andrey; Touileb, Samia; Velldal, Erik; Øvrelid, Lilja; Rønningstad, Egil; Sigdel, Elina; Palatkina, Anna

NorBench – A Benchmark for Norwegian Language Models Conference

2023.

Abstract | Links | BibTeX

Samia Touileb Jeremy Barnes, Petter Mæhlum; Lison, Pierre

Identifying Token-Level Dialectal Features in Social Media Conference

2023.

Abstract | Links | BibTeX

Sheikhi, Ghazaal; Touileb, Samia; Khan, Sohail Ahmed

Automated Claim Detection for Fact-checking: A Case Study using Norwegian Pre-trained Language Models Conference

2023.

Abstract | Links | BibTeX

Touileb, Samia; Øvrelid, Lilja; Velldal, Erik

Measuring Normative and Descriptive Biases in Language Models Using Census Data Conference

2023.

Abstract | Links | BibTeX

2022

Touileb, Samia; Nozza, Debora

Measuring Harmful Representations in Scandinavian Language Models Conference

2022.

Abstract | Links | BibTeX

Mæhlum, Petter; Kåsen, Andre; Touileb, Samia; Barnes, Jeremy

Annotating Norwegian language varieties on Twitter for Part-of-speech Workshop

2022.

Abstract | Links | BibTeX

Touileb, Samia; Øvrelid, Lilja; Velldal, Erik

Occupational Biases in Norwegian and Multilingual Language Models Workshop

2022.

Abstract | Links | BibTeX

2020

Touileb, Samia; Øvrelid, Lilja; Velldal, Erik

Gender and sentiment, critics and authors: a dataset of Norwegian book reviews Journal Article

In: Gender Bias in Natural Language Processing. Association for Computational Linguistics, 2020, (Pre SFI).

Abstract | Links | BibTeX

Barnes, J; Velldal, Erik; Øvrelid, Lilja

Improving sentiment analysis with multi-task learning of negation Journal Article

In: 2020, (Pre SFI).

About us

Norwegian Language Technologies

About us

/ Introduction

/ Introduction

/ Introduction

/ People

Lilja Øvrelid

Read more

Koenraad De Smedt

Read more

Lubos Steskal

Eivind Throndsen

Samia Touileb

Read more

Huiling You

Read more

Emiliano Guevara

Erik Velldal

Magnus Breder Birkenes

/ Publications

2024

2023

2022

2020

2019

2018

2017

2016

2012

/ Publications

2024

2023

2022

2020

2019

2018

2017

2016

2012

/ Publications

2024

2023

2022

2020

2019

2018

2017

2016

2012

Find us

Contact us

NEWSLETTER

Hosted by

PARTNERS

CLUSTER

FUNDED BY

Cookie policy

Privacy policy