About us

Home / Research / Work Package 5

Norwegian Language Technologies

Home / Research / Work Package 5

About us

Home / Research / Work Package 5

/ Introduction

Language technologies are at the core of media technologies. This work package aims to provide datasets and models for Norwegian (Bokmål/Nynorsk) that support the automated understanding as well as the automated production of media texts in this language. 

Objective: WP5 adopts theoretical approaches and methodologies primarily based on linguistic data science, including neural learning. Based on language data in the media from the user partners and data and tools at the research partners, large corpora will be annotated. The labelled examples in these corpora will be used for training and evaluating supervised models that demonstrate advanced approaches in areas such as robust deep language analysis, adaptive language generation, event identification and extraction, and analyzing opinions. The partners will cooperate to explore the use of such models for innovative purposes.

/ Introduction

Language technologies are at the core of media technologies. This work package aims to provide datasets and models for Norwegian (Bokmål/Nynorsk) that support the automated understanding as well as the automated production of media texts in this language. 

Objective: WP5 adopts theoretical approaches and methodologies primarily based on linguistic data science, including neural learning. Based on language data in the media from the user partners and data and tools at the research partners, large corpora will be annotated. The labelled examples in these corpora will be used for training and evaluating supervised models that demonstrate advanced approaches in areas such as robust deep language analysis, adaptive language generation, event identification and extraction, and analyzing opinions. The partners will cooperate to explore the use of such models for innovative purposes.

/ Introduction

Language technologies are at the core of media technologies. This work package aims to provide datasets and models for Norwegian (Bokmål/Nynorsk) that support the automated understanding as well as the automated production of media texts in this language. 

Objective: WP5 adopts theoretical approaches and methodologies primarily based on linguistic data science, including neural learning. Based on language data in the media from the user partners and data and tools at the research partners, large corpora will be annotated. The labelled examples in these corpora will be used for training and evaluating supervised models that demonstrate advanced approaches in areas such as robust deep language analysis, adaptive language generation, event identification and extraction, and analyzing opinions. The partners will cooperate to explore the use of such models for innovative purposes.

/ People

Lilja Øvrelid

Lilja Øvrelid

Work Package Leader

University of Oslo 

Read more
Koenraad De Smedt

Koenraad De Smedt

Work Package Co-Leader

Erik Velldal

Erik Velldal

Key Researcher and Task Leader

University of Oslo 

Read more
Emiliano Guevara

Emiliano Guevara

Industry WP5 Co-Leader

Amedia

Read more
Samia Touileb

Samia Touileb

Researcher

Huiling You

Huiling You

PhD Candicate

University of Oslo 

Read more

/ Publications

2022

Samia Touileb; Debora Nozza

Measuring Harmful Representations in Scandinavian Language Models Conference

2022.

Abstract | BibTeX | Links:

Petter Mæhlum, Andre Kåsen, Samia Touileb, Jeremy Barnes

Annotating Norwegian language varieties on Twitter for Part-of-speech Workshop

2022.

Abstract | BibTeX | Links:

Samia Touileb; Lilja Øvrelid; Erik Velldal

Occupational Biases in Norwegian and Multilingual Language Models Workshop

2022.

Abstract | BibTeX | Links:

2020

Samia Touileb; Lilja Øvrelid; Erik Velldal

Gender and sentiment, critics and authors: a dataset of Norwegian book reviews Journal Article

In: Gender Bias in Natural Language Processing. Association for Computational Linguistics, 2020, (Pre SFI).

Abstract | BibTeX | Links:

J Barnes; Erik Velldal; Lilja Øvrelid

Improving sentiment analysis with multi-task learning of negation Journal Article

In: 2020, (Pre SFI).

BibTeX | Links:

J Barnes; Lilja Øvrelid; Erik Velldal

Sentiment analysis is not solved! Assessing and probing sentiment classification Proceeding

2020, (Pre SFI).

BibTeX | Links:

Wafia Adouane; Samia Touileb; Jean-Philippe Bernardy

Identifying Sentiments in Algerian Code-switched User-generated Comments Conference

2020, (Pre SFI).

Abstract | BibTeX | Links:

Lilja Øvrelid; P Mæhlum; J Barnes; Erik Velldal

A Fine-Grained Sentiment Dataset for Norwegian Proceeding

2020, (Pre SFI).

BibTeX | Links:

F Jørgensen; T Aasmoe; ASR Husevåg; Lilja Øvrelid; Erik Velldal (Ed.)

NorNE: Annotating Named Entities for Norwegian Proceeding

2020, (Pre SFI).

BibTeX | Links:

P Meurer; V Rosén; Koenraad De Smedt

Interactive Visualizations in INESS Book Chapter

In: Butt, M.; Hautli-Janisz, A.; (Eds.), V. Lyding (Ed.): 2020, (Pre SFI).

BibTeX | Links:

Pierre Lison; Aliaksandr Hubin; Jeremy Barnes; Samia Touileb

Named Entity Recognition without Labelled Data: A Weak Supervision Approach Journal Article

In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 1518–1533, 2020, (Pre SFI).

Abstract | BibTeX | Links:

Koenraad de Smedt; D Koureas; P Wittenberg

FAIR Digital Objects for Science: From Data Pieces to Actionable Knowledge Units Journal Article

In: 2020, (Pre SFI).

BibTeX | Links:

2019

Jeremy Barnes; Samia Touileb; Lilja Øvrelid; Erik Velldal

Lexicon information in neural sentiment analysis: a multi-task learning approach Conference

Linköping University Electronic Press, 2019, (Pre SFI).

Abstract | BibTeX | Links:

2018

A Kutuzov; Lilja Øvrelid; T Szymanski; Erik Velldal

Diachronic word embeddings and semantic shifts: a survey Proceeding

2018, (Pre SFI).

BibTeX | Links:

Erik Velldal; Lilja Øvrelid; Eivind Alexander Bergem; Cathrine Stadsnes; Samia Touileb; Fredrik Jørgensen

NoReC: The Norwegian Review Corpus Proceeding

2018, (Pre SFI).

Abstract | BibTeX

2017

Samia Touileb; Truls Pedersen; Helle Sjøvaag

Automatic identification of unknown names with specific roles Journal Article

In: Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pp. 150-158, 2017, (Pre SFI).

Abstract | BibTeX | Links:

M Fares; A Kutuzov; S Oepen; Erik Velldal

Word vectors, reuse, and replicability: Towards a community repository of large-text resources Proceeding

2017, (Pre SFI).

BibTeX | Links:

2016

V Rosén; M Thunes; P Haugereid; GS Losnegaard; H Dyvik; P Meurer; G Lyse; Koenraad De Smedt

The enrichment of lexical resources through incremental parsebanking Journal Article

In: 2016, (Pre SFI).

BibTeX | Links:

H Dyvik; P Meurer; V Rosén; Koenraad De Smedt; P Haugereid; GS Losnegaard; G Lyse; M Thunes

NorGramBank: A 'Deep' Treebank for Norwegian.Proceedings of LREC Proceeding

2016, (Pre SFI).

BibTeX | Links:

Lilja Øvrelid; P Hohle

Universal dependencies for Norwegian Proceeding

2016, (Pre SFI).

BibTeX | Links:

V Rosén; Koenraad De Smedt; GS Losnegaard; E Bejcek; A Savary; P Osenova

MWEs in Treebanks: From Survey to Guidelines Proceeding

2016, (Pre SFI).

BibTeX | Links:

2012

E Lapponi; J Read; Lilja Øvrelid

Representing and resolving negation for sentiment analysis Proceeding

2012, (Pre SFI).

BibTeX | Links:

Erik Velldal; Lilja Øvrelid; J Read; S Oepen

Speculation and negation: Rules, rankers, and the role of syntax Journal Article

In: 2012, (Pre SFI).

BibTeX | Links:

/ Publications

2022

Touileb, Samia; Nozza, Debora

Measuring Harmful Representations in Scandinavian Language Models Conference

2022.

Abstract | Links | BibTeX

Andre Kåsen Petter Mæhlum, Samia Touileb

Annotating Norwegian language varieties on Twitter for Part-of-speech Workshop

2022.

Abstract | Links | BibTeX

Touileb, Samia; Øvrelid, Lilja; Velldal, Erik

Occupational Biases in Norwegian and Multilingual Language Models Workshop

2022.

Abstract | Links | BibTeX

2020

Touileb, Samia; Øvrelid, Lilja; Velldal, Erik

Gender and sentiment, critics and authors: a dataset of Norwegian book reviews Journal Article

In: Gender Bias in Natural Language Processing. Association for Computational Linguistics, 2020, (Pre SFI).

Abstract | Links | BibTeX

Barnes, J; Velldal, Erik; Øvrelid, Lilja

Improving sentiment analysis with multi-task learning of negation Journal Article

In: 2020, (Pre SFI).

Links | BibTeX

Barnes, J; Øvrelid, Lilja; Velldal, Erik

Sentiment analysis is not solved! Assessing and probing sentiment classification Proceeding

2020, (Pre SFI).

Links | BibTeX

Adouane, Wafia; Touileb, Samia; Bernardy, Jean-Philippe

Identifying Sentiments in Algerian Code-switched User-generated Comments Conference

2020, (Pre SFI).

Abstract | Links | BibTeX

Øvrelid, Lilja; Mæhlum, P; Barnes, J; Velldal, Erik

A Fine-Grained Sentiment Dataset for Norwegian Proceeding

2020, (Pre SFI).

Links | BibTeX

Jørgensen, F; Aasmoe, T; Husevåg, ASR; Øvrelid, Lilja; Velldal, Erik (Ed.)

NorNE: Annotating Named Entities for Norwegian Proceeding

2020, (Pre SFI).

Links | BibTeX

Meurer, P; Rosén, V; Smedt, Koenraad De

Interactive Visualizations in INESS Book Chapter

In: Butt, M.; Hautli-Janisz, A.; (Eds.), V. Lyding (Ed.): 2020, (Pre SFI).

Links | BibTeX

Lison, Pierre; Hubin, Aliaksandr; Barnes, Jeremy; Touileb, Samia

Named Entity Recognition without Labelled Data: A Weak Supervision Approach Journal Article

In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 1518–1533, 2020, (Pre SFI).

Abstract | Links | BibTeX

de Smedt, Koenraad; Koureas, D; Wittenberg, P

FAIR Digital Objects for Science: From Data Pieces to Actionable Knowledge Units Journal Article

In: 2020, (Pre SFI).

Links | BibTeX

2019

Barnes, Jeremy; Touileb, Samia; Øvrelid, Lilja; Velldal, Erik

Lexicon information in neural sentiment analysis: a multi-task learning approach Conference

Linköping University Electronic Press, 2019, (Pre SFI).

Abstract | Links | BibTeX

2018

Kutuzov, A; Øvrelid, Lilja; Szymanski, T; Velldal, Erik

Diachronic word embeddings and semantic shifts: a survey Proceeding

2018, (Pre SFI).

Links | BibTeX

Velldal, Erik; Øvrelid, Lilja; Bergem, Eivind Alexander; Stadsnes, Cathrine; Touileb, Samia; Jørgensen, Fredrik

NoReC: The Norwegian Review Corpus Proceeding

2018, (Pre SFI).

Abstract | BibTeX

2017

Touileb, Samia; Pedersen, Truls; Sjøvaag, Helle

Automatic identification of unknown names with specific roles Journal Article

In: Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pp. 150-158, 2017, (Pre SFI).

Abstract | Links | BibTeX

Fares, M; Kutuzov, A; Oepen, S; Velldal, Erik

Word vectors, reuse, and replicability: Towards a community repository of large-text resources Proceeding

2017, (Pre SFI).

Links | BibTeX

2016

Rosén, V; Thunes, M; Haugereid, P; Losnegaard, GS; Dyvik, H; Meurer, P; Lyse, G; Smedt, Koenraad De

The enrichment of lexical resources through incremental parsebanking Journal Article

In: 2016, (Pre SFI).

Links | BibTeX

Dyvik, H; Meurer, P; Rosén, V; Smedt, Koenraad De; Haugereid, P; Losnegaard, GS; Lyse, G; Thunes, M

NorGramBank: A 'Deep' Treebank for Norwegian.Proceedings of LREC Proceeding

2016, (Pre SFI).

Links | BibTeX

Øvrelid, Lilja; Hohle, P

Universal dependencies for Norwegian Proceeding

2016, (Pre SFI).

Links | BibTeX

Rosén, V; Smedt, Koenraad De; Losnegaard, GS; Bejcek, E; Savary, A; Osenova, P

MWEs in Treebanks: From Survey to Guidelines Proceeding

2016, (Pre SFI).

Links | BibTeX

2012

Lapponi, E; Read, J; Øvrelid, Lilja

Representing and resolving negation for sentiment analysis Proceeding

2012, (Pre SFI).

Links | BibTeX

Velldal, Erik; Øvrelid, Lilja; Read, J; Oepen, S

Speculation and negation: Rules, rankers, and the role of syntax Journal Article

In: 2012, (Pre SFI).

Links | BibTeX

/ Publications

2022

Touileb, Samia; Nozza, Debora

Measuring Harmful Representations in Scandinavian Language Models Conference

2022.

Abstract | Links | BibTeX

Andre Kåsen Petter Mæhlum, Samia Touileb

Annotating Norwegian language varieties on Twitter for Part-of-speech Workshop

2022.

Abstract | Links | BibTeX

Touileb, Samia; Øvrelid, Lilja; Velldal, Erik

Occupational Biases in Norwegian and Multilingual Language Models Workshop

2022.

Abstract | Links | BibTeX

2020

Touileb, Samia; Øvrelid, Lilja; Velldal, Erik

Gender and sentiment, critics and authors: a dataset of Norwegian book reviews Journal Article

In: Gender Bias in Natural Language Processing. Association for Computational Linguistics, 2020, (Pre SFI).

Abstract | Links | BibTeX

Barnes, J; Velldal, Erik; Øvrelid, Lilja

Improving sentiment analysis with multi-task learning of negation Journal Article

In: 2020, (Pre SFI).

Links | BibTeX

Barnes, J; Øvrelid, Lilja; Velldal, Erik

Sentiment analysis is not solved! Assessing and probing sentiment classification Proceeding

2020, (Pre SFI).

Links | BibTeX

Adouane, Wafia; Touileb, Samia; Bernardy, Jean-Philippe

Identifying Sentiments in Algerian Code-switched User-generated Comments Conference

2020, (Pre SFI).

Abstract | Links | BibTeX

Øvrelid, Lilja; Mæhlum, P; Barnes, J; Velldal, Erik

A Fine-Grained Sentiment Dataset for Norwegian Proceeding

2020, (Pre SFI).

Links | BibTeX

Jørgensen, F; Aasmoe, T; Husevåg, ASR; Øvrelid, Lilja; Velldal, Erik (Ed.)

NorNE: Annotating Named Entities for Norwegian Proceeding

2020, (Pre SFI).

Links | BibTeX

Meurer, P; Rosén, V; Smedt, Koenraad De

Interactive Visualizations in INESS Book Chapter

In: Butt, M.; Hautli-Janisz, A.; (Eds.), V. Lyding (Ed.): 2020, (Pre SFI).

Links | BibTeX

Lison, Pierre; Hubin, Aliaksandr; Barnes, Jeremy; Touileb, Samia

Named Entity Recognition without Labelled Data: A Weak Supervision Approach Journal Article

In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 1518–1533, 2020, (Pre SFI).

Abstract | Links | BibTeX

de Smedt, Koenraad; Koureas, D; Wittenberg, P

FAIR Digital Objects for Science: From Data Pieces to Actionable Knowledge Units Journal Article

In: 2020, (Pre SFI).

Links | BibTeX

2019

Barnes, Jeremy; Touileb, Samia; Øvrelid, Lilja; Velldal, Erik

Lexicon information in neural sentiment analysis: a multi-task learning approach Conference

Linköping University Electronic Press, 2019, (Pre SFI).

Abstract | Links | BibTeX

2018

Kutuzov, A; Øvrelid, Lilja; Szymanski, T; Velldal, Erik

Diachronic word embeddings and semantic shifts: a survey Proceeding

2018, (Pre SFI).

Links | BibTeX

Velldal, Erik; Øvrelid, Lilja; Bergem, Eivind Alexander; Stadsnes, Cathrine; Touileb, Samia; Jørgensen, Fredrik

NoReC: The Norwegian Review Corpus Proceeding

2018, (Pre SFI).

Abstract | BibTeX

2017

Touileb, Samia; Pedersen, Truls; Sjøvaag, Helle

Automatic identification of unknown names with specific roles Journal Article

In: Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pp. 150-158, 2017, (Pre SFI).

Abstract | Links | BibTeX

Fares, M; Kutuzov, A; Oepen, S; Velldal, Erik

Word vectors, reuse, and replicability: Towards a community repository of large-text resources Proceeding

2017, (Pre SFI).

Links | BibTeX

2016

Rosén, V; Thunes, M; Haugereid, P; Losnegaard, GS; Dyvik, H; Meurer, P; Lyse, G; Smedt, Koenraad De

The enrichment of lexical resources through incremental parsebanking Journal Article

In: 2016, (Pre SFI).

Links | BibTeX

Dyvik, H; Meurer, P; Rosén, V; Smedt, Koenraad De; Haugereid, P; Losnegaard, GS; Lyse, G; Thunes, M

NorGramBank: A 'Deep' Treebank for Norwegian.Proceedings of LREC Proceeding

2016, (Pre SFI).

Links | BibTeX

Øvrelid, Lilja; Hohle, P

Universal dependencies for Norwegian Proceeding

2016, (Pre SFI).

Links | BibTeX

Rosén, V; Smedt, Koenraad De; Losnegaard, GS; Bejcek, E; Savary, A; Osenova, P

MWEs in Treebanks: From Survey to Guidelines Proceeding

2016, (Pre SFI).

Links | BibTeX

2012

Lapponi, E; Read, J; Øvrelid, Lilja

Representing and resolving negation for sentiment analysis Proceeding

2012, (Pre SFI).

Links | BibTeX

Velldal, Erik; Øvrelid, Lilja; Read, J; Oepen, S

Speculation and negation: Rules, rankers, and the role of syntax Journal Article

In: 2012, (Pre SFI).

Links | BibTeX