More Room for Language: Investigating the Effect of Retrieval on Language Models

February 13 @ 12:15 - 13:00

The Language Technology Group (LTG) at University in Oslo is organising weekly RAG seminars. They do research on a number of topics related to Natural Language Processing (NLP), a subfield of Artificial Intelligence enabling computers to `make sense’ of human language. This weeks speakers for the LTG research seminar are David Samuel, Lucas Charpentier and Sondre Wold, PhD Candidates at the LTG.

They will present their work on investigating the effects of pretraining with retrieval on a language model.

The talk will be held in room 4118, Styreromfrom 12:15 to 13:00 on Tuesday 13th of February and  also streamed on Zoom, you can join with this link: https://uio.zoom.us/j/66002313655?pwd=UlZBUnRPdG9zdytIWUxmczR3TlVOQT09


Retrieval-augmented language models pose a promising alternative to standard language modeling. During pretraining, these models search in a corpus of documents for contextually relevant information that could aid the language modeling objective. We introduce an `ideal retrieval’ methodology to study these models in a fully controllable setting. We conduct an extensive evaluation to examine how retrieval augmentation affects the behavior of the underlying language model. Among other things, we observe that these models: {i) save substantially less world knowledge in their weights, (ii) are better in understanding local context and inter-word dependencies, but {iii) are worse in comprehending global context.


February 13
12:15 - 13:00
UiO, Room 4118, Styrerom