BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//MediaFutures - ECPv6.15.13.1//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:MediaFutures
X-ORIGINAL-URL:https://mediafutures.no
X-WR-CALDESC:Events for MediaFutures
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:Europe/Oslo
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:20230326T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:20231029T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:20240331T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:20241027T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:20250330T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:20251026T010000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=Europe/Oslo:20240903T123000
DTEND;TZID=Europe/Oslo:20240903T140000
DTSTAMP:20260729T041733
CREATED:20240826T092927Z
LAST-MODIFIED:20240826T092927Z
UID:19206-1725366600-1725372000@mediafutures.no
SUMMARY:Advancing Image and Video Understanding Beyond Traditional Paradigms: Vocabulary-Free Classification and Zero-Shot Temporal Localization
DESCRIPTION:The Department of Information Science and Media Studies and the I2S research group welcomes you to this seminar with Paolo Rota of University of Trento. \nThe rapid evolution of vision-language models is transforming the landscape of image and video understanding\, going beyond traditional classification and localization paradigms. We will explore two recent methodologies that challenge the conventional reliance on predefined vocabularies and training data. The first part of the talk introduces the concept of Vocabulary-Free Image Classification (VIC)\, a novel approach that assigns classes to images without the constraints of a fixed vocabulary. We will delve into the challenges of operating within an unconstrained semantic space containing millions of concepts and present Category Search from External Databases (CaSED)\, a training-free method that leverages external vision-language databases for efficient and accurate classification. In the second part\, we will shift focus to Test-Time Zero-Shot Temporal Action Localization (ZS-TAL)\, which tackles the problem of identifying and locating unseen actions in untrimmed videos without the need for annotated training data. We will introduce the Test-Time adaptation for Temporal Action Localization (T3AL) approach\, which adapts a pre-trained Vision and Language Model (VLM) to perform action localization in a self-supervised manner\, significantly improving generalization across diverse video domains. Finally we will show how LLMs can be used as a sort of orchestrator to solve research problems autonomously\, through visual programming. \nPaolo is an assistant professor at the Center for Mind and Brain (CIMeC) at the University of Trento. He received his Ph.D. from the same university and has worked as a postdoctoral Marie Curie fellow at TU Wien and as a postdoc at the Istituto Italiano di Tecnologia in Genoa. He also worked as an ML researcher at the ProM Facility in Rovereto. He has been an assistant professor at the University of Trento since 2019 and started his tenure track in 2022. His research interests are focused on image and video classification using Vision and Language.
URL:https://mediafutures.no/event/advancing-image-and-video-understanding-beyond-traditional-paradigms-vocabulary-free-classification-and-zero-shot-temporal-localization/
LOCATION:Ulrike Pihls hus\, Seminarrom 2B
CATEGORIES:Events
ATTACH;FMTTYPE=image/png:https://mediafutures.no/wp-content/uploads/Frame-36.png
END:VEVENT
END:VCALENDAR