BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//MediaFutures - ECPv6.15.13.1//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://mediafutures.no
X-WR-CALDESC:Events for MediaFutures
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:Europe/Oslo
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:20240331T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:20241027T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:20250330T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:20251026T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:20260329T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:20261025T010000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=Europe/Oslo:20250929T091500
DTEND;TZID=Europe/Oslo:20250929T140000
DTSTAMP:20260404T175744
CREATED:20250905T080105Z
LAST-MODIFIED:20250911T090710Z
UID:21526-1759137300-1759154400@mediafutures.no
SUMMARY:Human-AI Interaction for video Content: Designing and Engineering Multimodal Conversational Agents
DESCRIPTION:Our PhD Candidate Peter Andrews will defend his thesis on September 29th at the University of Bergen. \nTrial lecture starts at 09:15 and the defense at 10.30. \nWe want to encourage you to attend his trial lecture and defense\, and learn more about HCI. \nTitle: \nHuman-AI Interaction for video Content: Designing and Engineering Multimodal Conversational Agents \nAbstract: \n\nAs young adults increasingly shift away from conventional news sources\, interactive and AI-driven media present a new frontier for their engagement in news consumption. Young adults often prefer more interactive video content on streaming platforms\, challenging the traditional model of passive video consumption.  Second screening\, interacting with a second device while watching a primary display\, has emerged to satisfy the need for interaction and support with additional content and context. However\, second screening can hinder comprehension\, revealing the need to synchronize the experience. \n\n\nThis thesis unifies the second screening experience with Computer Vision (CV) and Deep Learning (DL)\, thereby building an interactive video framework following the \textit{From Video to Data} $\to$ \textit{From Data to Narrative} $\to$ \textit{From Narrative to Interaction} paradigm. The result is a Multimodal Conversational Agent (MCA) that can hyper-contextualize video content. This video framework encompasses three research questions: 1)  How can recent advances in computer vision and artificial intelligence facilitate interaction with video content? 2) How can interactive video increase subjective understanding of the content? 3) How do young adults perceive the user experience of interactive video for news broadcasts? Answering these questions gives a better grasp of what is needed to build an end-to-end interactive video framework with AI. At the same time\, empirical research can show how the capabilities of the framework can improve user experience and comprehension. \n\n\nTo address these questions\, I develop prototypes for interactive video in sports (football) and politics. I approached the video framework in a modular manner with four in-house design prototypes – FootyVision\, the Automated Commentary System (ACS)\, AiCommentator\, and AiModerator. Collectively\, these four prototypes demonstrate how CV- and NLP-based event detection and LLM-powered MCAs can synchronize and facilitate real-time interaction with video content. I tested prototypes in lab-based mixed method studies and found that interactive video with MCA can enhance engagement\, immersion\, and subjective understanding. However\, a Human-AI Interaction (HAI) trade-off between automation and user control occurs. While a high degree of automation can tightly synchronize the experience\, it comes at the cost of user control. The affordances of MCA include multimodal feedback and remediation. Multimodal feedback supports subjective understanding\, which aligns with the Cognitive Theory of Multimedia Learning (CTML). Remediation involves repurposing traditional roles in innovative ways. MCAs achieve this by transforming sports commentators and political moderators into remediated personas\, thus leading to increased engagement. Moreover\, MCAs can also push the user into a more objective viewing state\, highlighting a trade-off between objectivity and emotional involvement. Finally\, trust is paramount for high-stakes environments where transparency is crucial. \n\n\nOverall\, my research challenges traditional linear media by integrating CV\, DL\, and NLP into an interactive framework that facilitates on-demand information augmented by the information space. However\, future systems must address key concerns regarding the aforementioned trade-offs and the management of cognitive load. I recommend variable autonomy and transparency to give the user control over the experience\, reinforcing both trust and understanding through Human-Centered AI (HCAI). By synthesizing these findings in human-AI interaction (HAI) and multimedia learning frameworks\, my work provides valuable insights for researchers\, developers\, and broadcasters looking to engage the next generation of news consumers through interactive video. \n\nOpponents:\n\nDr. (Research Director\, DR2) Petra Isenberg \, Laboratoire Interdisciplinaire des Sciences du Numérique\, Université Paris-Saclay\nProf Huamin Qu\, Department of Computer Science and Engineering\, Hong Kong University of Science and Technology\n\nHead of committee\nProf Miroslav Bachinski \nModerator of the defense\nProf Bjørnar Tessem
URL:https://mediafutures.no/event/phd-defense/
LOCATION:Jusbygget\, Auditorium 3
CATEGORIES:Events
ATTACH;FMTTYPE=image/png:https://mediafutures.no/wp-content/uploads/Frame-131-3.png
END:VEVENT
END:VCALENDAR