Always more, always new, always short. The vast amount of freely available content on the internet has created a competition between creators. News agencies are fighting daily to get attention and stay interesting for their target group. Automated content generation and analysis is our way of optimizing efficiency and assisting journalists in their reporting and storytelling capabilities. As a new member of work package 3, Adane Nega Tarekegn is already fully engaged in working on content generation and analysis spanning from unimodal to multimodal and the creation of immersive content with the help of generative AI and deep learning techniques.
Adane has now presented three project proposals which he will work on with our industry partners during the next four years.
Automated video summarization
One of his ideas is to use AI and deep learning to create summaries of videos together with one of MediaFutures partners. Studies have shown that the length of videos is essential for the engagement of viewers. As short videos rank higher in getting clicks and increasing the viewers engagement, the implementation of AI and deep learning to automatically select the most important parts of video material and create short video summary collections can help our industry partners to compete with strong creators on the market.
Adane does not only want to edit videos shorter, but aims to include the extraction of different sources like a collage for a new shorter video. Thus, video editing tasks, such as summarization and adapting content scale to different formats, hold the potential applicability to improve viewer engagement and content consumption across diverse platforms.
Multi-modal content generation
Generative AI tools, such as ChatGPT for text generation, OpenAI’s DALL-E and
Midjourney for creating visual content primarily operate within a single
modality. What lacks is multi-modal content generation, which considers texts, audio, and other available contextual information combined. Instead of using AI to only process the material, Adane wants to explore how different modalities can be combined, manipulated, and synthesized to produce coherent and
engaging content, providing insight into the generation of compelling stories.
AI can make it easier and more affordable to customize the production based on various aspects presented in the domain, for example, political aspect, economic aspect, lifestyle, etc.
Immersive content using 3D reconstruction
As a third project, Adane wants to concentrate on developing AI generated three-dimensional (3D) reconstruction from objects or scenes that are originally in 2D. 3D reconstruction is the task of recovering the three-dimensional (3D) world using images or videos, with a wide range of applications. By applying deep learning and AI, 3D reconstruction can become more affordable and easy to use to enhance storytelling and user engagement in various journalistic topics. Already used in crime reporting and historical scenes, 3D enables stories to be interactive, immersive, and expressive from a viewer-controlled first-person perspective. Unlike their wide use in image processing and computer vision, AI-based methods are not yet well investigated in 3D video reconstruction and immersive technologies, but that is what Adane wants to focus on in his PostDoc.