Events
Talk Promotion: Neural Reconstruction and Rendering of Dynamic Real-World Content from Monocular Video
10.10.2025 10:00
- 10.10.2025 12:00
IZ 161
Speaker(s): Moritz Kappel
Modern imaging systems enable the preservation of memories and experiences in a compact digital format. On top of static images, their ability to record videos at high temporal resolution additionally preserves the dynamics and motion of the captured scene. With the availability of high-quality smartphone cameras, countless videos are recorded and shared across the globe on a daily basis.
The widespread availability inherently entails an ever-growing demand for methods and tools to further enhance these dynamic videos. This thesis addresses the challenge of reconstructing and rendering dynamic representations of a scene captured in a single monocular video, enabling free spatiotemporal scene exploration and enhancing user engagement and immersion. While 3D reconstruction has been extensively studied for static image sequences and multi-view videos, the inherent absence of depth information in monocular videos (e.g., smartphone recordings) renders this problem highly ill-posed.
In this thesis, I explore three different methods to overcome the challenges associated with monocular video-based scene reconstruction. For this purpose, I leverage recent advances in machine learning and neural rendering techniques for interactive, photorealistic novel view synthesis while bypassing the ambiguities of scene motion and depth using additional data-driven priors. Every presented method employs a unique combination of neural scene representation, rendering approach and monocular depth resolution, each tailored to the specific requirements of the given task: While the first method combines deep image translation networks with human pose estimation to generate highly realistic 2D human avatars from a temporal context, the second method targets full 3D single object reconstruction from monocularized multi-view video using neural radiance fields. Finally, the third method addresses full dynamic 3D reconstruction of casual video recordings via differentiable point rasterization initialized from monocular depth estimates.
Together, the presented techniques demonstrate how monocular videos can be enhanced for immersive digital experiences, advancing the possibilities of video-based scene reconstruction.
Talk Neuronales Punkt-basiertes Rendering gestern und heute
10.10.2025 09:00
- 10.10.2025 09:45
G30
Speaker(s): Marc Stamminger
Punkt-basiertes Rendering wurde bereits vor 25 Jahren intensiv erforscht.
Mit dem Aufkommen von neuronalen Rendering-Techniken gewann das Thema vor kurzer Zeit jedoch wieder sehr an Bedeutung.
Vor allem mit „3D Gaussian Splatting“ gewannen viele alte Techniken wieder große Bedeutung.
In dem Vortrag wird diese Entwicklung dargestellt, und es werden Ergebnisse neuester Arbeiten auf diesem Gebiet gezeigt.
Talk MA-Talk: Moment-based Adaptive Fragment Density Reconstruction for Layered Order-Independent Transparency
12.09.2025 13:00
G30
Speaker(s): Ke Zhao
Talk BA-Talk: Designing and Implementing an Immersive Learning Space to Enhance Attention of Students with ADHD
03.09.2025 13:00
G30
Speaker(s): Nazli Cesur
Talk BA-Talk: Real-Time Control of Lighting Effects via Neural-Network-Based Detection of Salient Musical Features
01.08.2025 13:00
Hardstyle Lab
Speaker(s): Fakher Belkacem
This thesis describes the development of an interactive lighting control system that visualizes music in real-time. The system utilizes a Raspberry Pi, a microphone, and LEDs to process audio signals and generate visual effects. The core of the project is the implementation of a three-layer Deep Neural Network (DNN) trained directly on raw audio data in the time domain. This innovative approach differs from traditional methods that transform audio signals into the frequency domain.
The primary objective of the project was to apply theoretical knowledge in machine learning to a practical, hands-on project while overcoming unexpected challenges. Throughout the development process, various issues such as inconsistencies in data labeling and difficulties in model training were identified and resolved. These experiences highlighted the importance of maintaining a clear and simple vision during model training.
A notable achievement was the successful deployment of the model on a Raspberry Pi, demonstrating the system's capability to perform complex machine learning tasks on low-cost hardware. This work contributes to the fields of music visualization and interactive installations, offering potential applications in live performances and art exhibitions.
Talk Expanding Capabilities of 3D representations for XR
20.06.2025 11:00
G30
Speaker(s): Brent Zoomers
Novel view synthesis has seen rapid advancements, enabling the photorealistic rendering of unseen viewpoints from sparse input data. Among the latest methods, 3D Gaussian Splatting has emerged as a prominent technique due to its combination of high-fidelity reconstruction, fast training times, and editability thanks to its explicit representation. However, despite its academic success, real-world adoption remains limited. In this talk, Brent Zoomers explores the challenges that hinder the industrial application of such methods and presents his ongoing efforts to bridge this gap. He will share insights from his current research, discuss practical obstacles faced in real-world scenarios, and outline promising directions for future work aimed at making novel view synthesis viable for production settings.
Talk BA-Talk: Designing and Implementing an Augmented Reality Memory Palace to Enhance the Memory Performance of Individuals with ADHD
03.06.2025 15:00
G30
Speaker(s): Lara Maschkowitz
Children and adolescents with Attention Deficit Hyperactivity Disorder (ADHD) experience challenges in the academic domain, particularly in learning. This Bachelor’s thesis aims to prototypically implement a memory palace using Augmented Reality, tailored specifically to meet the needs of learners with ADHD. The target is to improve their memory performance. To achieve this, the concept is based on the existing literature and is implemented as a prototype, followed by a critical discussion of the concept and prototype.
Talk BA-Talk: Designing, Developing and Exploring Virtual Reality Mindfulness Interventions to Reduce Mind Wandering in Individuals with ADHD
03.06.2025 14:00
G30
Speaker(s): Hannes Ast
In dieser Bachelorarbeit werden Achtsamkeitsinterventionen in der virtuellen Realität (VR) entworfen, entwickelt und erforscht, die darauf abzielen, das Gedankenwandern bei Personen mit Aufmerksamkeitsdefizit-Hyperaktivitätsstörung (ADHS) zu verringern. Aufgrund von Kernsymptomen wie Unaufmerksamkeit, Impulsivität und Hyperaktivität stehen Menschen mit ADHS in akademischen Kontexten oft vor größeren Herausforderungen als Studierende ohne ADHS und profitieren von maßgeschneiderten Hilfsmitteln. Eine VR-Anwendung wurde entwickelt, um diese Probleme anzugehen, indem eine immersive und kontrollierte Umgebung geschaffen wurde, die externe Ablenkungen minimiert und dem Teilnehmer eine Aufgabe stellt, bei welcher sich auf eine bestimmte Sache konzentriert werden muss. Die Ergebnisse deuten darauf hin, dass VR-Achtsamkeit für einige Personen mit ADHS eine wirksame Methode sein kann, und unterstreichen die Bedeutung der Personalisierung.
Talk MA-Talk: Semantic Segmentation of Harz Dead Trees using Multi-temporal High Resolution Optical Imagery
21.05.2025 14:00
G30
Speaker(s): Aditya Murti
Forests play a crucial role in maintaining ecological balance, supporting biodiversity, and providing resources for human use. Monitoring forest health, particularly in the face of threats such as tree mortality, is essential for effective forest management and urban planning. Land use and land cover (LULC) maps help monitor the health of forests, including tracking deforestation, reforestation, and changes due to natural disturbances such as wildfires or bark beetle infestations. Remote sensing (RS) technology has emerged as a powerful tool for environmental monitoring, offering time-series data that can capture changes in forest conditions over time.
The primary objective of this research is to develop a novel DL model which includes the benefits of the aforementioned models for dead tree segmentation of multi-temporal remote sensing images of the Harz region. The DL model will then be applied to generate the segmentation maps of images of Harz region per month during the growing seasons of a year.
Talk Über Künstliche Intelligenz und Natürliche Dummheit
23.04.2025 18:30
Roter Saal im Schloss, Braunschweig
Speaker(s): Marcus Magnor
Akademie-Vorlesung im Schloss, Braunschweigische Wissenschaftliche Gesellschaft
Talk MA-Talk: Improving Hybrid-Transparency 3D Gaussian Splatting through Exact Volumetric Rendering of Ellipsoidal Primitives
23.04.2025 12:00
G30
Speaker(s): Mathias Ivanov
Talk MA-Talk: Transferring traditional learning approaches to an immersive VR environment to enhance executive functions in students with ADHD
20.03.2025 12:00
- 20.03.2025 13:00
IZ G30
Speaker(s): Florian Krüger
Talk MA-Talk: Perception-aware Color Reduction
20.03.2025 11:00
G30
Speaker(s): Jing Wang
Color reduction is an image processing technique designed to reduce memory consumption and save transmission resources. Traditional color reduction methods, such as k-means, may ignore the perceptual characteristics of the human visual system and result in poor image quality.
To improve the perceptual quality of the image, a perceptual loss based method is used to generate the color reduced image. This is more similar to the original image by optimising the color palettes and mapping strategies.
The aim of this thesis is to develop colour reduction methods that take into account the human visual system and human perception through perceptual loss functions, such as the Learned Perceptual Image Patch Similarity (LPIPS), to evaluate and optimise the color palette and mapping strategies.
Talk BA-Talk: Designing and Implementing a Serious Game for Teaching Concepts of Coding to Individuals with ADHD
31.01.2025 15:00
- 31.01.2025 16:00
IZ G30
Speaker(s): Joel Schaub
Talk Promotion: Ego-Motion Aware Immersive Rendering from Real-World Recorded Panorama Videos
31.01.2025 10:00
- 31.01.2025 12:00
IZ 161
Speaker(s): Moritz Mühlhausen
In this talk, we will explore how we can enhance the immersive experience in virtual reality (VR) by integrating natural motion effects — specifically, ego-motion-aware parallax effects — into real-world panoramic videos.
Traditional panoramic video allows users to view a scene in all directions, but it still limits the sense of presence, especially in VR, where true immersion requires not just looking around but also feeling as though you're moving within the space. This is where parallax comes in: the natural shift in perspective that occurs when we move our heads, which adds depth and realism to our surroundings.
The first part of this talk will focus on how we can use multiple panoramic images to simulate this motion effect. By applying image-warping techniques, we can approximate parallax, making the VR experience feel more dynamic and lifelike. Although this method doesn’t fully replicate the real-world motion, it significantly improves immersion.
Secondly, we will introduce a simpler but powerful approach using a single recording from a single stationary omnidirectional stereo (ODS) camera. This camera captures images for both the left and right eye simultaneously, providing built-in depth perception without the need for multiple cameras. This not only simplifies the capturing process but also allows for a more efficient creation of immersive VR content.
This talk will demonstrate how these methods, whether using multiple cameras or a single ODS camera, can improve depth perception and realism in VR applications. These innovations can make VR experiences — from gaming to education — more engaging and lifelike by offering an experience that feels more connected to real-world motion.
Talk MA-Talk: 4D Diffusion Priors for Robust Dynamic View Synthesis from Monocular Video
24.01.2025 13:00
- 24.01.2025 14:00
IZ G30
Speaker(s): Timon Scholz
Reconstructing dynamic scenes from only a single monocular input video for novel view synthesis is a severely under-constrained problem due to significant ambiguities in the inputs. Nonetheless, recent advancements in neural rendering have led to significant improvements in the field, especially when combined with high-quality priors for regularization. A recent prior that has been explored for static scene reconstruction is diffusion models, which are capable of generating photorealistic images even with very little guidance. Taming these models to provide multi-view-consistent outputs has proven to be challenging, but recent research has been able to generate promising results. Due to these challenges, adapting diffusion-based priors for dynamic novel view synthesis has been severely under-explored. For this reason, I adapt the recent ViewCrafter model for static scene reconstruction as a diffusion prior for the state-of-the-art dynamic neural rendering model D-NPC. The presented approach models each timestep of the input video separately, producing diffusion-based novel views for the entire sequence. I then utilize these views to regularize training gradients of the D-NPC model. By evaluating this approach both qualitatively and quantitatively, I am able to showcase promising results for improving the quality of the reconstructed scene. However, my findings also indicate that currently available diffusion models do not provide sufficient consistency among views to always provide benefits to the reconstruction. In fact, ViewCrafter fails to produce plausible results on many scenes and produces geometrically inconsistent results in many cases. This leads to excessive blurring in a lot of cases, oftentimes significantly decreasing visual quality. By documenting these challenges I hope to provide a baseline for future work to improve diffusion models for the use as scene reconstruction priors.
Talk MA-Talk: Improving Surface Extraction from Neural Radiance Fields through Explicit Handling of Specular Reflections
13.01.2025 13:00
IZ G30
Speaker(s): Shangxiao Ye
Talk Promotions-Vor-Vortrag: Neural Reconstruction and Rendering of Dynamic Real-World Content from Monocular Video
06.12.2024 13:00
- 06.12.2024 14:00
IZ G30
Speaker(s): Moritz Kappel
Dynamic video content is ubiquitous in our daily lives, with countless recordings shared across the globe. But how can we unlock the full potential of these casual captures? The challenge lies in reconstructing immersive representations from monocular videos, a task made difficult by the inherent lack of depth information. This talk explores three machine learning approaches that address this challenge through distinct rendering techniques and strategies to resolve monocular motion and depth ambiguities.
The initial method leverages image translation networks to synthesize human shape, structure, and appearance from pose and motion inputs, enabling temporally consistent human motion transfer with fine clothing dynamics. We will then explore neural radiance fields trained on monocularized multi-view videos for efficient single object reconstruction. Finally, we examine dynamic neural point clouds that incorporate learned priors, such as monocular depth estimation and object segmentation, to resolve ambiguities and enable fast, high-quality scene reconstruction and rendering.
Together, these techniques demonstrate how monocular videos can be transformed into immersive digital experiences, advancing the possibilities of video-based scene reconstruction.
Talk Neural Point-based Radiance Field Rendering for VR
02.12.2024 14:00
- 02.12.2024 15:00
IZ G30
Speaker(s): Linus Franke
Point-based radiance field rendering has demonstrated impressive results for novel view synthesis, offering a compelling blend of rendering quality and computational efficiency, recently showcased with 3D Gaussian Splatting.
This talk will cover similar point-based representations with small neural splats, which allow for reconstruction with very fine detail. This concept is extended for VR rendering, exploiting the human perceptual system for acceleration.
Furthermore, the presentation will explore scene self-refinement capabilities, including techniques for point cloud completion, pose correction and photometric parameter optimization. These techniques address common issues in real-world capturing and significantly enhance rendering quality and temporal stability.
Talk MA-Talk: Exploration and Analysis of Flow Data in Augmented Reality
08.11.2024 13:00
IZ G30
Speaker(s): Anna-Lena Ehmer
Talk BA-Talk: Accelerated Rendering of Implicit Neural Point Clouds through Hardware Rasterization
28.10.2024 13:00
IZ G30
Speaker(s): Tim Stuppenhagen
Talk BA-Talk: Two-Plane-Parameterized Beam Acceleration Data Structure
24.10.2024 13:00
IZ G30
Speaker(s): Marius Werkmeister
Talk BA-Talk: Extension of the Unreal Engine to Generate Image Datasets with a Physically Plausible Range of Light Intensity Values
02.10.2024 14:00
IZ G30
Speaker(s): Maximilian Giller
Talk MA-Talk: Voice in Focus: Debunking and Identifying Audio Deepfakes in Forensic Scenarios
27.09.2024 13:00
IZ G30
Speaker(s): Maurice Semren
In today's media-dominated world, the use of Voice Conversion systems and manipulated audio samples (deep fakes) is becoming increasingly widespread. However, these methods can often spread misinformation and cause confusion. Although there are systems that can identify these fakes, as of now, there is no technology that can accurately identify the source speaker. Developing such systems could greatly assist law enforcement and discourage the misuse of this technology. This work focuses on identifying the original speaker in Voice Conversion deepfakes using a specific list of potential suspects. We examine various Voice Conversion systems, comparing their overall quality, how closely they resemble the target speaker, and how well they disguise the original speaker. Additionally, we compare results from a human perception experiment with machine-based metrics derived from Speaker Verification tools.
The artificial perception appears to yield more accurate identification results on average, even when the human participants and the speaker are familiar with each other.
Talk BA-Talk: Learned Initialization of Neural Rendering Networks for Point-Based Novel View Synthesis
16.09.2024 13:00
G30
Speaker(s): Leon Overkämping