Vision Mini Conference 2025

Showcasing Cutting-Edge Research and Innovation

Oral Session 1

Time Title Presenter
9:00-9:10 Region-Based Representations Savya Khosla
9:10-9:20 Mobile Manipulation Arjun Gupta
9:20-9:30 Versatile Video Diffusion Editor for Videos and 3D Scenes Junkun Chen
9:30-9:40 Versatile Multimodal Learning with Less Human Supervision Shengcao Cao
9:40-9:50 Autoregressive Generative Models Ziqi Pang
9:50-10:00 Scalable, Physics-Driven Generation of Human-Object Interactions Sirui Xu
10:00-10:10 Robotic Manipulation by Imitating Generated Videos with Zero Training Demos Shivansh Patel
10:10-10:20 C3T: Cross-modal Transfer Through Time for Human Action Recognition Abhi Kamboj
Coffee Break (10:20-10:30)

Oral Session 2

Time Title Presenter
10:30-10:40 Learning Predictive Visuomotor Coordination Wenqi Jia
10:40-10:50 Weather simulation with video diffusion model Chih-Hao Lin
10:50-11:00 Perceive, Simulate, Imitate: a Simple Framework for Cross-Embodiment Visual Imitation Albert Zhai
11:00-11:10 Template Based Visual Program Distillation Michal Shlapentokh-Rothman
11:10-11:20 LidarDM Vlas Zyrianov
11:20-11:30 Tool-as-Interface: Learning Robot Policies from Human Tool Usage through Imitation Learning Haonan Chen
11:30-11:40 Decomposing any image into 3D primitives with applications to controlled image synthesis Vaibhav Vavilala
11:40-11:50 DiffVax: Optimization-Free Image Immunization Against Diffusion-Based Editing Ozgur Kara
11:50-12:00 SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images Zixuan Huang
Lunch (12:00-1:15)

Panel Sessions

Time Session Details Moderators
1:15-2:00 Faculty Panel Discussion with faculty members Aditya Prakash & Zhen Zhu
2:00-3:00 Alumni Panel Discussion with program alumni Sethuraman T V & Savya Khosla

Oral Session 3

Time Title Presenter
3:10-3:20 Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation Bolin Lai
3:20-3:30 From physgen to physgen3d: Crafting a Miniature Interactive World from a Single Image Shaowei Liu
3:30-3:40 Web Agents Qianlan Yang
3:40-3:50 Unleashing Diffusion for Perception Xin Xu
Posters + Discussion (3:50-5:00)

Alumni Panel

Meet our Alumni who have made significant contributions to their fields.