Vision Mini Conference 2025

Showcasing Cutting-Edge Research and Innovation

Oral Session 1

Time	Title	Presenter
9:00-9:10	Region-Based Representations	Savya Khosla
9:10-9:20	Mobile Manipulation	Arjun Gupta
9:20-9:30	Versatile Video Diffusion Editor for Videos and 3D Scenes	Junkun Chen
9:30-9:40	Versatile Multimodal Learning with Less Human Supervision	Shengcao Cao
9:40-9:50	Autoregressive Generative Models	Ziqi Pang
9:50-10:00	Scalable, Physics-Driven Generation of Human-Object Interactions	Sirui Xu
10:00-10:10	Robotic Manipulation by Imitating Generated Videos with Zero Training Demos	Shivansh Patel
10:10-10:20	C3T: Cross-modal Transfer Through Time for Human Action Recognition	Abhi Kamboj
Coffee Break (10:20-10:30)

Oral Session 2

Time	Title	Presenter
10:30-10:40	Learning Predictive Visuomotor Coordination	Wenqi Jia
10:40-10:50	Weather simulation with video diffusion model	Chih-Hao Lin
10:50-11:00	Perceive, Simulate, Imitate: a Simple Framework for Cross-Embodiment Visual Imitation	Albert Zhai
11:00-11:10	Template Based Visual Program Distillation	Michal Shlapentokh-Rothman
11:10-11:20	LidarDM	Vlas Zyrianov
11:20-11:30	Tool-as-Interface: Learning Robot Policies from Human Tool Usage through Imitation Learning	Haonan Chen
11:30-11:40	Decomposing any image into 3D primitives with applications to controlled image synthesis	Vaibhav Vavilala
11:40-11:50	DiffVax: Optimization-Free Image Immunization Against Diffusion-Based Editing	Ozgur Kara
11:50-12:00	SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images	Zixuan Huang
Lunch (12:00-1:15)

Panel Sessions

Time	Session	Details	Moderators
1:15-2:00	Faculty Panel	Discussion with faculty members	Aditya Prakash & Zhen Zhu
2:00-3:00	Alumni Panel	Discussion with program alumni	Sethuraman T V & Savya Khosla

Oral Session 3

Time	Title	Presenter
3:10-3:20	Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation	Bolin Lai
3:20-3:30	From physgen to physgen3d: Crafting a Miniature Interactive World from a Single Image	Shaowei Liu
3:30-3:40	Web Agents	Qianlan Yang
3:40-3:50	Unleashing Diffusion for Perception	Xin Xu
Posters + Discussion (3:50-5:00)

Alumni Panel

Meet our Alumni who have made significant contributions to their fields.

Photo of Unnat Jain

Unnat Jain

Research @ Skild AI

Photo of Bryan A. Plummer

Bryan A. Plummer

Research @ Boston University

Photo of Kevin Karsch

Kevin Karsch

Research @ Amazon

Photo of Bowen Cheng

Bowen Cheng

Research @ OpenAI

Photo of Daniel Mckee

Daniel Mckee

Research @ Snapchat

Photo of Jason Ren

Jason Ren

Research @ Apple

Photo of Anand Bhattad

Anand Bhattad

Research @ TTIC

Photo of Chuhang Zou

Chuhang Zou

Research @ Amazon