Siebel School Speaker Series Master Calendar

View Full Calendar

Machine Learning Seminar: Dr. Haoxiang Wang, "Beyond Diffusion: The Rise of LLM-Native Image Generation."

Event Type
Seminar/Symposium
Sponsor
CS 591 MLR Organizers
Virtual
Join online
Date
Oct 31, 2025   2:00 - 3:15 pm  
Speaker
Dr. Haoxiang Wang
Contact
Allison Mette
E-Mail
agk@illinois.edu
Originating Calendar
Siebel School Speakers Calendar

Abstract: The 2025 releases of OpenAI's GPT-4o image generation and Google's "Nano Banana" (Gemini 2.5 Flash Image) marked a clear paradigm shift. This shift moves image generation away from diffusion models (e.g., Stable Diffusion) and into "omni models"—general-purpose transformers capable of understanding and generating across multiple modalities. This "LLM-native" approach integrates generation directly into the chat context, enabling superior controllability, robust text rendering, and intuitive language-based editing.

This talk provides a technical survey of this emerging field. We will discuss how these systems are typically created by finetuning existing large language models (LLMs) or vision-language models (VLMs) to acquire generative capabilities. We will analyze the core mechanisms, where a single transformer learns to emit visual tokens, often alongside text, using next-token prediction. We will review recent works along this line and systematically compare and contrast key architectural decisions that impact performance.

Bio: Haoxiang Wang is a Research Scientist at Luma AI. Prior to joining Luma, he was a Research Scientist at NVIDIA, where he worked on world models and vision-language models. Haoxiang completed his Ph.D. in Electrical and Computer Engineering at UIUC in 2024, under the supervision of Prof. Han Zhao and Prof. Bo Li. His Ph.D. research focused on several areas of machine learning, such as RLHF, OOD generalization, and multi-task learning. During his studies, he also interned at Apple, Amazon, and Waymo.

link for robots only