Image generation with end-to-end training and benefits of a good VAE : vTools Events

IEEE.org | IEEE Xplore Digital Library | IEEE Standards | IEEE Spectrum | More Sites

Image generation with end-to-end training and benefits of a good VAE

#architecture #deep-learning #image-generation #artificial-intelligence #communication #machine-learning #networking #processing

Latent diffusion models underly modern image generation, which requires a variational auto-encoder (VAE) for image encoding and decoding, and a diffusion transformer for generation. While end-to-end training has been the spirit of deep learning, it is surprising that latent diffusion models are not trained end-to-end, causing representation bottlenecks. In this talk, I will introduce our work that jointly trains the VAE and diffusion transformer and show how it accelerates training and yields high quality images. Further, I will discuss use cases where the resulting end-to-end trained VAEs bring significant benefits. This includes higher-quality text-to-image generation and automatic agentic search of diffusion transformer architectures. I will conclude with new perspectives.

Liang Zheng

Australian National University

Dr. Liang Zheng is an Associate Professor at the Australian National University and a Research Scientist at Canva. He is interested in representation learning for perception and generation. He contributed many useful datasets and methods to the object re-identification field that were later used in wider domains. He is currently working on image generation in both aspects of pre-training and post-training. He is a Program Chair for ACM MM’24, MM’28, andAVSS'24, and a General Chair for AVSS’27 and DICTA 2027. He is a regular area chair for important conferences and an Associate Editor for TPAMI. He has bachelor degrees in Biology, Economics and a PhD degree in Computer Science from Tsinghua University.

Date and Time

Location

Hosts

Registration

Add Event to Calendar
iCal
Google Calendar

Loading virtual attendance info...

Contact Event Host

Link to External Registration