Recent Advancements in Image Generation and Understanding
- 2nd lecture of the 2026 Invited Seminar Series (Virtual) organized by IEEE Computer Society San Diego Chapter. Previous lectures: 2023, 2024, and 2025 invited seminar series.
2nd Lecture of IEEE CS San Diego's 2026 Invited Seminar Series (Virtual)
Date and Time
Location
Hosts
Registration
-
Add Event to Calendar
Loading virtual attendance info...
- Contact Event Host
- Co-sponsored by Media Partner: Open Research Institute (ORI)
Speakers
Aniket Roy of NEC Labs America
Recent Advancements in Image Generation and Understanding
In this talk, Dr. Roy will provide an overview of his research and then takes a closer look at three recent works. Image generation has progressed rapidly in the past decade-evolving from Gaussian Mixture Models (GMMs) to Variational Autoencoders (VAEs), GANs, and more recently diffusion models, which have set new standards for quality. Dr. Roy will begin with DiffNat (TMLR’25), which draws inspiration from a simple yet powerful observation: the kurtosis concentration property of natural images. By incorporating a kurtosis concentration loss together with a perceptual guidance strategy, DiffNat can be plugged directly into existing diffusion pipelines, leading to sharper and more faithful generations across tasks such as personalization, super-resolution, and unconditional synthesis. Continuing the theme of improving quality under constraints, Dr. Aniket Roy will then discuss DuoLoRA (ICCV’25), which tackles the challenge of content–style personalization from just a few examples. DuoLoRA introduces adaptive-rank LoRA merging with cycle-consistency, allowing the model to better disentangle style from content. This not only improves personalization quality but also achieves it with 19× fewer trainable parameters, making it far more efficient than conventional merging strategies. Finally, I will turn to Cap2Aug (WACV’25), which directly addresses data scarcity. This approach uses captions as a bridge for semantic augmentation, applying cross-modal backtranslation (image → text → image) to generate diverse synthetic samples. By aligning real and synthetic distributions, Cap2Aug boosts both few-shot and long-tail classification performance on multiple benchmarks.
Biography:
Aniket is currently a PhD student in the Computer Science deptartment at Johns Hopkins University under the guidance of Bloomberg Distinguished Professor Prof. Rama Chellappa. Prior to that, he did a Master’s from Indian Institute of Technology Kharagpur. He was recognized with the Best Paper Award at IWDW 2016 and the Markose Thomas Memorial Award for the best research paper at the Master’s level. During PhD, he explored domains of few-shot learning, multimodal learning, diffusion models, LLMs, LoRA merging with publications in leading venues such as NeurIPS, ICCV, TMLR, WACV, CVPR and also 3 US patents filed. During his PhD, he also gained industrial experience through multiple internships in Amazon, Qualcomm, MERL, and SRI International. He was awarded as an Amazon Fellow (2023-24) at JHU and selected to participate in ICCV'25 and AAAI'26 doctoral consortium.
Agenda
- Invited talk from Dr. Aniket Roy, from NEC Labs America
- Q/A Session