Generative audio and its applications
Join Dr. Ivan Tashev, Partner Software Architect at Microsoft Research, for an insightful session on "Generative Audio and Its Applications." Explore the cutting-edge role of audio in generative AI, from enhancing emotional connections to enabling multimodal experiences. Dr. Tashev will delve into innovative AI systems that generate and synchronize audio across different modalities, with real-world applications in fields like accessibility, entertainment, and beyond. This talk will feature key research from the Audio and Acoustics Research Group at Microsoft Research, shedding light on how generative audio is transforming the way we interact with technology.
Topic: Generative audio and its applications
Description:
Audio—including sound and music—has the power to foster emotional connections and promote social bonding. As a vital human sense that complements vision, it remains relatively underexplored in generative AI research. By positioning audio as a key social-emotional layer of AI, we underscore its transformative potential in building more inspiring and context-aware systems.
This talk presents an overview of key approaches within the broader generative AI landscape, with a focus on the role of audio. Audio-language models can generate captions, labels, or free-form text from audio signals, enabling applications such as question answering. Moreover, generating audio from prior audio inputs or from other modalities—such as text, images, or video—opens the door to compelling multimodal models and experiences. For instance, AI systems can synchronize audio and video streams or produce coordinated audio-visual outputs.
The talk will be illustrated with research projects from the Audio and Acoustics Research Group at Microsoft Research.
Biography:
Dr. Ivan Tashev is a Partner Software Architect in Microsoft Research (MSR), Redmond, WA, USA, where he leads the Audio and Acoustics Research Group. His interests include multichannel signal processing and machine learning and artificial intelligence for signal processing. Ivan Tashev also coordinates the Brain-Computer Interfaces project in MSR. Dr. Tashev have published two books, two book chapters, 100+ scientific papers, listed as inventor in 50 US patents. Ivan Tashev is affiliate professor in the Department for Electrical and Computer Engineering of University of Washington in Seattle, USA, and honorary professor at Technical University of Sofia, Bulgaria. Technologies created by Dr. Tashev are incorporated in many Microsoft products, he served as the audio architect for Kinect and for HoloLens. He is an IEEE Fellow, member of AES and ASA. More details about him can be found in his web page https://www.microsoft.com/en-
Date and Time
Location
Hosts
Registration
-
Add Event to Calendar
- 901 12th Ave
- Seattle, Washington
- United States 98109-5210
- Building: Bannan
- Room Number: 629
Speakers
Generative audio and its applications
Audio—including sound and music—has the power to foster emotional connections and promote social bonding. As a vital human sense that complements vision, it remains relatively underexplored in generative AI research. By positioning audio as a key social-emotional layer of AI, we underscore its transformative potential in building more inspiring and context-aware systems.
This talk presents an overview of key approaches within the broader generative AI landscape, with a focus on the role of audio. Audio-language models can generate captions, labels, or free-form text from audio signals, enabling applications such as question answering. Moreover, generating audio from prior audio inputs or from other modalities—such as text, images, or video—opens the door to compelling multimodal models and experiences. For instance, AI systems can synchronize audio and video streams or produce coordinated audio-visual outputs.
The talk will be illustrated with research projects from the Audio and Acoustics Research Group at Microsoft Research.
Agenda
5 PM - 5.30 PM | Networking and light snacks |
5.30 PM - 6.30 PM | Guest Talk by Distinguished Speaker Dr. Ivan Tashev |
6.30 PM - 7.00 PM | Networking and Wrap up |