IEEE SPS DISTINGUISHED INDUSTRY SPEAKER PROGRAM TWIN CITIES SP/COM CHAPTER SEMINAR 10/15/2025 : vTools Events

IEEE.org | IEEE Xplore Digital Library | IEEE Standards | IEEE Spectrum | More Sites

IEEE SPS DISTINGUISHED INDUSTRY SPEAKER PROGRAM TWIN CITIES SP/COM CHAPTER SEMINAR 10/15/2025

#Convolutional-Beamformer #Denoising #Dereverberation #Source #Separation #ASR #Hearing-Assistive-Device

Lecture by Dr. Tomohiro Nakatani, Senior Distinguished Researcher at NTT Communication Science Laboratories, NTT, Inc., Japan, an IEEE Distinguished Industry Speaker, on the topic of "Convolutional Beamformer for Joint Denoising, Dereverberation, and Source Separation" to be held at WFA Education Center, Starkey (Eden Prairie).

Title: Convolutional Beamformer for Joint Denoising, Dereverberation, and Source Separation

Abstract:
When speech is captured by distant microphones in everyday environments, the signals are often contaminated by background noise, reverberation, and overlapping voices. The convolutional beamformer (CBF) is a signal processing technique that recovers clean, close-microphone-quality speech from such complex mixtures. By jointly performing denoising, dereverberation, and source separation, CBF enhances both human listening experiences and automatic speech recognition (ASR) accuracy. Potential applications include hearing assistive devices, meeting transcription systems, and other real-world speech technologies.

This talk begins by introducing the concept of CBF, including its formal definition, mechanism for joint enhancement, and optimization via maximum likelihood estimation. CBF is defined as a series of beamformers estimated at each frequency in the short-time Fourier transform (STFT) domain and convolved with the observed signal to achieve the desired enhancement. The presentation then describes that CBF can be factorized into Multichannel Linear Prediction (MCLP) for dereverberation and Beamforming (BF) for denoising and separation, highlighting the practical advantages of this decomposition. Related work is reviewed, including Weighted Prediction Error (WPE) dereverberation, mask-based beamforming, and guided source separation, with emphasis on strong results in challenging tasks such as the CHiME-8 distant ASR challenge.

Further extensions are presented, including blind CBF for unknown recording conditions, switching CBF for enhanced performance with a limited number of microphones, and integration with neural networks - notably the DiffCBF framework, which combines CBF with diffusion-based speech enhancement models. Experimental results demonstrate state-of-the-art speech quality, even with relatively few microphones and limited training data.

Date and Time

Location

Hosts

Registration

Add Event to Calendar
iCal
Google Calendar

If you are not a robot, please complete the ReCAPTCHA to display virtual attendance info.

6425 Flying Cloud Dr
Eden Prairie, Minnesota
United States 55344
Building: William F Austin Center
Room Number: Excelsior Room
Click here for Map

Contact Event Host
Co-sponsored by Starkey

Starts 19 September 2025 05:00 AM UTC
Ends 15 October 2025 02:00 PM UTC
No Admission Charge

Speakers

Dr. Tomohiro Nakatani of NTT Communication Science Laboratories, NTT, Inc., Japan

Topic:

Convolutional Beamformer for Joint Denoising, Dereverberation, and Source Separation

When speech is captured by distant microphones in everyday environments, the signals are often contaminated by background noise, reverberation, and overlapping voices. The convolutional beamformer (CBF) is a signal processing technique that recovers clean, close-microphone-quality speech from such complex mixtures. By jointly performing denoising, dereverberation, and source separation, CBF enhances both human listening experiences and automatic speech recognition (ASR) accuracy. Potential applications include hearing assistive devices, meeting transcription systems, and other real-world speech technologies.

Biography:

Tomohiro Nakatani received the B.E., M.E., and Ph.D. degrees from Kyoto University, Kyoto, Japan, in 1989, 1991, and 2002, respectively.

He is currently a Senior Distinguished Researcher at NTT Communication Science Laboratories, NTT, Inc., Japan. In 2005, he was a Visiting Scholar at the Georgia Institute of Technology, USA, and from 2008 to 2017, he served as a Visiting Associate Professor in the Department of Media Science at Nagoya University, Japan. Since joining NTT as a Researcher in 1991, he has focused on developing audio signal processing technologies for intelligent human–machine interfaces, including dereverberation, denoising, source separation, and robust automatic speech recognition (ASR).

Dr. Nakatani served as an Associate Editor for the IEEE Transactions on Audio, Speech, and Language Processing from 2008 to 2010. He was a member of the IEEE SPS Audio and Acoustic Signal Processing Technical Committee from 2009 to 2014, the IEEE SPS Speech and Language Processing Technical Committee from 2016 to 2021, and the IEEE SPS Fellow Evaluating Committee in 2024 and 2025. He has been serving as an IEEE SPS Distinguished Industry Speaker since 2025. He was co-Chair of the 2014 REVERB Challenge Workshop and General Co-Chair of IEEE ASRU 2017. His accolades include the 2005 IEICE Best Paper Award, the 2009 ASJ Technical Development Award, the 2012 Japan Audio Society Award, an Honorable Mention for the 2015 IEEE ASRU Best Paper Award, the 2017 Maejima Hisoka Award, and the 2018 IWAENC Best Paper Award. He has been an IEEE Fellow since 2021 and an IEICE Fellow since 2022.

Agenda

9:30 – 10:00 a.m. Meet and Greet

10:00 – 10:05 a.m. Welcome Remarks by Dr. Masahiro Sunohara

10:05 – 11:00 a.m. Convolutional Beamformer for Joint Denoising, Dereverberation, and Source Separation by Dr. Tomohiro Nakatani

Please register in-person attendance by emailing Masahiro Sunohara your names and affiliation at masahiro_sunohara@starkey.com. Online meeting attendance does not require registration. Link will be provided soon.