Optimization Approach for the Cocktail Party Problem

Share

Superchapter: Joint Chapter of Communications, Information Theory, and Signal Processing Societies

 


A Joint Attention Decoding and Adaptive Beamforming Optimization Approach for the Cocktail Party Problem

 

The cocktail party problem has remained to be one of the most difficult problems for hearing devices even after decades of extensive research. One of the key challenges is to determine the desired talker in a cocktail party. Recently, researchers have successfully demonstrated the decoding of auditory attention using EEG, MEG or EMG. In addition, several research studies have attempted to incorporate the decoded auditory attention information into speech enhancement solutions. However, the existing solutions are less optimal in the sense that auditory attention decoding is often separate from speech enhancement. In this talk, we propose a joint auditory attention decoding and multi-channel speech enhancement approach. The proposed approach eliminates the need of extracting speech envelope of each talk, which is a difficult problem in practice by itself. Furthermore, the proposed solution is optimal in the sense that the attended talker’s speech is optimized using both microphone inputs and EEG inputs in a united framework. We present preliminary results to demonstrate the effectiveness of the algorithm and discuss future research directions.



  Date and Time

  Location

  Contact

  Registration



  • 1280 Main Street West
  • Hamilton, Ontario
  • Canada L8S 4K1
  • Building: ITB
  • Room Number: A113

Staticmap?size=250x200&sensor=false&zoom=14&markers=43.2579892%2c 79
  • Dr. Jun Chen, Chair Hamilton Superchapter



  Speakers

Dr. Tao Zhang

Dr. Tao Zhang of Starkey Laboratories, Inc.

Topic:

A Joint Attention Decoding and Adaptive Beamforming Optimization Approach for the Cocktail Party Problem

The cocktail party problem has remained to be one of the most difficult problems for hearing devices even after decades of extensive research. One of the key challenges is to determine the desired talker in a cocktail party. Recently, researchers have successfully demonstrated the decoding of auditory attention using EEG, MEG or EMG. In addition, several research studies have attempted to incorporate the decoded auditory attention information into speech enhancement solutions. However, the existing solutions are less optimal in the sense that auditory attention decoding is often separate from speech enhancement. In this talk, we propose a joint auditory attention decoding and multi-channel speech enhancement approach. The proposed approach eliminates the need of extracting speech envelope of each talk, which is a difficult problem in practice by itself. Furthermore, the proposed solution is optimal in the sense that the attended talker’s speech is optimized using both microphone inputs and EEG inputs in a united framework. We present preliminary results to demonstrate the effectiveness of the algorithm and discuss future research directions.

Biography:

Tao Zhang received his B.S. degree in physics from Nanjing University, Nanjing, China in 1986, M.S. degree in electrical engineering from Peking University, Beijing, China in 1989, and Ph.D. degree in speech and hearing science from the Ohio-State University, Columbus, OH, USA in 1995. He joined the Advanced Research Department at Starkey Laboratories, Inc. as a Sr. Research Scientist in 2001, managed the DSP department from 2004 to 2008 and the Signal Processing Research Department from 2008 to 2019. Since 2019, he has been Director of the Algorithms Department at Starkey Hearing Technologies, a global leader in providing innovative hearing technologies. He has received many prestigious awards including Inventor of the Year Award, the Mount Rainier Best Research Team Award, the Most Valuable Idea Award, the Outstanding Technical Leadership Award and the Engineering Service Award at Starkey.

He is a senior member of IEEE and the Signal Processing Society and the Engineering in Medicine and Biology Society. He serves on the IEEE AASP Technical Committee and the industrial relationship committee and the IEEE ComSoc North America Region Board, He is an IEEE SPS Distinguished Industry Speaker and the Chair of IEEE Twin-cities Signal Processing and Communication Chapter.

His current research interests include audio, acoustic, speech signal processing and machine learning; multimodal signal processing and machine learning for hearing enhancement, health and wellness monitoring; psychoacoustics, room and ear canal acoustics; ultra-low power real-time embedded system design and device-phone-cloud ecosystem design. He has authored and coauthored 130+ presentations and publications, received 23 approved patents and had additional 30+ patents pending.

Email: