SHORT WEBINAR SERIES ON SELECTED TOPICS IN SIGNAL PROCESSING
The Montreal Chapter of the IEEE Signal Processing Society, in collaboration with STARaCom, cordially invites you to attend the following talks, as part of its series of Short Webinars on Selected Topics in Signal Processing.
Date and Time
Location
Hosts
Registration
- Date: 18 Feb 2021
- Time: 05:00 PM to 06:00 PM
- All times are (GMT-05:00) Canada/Eastern
- Add Event to Calendar
- 3480 Boul. Robert-Bourassa
- Montreal, Quebec
- Canada H3A 0E9
- Contact Event Host
-
Prof. Benoit Champagne, ECE Department, McGill University, Montreal
- Co-sponsored by STARaCom
- Starts 03 February 2021 12:00 PM
- Ends 18 February 2021 06:00 PM
- All times are (GMT-05:00) Canada/Eastern
- No Admission Charge
Speakers
Hongjiang Yu
Speech Enhancement with Deep Neural Networks Augmented Kalman Filtering System
Traditional Kalman filter (KF) based speech enhancement has attracted researchers’ great interests because of its capability to enhance the time domain and non-stationary speech signals. The performance of Kalman filtering is largely dependent on the estimation accuracy of the parameters of clean speech AR model. However, the clean speech is not accessible in practice and the estimated parameters from noisy observation are not accurate enough to achieve good enhancement results. In recent years, the deep neural networks (DNN) based signal processing methodology has largely advanced the research in speech enhancement due to the powerful learning capability of the DNN. As such, we first propose a DNN augmented KF for speech enhancement, in which the DNN is trained to learn the mapping between the noisy acoustic features and the parameters of clean speech AR model. The employment of DNN offers more accurate parameters for KF-based denoising, which contributes to a better enhancement performance. We then implement DNN-based parameter estimation to two other advanced KFs: subband KF and colored-noise KF, in order to further improve the performance of the DNN augmented Kalman filtering system. Finally, we will present the experimental results with our proposed system under different noisy conditions.
Biography:
Hongjiang Yu received the M.Sc. degree from the National Engineering Research Center for Multimedia Software, Wuhan University, China, in 2016. He is currently pursuing the Ph.D. degree at Concordia University, Montreal, Qc, Canada. His research interests include speech processing and deep learning.
Email:
Address:Department of Electrical and Computer Engineering, Concordia University, Montreal, Quebec, Canada
Loren Lugosch
Transducer Models for Speech Recognition
The Transducer is a neural network model suitable for sequence transduction tasks in which there is a monotonic alignment between the input sequence and output sequence, like speech recognition. The model was introduced in 2012, but has mostly been neglected since then, possibly because of its large memory requirements during training. Recently, however, the Transducer has been found to have excellent performance when implemented using modern neural network architectures, and now holds the state-of-the-art on the LibriSpeech speech recognition benchmark. In this talk, I'll introduce the Transducer in detail, describe its advantages over other end-to-end models for speech recognition (like CTC models and attention models), and discuss some current work on reducing its training memory consumption.
Biography:
Loren Lugosch (https://lorenlugosch.github.io/) is a PhD student at McGill University and Mila working on sequence modeling and conditional computation in neural networks. He has a B.Eng. in computer engineering and linguistics and an M.Eng. in electrical engineering, both from McGill University. From July 2017 to January 2019, he worked on speech recognition as a research engineer at Fluent.ai.
Email:
Address:Department of Electrical and Computer Engineering, McGill University, Montreal, Quebec, Canada