Speech Recognition: What's Left? : vTools Events

IEEE.org | IEEE Xplore Digital Library | IEEE Standards | IEEE Spectrum | More Sites

Speech Recognition: What's Left?

#Speech #recognition #noisy #speech #reverberant #human #perception

Presented by the University of Pennsylvania and the IEEE Philadelphia Chapter of the Signal Processing Society

Date and Time

Location

Hosts

Registration

Date: 30 Apr 2018
Time: 01:30 PM to 03:00 PM
All times are (GMT-05:00) US/Eastern
Add Event to Calendar
iCal
Google Calendar

University of Pennsylvania
3330 Walnut St.
Philadelphia, Pennsylvania
United States 19104
Building: Levine Hall
Room Number: 307

Contact Event Host

Starts 05 April 2018 12:00 PM
Ends 30 April 2018 12:00 PM
All times are (GMT-05:00) US/Eastern
No Admission Charge

Speakers

Dr. Michael Picheny

Topic:

Speech Recognition: What's Left?

Recent speech recognition advances on the SWITCHBOARD corpus suggest that because of recent advances in Deep Learning, we now achieve Word Error Rates comparable to human listeners. Does this mean the speech recognition problem is solved and the community can move on to a different set of problems? In this talk, we examine speech recognition issues that still plague the community and compare and contrast them to what is known about human perception. We specifically highlight issues in accented speech, noisy/reverberant speech, speaking style, rapid adaptation to new domains, and multilingual speech recognition. We try to demonstrate that compared to human perception, there is still much room for improvement, so significant work in speech recognition research is still required from the community.

Biography:

Michael Picheny (Fellow, IEEE) is the Senior Manager of the Speech Technologies Group in IBM Research AI based at the IBM TJ Watson Research Center. Michael has worked in the Speech Recognition area since 1981, joining IBM after finishing his doctorate at MIT. He has been heavily involved in the development of almost all of IBM's recognition systems, ranging from the world's first real-time large vocabulary discrete system through IBM's current product lines for telephony and embedded systems. He has published numerous papers in both journals and conferences on almost all aspects of speech recognition. He has received several awards from IBM for his work, including three outstanding Technical Achievement Awards and two Research Division Awards, and most recently, a Corporate Award. He is the co-holder of over 40 patents and was named a Master Inventor by IBM in 1995 and again in 2000. Michael served as an Associate Editor of the IEEE Transactions on Acoustics, Speech, and Signal Processing from 1986-1989, was the chairman of the Speech Technical Committee of the IEEE Signal Processing Society from 2002-2004, and is a Fellow of the IEEE. He served as an Adjunct Professor in the Electrical Engineering Department of Columbia University in the spring of 2016 and co-taught a course in speech recognition. He is a Fellow of ISCA (International Speech Communication Association) and was a member of the ISCA board from 2005-2013. He was a co-organizer of the 2011 IEEE Automatic Speech

Agenda

Lecture Starts: 1:00 PM