Audio Analytics – What We Can Get from Speech Beyond Speech Recognition, and is There Anything Useful in the Non-Speech Audio

#talk #signal #processing #audio #Speech #Recognition
Share

Around half of the information humans exchange during interaction is not the meaning of speech itself. Speech audio signal carries information about the age, gender, emotion of the speaker. In this talk we will discuss the information that can be obtained from the speech signal, potential approaches and applications. Then we will extend the scope with extracting information from non-speech audio – audio events detection and audio background recognition. We will discuss the technologies and algorithms for solving these problems – neural networks with supervised and non-supervised training, most commonly used features and cost functions. 



  Date and Time

  Location

  Hosts

  Registration



  • Date: 24 Aug 2023
  • Time: 10:00 AM to 11:00 AM
  • All times are (UTC+05:30) Chennai
  • Add_To_Calendar_icon Add Event to Calendar
If you are not a robot, please complete the ReCAPTCHA to display virtual attendance info.
  • Contact Event Host
  • Survey: Fill out the survey


  Speakers

Dr. Ivan Tashev of Partner Software Architect in Microsoft Research (MSR)

Topic:

Audio Analytics – What We Can Get from Speech Beyond Speech Recognition, and is There Anything Useful in the Non-Speech

Around half of the information humans exchange during interaction is not the meaning of speech itself. Speech audio signal carries information about the age, gender, emotion of the speaker. In this talk we will discuss the information that can be obtained from the speech signal, potential approaches and applications. Then we will extend the scope with extracting information from non-speech audio – audio events detection and audio background recognition. We will discuss the technologies and algorithms for solving these problems – neural networks with supervised and non-supervised training, most commonly used features and cost functions. 

Biography:

Dr. Ivan Tashev is a Partner Software Architect in Microsoft Research (MSR), Redmond, WA, USA, where he leads the Audio and Acoustics Group. His interests include multichannel signal processing using machine learning and AI approaches. Ivan Tashev also coordinates the Brain-Computer Interfaces project in MSR. Dr. Tashev is affiliate professor in the Department for Electrical and Computer Engineering of University of Washington in Seattle, USA, and honorary professor at Technical University of Sofia, Bulgaria. Technologies created by Ivan Tashev are incorporated in many Microsoft products, he served as the audio architect for Kinect and for HoloLens. He is an IEEE Fellow, member of AES and ASA. More details about him can be found in his web page https://www.microsoft.com/en-us/research/people/ivantash/

Email:

Address:Redmond, WA, , , USA., United States