IEEE Tech Talk - From Speech to Summary: Speech Pipelines in Microsoft Copilot

#tech #talk #Microsoft #AI
Share

 

 


 Microsoft Copilot can be utilized to automatically summarize meeting discussions and generate actionable items from a meeting. Understanding "who is speaking what and when" is a crucial component to make this magic work. This talk explores the processes involved in handling speech signals, focusing on speech separation, enhancement, automatic speech recognition (ASR), and speaker diarization. We will start with techniques for separating and enhancing speech to ensure clarity and quality. The discussion will then shift to multilingual ASR, addressing the complexities of accurately transcribing speech across various languages. We will conclude the talk with speaker diarization, which identifies and attributes speech segments to individual speakers, answering the critical questions of who is speaking and when.

 

Please register for the event using the link provided below. No CEUs or PDHs will be offered for this event.



  Date and Time

  Location

  Hosts

  Registration



  • Date: 31 Oct 2024
  • Time: 12:30 AM UTC to 01:50 AM UTC
  • Add_To_Calendar_icon Add Event to Calendar
If you are not a robot, please complete the ReCAPTCHA to display virtual attendance info.
  • Address: 14820 NE 36th St
  • Redmond, Washington 98052
  • Redmond, Washington
  • United States 98052
  • Building: Microsoft Building #99
  • Room Number: Room #1505

  • Contact Event Hosts
  • Starts 25 October 2024 07:00 AM UTC
  • Ends 31 October 2024 12:00 AM UTC
  • 0 in-person spaces left!
  • No Admission Charge


  Speakers

Dr. Sunit Sivasankaran

Topic:

From Speech to Summary: Speech Pipelines in Microsoft Copilot

 Microsoft Copilot can be utilized to automatically summarize meeting discussions and generate actionable items from a meeting. Understanding "who is speaking what and when" is a crucial component to make this magic work. This talk explores the processes involved in handling speech signals, focusing on speech separation, enhancement, automatic speech recognition (ASR), and speaker diarization. We will start with techniques for separating and enhancing speech to ensure clarity and quality. The discussion will then shift to multilingual ASR, addressing the complexities of accurately transcribing speech across various languages. We will conclude the talk with speaker diarization, which identifies and attributes speech segments to individual speakers, answering the critical questions of who is speaking and when.

Biography:

 

Speaker: Dr. Sunit Sivasankaran 

Biography:

Sunit Sivasankaran is a Senior Applied Scientist at Microsoft, based in Redmond, US. He holds a PhD from INRIA-Nancy and Université de Lorraine, France, where he specialized in Speech Separation. Sunit has extensive experience in Multichannel Speech Enhancement and Automatic Speech Recognition (ASR), and Speaker Diarization, having previously worked at Microsoft Research, Inria, and Samsung Research Institute. Sunit's academic background includes a Master of Science by Research from the Indian Institute of Technology, Madras, and a Bachelor of Engineering from Rashtreeya Vidyalaya College of Engineering, Bangalore. He has been an active member of the IEEE community, serving as a reviewer for several conferences and journals.





Agenda

5.30 pm to 6.00 pm. Networking & Dinner.

6.00 pm to 6.50 – Tech Talk