BEGIN:VCALENDAR
VERSION:2.0
PRODID:IEEE vTools.Events//EN
CALSCALE:GREGORIAN
BEGIN:VTIMEZONE
TZID:Asia/Kolkata
BEGIN:STANDARD
DTSTART:19451014T230000
TZOFFSETFROM:+0630
TZOFFSETTO:+0530
TZNAME:IST
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260204T183330Z
UID:97ACA14A-304B-41D5-B12B-2BB824937770
DTSTART;TZID=Asia/Kolkata:20240604T213000
DTEND;TZID=Asia/Kolkata:20240604T223000
DESCRIPTION:Video coding is a fundamental and ubiquitous technology in the 
 modern society. Generations of international video coding standards\, such
  as the widely-deployed H.264/AVC and H.265/HEVC and the latest H.266/VVC\
 , provide essential means for enabling video conferencing\, video streamin
 g\, video sharing\, e-commerce\, entertainment\, and many more video appli
 cations. These existing standards all rely on the fundamental theory of si
 gnal processing and information theory to encode generic video efficiently
  with a favorable rate distortion behavior. In recent years\, rapid advanc
 ement in deep learning and artificial intelligence technology has allowed 
 people to manipulate images and videos using deep generative models. Among
  these\, of particular interest to the field of video coding is the applic
 ation of deep generative models towards compressing talking-face video at 
 ultra-low bit rates. By focusing on talking faces\, generative models can 
 effectively learn the inherent structure about composition\, movement and 
 posture of human faces and deliver promising results using very little ban
 dwidth resource. At ultra-low bit rates\, when even the latest video\n\nco
 ding standard H.266/VVC is apt to suffer from significant blocking artifac
 ts and blurriness beyond the point of recognition\, generative methods can
  maintain clear facial features and vivid expression in the reconstructed 
 video. Further\, generative face video coding techniques are inherently ca
 pable of manipulating the reconstructed face and promise to deliver a more
  interactive experience. In this talk\, we start with a quick overview of 
 traditional and deep learning-based video coding techniques. We then focus
  on face video coding with generative networks\, and present two schemes t
 hat send different deep information in the bitstream\, one sending compact
  temporal motion features and the other sending 3D facial semantics. We co
 mpare their compression efficiency and visual quality with that of the lat
 est H.266/VVC standard\, and showcase the power of deep generative models 
 in preserving vivid facial images with little bandwidth resource. We also 
 present visualization results to exhibit the capability of the 3D facial s
 emantics-based scheme in terms of interacting with the reconstructed face 
 video and animating virtual faces.\n\nSpeaker(s):  Dr. Yan Ye\, \n\nVirtua
 l: https://events.vtools.ieee.org/m/422747
LOCATION:Virtual: https://events.vtools.ieee.org/m/422747
ORGANIZER:ieee.sps.sb.iitkgp@gmail.com
SEQUENCE:15
SUMMARY:IEEE SPS SBC Webinar: Face video compression with generative networ
 ks (By Dr. Yan Ye)
URL;VALUE=URI:https://events.vtools.ieee.org/m/422747
X-ALT-DESC:Description: &lt;br /&gt;&lt;p dir=&quot;ltr&quot;&gt;Video coding is a fundamental an
 d ubiquitous technology in the modern society. Generations of internationa
 l video coding standards\, such as the widely-deployed H.264/AVC and H.265
 /HEVC and the latest H.266/VVC\, provide essential means for enabling vide
 o conferencing\, video streaming\, video sharing\, e-commerce\, entertainm
 ent\, and many more video applications. These existing standards all rely 
 on the fundamental theory of signal processing and information theory to e
 ncode generic video efficiently with a favorable rate distortion behavior.
  In recent years\, rapid advancement in deep learning and artificial intel
 ligence technology has allowed people to manipulate images and videos usin
 g deep generative models. Among these\, of particular interest to the fiel
 d of video coding is the application of deep generative models towards com
 pressing talking-face video at ultra-low bit rates. By focusing on talking
  faces\, generative models can effectively learn the inherent structure ab
 out composition\, movement and posture of human faces and deliver promisin
 g results using very little bandwidth resource. At ultra-low bit rates\, w
 hen even the latest video&lt;/p&gt;\n&lt;p dir=&quot;ltr&quot;&gt;coding standard H.266/VVC is a
 pt to suffer from significant blocking artifacts and blurriness beyond the
  point of recognition\, generative methods can maintain clear facial featu
 res and vivid expression in the reconstructed video. Further\, generative 
 face video coding techniques are inherently capable of manipulating the re
 constructed face and promise to deliver a more interactive experience. In 
 this talk\, we start with a quick overview of traditional and deep learnin
 g-based video coding techniques. We then focus on face video coding with g
 enerative networks\, and present two schemes that send different deep info
 rmation in the bitstream\, one sending compact temporal motion features an
 d the other sending 3D facial semantics. We compare their compression effi
 ciency and visual quality with that of the latest H.266/VVC standard\, and
  showcase the power of deep generative models in preserving vivid facial i
 mages with little bandwidth resource. We also present visualization result
 s to exhibit the capability of the 3D facial semantics-based scheme in ter
 ms of interacting with the reconstructed face video and animating virtual 
 faces.&lt;/p&gt;\n&lt;p&gt;&amp;nbsp\;&lt;/p&gt;
END:VEVENT
END:VCALENDAR

