BEGIN:VCALENDAR
VERSION:2.0
PRODID:IEEE vTools.Events//EN
CALSCALE:GREGORIAN
BEGIN:VTIMEZONE
TZID:Canada/Pacific
BEGIN:DAYLIGHT
DTSTART:20210314T030000
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
RRULE:FREQ=YEARLY;BYDAY=2SU;BYMONTH=3
TZNAME:PDT
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:20211107T010000
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
RRULE:FREQ=YEARLY;BYDAY=1SU;BYMONTH=11
TZNAME:PST
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210628T214540Z
UID:2D98CC2C-B662-42FD-A48D-C295679448D8
DTSTART;TZID=Canada/Pacific:20210628T120000
DTEND;TZID=Canada/Pacific:20210628T130000
DESCRIPTION:In this work\, we compress convolutional neural network (CNN) w
 eights post-training via transform quantization. Previous CNN quantization
  techniques tend to ignore the joint statistics of weights and activations
 \, producing sub-optimal CNN performance at a given quantization bit-rate\
 , or consider their joint statistics during training only and do not facil
 itate efficient compression of already trained CNN models. We optimally tr
 ansform (decorrelate) and quantize the weights post-training using a rate-
 distortion framework to improve compression at any given quantization bit-
 rate. Transform quantization unifies quantization and dimensionality reduc
 tion (decorrelation) techniques in a single framework to facilitate low bi
 t-rate compression of CNNs and efficient inference in the transform domain
 . We first introduce a theory of rate and distortion for CNN quantization\
 , and pose optimum quantization as a rate-distortion optimization problem.
  We then show that this problem can be solved using optimal bit-depth allo
 cation following decorrelation by the optimal End-to-end Learned Transform
  (ELT) we derive in this paper. Experiments demonstrate that transform qua
 ntization advances the state of the art in CNN compression in both retrain
 ed and non-retrained quantization scenarios. In particular\, we find that 
 transform quantization with retraining is able to compress CNN models such
  as AlexNet\, ResNet and DenseNet to very low bit-rates (1-2 bits).\n\nThi
 s talk is based on joint published work with Zhe Wang\, David Taubman and 
 Bernd Girod. Preprint is available at https://arxiv.org/abs/2009.01174.\n\
 nSpeaker(s): Dr. Sean I. Young\, \n\nVirtual: https://events.vtools.ieee.o
 rg/m/274143
LOCATION:Virtual: https://events.vtools.ieee.org/m/274143
ORGANIZER:ivan_bajic@ieee.org
SEQUENCE:1
SUMMARY:Transform Quantization for CNN Compression
URL;VALUE=URI:https://events.vtools.ieee.org/m/274143
X-ALT-DESC:Description: &lt;br /&gt;&lt;p&gt;In this work\, we compress convolutional n
 eural network (CNN) weights post-training via transform quantization. Prev
 ious CNN quantization techniques tend to ignore the joint statistics of we
 ights and activations\, producing sub-optimal CNN performance at a given q
 uantization bit-rate\, or consider their joint statistics during training 
 only and do not facilitate efficient compression of already trained CNN mo
 dels. We optimally transform (decorrelate) and quantize the weights post-t
 raining using a rate-distortion framework to improve compression at any gi
 ven quantization bit-rate. Transform quantization unifies quantization and
  dimensionality reduction (decorrelation) techniques in a single framework
  to facilitate low bit-rate compression of CNNs and efficient inference in
  the transform domain. We first introduce a theory of rate and distortion 
 for CNN quantization\, and pose optimum quantization as a rate-distortion 
 optimization problem. We then show that this problem can be solved using o
 ptimal bit-depth allocation following decorrelation by the optimal End-to-
 end Learned Transform (ELT) we derive in this paper. Experiments demonstra
 te that transform quantization advances the state of the art in CNN compre
 ssion in both retrained and non-retrained quantization scenarios. In parti
 cular\, we find that transform quantization with retraining is able to com
 press CNN models such as AlexNet\, ResNet and DenseNet to very low bit-rat
 es (1-2 bits).&amp;nbsp\;&lt;/p&gt;\n&lt;p&gt;This talk is based on joint published work w
 ith Zhe Wang\, David Taubman and Bernd Girod. Preprint is available at&amp;nbs
 p\;&lt;a href=&quot;https://arxiv.org/abs/2009.01174&quot;&gt;https://arxiv.org/abs/2009.0
 1174&lt;/a&gt;.&lt;/p&gt;\n&lt;p&gt;&amp;nbsp\;&lt;/p&gt;
END:VEVENT
END:VCALENDAR

