BEGIN:VCALENDAR
VERSION:2.0
PRODID:IEEE vTools.Events//EN
CALSCALE:GREGORIAN
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
DTSTART:20230312T030000
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
RRULE:FREQ=YEARLY;BYDAY=2SU;BYMONTH=3
TZNAME:EDT
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:20231105T010000
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
RRULE:FREQ=YEARLY;BYDAY=1SU;BYMONTH=11
TZNAME:EST
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240311T160605Z
UID:33532B43-AD2A-4559-B2EE-99D5B790644A
DTSTART;TZID=America/New_York:20231102T110000
DTEND;TZID=America/New_York:20231102T120000
DESCRIPTION:Q-learning\, which seeks to learn the optimal Q-function of a M
 arkov decision process (MDP) in a model-free fashion\, lies at the heart o
 f reinforcement learning practices. However\, theoretical understandings o
 n its non-asymptotic sample complexity remain unsatisfactory\, despite sig
 nificant recent efforts. In this talk\, we first show a tight sample compl
 exity bound of Q-learning in the single-agent setting\, together with a ma
 tching lower bound to establish its minimax sub-optimality. We then show h
 ow federated versions of Q-learning allow collaborative learning using dat
 a collected by multiple agents without central sharing\, where an importan
 ce averaging scheme is introduced to unveil the blessing of heterogeneity.
 \n\nSpeaker(s): Dr. Yuejie Chi\, \n\nBldg: Department of Electrical and Co
 mputer Engineering\, PH0339\, Boston University\, 8 St. Mary’s Street\, 
 Boston\, New Jersey\, United States\, 02215
LOCATION:Bldg: Department of Electrical and Computer Engineering\, PH0339\,
  Boston University\, 8 St. Mary’s Street\, Boston\, New Jersey\, United 
 States\, 02215
ORGANIZER:karen@ece.tufts.edu
SEQUENCE:9
SUMMARY:Sample Complexity of Q-learning: from Single-agent to Federated Lea
 rning
URL;VALUE=URI:https://events.vtools.ieee.org/m/380556
X-ALT-DESC:Description: &lt;br /&gt;&lt;p&gt;Q-learning\, which seeks to learn the opti
 mal Q-function of a Markov decision process (MDP) in a model-free fashion\
 , lies at the heart of reinforcement learning practices.&amp;nbsp\;However\, t
 heoretical understandings on its non-asymptotic sample complexity remain u
 nsatisfactory\, despite significant recent efforts. In this talk\, we firs
 t show a tight sample complexity bound of Q-learning in the single-agent s
 etting\, together with a matching lower bound to establish its minimax sub
 -optimality. We then show how federated versions of Q-learning allow colla
 borative learning using data collected by multiple agents without central 
 sharing\, where an importance averaging scheme is introduced to unveil the
  blessing of heterogeneity.&lt;/p&gt;\n&lt;p&gt;&amp;nbsp\;&lt;/p&gt;
END:VEVENT
END:VCALENDAR