SPS Oregon Chapter Seminar: Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

#sps #optimization #seminar#reinforcementlearning
Share

Learning to rank is a key problem in information retrieval and machine learning and a core part of modern search engines and recommender systems. Off-policy learning to rank aims to optimize a ranker from implicit user feedback (e.g., clicks) collected by a deployed logging policy. However, existing off-policy learning to rank methods often make strong assumptions about how users generate the click data, i.e., the click model, and hence need to tailor their methods specifically under different click models. In this talk, I will introduce how to unify the ranking process under stochastic click models as a Markov Decision Process, and then discuss our work of leveraging offline reinforcement learning methods for click model-agnostic off-policy learning to rank.



  Date and Time

  Location

  Hosts

  Registration



  • Date: 30 Nov 2023
  • Time: 07:00 PM UTC to 07:59 PM UTC
  • Add_To_Calendar_icon Add Event to Calendar
If you are not a robot, please complete the ReCAPTCHA to display virtual attendance info.
  • Contact Event Host
  • Starts 22 November 2023 05:56 PM UTC
  • Ends 30 November 2023 06:56 PM UTC
  • No Admission Charge


  Speakers

Huazheng Wang

Topic:

Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

Learning to rank is a key problem in information retrieval and machine learning and a core part of modern search engines and recommender systems. Off-policy learning to rank aims to optimize a ranker from implicit user feedback (e.g., clicks) collected by a deployed logging policy. However, existing off-policy learning to rank methods often make strong assumptions about how users generate the click data, i.e., the click model, and hence need to tailor their methods specifically under different click models. In this talk, I will introduce how to unify the ranking process under stochastic click models as a Markov Decision Process, and then discuss our work of leveraging offline reinforcement learning methods for click model-agnostic off-policy learning to rank.

Biography:

Huazheng Wang is an assistant professor in School of Electrical Engineering and Computer Science at Oregon State University. He was a postdoctoral research associate at Princeton University. He received his Ph.D. in Computer Science from University of Virginia, and his B.E. from University of Science and Technology of China. His research interests include  reinforcement learning and information retrieval. His recent focus is developing efficient and robust reinforcement learning and multi-armed bandit algorithms with applications to online recommendation and ranking systems. He co-organized tutorials at KDD2020 and SIGIR 2021 on interactive Information Retrieval and exploration. He is a recipient of SIGIR 2019 Best Paper Award.