Imperial College London > Talks@ee.imperial > Control and Power Seminars > Distributional Policy Evaluation and Reachability

Distributional Policy Evaluation and Reachability

Add to your list(s) Download to your calendar using vCal

  • UserYulong Gao, Imperial College London
  • ClockThursday 18 January 2024, 14:00-15:00
  • House909B Seminar Room .

If you have a question about this talk, please contact Giordano Scarciotti.

Abstract: In this talk, we study the system performance and behaviors from a distributional perspective. In the first part, we focus on the characterization of random return in the distributional LQR . This return cumulates the discounted quadratic cost over infinite horizon under random exogenous disturbances. We provide a closed-form expression of the random return and show that it is the fixed-point solution to the distributional Bellman equation. We analyse the fundamental properties of the new characterization and extend the results to the partially observable linear systems. In the second part, we study the distributional reachability for finite Markov decision processes (MDPs). Unlike standard probabilistic reachability notions, which are defined over MDP states or trajectories, we formulate reachability over the space of probability distributions. We propose two set-valued maps for the forward and backward distributional reachability problems and design an efficient and scalable sampling-based computation algorithm. These results provide an alternative way to interpret the dynamic behaviors and resolve some important control problems for MDPs.

Biography: Dr. Yulong Gao is a Lecturer in the Department of Electrical and Electronic Engineering at Imperial College London. He received the B.E. degree in Automation in 2013, the M.E. degree in Control Science and Engineering in 2016, both from Beijing Institute of Technology, and the joint Ph.D. degree in Electrical Engineering in 2021 from KTH Royal Institute of Technology and Nanyang Technological University. He was a visiting student in Department of Computer Science, University of Oxford in 2019 and a Researcher at KTH from 2021 to 2022. From 2022 to 2023, he was a postdoctoral researcher at Oxford. His research interests include formal verification and control, machine learning, and applications to safety-critical systems.

This talk is part of the Control and Power Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

Changes to Talks@imperial | Privacy and Publicity