Imperial College London > Talks@ee.imperial > Featured talks > Large Structured Bandits and Applications
Log inImperial users Other users No account?Information onFinding a talk Adding a talk Syndicating talks Who we are Everything else |
Large Structured Bandits and ApplicationsAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Professor Peter Cheung. In bandit optimization problems, a decision maker aims at sequentially selecting actions with (a priori) unknown average rewards so as to maximize her cumulative reward over some finite time horizon. Since the 30’s, these problems have been extensively used in many areas to model and investigate the trade-off between exploitation (selecting actions that gave high rewards in the past), and exploration (playing actions whose rewards may be higher in the future). Most of the literature on bandits assumes that rewards are independent across actions, and that the set of actions is limited. In this talk, we introduce a class of bandit problems (i) with very large action space (we cannot even sample all actions once within the time horizon), and (ii) with rewards that are correlated across actions. We explore possible research directions towards solutions of this novel class of sequential decision problems, and explain how these problems naturally arise in e-commerce systems (display ads, sponsored search auctions, …) and in the design of radio communication networks. This talk is part of the Featured talks series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsPowertalk AI- and HCI-related talks Type the title of a new list hereOther talksPeriodic Image Trajectories in Earth-Moon Space Knowledge-aided wireless communications: An idea whose time has come |