Speaker
Description
Reinforcement learning proposes a flexible framework that allows to tackle problems where data is gathered in a dynamic context: actions have an influence on future states. The classical reinforcement learning paradigm depends on a Markovian hypothesis, the observed states depend upon past states only through the last state and action. This condition may be too restrictive for real-world applications, as the dynamics may depend on past states. We get around this issue by augmenting the state space into a functional space, and we propose to use policies that depend on these states using functional regression models. We present an industrial application involving the automatic tint control of e-chromic frames, minimizing the number of user interactions. This particular policy takes action on an ordered set, using functional data estimated from successive Ambient Light Sensor (ALS) values, which are non-stationary. This is achieved using an extension of an existing ordinal model with functional covariates as a policy. The non-stationary ALS signal describing the state is handled by means of a wavelet functional basis. Finally, the policy is improved using policy gradient methods.
Type of presentation | Talk |
---|---|
Classification | Both methodology and application |
Keywords | Reinforcement learning, functional data, ordinal regression |