15–19 Sept 2024
Leuven, Belgium
Europe/Berlin timezone

Learning Ordinal Preferences on Wearable Devices through Reinforcement Learning using Time-Dependent States

17 Sept 2024, 15:05
20m
Conference room 1

Conference room 1

AI in Industry AI in industry 2

Speaker

Mr Simon Weinberger (EssilorLuxottica & Université Lumière Lyon 2)

Description

Reinforcement learning proposes a flexible framework that allows to tackle problems where data is gathered in a dynamic context: actions have an influence on future states. The classical reinforcement learning paradigm depends on a Markovian hypothesis, the observed states depend upon past states only through the last state and action. This condition may be too restrictive for real-world applications, as the dynamics may depend on past states. We get around this issue by augmenting the state space into a functional space, and we propose to use policies that depend on these states using functional regression models. We present an industrial application involving the automatic tint control of e-chromic frames, minimizing the number of user interactions. This particular policy takes action on an ordered set, using functional data estimated from successive Ambient Light Sensor (ALS) values, which are non-stationary. This is achieved using an extension of an existing ordinal model with functional covariates as a policy. The non-stationary ALS signal describing the state is handled by means of a wavelet functional basis. Finally, the policy is improved using policy gradient methods.

Type of presentation Talk
Classification Both methodology and application
Keywords Reinforcement learning, functional data, ordinal regression

Primary authors

Dr Aurélie Le Cain (EssilorLuxottica) Jairo Cugliari (Université Lumière Lyon 2) Mr Simon Weinberger (EssilorLuxottica & Université Lumière Lyon 2)

Presentation materials

There are no materials yet.