10–14 Sept 2023
Europe/Madrid timezone

A Feature Selection Method Based on Shapley Values Robust to Concept Shift in Regression

11 Sept 2023, 12:00
30m
Auditorium

Auditorium

Speaker

Carlos Sebastián Martínez-Cava (Fortia Energía - Universidad Politécnica de Madrid)

Description

Feature selection is one of the most relevant processes in any methodology for creating a statistical learning model. Generally, existing algorithms establish some criterion to select the most influential variables, discarding those that do not contribute any relevant information to the model. This methodology makes sense in a classical static situation where the joint distribution of the data does not vary over time. However, when dealing with real data, it is common to encounter the problem of the dataset shift and, specifically, changes in the relationships between variables (concept shift). In this case, the influence of a variable cannot be the only indicator of its quality as a regressor of the model, since the relationship learned in the traning phase may not correspond to the current situation. Thus, we propose a new feature selection methodology for regression problems that takes this fact into account, using Shapley values to study the effect that each variable has on the predictions. Five examples are analysed: four correspond to typical situations where the method matches the state of the art and one example related to electricity price forecasting where a concept shift phenomenon has occurred in the Iberian market. In this case the proposed algorithm improves the results significantly.

Classification Mainly methodology
Keywords Concept shift, Feature selection, Regression

Primary author

Carlos Sebastián Martínez-Cava (Fortia Energía - Universidad Politécnica de Madrid)

Co-author

Prof. Carlos González Guillén (Universidad Politécnica de Madrid)

Presentation materials

There are no materials yet.