Speaker
Description
This work deals with the problem of identifying recurrent patterns in the passengers' daily access to trains and/or stations in the railway system of Lombardy. People counter data, i.e. the number of boarding and dropping passengers on each train at each station, are analysed to identify eventual issues of the railway transport system and help decision makers in planning the trains scheduling and improving the service quality. To this end, a general and flexible bi-clustering algorithm for the analysis of complex data - i.e. to simultaneously group the rows and the columns of a data matrix whose entry in each cell is a more complex object than a scalar (e.g. functional data) - is developed and applied focusing, respectively, on the analysis of stations and trains over nine days. First, we study the passengers' departures and arrivals at each day-hour for each station along nine days. This allows us to identify subsets of stations that in specific days show similar patterns of departures and arrivals along the day point out station-day pairs that could be homogeneously managed by the railway service provider. Second, we study also the passengers' boarding, deboarding, and occupancy of each scheduled train along its journey for a period of nine days, so to identify groups of trains that in specific days show a similar usage profile across the line stations. The obtained results reveal both overcrowded and uncrowded situations, therefore helping the railway transport company to best handle the service. The developed approach is flexible and scalable, as a matter of fact, it is ready to be used to analyse larger datasets and different railway systems in other regions.