Speakers
Description
The spatial dependence on environmental data is an influential criterion in clustering processes, since the results obtained provide relevant information. As classical methods do not consider spatial dependence, considering this structure produces unexpected results, and groupings of curves that cannot be similar in shape/behavior.
In this work, the clustering is performed using the modified k-means method for spatially correlated functional data applied to NDVI data from the ecuadorian Andes. NDVI studies are important because it is used mainly to measure biomass, assess crop health, help forecast fire danger zones, etc.
For this, quality indexes are implemented that can obtain the appropriate number of groups. Based on the methodology used in the hierarchical approach for functional data with spatial correlation, and given that the functional data belong to the Hilbert space of square-integrable functions; the analysis is developed considering the distance between curves through the $\mathcal{L}^2$ norm, obtaining a reduced representation of the data through a finite Fourier-type basis. Then, the empirical variogram is calculated and a parametric theoretical model is fitted in order to weight the distance matrix between the curves by the trace-variogram and multivariogram calculated with the coefficients of the base functions, this matrix carries out the grouping of spatially correlated functional data. For the validation of the method, some simulation scenarios were carried out, obtaining more than $80 \%$ of good classification and complemented with a case of application to NDVI data; obtaining five latitudinally distributed regions; these regions are influenced by the hydrographic basins of Ecuador.
Keywords | K-means, functional data, spatial correlation |
---|