Speaker
Description
Bayes classifiers rest on maximising the joint conditional PDF of the feature vector under the class value. The usage of copulae is the most flexible way of fitting joint distributions to data. In recent years, the problem of applying copulae to high dimensions has been approached with Vine copulae. Nevertheless, the application to very high dimensions in the order of several thousands have not yet been studied on large scale in the literature. The present work investigates the feasibility of Bayes classification based on copula modelling in problems with up to 5000 feature components, with a relatively small sample to dimension ratio. To fit Vine copulae successfully in useful computational time in this environment, we use truncation and thresholding. In particular, the potential of thresholding has not yet been studied in classification approaches. We develop approaches for choosing the relevant thresholding levels. Simulation experiments show that the resulting classifiers are strongly competitive with other classifiers as SVM.