Speaker
Description
Classification precision is particularly crucial in scenarios where the cost of false output is high, e.g. medical diagnosis, search engine results, product quality control etc. A statistical model for analyzing classification's precision from collaborative studies will be presented. Classification (categorical measurement) means that the object’s property under study is presented by each collaborator on the scale consisting of K exclusive classes/categories forming a comprehensive spectrum of this property. We assume that due to measurement/classification errors, a property belonging to category i can be classified by collaborator to category j with probabilities Pj/i (confusion matrix), distributed between collaborators according to Dirichlet distribution for every given i , whereas category counts of repeated by every collaborator classifications are distributed according to corresponded multinomial distribution. Such a model is called Dirichlet-multinomial distribution model which is a generalization of the beta-binomial model of the binary test. We propose repeatability and reproducibility measures based on categorical variation and Hellinger distance analysis, their unbiased estimators and discuss possible options of statistical homogeneity/heterogeneity test. Finally, the Bayesian approach to assessing the classification abilities of collaborators will also be discussed.
Type of presentation | Talk |
---|---|
Classification | Mainly methodology |
Keywords | misclassification, uncertainty, precision |