Speaker
Description
Machine learning models are often the basis of current automated systems. Trust in an automated system is typically justified only up to a certain degree: A moderately reliable system deserves less trust than a highly reliable one. Ideally, trust is calibrated, in the sense that a human interacting with a system neither over- nor undertrusts the system. To be able to relate objective measures of reliability like classification accuracy, fairness or robustness measures to perceived trust, the latter needs to be quantified. However, trust is no unidimensional construct, with several related facets determining trust. Existing psychometric questionnaires have several shortcomings. By building on existing theories from a range of fields, we present a theoretically well-founded ques- tionnaire that includes 30 five point Likert scale items for six dimensions of trust: Global Trust, Integrity, Unbiasedness, Perceived Performance, Vigilance and Transparency. The Global Trust items are intended to be used as an economic short form. The questionnaire’s performance has been evaluated in several studies, including an English and a German version. Here, we focus on the largest English language sample of N = 883 that was used to derive the final TrustSix scale from a larger initial item pool. Perceived trust in three vignettes (fictional automated systems) is measured, i.e., systems for skin cancer detection, poisonous mushroom detection and automated driving, each based on machine learning models. Special emphasis has been placed to explore the exact factorial structure of the latent variables and check their stability across vignettes. A Global Trust factor could be discovered with the help of a bifactor rotation, with five additional factors for the more specific trust dimensions. Reliability of each 5 item subscale is satisfactory (alpha = .76 - .96), and with satisfactory overall reliability for the main factor (hierarchical McDonald’s omega = .75 - .80, total McDonald’s omega = .97-.98), and correlations with adjacent constructs indicating sufficient discriminant validity.