Predictive monitoring techniques produce signals in case of a high probability of an undesirable event, such as machine failure, heart attacks, or mortality. When using these predicted probabilities to classify the unknown outcome, a decision threshold needs to be chosen in statistical and machine learning models. In many cases, this is set to 0.5 by default. However, in a high number of applications, for instance in healthcare and finance, data characteristics such as class imbalance in the outcome variable may occur. A threshold of 0.5, therefore, often does not lead to an acceptable model performance. In addition, the False Alarm Rate can become higher than is desirable in practice. To mitigate this issue, different threshold optimization approaches have been proposed in the previous literature, based on techniques such as bootstrapping and cross-validation. In the present ongoing project, we study the suitability of some of these threshold optimization approaches for time-dependent data as are encountered in predictive monitoring settings. The goal is to provide guidance for practitioners and to help promote the development of novel procedures. An illustration using real-world data will be provided.
|Keywords||predictive monitoring, threshold tuning, false alarm rate|