Telemonitoring is the use of electronic devices such as smartphones to remotely monitor patients. It provides great convenience and enables timely medical decisions. To facilitate the decision making for each patient, a model is needed to translate the data collected by the patient’s smartphone into a predicted score for his/her disease severity. To train a robust predictive model, semi-supervised learning (SSL) provides a viable approach by integrating both labeled and unlabeled samples to leverage all the available data from each patient. There are two challenging issues that need to be simultaneously addressed in using SSL for this problem: feature selection from high-dimensional noisy telemonitoring data; instance selection from many, possibly redundant unlabeled samples. We propose a novel SSL model allowing for simultaneous feature and instance selection, namely the S2SSL model. We present a real-data application of telemonitoring for patients with Parkinson’s Disease using their smartphone-collected activity data such as tapping and speaking. A total of 382 features were extracted from the activity data of each patient. 74 labeled and 563 unlabeled instances from 37 patients were used to train S2SSL. The trained model achieved a high accuracy of 0.828 correlation between the true and predicted disease severity scores on a validation dataset.
|Keywords||machine learning; semi-supervised learning; health care|