Reliability and validity of the Roche PD Mobile Application v2
The Roche PD Mobile Application v1 was designed to measure the core motor signs of PD5,18, and was recently revised to v2 to primarily include two new active tests of bradykinesia (Hand Turning, Draw A Shape), as well as a test of psychomotor slowing (eSDMT) and a speech test. In addition, the original gait task was revised to a U-turn test, and a smartwatch was incorporated into the remote passive monitoring procedure. Preliminary test–retest reliability scores for the pre-specified sensor features from all active tests except Speech and eSDMT, and for both passive monitoring measures, were in the ‘excellent’ range22. Preliminary clinical validity was established via correlations with corresponding MDS-UPDRS item scores. We note that these findings are reassuring considering the continuous (sensor feature) versus ordinal (MDS-UPDRS) nature of the two datasets, and the lack of conceptually comparable MDS-UPDRS items for some active test features (e.g. Draw A Shape). Cross-correlations between sensor features and MDS-UPDRS subscale scores supported the convergent and divergent validity of bradykinesia and tremor sensor features. Most active test sensor features demonstrated sensitivity for subtle manifestations, discriminating individuals who received MDS-UPDRS item scores of 0 from those with item scores of 1. Measures of upper limb bradykinesia demonstrated known-groups validity, differentiating individuals in Hoehn and Yahr Stage I versus II. All lateralized sensor features discriminated least versus the most affected sides of the body. The results from shared active tests and passive sensor features confirm previous findings with the Roche PD Mobile Application v15. Taken together, these results indicate that the Roche PD Mobile Application v2 may prove suitable for quantifying motor disease severity and tracking disease progression in the earlier stages of PD.
DHT measurement of bradykinesia
The Roche PD Mobile Application v2 contains three active tests designed to measure upper limb bradykinesia: Dexterity (finger tapping), Hand Turning (pronation/supination), and Draw A Shape. Pre-specified sensor features from all three tests correlated with their corresponding MDS-UPDRS upper limb bradykinesia item scores, and showed convergent and divergent validity in cross-correlations with MDS-UPDRS Part III subscale scores, correlating numerically most strongly with bradykinesia compared with all other subscale scores. These findings indicate that the Roche PD Mobile Application v2 bradykinesia tests indeed reflect the neurological concept of upper limb bradykinesia. Finger tapping and pronation/supination tasks are well-established assessments of upper limb bradykinesia as evidenced by their inclusion in both the UPDRS24 and MDS-UPDRS8. Over the last decade, different digitized variants of finger tapping and pronation/supination tests have been developed21. Despite methodological differences, studies of these DHT tasks generally showed good correspondence between finger tapping sensor features and respective clinical ratings, as well as the ability to differentiate healthy controls from individuals with early PD, and individuals with early PD from individuals with later-stage PD5,18,25,26,27,28, in line with the present findings. While the literature on digitized pronation/supination assessments is less rich than for finger tapping, available results also consistently demonstrate correlations with related clinical scores and the ability to differentiate healthy participants from individuals with PD16,23,29,30,31. Spiral drawing is traditionally used in behavioral neurology to assess fine motor impairment including bradykinesia and tremor32,33,34,35. DHT versions of spiral drawing demonstrated that time to completion correlated with clinician ratings of bradykinesia severity, and differentiated PD cases from controls34. The majority of previous DHT spiral drawing tasks used pens/digital pens to draw on regular paper or tablets, a more challenging motor task compared with the present finger drawing on smaller smartphone touch screens. In the present study, celerity, i.e. accuracy/time to complete spiral shape tracing on the smartphone screen, was pre-specified to additionally consider the accuracy of directed fine motor movements in the unsupervised at-home setting. Spiral celerity correlated with MDS-UPDRS bradykinesia measures, and the strength of these correlations was numerically smaller compared with Finger Tapping and Hand Turning. This may be due to the relative difficulty of the latter two tasks compared with spiral drawing, which may have challenged individuals more, thereby revealing greater impairment. We note that additional sensor features (e.g. variability in drawing speed, hesitation), analyzed either individually or combined within and across shapes, are expected to provide additional meaningful information, as has been shown for PD and multiple sclerosis36,37.
Passive monitoring with smartwatches
Passive monitoring with smartwatches provides a unique opportunity to explore slowing of upper limb movements during daily life. Here, sensor data segments during arm movements were identified from the circa 90% non-walking periods in the passive monitoring sensor data stream, using the squared magnitude of the accelerometer sensor movement as the sensor feature. This same feature has been related to decreased expressivity in patients with schizophrenia with negative symptoms38. Here, arm movement power was specifically related to the MDS-UPDRS bradykinesia subscore and item scores, as well as the rigidity subscore, and is in line with a slowing of hand movement in daily non-gait-related activities such as gesturing when speaking, eating, etc. These findings are consistent with previous research with wrist-worn wearables, which traditionally focused on arm swing during gait39,40,41, as well as multi-sensor systems used to measure the impact of bradykinesia on activities of daily living15,42. Thus, passively monitored motor behavior in daily life may facilitate our understanding of the effect and burden of PD on individuals’ daily lives.
DHT measurement of bradyphrenia
The eSDMT43 is commonly applied to measure psychomotor slowing, or bradyphrenia, one of the earliest cognitive signs in PD, appearing up to 5 years prior to a PD dementia diagnosis20. However, as the test requires multiple cognitive functions, it is not surprising that it is sensitive to many forms of neurologic impairment44. Indeed, while SDMT performance is reduced in PD45, impairments are exacerbated in individuals with PD with concomitant vascular46 and amyloid47 imaging findings. A standard SDMT outcome measure, number of correct responses in 90 s, was pre-specified for the present analyses of the eSDMT, and showed ‘good’22 test–retest reliability (ICC = 0.75). However, it correlated only weakly (rho = −0.18) with the MDS-UPDRS item 1.1. assessing global cognitive impairment. This finding is surprising given the catch-all nature of both the eSDMT and MDS-UPDRS item 1.1., but may be accounted for by the fact that cognitive impairments were excluded during the screening process in the PASADENA study, leading to a truncation of range in both scores (see Supplementary Fig. 1). We note that we attempted to minimize the effect of bradykinesia on eSDMT scores by requiring a simple tap response on a number pad displayed at the bottom half of the smartphone screen. Nevertheless, to mitigate the risk of this confound, eSDMT performance could be controlled by a non-cognitively demanding motor test using a similar response format.
DHT measurement of voice and speech
Voice and speech impairments in PD are varied and generally summarized under the term dysarthria, and include resonatory, articulatory, phonatory, prosodic and respiratory components48. This symptomatology and its relevance to patients’ daily lives motivated the inclusion of a Sustained Phonation task in the suite of active tests, and the development of the novel Speech test. Voice jitter was pre-selected as a proxy of disordered vocal fold function for the sustained phonation test. In line with previous research49, increased voice jitter correlated weakly with MDS-UPDRS 3.1. (Speech) scores, and differentiated individuals with slight speech disturbances (MDS-UPDRS 3.1. score of 1) from those with no perceivable speech impairment at the site visit (MDS-UPDRS 3.1. score of 0). In the Speech active test, monotonicity (i.e. Mel Frequency Cepstral Coefficient 2 [MFCC] 2 fundamental frequency variability) was selected as the sensor feature of prosodic deficits based on previous research demonstrating that this feature differentiated individuals with PD from healthy controls48. In the present study, MFCC2 variability correlated with MDS-UPDRS 3.1. (Speech) scores, and differentiated participants with MDS-UPDRS 3.1. scores of 0 and 1. The bulbar MDS-UPDRS Part III composite item score was designed to gauge the severity of motor impairments in body parts involved in speech production. Despite a truncation of range in this score (average < 3/20 points), MFCC2 variability correlated with the bulbar score, indicating that this feature may estimate the severity of motor impairments in the speech apparatus. Future research will investigate further richly multi-faceted aspects of speech function to better understand motor and cognitive behavior in PD.
DHT measurement of tremor, turning and balance
The Roche PD Mobile Application v2 aims to assess the broad array of motor signs in PD and related movement disorders. Thus, besides bradykinesia, speech, voice, and psychomotor slowing, tremor (rest, postural), turning during gait, and balance were also assessed. The rest and postural tremor active test features corresponded most strongly to the respective MDS-UPDRS concepts of tremor, as demonstrated by the highest correlation overall with any MDS-UPDRS item and subscale scores. This is consistent with similar DHT reports5,25,50. The novel U-turn test (which instructed individuals, if safe to do so, to walk several paces and make a U-turn at least five times) and the identification of turning while walking throughout the day in passive monitoring sensor data, were motivated by findings that turning is particularly impaired in PD5,51,52. For example, a 360 degree walking turn and instrumented timed-up-and-go test showed strong reliability and discriminated controls from PD participants53,54. Similarly, sensor-based measures of turn speed in daily life differentiated PD individuals from controls55. In the present study, turn speed measured in both the active test and passive setting correlated with MDS-UPDRS 3.14. body bradykinesia item scores, but was not specifically related to MDS-UPDRS PIGD relative to other subscores. While neither measure of turn speed differentiated between less and more affected individuals on MDS-UPDRS body bradykinesia scores of 0 versus 1, both differentiated between individuals in Hoehn and Yahr Stage I versus II. Although participants were not instructed to ‘turn as fast as possible’ to ensure a safe conduct of the active test, the U-turn test showed numerically higher correlations with body bradykinesia compared with passive turning speed, in line with similar profile of performance (active testing) versus capacity (passive monitoring) scores previously demonstrated for gait speed56. In the balance active test, the jerk sensor feature correlated with the MDS-UPDRS 3.12. postural stability item score, similar to previous reports5,57, and differentiated individuals with MDS-UPDRS item 3.12 scores of 0 versus 1, but failed to differentiate individuals in Hoehn and Yahr Stage I versus II. We speculate that this negative finding may reflect the low levels of gait and postural instability impairments in the present cohort (mean PIGD = 1).
DHT composite scores
A composite summary score of individual features across diverse assessments is expected to provide a more robust measure of global PD severity and progression, especially given the heterogeneous nature of PD. Several DHT solutions besides the Roche PD Mobile Application v2 administer different motor active tests, and some additionally collect passive monitoring data5,17,18. Supplementary Table 2 provides a high-level comparison of these DHT solutions. All solutions contain active tests for tremor and tapping, but vary with respect to the inclusion of other upper limb, postural stability/gait, cognition, and voice/speech tests, and whether passively monitored motor data are collected. The power of combining different features across the tests in these DHTs has been shown via machine learning models that predict MDS-UPDRS total scores (Roche PD Mobile Application v1)58 or lead to a new score based on differentiation of ON and OFF L-dopa states59, and distinguished between healthy controls, idiopathic Rapid Eye Movement and PD16,60. A machine learning approach was also used to combine different HopkinsPD baseline sensor features to predict clinically significant events (e.g. falls, functional impairment) at the 18-month follow-up61. In contrast to data-driven approaches to composite score development, a clinical outcomes assessment approach could be applied whereby information from individuals with PD informs the selection of sensor features such that they optimally reflect what matters most to patients62.
Limitations
Several facets of the present study limit the generalizability of the findings. Firstly, all individuals’ disease duration was < 2 years, and individuals were in Hoehn and Yahr Stages I or II. Thus, the applicability of the present findings to later-stage or prodromal PD is unknown. The reduced range of disease severities also appeared to limit the ranges of some DHT and clinical measures, which consequently limited the possibility to detect relationships between the two (Supplementary Fig. 1). Also, further research is necessary to better understand the suitability of this remote monitoring approach for later-stage patients with more severe cognitive or visual impairments. Second, since Roche PD Mobile Application v2 data are not yet available from neurologically normal individuals, sensor feature cut-off values differentiating normal from impaired motor behavior could not yet be calculated. It should be also noted that comparisons between DHT measures and clinical measures such as the MDS-UPDRS can also be affected by limitations in the clinical measures; if an active test is not adequately reflected by a clinical measure, the ability to detect meaningful correlations is reduced. Finally, only two continuous 2-week periods of DHT data were analyzed; thus, the long-term adherence to the remote monitoring procedure and ability of sensor features to detect changes over time remain to be established. Towards this end, it is critical to quantify and report test–retest reliabilities of sensor feature scores towards assessing a sensor feature’s potential to detect changes over time63 and any deviation from normal progression as a function of e.g. pharmacological interventions.
The Roche PD Mobile Application v2 was designed to measure the severity of early PD core motor signs and to provide information complementary to established clinical outcome measures. This remote monitoring approach enables high-frequency (i.e. daily) assessments with low average daily burden. The frequent measurement coupled with the high sensitivity of smartphone/smartwatch sensors may increase signal-to-noise of digital outcome measures for clinical research and provide novel insights into patients’ functioning in daily life.