Computer mouse use captures ataxia and parkinsonism, enabling accurate measurement and detection

Abstract Background Objective assessments of movement impairment are needed to support clinical trials and facilitate diagnosis. The objective of the current study was to determine if a rapid web‐based computer mouse test (Hevelius) could detect and accurately measure ataxia and parkinsonism. Methods Ninety‐five ataxia, 46 parkinsonism, and 29 control participants and 229,017 online participants completed Hevelius. We trained machine‐learning models on age‐normalized Hevelius features to (1) measure severity and disease progression and (2) distinguish phenotypes from controls and from each other. Results Regression model estimates correlated strongly with clinical scores (from r = 0.66 for UPDRS dominant arm total to r = 0.83 for the Brief Ataxia Rating Scale). A disease change model identified ataxia progression with high sensitivity. Classification models distinguished ataxia or parkinsonism from healthy controls with high sensitivity (≥0.91) and specificity (≥0.90). Conclusions Hevelius produces a granular and accurate motor assessment in a few minutes of mouse use and may be useful as an outcome measure and screening tool. © 2019 The Authors. Movement Disorders published by Wiley Periodicals, Inc. on behalf of International Parkinson and Movement Disorder Society.


Supporting Data
Additional Supporting Information may be found in the online version of this article at the publisher's web-site.

A B S T R AC T
Background: Objective assessments of movement impairment are needed to support clinical trials and facilitate diagnosis. The objective of the current study was to determine if a rapid web-based computer mouse test (Hevelius) could detect and accurately measure ataxia and parkinsonism. Methods: Ninety-five ataxia, 46 parkinsonism, and 29 control participants and 229,017 online participants completed Hevelius. We trained machine-learning models on age-normalized Hevelius features to (1) measure severity and disease progression and (2)  Drug development efforts are underway for patients suffering from neurodegenerative diseases, including cerebellar ataxias, Parkinson's disease (PD), and Parkinsonplus syndromes. Key challenges for clinical trials include the ability to accurately diagnose early disease 1-4 and confidently measure disease change. These challenges arise in part because current assessments of neurodegenerative diseases are subjective, exhibit intra-and interrater differences, 5 and are poorly accessible because they have to be performed in a clinical setting by a movement disorders specialist.
Such challenges are amplified in children in whom norms for movement evolve rapidly with age. Furthermore, disease-tailored clinical scoring scales are limited in their ability to measure nonprototypical phenotypes, for example, in ataxia patients with bradykinesia. Because of the complex, heterogeneous, and overlapping phenotypes in neurodegenerative diseases, it would be advantageous to complement existing assessment methods with a readily available tool that could characterize movement across a number of phenotypes.
We have developed a rapid, computer mouse-based tool called Hevelius that quantifies arm function by extracting 32 features from continuous, target-driven computer mouse trajectories (see Supplementary Methods for task and analysis details). Here, we demonstrate the effectiveness of Hevelius (1) to accurately measure disease severity and (2) to distinguish patients with ataxia or parkinsonism from controls and from each other.

Participant Demographics
Data from 229,017 online participants were used to develop the normative data set. Participants selfreported coming from 158 countries, with the largest group coming from the United States (43.8%).
One hundred and eighty-nine patients were assessed using Hevelius in the clinic setting: 95 with cerebellar ataxia, 46 with parkinsonism, and 29 controls (see Table 1). Eighteen individuals with a progressive ataxia diagnosis (12 with spinocerebellar ataxia [SCA], 4 with ataxia-telangiectasia [A-T], and 2 with multiple system atrophy, cerebellar-type [MSA-C]) completed the task at an additional point. For mixed movement disorders such as MSA, we relied on the treating neurologist's assessment to group the individual into ataxia versus parkinsonism. The dominant arm was equally or more affected than the nondominant arm in 82 of 141 individuals with ataxia or parkinsonism. Individuals with neurologic disease (median, 3.1 minutes) took longer than healthy controls (median, 1.9 minutes) to complete the task (F 1,185 = 19.99, P < 0.0001).

Summary Statistics for Online Participants
Supplementary Figure S3 (top) shows how 4 representative measures collected by Hevelius varied across the life span in the cross-sectional sample collected online. As expected, basic aspects of performance, such as overall efficiency (measured by movement time) or the ability to control movement speed (measured by normalized jerk) peaked in late teens, that is, at the age of biological maturity. Ability to produce force (measured by peak acceleration) peaked later in life. 6 Finally, measures of error in gross motor performance (e.g., movement errors) generally declined throughout adulthood, consistent with prior findings. 7 Taken together, the clear relationships between age and performance found in our online data and that these relationships are consistent with existing knowledge provide compelling evidence of the validity of these baseline data.

Summary Statistics for Clinical Participants
Participants with ataxia and parkinsonism differed from age-matched online controls across a number of Hevelius movement features. In particular, features related to duration (movement time, execution time, number and duration of pauses, and click duration) were increased, and those related to movement control (distance from target at end of main submovement noise-to-force ratio, and jerk) were impaired compared with online controls in both ataxia and parkinsonism (see Supplementary Table S2).
Participants with ataxia demonstrated additional impairments in features reflecting "dysmetria": direction changes, target reentries, movement error and variability, and deviation from task axis. Similarly, in participants with parkinsonism but not ataxia decreased peak acceleration and peak speed were present, matching the phenotype of "bradykinesia." These observations are illustrated in Supplementary Figure S3 ( Clinical Score Estimation Table 2 shows the performance of regression models trained to predict clinical severity scores. For both ataxia and parkinsonism, we separately predicted dominant arm scores and total scores. We also introduced a disease-independent "common score": disease-specific dominant arm and total scores were normalized by the maximum score to obtain a value between 0 and 1. The estimates produced by the regression models correlated strongly with actual clinical scores. The correlation coefficient ranged from r = 0.66 for UPDRS dominant arm total to r = 0.83 for Brief Ataxia Rating Scale (BARS) total and common total score. The mean absolute error (MAE) for all was <10% of the maximum score. The MAE for Hevelius AE standard deviation (SD) in estimating BARS dominant arm score was 0.35 AE 0.30, comparable to the previously published  MAE of 0.38 of expert clinicians asked to rate video recordings of the finger-nose-finger task. 8 Although Hevelius measures dominant arm performance, it is equally effective for predicting dominant arm score and total score. This is not surprising given that in our data set dominant arm score and total score were highly correlated (BARS, r = 0.89, P < 0.0001; UPDRS, r = 0.82, P < 0.0001; common score, r = 0.85, P < 0.0001).
The results of the bootstrap analysis indicated high within-session reliability of the severity score estimates ( Table 2).

Classification Analyses
Classification models trained on data produced by Hevelius distinguished between individual disease classes (ataxia or parkinsonism) and healthy controls with high sensitivity (≥0.91) and specificity (≥0.90); see Table 2. As expected, different features were most informative for different phenotypes (see Supplementary Table S4). A model discriminating ataxia and parkinsonism patients also demonstrated good performance (sensitivity, 0.85; specificity, 0.91).
A model trained to discriminate between healthy controls and early-stage ataxia patients (BARS score of 0 in the dominant arm), yielded a sensitivity of 0.75 and specificity of 0.97.

Clinical Progression Estimation
A binary classification model trained to learn which session in a pair of sessions was more severe was applied to 18 individuals with a progressive ataxia diagnosis and a repeat session (12 with SCA, 4 with A-T, and 2 with MSA-C). The mean interval duration between sessions was 325 days with a range of 126-469 days. In these 18 individuals, the dominant arm BARS score increased (indicating disease progression) in 8 of 18, was unchanged in 9 of 18, and decreased (indicating improvement) in 1 of 18 (an individual with SCA-6). The classification model predicted that 17 of 18 individuals had increased dominant arm severity at the time of their second session. One of 18 was predicted by the model to have decreased severity on the second session (the same individual with SCA-6 who also showed improvement on BARS). These results support that Hevelius can sensitively capture arm severity progression information.

Discussion
Hevelius is a novel tool for performing objective, granular, and rapid assessments of dominant arm motor function. We have demonstrated that the tool can be used in children and adults and forms an interpretable and multidimensional representation of ataxia and parkinsonism.
We have shown that the 32 movement features computed from computer mouse trajectories are interpretable, capture several dimensions of motor control, and vary with development and aging ( Supplementary Fig. S3). Regression models used these features to accurately estimate disease scores in individuals with ataxia or parkinsonism (Table 2), and another machine-learning model detected severity progression in 17 individuals with ataxia. Accuracy in estimating dominant arm score in ataxia participants was comparable to the accuracy of clinical experts. Furthermore, the tool was shown to have high intrasession reliability. Thus, Hevelius produces granular, accurate, reliable, and age-normalized assessments of arm function in ataxia and parkinsonism and may prove useful in related disorders affecting motor control.
An ideal screening tool for detecting early disease would not only coarsely discriminate disease from healthy states, but would also have disease specificity. It was for this reason that we tested the ability of Hevelius to distinguish between ataxia and parkinsonism (which it performed accurately; Table 2). In addition, Hevelius was able to accurately classify healthy Severity scores were unavailable for a small number of patients (in addition, for some ataxia patients only dominant arm scores were available, but not total scores); hence, the number of participants included in the regression analyses differs from the number included in the classification analyses. Median 95% within-session confidence intervals shown in columns 3 and 4 were estimated using the bootstrap method and reflect the sensitivity of score predictions to natural variability in performance during a single assessment session. Features selected by each regression model are shown in Supplementary Table S4.
individuals from the subset of ataxia participants who had no scorable abnormalities in the dominant arm, with only 1 false-positive (Table 2). Thus, this tool could form part of an early screening technology, especially if combined with tools in additional domains, such as eye movement and speech analysis. Many technologies have been developed in the last decade and a half to enable objective assessments of motor performance of individuals with neurologic diseases. Most rely on accelerometers 9,10 ; however, other useful scalable approaches have included spiral drawing on a tablet 11 and keyboard typing. 12 Our approach complements prior work in important ways. First, a computer with a mouse is a highly accessible technology, more so than specialized wearable devices and even more so than smartphones, especially for adults aged 65 and older. 13 Second, although accelerometers give access to acceleration, our approach directly measures the hand's position and speed. This turns out to be important: of the 8 features used to discriminate disease from controls, 4 relied on position and 2 on speed (see Supplementary Table S4).
Another key feature is that Hevelius is scalable: the task took patients 2-6 minutes to complete and only requires a computer, a mouse, and an Internet connection without the need for special software. The simplicity of the task and the automated scoring mean that no special expertise is needed to use Hevelius. Accessibility, along with a design that engaged intrinsic motivation (curiosity 14 and social comparison 15 ), facilitated the collection of data from 500,000 online volunteers in 4 months. This raises the possibility that Hevelius could be used in the future to perform longitudinal assessments from thousands of individuals with neurodegenerative disease in their home setting.
There are several limitations to the current study. First, the normative data were collected from a self-selected sample of online volunteers. It is possible that people who have the means and the time to access the Internet for personal reasons have better than average access to health care and, consequently, are healthier than average. Second, the largely cross-sectional design does not enable an assessment of learning effects with shorter time scales or influences because of changes in the testing environment. Last, there were substantial age differences in different populations studied (ataxia, parkinsonism, controls). Despite age adjustment enabled by the normative data set, it is conceivable that not all age-related factors were fully removed, resulting in inflated performance estimates of classification models.