Parkinson’s disease (PD) is a motor neurodegenerative disease principally manifested by motor disabilities, such as postural instability, bradykinesia, tremor, and stiffness. In clinical practice, there exist several diagnostic rating scales that coarsely allow the measurement, characterization and classification of disease progression. These scales, however, are only based on strong changes in kinematic patterns, and the classification remains subjective, depending on the expertise of physicians. In addition, even for experts, disease analysis based on independent classical motor patterns lacks sufficient sensitivity to establish disease progression. Consequently, the disease diagnosis, stage, and progression could be affected by misinterpretations that lead to incorrect or inefficient treatment plans. This work introduces a multimodal non-invasive strategy based on video descriptors that integrate patterns from gait and eye fixation modalities to assist PD quantification and to support the diagnosis and follow-up of the patient. The multimodal representation is achieved from a compact covariance descriptor that characterizes postural and time changes of both information sources to improve disease classification.
A multimodal approach is introduced as a computational method to capture movement abnormalities associated with PD. Two modalities (gait and eye fixation) are recorded in markerless video sequences. Then, each modality sequence is represented, at each frame, by primitive features composed of (1) kinematic measures extracted from a dense optical flow, and (2) deep features extracted from a convolutional network. The spatial distributions of these characteristics are compactly coded in covariance matrices, making it possible to map each particular dynamic in a Riemannian manifold. The temporal mean covariance is then computed and submitted to a supervised Random Forest algorithm to obtain a disease prediction for a particular patient. The fusion of the covariance descriptors and eye movements integrating deep and kinematic features is evaluated to assess their contribution to disease quantification and prediction. In particular, in this study, the gait quantification is associated with typical patterns observed by the specialist, while ocular fixation, associated with early disease characterization, complements the analysis.
In a study conducted with 13 control subjects and 13 PD patients, the fusion of gait and ocular fixation, integrating deep and kinematic features, achieved an average accuracy of 100% for early and late fusion. The classification probabilities show high confidence in the prediction diagnosis, the control subjects probabilities being lower than 0.27 with early fusion and 0.3 with late fusion, and those of the PD patients, being higher than 0.62 with early fusion and 0.51 with late fusion. Furthermore, it is observed that higher probability outputs are correlated with more advanced stages of the disease, according to the H&Y scale.
A novel approach for fusing motion modalities captured in markerless video sequences was introduced. This multimodal integration had a remarkable discrimination performance in a study conducted with PD and control patients. The representation of compact covariance descriptors from kinematic and deep features suggests that the proposed strategy is a potential tool to support diagnosis and subsequent monitoring of the disease. During fusion it was observed that devoting major attention to eye fixational patterns may contribute to a better quantification of the disease, especially at stage 2.

Copyright © 2021 Elsevier B.V. All rights reserved.