Classification of headache disorders is dependent on a subjective self-report from patients and its interpretation by physicians. We aimed to apply objective data-driven machine learning approaches to analyze patient-reported symptoms and test the feasibility of the automated classification of headache disorders. The self-report data of 2162 patients were analyzed. Headache disorders were merged into five major entities. The patients were divided into training (n = 1286) and test (n = 876) cohorts. We trained a stacked classifier model with four layers of XGBoost classifiers. The first layer classified between migraine and others, the second layer classified between tension-type headache (TTH) and others, and the third layer classified between trigeminal autonomic cephalalgia (TAC) and others, and the fourth layer classified between epicranial and thunderclap headaches. Each layer selected different features from the self-reports by using least absolute shrinkage and selection operator. In the test cohort, our stacked classifier obtained accuracy of 81%, sensitivity of 88%, 69%, 65%, 53%, and 51%, and specificity of 95%, 55%, 46%, 48%, and 51% for migraine, TTH, TAC, epicranial headache, and thunderclap headaches, respectively. We showed that a machine-learning based approach is applicable in analyzing patient-reported questionnaires. Our result could serve as a baseline for future studies in headache research.

References

PubMed