Chronic myelogenous leukemia (CML) is a clonal stem cell disorder accounting for 15% of adult leukemias. We aimed to determine if machine learning models could predict CML using blood cell counts prior to diagnosis.
We identified patients with a diagnostic test for CML (BCR-ABL1) and at least 6 consecutive prior years of differential blood cell counts between 1999 and 2020 in the largest integrated health care system in the United States. Blood cell counts from different time periods prior to CML diagnostic testing were used to train, validate, and test machine learning models.
The sample included 1,623 patients with BCR-ABL1 positivity rate 6.2%. The predictive ability of machine learning models improved when trained with blood cell counts closer to time of diagnosis: 2 to 5 years area under the curve (AUC), 0.59 to 0.67, 0.5 to 1 years AUC, 0.75 to 0.80, at diagnosis AUC, 0.87 to 0.92.
Blood cell counts collected up to 5 years prior to diagnostic workup of CML successfully predicted the BCR-ABL1 test result. These findings suggest a machine learning model trained with blood cell counts could lead to diagnosis of CML earlier in the disease course compared to usual medical care.

© American Society for Clinical Pathology, 2021.

Author