Machine learning algorithms key to finding

A study from researchers at University College London suggested that machine learning may open the door to big data findings from small datasets — findings that may guide clinical decisions in rare diseases for which data are limited.

George A. Robinson, PhD, of the Centre for Adolescent Rheumatology Versus Arthritis, University College London, and colleagues used machine learning to analyze blood samples from 67 patients with juvenile-onset systemic lupus erythmatosus to identify an “immune cell signature that stratified patients according to their disease trajectory.” They reported their findings in The Lancet, Rheumatology Open Access.

Currently, there are no approved juvenile-onset SLE-specific medications, “due mainly to the paucity of clinical trial data in children and adolescents, meaning that patients with juvenile-onset SLE are treated similarly to patients with adult-onset SLE. However, despite treatment, severe juvenile-onset SLE leads to early organ damage and unsatisfactory outcomes (eg, renal and CNS manifestations) for many patients, emphasizing the need for improved understanding of the immunological defects driving disease pathogenesis and clinically relevant patient stratification strategies for personalized treatment,” they explained.

“Immunophenotyping profiles (28 immune cell subsets) of peripheral blood mononuclear cells from patients with juvenile-onset SLE and healthy controls were determined by flow cytometry,” they wrote. “We used balanced random forest (BRF) and sparse partial least squares-discriminant analysis (sPLS-DA) to assess classification and parameter selection, and validation was by ten-fold cross-validation. We used logistic regression to test the association between immune phenotypes and k-means clustering to determine patient stratification. Retrospective longitudinal clinical data, including disease activity and medication, were related to the immunological features identified.”

The median age of the 67 SLE patients was 19 and median age of controls was 18. Between Sept 5, 2012, and March 7, 2018, peripheral blood was collected from 67 patients with juvenile-onset SLE and 39 healthy controls.

The BRF model identified patients and controls with 99.9% accuracy.

“The top-ranked immunological features from the BRF model were confirmed using sPLS-DA and logistic regression, and included total CD4, total CD8, CD8 effector memory, and CD8 naive T cells, Bm1, and unswitched memory B cells, total CD14 monocytes, and invariant natural killer T cells,” they wrote.

Guided by those markers, Robinson and colleagues sorted the patients into four groups: “CD8 T-cell subsets were important in driving patient stratification, whereas B-cell markers were similarly expressed across the cohort of patients with juvenile-onset SLE. Patients with juvenile-onset SLE and elevated CD8 effector memory T-cell frequencies had more persistently active disease over time, as assessed by the SLE disease activity index 2000, and this was associated with increased treatment with mycophenolate mofetil and an increased prevalence of lupus nephritis. Finally, network analysis confirmed the strong association between immune phenotype and differential clinical features.”

In a comment published with the study, May Y. Choi, MD, and Christopher Ma, MD, of the Cumming School of Medicine, at the University of Calgary in Calgary, Alberta, Canada, lauded the work by Robinson et al, noting the “findings of this study not only contribute to our current understanding of juvenile-onset SLE, but they also outline a potential future machine-learning roadmap for untangling the complex immunological underpinnings of other rheumatic conditions. The data density in this analysis comes not from the cohort size, but rather from detailed analysis of peripheral blood mononuclear cells (PBMCs) by flow cytometry. Subsequently, machine-learning algorithms were applied, and this strategy might be particularly useful for studying other rheumatic conditions such as vasculitis, myositis, or systemic sclerosis, which are relatively rare and highly heterogeneous, making accrual of multiple, large clinical cohorts challenging.”

But Choi and Ma cautioned that the lack of an external validation dataset was a significant limitation that raises questions about the generalizability of the findings. Questions that are “magnified when a small training dataset is used or when baseline parameters have highly skewed distributions, one of the recognized limitations of methods such as random forest,” they wrote.

Robinson and colleagues acknowledged that they were limited by using a small cohort from a single center. It is “important for this analysis to be validated in external cohorts and done on a cohort by cohort basis to account for ethnicity changes between regions or countries and to prospectively validate the association between immune cell subsets and disease progression.”

That said, the authors concluded that the “application of machine-learning approaches to immune phenotyping data has identified immunological biomarkers that could potentially help to unravel underlying disease mechanisms in juvenile-onset SLE and explain the differences in long-term outcomes of patients with this disease. Such immunological signatures could facilitate better stratification of patients for optimal treatment choices and provide information to improve interventional clinical trial design.”

  1. Be aware that this study is the first use machine learning algorithms to identify a juvenile-onset SLE phenotype.

  2. Note that with use of machine-learning methods the researchers identified a juvenile-onset SLE immune cell signature that stratified patients according to their disease trajectory, which may facilitate stratification of patients for optimal treatment choices.

Peggy Peck, Editor-in-Chief, BreakingMED™

The study was funded by Lupus UK, The Rosetrees Trust, Versus Arthritis, and UK National Institute for Health Research University College London Hospital Biomedical Research Centre.

Robinson had no disclosures.

Choi is supported by the Gary S Gilkeson Career Development Award from the Lupus Foundation of America.

Cat ID: 67

Topic ID: 90,67,730,138,192,67,925