Advanced non-small cell lung cancer (NSCLC) patients are a diverse group with limited life expectancies. For a study, researchers sought to create techniques for better identifying patients whose survival was more than 90 days.

They examined 83 traits in 106 stage IV NSCLC patients who had never received therapy and had an Eastern Cooperative Oncology Group Performance Status (ECOG- PS) >1. Hyperparameter optimization and model selection were both automated machine-learning processes. For a second (“lite”) model, dimensionality reduction was done using 100-fold bootstrapping. C-statistic and accuracy measures were used to assess performance in an out-of-sample validation cohort. On a second independent, prospective cohort (N = 42), the “lite” model was verified. To assess the variations in feature centrality and connectedness, network analysis (NA) was used.

ExtraTrees Classifier was chosen as the algorithm, and it had a C-statistic of 0.82 (P<0.01) and an accuracy of 0.81 (P=0.01). With 16 variables, the “lite” model produced C-statistics of 0.84 (P<0.01) and 0.75 (P=0.039) for the first cohort and C-statistics of 0.706 (P<0.01) and 0.714 (P<0.01) for the second cohort. Patients’ networks that had lower rates of survival were more intricately interwoven. In NA, prestige ratings for characteristics associated with cachexia, inflammation, and quality of life differed significantly.

To assess the prognosis of advanced NSCLC, machine learning can be helpful. With fewer features, the model was still very accessible and had good metrics.