Untargeted plasma metabolomic profiling in conjunction with machine learning (ML) may result in the finding of metabolic profiles that help researchers understand the etiology of juvenile CKD. For a study, they sought to find metabolomic markers in juvenile CKD depending on the diagnosis: FSGS, obstructive uropathy (OU), aplasia/dysplasia/hypoplasia (A/D/H), and reflux nephropathy (RN). Untargeted metabolomics quantification (GC-MS/LC-MS, Metabolon) was done on plasma from 702 individuals in the Chronic Kidney Disease in Children study (FSGS=63, OU=122, A/D/H=109, and RN=86). For feature selection, Lasso regression with clinical factors adjustment was utilized. Then, to stratify importance, four approaches were used: logistic regression, support vector machine, random forest, and extreme gradient boosting. The ML training was carried out on 80% of the total cohort subgroups and validated on 20% of the holdout subsets. Important characteristics were chosen because they were significant in at least two of the four modeling techniques. In addition, they used pathway enrichment analysis to identify metabolic subpathways linked to the causation of CKD.

On holdout subsets, ML models were assessed using receiver-operator and precision-recall area-under-the-curve, F1 score, and Matthews correlation coefficient. No-skill prediction was outperformed by ML models. Cause-specific metabolomic patterns were discovered. The sphingomyelin-ceramide axis was linked to FSGS. Individual plasmalogen metabolites and the subpathway were also linked to FSGS. OU was linked to histidine metabolites generated from the intestinal microbiota.

Based on the etiology of the CKD, ML models revealed metabolomic markers. Using ML approaches in combination with classical biostatistics, we demonstrated that FSGS is linked with sphingomyelin-ceramide and plasmalogen dysmetabolism, whereas OU is connected with gut microbiome-derived histidine metabolites.