Although greater parathyroid hormone (PTH) levels were linked to increased mortality risks, there wasn’t enough data to say when PTH levels should be elevated. Hence measurements should be made among the general population. Based on US adults’ demographic, lifestyle, and biochemical data, researchers sought to develop a machine learning-based prediction model for increased PTH levels.

Adults 20 years of age or older with their serum intact PTH levels measured as part of the National Health and Nutrition Examination Survey (NHANES) from 2003 to 2006 were included in the population-based investigation. Six machine-learning prediction models (logistic regression with and without splines, lasso regression, random forest, gradient-boosting machines [GBMs], and SuperLearner) were trained using the NHANES 2003–2004 cohort (n = 4,096). Then, they assessed the model’s performance using the NHANES 2005–2006 cohort (n = 4,112), considering the area under the receiver operating characteristic curve (AUC).

Of the 8,208 US individuals, 753, or 9.2%, had PTH levels higher than 74 pg/mL. Random forest (AUC [95% CI] = 0.79 [0.76-0.81]), GBM (AUC [95% CI] = 0.78 [0.75-0.81]), and SuperLearner (AUC [95% CI] = 0.79 [0.76-0.81]) had the greatest AUCs among the six algorithms tested. When cubic splines were included in the logistic regression models to represent the estimated glomerular filtration rate (eGFR), the AUC increased from 0.69 to 0.77. While other methods were less calibrated, the calibration performance of logistic regression models with splines was best (calibration slope [95% CI] = 0.96 [0.86-1.06]). The most significant predictor of the random forest model and GBM among all the factors was eGFR.

They created a prediction model using nationally representative data in the United States, which may help us detect increased PTH accurately and quickly in routine clinical practice. Future research was necessary to determine whether the increased PTH prediction tool might improve unfavorable health consequences.