Chronic Kidney Disease (CKD) is a condition characterized by a progressive loss of kidney function over time caused by many diseases. The most effective weapons against CKD are early diagnosis and treatment, which in most of the cases can only postpone the onset of complete kidney failure. The CKD grading system is classified based on the estimated Glomerular Filtration Rate (eGFR), and it helps to stratify patients for risk, follow up and management planning. This study aims to effectively predict how soon a CKD patient will need to be dialyzed, thus allowing personalized care and strategic planning of treatment.
To accurately predict the time frame within which a CKD patient will necessarily have to be dialyzed, a computational model based on a supervised machine learning approach is developed. Many techniques, regarding both information extraction and model training phases, are compared in order to understand which approaches are most effective. The different models compared are trained on the data extracted from the Electronic Medical Records of the Vimercate Hospital.
As final model, we propose a set of Extremely Randomized Trees classifiers considering 27 features, including creatinine level, urea, red blood cells count, eGFR trend (which is not even the most important), age and associated comorbidities. In predicting the occurrence of complete renal failure within the next year rather than later, it obtains a test accuracy of 94%, specificity of 91% and sensitivity of 96%. More and shorter time-frame intervals, up to 6 months of granularity, can be specified without relevantly worsening the model performance.
The developed computational model provides nephrologists with a great support in predicting the patient’s clinical pathway. The model promising results, coupled with the knowledge and experience of the clinicians, can effectively lead to better personalized care and strategic planning of both patient’s needs and hospital resources.

Copyright © 2021. Published by Elsevier B.V.