FRIDAY, Nov. 11, 2016 (HealthDay News) — Modeling has identified the amount and type of data needed to detect prediagnostic heart failure, according to research published online Nov. 8 in Circulation: Cardiovascular Quality and Outcomes.
Kenney Ng, Ph.D., from the T.J. Watson Research Center in Cambridge, Mass., and colleagues used longitudinal electronic health records data to examine the performance of machine learning models trained to detect prediagnostic heart failure in primary care patients. The authors assessed model performance in relation to the prediction window length, observation window length, number of different data domains, number of patient records, and density of patient encounters. Data from 1,684 incident health failure cases and 13,525 sex-, age category-, and clinic-matched controls were used.
The researchers observed improvement in model performance as the prediction window length decreased, especially when less than two years, and as the observation window length increased, but plateaued after two years. In addition, improved prediction was seen with increases in the training data set size, but leveled off after 4,000 patients; with use of more diverse data types (combination of diagnosis, medication order, and hospitalization data were most important); and when data were confined to patients with 10 or more phone or face-to-face encounters in two years.
“These empirical findings suggest possible guidelines for the minimum amount and type of data needed to train effective disease onset predictive models using longitudinal electronic health records,” the authors write.
Copyright © 2016 HealthDay. All rights reserved.