PloS one 2018 03 2313(3) e0194916 doi 10.1371/journal.pone.0194916
Spreadsheet software is increasingly used to implement systems science models informing health policy decisions, both in academia and in practice where technical capacity may be limited. However, spreadsheet models are prone to unintentional errors that may not always be identified using standard error-checking techniques. Our objective was to illustrate, through a methodologic case study analysis, the impact of unintentional errors on model projections by implementing parallel model versions.
We leveraged a real-world need to revise an existing spreadsheet model designed to inform HIV policy. We developed three parallel versions of a previously validated spreadsheet-based model; versions differed by the spreadsheet cell-referencing approach (named single cells; column/row references; named matrices). For each version, we implemented three model revisions (re-entry into care; guideline-concordant treatment initiation; immediate treatment initiation). After standard error-checking, we identified unintentional errors by comparing model output across the three versions. Concordant model output across all versions was considered error-free. We calculated the impact of unintentional errors as the percentage difference in model projections between model versions with and without unintentional errors, using +/-5% difference to define a material error.
We identified 58 original and 4,331 propagated unintentional errors across all model versions and revisions. Over 40% (24/58) of original unintentional errors occurred in the column/row reference model version; most (23/24) were due to incorrect cell references. Overall, >20% of model spreadsheet cells had material unintentional errors. When examining error impact along the HIV care continuum, the percentage difference between versions with and without unintentional errors ranged from +3% to +16% (named single cells), +26% to +76% (column/row reference), and 0% (named matrices).
Standard error-checking techniques may not identify all errors in spreadsheet-based models. Comparing parallel model versions can aid in identifying unintentional errors and promoting reliable model projections, particularly when resources are limited.