Primary nephrotic syndromes are rare diseases which can impede adequate sample size for observational patient-oriented research and clinical trial enrollment. A computable phenotype may be powerful in identifying patients with these diseases for research across multiple institutions.
A comprehensive algorithm of inclusion and exclusion ICD-9 and ICD-10 codes to identify patients with primary nephrotic syndrome was developed. The algorithm was executed against the PCORnet CDM at three institutions from January 1, 2009 to January 1, 2018, where a random selection of 50 cases and 50 noncases (individuals not meeting case criteria seen within the same calendar year and within 5 years of age of a case) were reviewed by a nephrologist, for a total of 150 cases and 150 noncases reviewed. The classification accuracy (sensitivity, specificity, positive and negative predictive value, F1 score) of the computable phenotype was determined.
The algorithm identified a total of 2708 patients with nephrotic syndrome from 4,305,092 distinct patients in the CDM at all sites from 2009 to 2018. For all sites, the sensitivity, specificity, and area under the curve of the algorithm were 99% (95% CI, 97% to 99%), 79% (95% CI, 74% to 85%), and 0.9 (0.84 to 0.97), respectively. The most common causes of false positive classification were secondary FSGS (nine out of 39) and lupus nephritis (nine out of 39).
This computable phenotype had good classification in identifying both children and adults with primary nephrotic syndrome utilizing only ICD-9 and ICD-10 codes, which are available across institutions in the United States. This may facilitate future screening and enrollment for research studies and enable comparative effectiveness research. Further refinements to the algorithm including use of laboratory data or addition of natural language processing may help better distinguish primary and secondary causes of nephrotic syndrome.

Copyright © 2021 by the American Society of Nephrology.