Electronic health records (EHRs) represent powerful tools to study rare diseases. We developed and validated EHR algorithms to identify SLE births across centers. Subjects at all three centers were selected using ≥ 1 SLE ICD‐9 or SLE ICD‐10‐CM codes and ≥ 1 ICD‐9 or ICD‐10‐CM delivery code. We tested algorithms using SLE ICD‐9 or ICD‐10‐CM codes, antimalarial use, a positive antinuclear antibody ≥ 1:160, and ever checked dsDNA or complements using both rule‐based and machine learning methods. The impact of subject race, case definition, and coding provider on algorithm performance was assessed.
Algorithms performed similarly across all three centers. Increasing the number of SLE codes, adding clinical data, and having a rheumatologist use the SLE code all increased the likelihood of identifying actual SLE patients. All the algorithms had higher PPVs in African American vs. Caucasian SLE births. Using machine learning methods, the total number of SLE codes and an SLE code from a rheumatologist were the most critical variables in the SLE case status model.
In conclusion, we developed and validated algorithms that use multiple data types to identify SLE births in the EHR. Algorithms performed better in African American mothers than Caucasian mothers.