Mismatch between federal and state data early in the pandemic to blame

Federal data from the CDC’s National Healthcare Safety Network (NHSN) understated total Covid-19 cases and deaths in nursing homes early in 2020, and not taking this into account may lead to wrong conclusions about nursing home outbreaks, a cross-sectional study suggested.

“We estimate that 44.7% of Covid-19 cases and 40.0% of Covid-19 deaths occurring prior to May 24 [2020] were not reported in the first NHSN submission,” wrote Karen Shen, PhD, of the Harvard University Department of Economics, and co-authors in JAMA Network Open. “These unreported cases and deaths had a significant influence on our estimates of total cases and deaths attributable to Covid-19 in nursing homes, accounting for 11.6% of cases and 14.0% of deaths in the year-end totals.”

The findings suggested 68,613 cases and 16,623 deaths in 2020 were omitted.

The study used information from 20 state health departments to evaluate and supplement federal data on Covid-19 cases and deaths in nursing homes. Required federal Covid-19 case and death reporting for nursing homes began May 24, 2020—3 months after the reported outbreak in Kirkland, Washington. Reporting required weekly incident cases and deaths, but retrospectively reporting cases and deaths from earlier in the pandemic was optional.

“Accounting for this delay is important when comparing the toll of the pandemic across places,” Shen and colleagues noted. “Consistent with the fact that states in the Northeast were hit hardest in the early months of the pandemic but generally experienced lower case and death rates in later months, we found that unreported cases and deaths represented a significantly larger share of year-end totals in the Northeast than in the South and West, where most cases and deaths occurred later.”

“To date, both academic and policymakers’ analyses of facility level determinants of infections and mortality have likely been limited owing to the reliance on federal estimates,” they continued. “In particular, use of the unadjusted federal data may help explain why some reports found an association between lower-rated nursing homes and Covid-19 outbreaks (a conclusion that guided early enforcement actions against nursing homes), while others did not.”

Researchers used data from all U.S. nursing homes from late May 2020, with sources including the NHSN Covid-19 Nursing Home Data set (weekly facility-level data on new and cumulative cases and deaths; the first report may or may not have included retrospective cases and deaths).

They compared federal data with facility-level data from 20 state health departments that required cases and deaths to be reported from the beginning of the pandemic, also comparing within-state differences between facilities contributing data to the study and those not contributing data. The state samples included 15,415 facilities overall, with case data from 12 states and 4,599 facilities (10,816 facilities with case data unavailable) and death data from 19 states and 7,405 facilities (data from 8,010 facilities unavailable).

Raw federal NHSN data for the earliest reporting period implied rates of death in New York and California were 5.0 and 4.8 deaths per 100 beds, respectively. Accounting for unreported deaths, the group estimated instead 8.0 and 5.5 deaths per 100 beds, respectively.

“We did not find differences in nonreporting by facility characteristics (i.e., region, ownership, chain affiliation, or star rating) as of May 24,” the researchers wrote. “This implies that facilities of all types omitted previous cases and deaths in the first NHSN submission. This may demonstrate a widespread inability of nursing homes to reliably collect data early in the pandemic or that pressures to report fewer cases and deaths were common to all facilities.”

“The first key takeaway from these findings is that future research using the NHSN data must account for this significant data limitation,” noted Elizabeth White, PhD, APRN, of Brown University in Providence, Rhode Island, in an accompanying editorial. “Cumulative measures of Covid-19 prevalence or mortality are likely to be flawed without correcting for underreporting, and may particularly bias estimates for nursing homes that had their most severe outbreaks during the initial U.S. wave of the pandemic,” White continued. “A large body of research has already developed examining various facility- and community-level factors associated with Covid-19 outcomes in nursing homes using the NHSN data. The study by Shen and colleagues provides important context for evaluating the rigor of these studies.”

For latest news and updates

“The second, more humbling conclusion from these findings is that the true toll of Covid-19 on nursing home residents may never be known,” she added, noting that the most accurate data sources for nursing home cases and deaths during the initial U.S. pandemic wave are state health departments. However, “only approximately half of U.S. states collected and publicly released nursing home Covid-19 data during Spring 2020, and these states varied widely in the amount and quality of information reported,”White observed.

Limitations of the study include the use of extrapolation from sample states to non-sample states. “Although facilities in sample states and non-sample states differed significantly on several important characteristics (e.g., region, ownership, size), we do not find that these characteristics were associated with the likelihood of non-reporting; thus, we believe our extrapolation is reasonable,” Shen and co-authors noted.

In addition, reporting requirements and definitions of case and death differed from state to state. In New York, for example, deaths outside the facility following a patient being sent to the hospital were not counted.

“Our data, which we have made publicly available, also offer the ability to credibly study the associations of facility responses and state and federal policy in the early months of the pandemic with slowing the spread in nursing homes, which is not possible with the federal data owing to missing data,” the researchers wrote.

  1. Federal data from the CDC’s National Healthcare Safety Network (NHSN) understated total Covid-19 cases and deaths in nursing homes early in 2020, and not taking this into account may lead to wrong conclusions about nursing home outbreaks, a cross-sectional study suggested.

  2. The researchers estimated 44.7% of Covid-19 cases and 40.0% of Covid-19 deaths occurring prior to May 24, 2020 were not reported, suggesting 68,613 cases and 16,623 deaths were omitted.

Paul Smyth, MD, Contributing Writer, BreakingMED™

Shen reported no disclosures.

White reported receiving grants from the National Institute on Aging.

Cat ID: 926

Topic ID: 79,926,933,926,928,934