Next-generation sequencing (NGS) technologies have been applied in bacterial flora analysis. However, there is no standardized protocol, and the optimal clustering threshold for estimating bacterial species in respiratory infection specimens is unknown. This study was conducted to investigate the optimal threshold for clustering 16S ribosomal RNA gene sequences into operational taxonomic units (OTUs) by comparing the results of NGS technology with those of the Sanger method, which has a higher accuracy of sequence per single read than NGS technology. This study included 45 patients with pneumonia with aspiration risks and 35 patients with lung abscess. Compared to Sanger method, the concordance rates of NGS technology (clustered at 100%, 99%, and 97% homology) with the predominant phylotype were 78.8%, 71.3%, and 65.0%, respectively. With respect to the specimens dominated by the Streptococcus mitis group, containing several important causative agents of pneumonia, Bray Curtis dissimilarity revealed that the OTUs obtained at 100% clustering threshold (versus those obtained at 99% and 97% thresholds; medians of 0.35, 0.69, and 0.71, respectively) were more similar to those obtained by the Sanger method, with statistical significance (p < 0.05). Clustering with 100% sequence identity is necessary when analyzing the microbiota of respiratory infections using NGS technology.
© 2021. The Author(s).

Author