Sequence characteristics are usually used to explain the adaptive ability to hosts, metabolism, genetic diversity, drug resistance, and infectivity of Mycobacterium tuberculosis. Exploring the codon usage pattern of coding sequences in Mycobacterium tuberculosis is of great significance. In the present study, two hundred random complete genomes of Mycobacterium tuberculosis were downloaded from the National Center for Biotechnology Information database. The important codon usage pattern, such as the codon bias index, the effective number of codons, the relative synonymous codon usage as well as the base component, of twenty one specific functional genes were counted or calculated. The differences of the relative synonymous codon usage values among those functional genes, and the summation of the standard deviations of codon usage parameters were used to evaluate the divergence degree of the concerned genes. The results show that among the concerned genes, 1) all genes are high GC sequences, the codon usage frequency corresponding to each amino acid of these functional genes had a significant bias; 2) the genes of those with high effective number of codons, such as the coding sequences of Myco-bacterial membrane protein large family, usually have higher divergences; and 3) genes with lower divergences, such as the ag85A and the sigH, are usually highly conserved and are often used as drug target genes. The findings of the present work would improve new understandings on the evolution of Mycobacterium tuberculosis and on the measures to prevent and control tuberculosis from the gene engineering.
Copyright © 2021. Published by Elsevier B.V.

Author