Cigarette smoking is a risk factor for the development of head and neck squamous cell carcinoma (HNSCC), partially due to tobacco-induced large-scale chromosomal copy-number alterations (CNAs). Identifying CNAs caused by smoking is essential in determining how gene expression from such regions impact tumor progression and patient outcome. We utilized The Cancer Genome Atlas (TCGA) whole genome sequencing data for HNSCC to directly identify amplified or deleted genes correlating with smoking pack-year based on linear modeling. Internal cross-validation identified 35 CNAs that significantly correlated with patient smoking, independent of human papillomavirus (HPV) status. The most abundant CNAs were chromosome 11q13.3-q14.4 amplification and 9p23.1/9p24.1 deletion. Evaluation of patient amplicons reveals four different patterns of 11q13 gene amplification in HNSCC resulting from breakage-fusion-bridge (BFB) events. . Predictive modeling identified 16 genes from these regions that denote poorer overall and disease-free survival with increased pack-year use, constituting a smoking-associated expression signature (SAES). Patients with altered expression of signature genes have increased risk of death and enhanced cervical lymph node involvement. The identified SAES can be utilized as a novel predictor of increased disease aggressiveness and poor outcome in smoking-associated HNSCC.
