This study seeks to (1) demonstrate how machine learning (ML) can be used for prediction modeling by predicting the treatment patients with T1-2, N0-N1 oropharyngeal squamous cell carcinoma (OPSCC) receive and (2) assess the impact patient, socioeconomic, regional, and institutional factors have in the treatment of this population.
A retrospective cohort of adults diagnosed with T1-2, N0-N1 OPSCC from 2004 to 2013 was obtained using the National Cancer Database. The data was split into 80/20 distribution for training and testing, respectively. Various ML algorithms were explored for development. Area under the curve (AUC), accuracy, precision, and recall were calculated for the final model.
Among the 19,111 patients in the study, the mean (standard deviation) age was 61.3 (10.8) years, 14,034 (73%) were male, and 17,292 (91%) were white. Surgery was the primary treatment in 9,533 (50%) cases and radiation in 9,578 (50%) cases. The model heavily utilized T-stage, primary site, N-stage, grade, and type of treatment facility to predict the primary treatment modality. The final model yielded an AUC of 78% (95% CI, 77-79%), accuracy of 71%, precision of 72%, and recall of 71%.
This study created a ML model utilizing clinical variables to predict primary treatment modality for T1-2, N0-N1 OPSCC. This study demonstrates how ML can be used for prediction modeling while also highlighting that tumor and facility realted variables impact the decision making process on a national level.

© 2021 S. Karger AG, Basel.