In recent years, breath analysis has been used as a tool for lung cancer detection and many gas sensors were developed for this purpose. Although they are fabricated with advanced materials, for now, gas sensors are still limited in their medical application due to their unfavorable performance. Here, we hypothesized that a combination of diverse types of sensors could aid in improving the detection performance. We fabricated an e-nose based on 10 gas sensors of 4 types and directly tested it using samples from 153 healthy participants and 115 lung cancer patients, without gas pre-concentration. Additionally, we studied and compared five feature extraction algorithms. The extracted features were then used in 2 optimized clustering algorithms and 3 supervised classification strategies, and their performance was investigated. As a result, “breath-prints” for all subjects were successfully obtained. The combined features extracted by LDA and Fast ICA formed the best feature space. Within this feature space, both clustering algorithms grouped all “breath-prints” into exactly 2 clusters with an Adjusted Rand Index greater than 0.95. Among the 3 supervised classification strategies, random forest with 3-fold cross validation showed the best performance with 86.42% of mean classification accuracy and 0.87 of AUC, which was somewhat better than many recently reported sensor arrays. It can be concluded that, the diversity of sensors may play a role in improving the performance of the e-nose though to what extent still requires evaluation.
Copyright © 2020. Published by Elsevier Ltd.