As a group of complex neurodevelopmental disorders, autism spectrum disorder (ASD) has been reported to have a high overall prevalence, showing an unprecedented spurt since 2000. Due to the unclear pathomechanism of ASD, it is challenging to diagnose individuals with ASD merely based on clinical observations. Without additional support of biochemical markers, the difficulty of diagnosis could impact therapeutic decisions and, therefore, lead to delayed treatments. Recently, accumulating evidence have shown that both genetic abnormalities and chemical toxicants play important roles in the onset of ASD. In this work, a new multilabel classification (MLC) model is proposed to identify the autistic risk genes and toxic chemicals on a large-scale data set. We first construct the feature matrices and partially labeled networks for autistic risk genes and toxic chemicals from multiple heterogeneous biological databases. Based on both global and local measure metrics, the simulation experiments demonstrate that the proposed model achieves superior classification performance in comparison with the other state-of-the-art MLC methods. Through manual validation with existing studies, 60% and 50% out of the top-20 predicted risk genes are confirmed to have associations with ASD and autistic disorder, respectively. To the best of our knowledge, this is the first computational tool to identify ASD-related risk genes and toxic chemicals, which could lead to better therapeutic decisions of ASD.

Author