Language abnormalities are a core symptom of schizophrenia-spectrum disorders and could serve as a potential diagnostic marker. Natural language processing enables quantification of language connectedness, which may be lower in schizophrenia-spectrum disorders. Here, we investigated connectedness of spontaneous speech in schizophrenia-spectrum patients and controls and determine its accuracy in classification. Using a semi-structured interview, speech of 50 patients with a schizophrenia-spectrum disorder and 50 controls was recorded. Language connectedness in a semantic word2vec model was calculated using consecutive word similarity in moving windows of increasing sizes (2-20 words). Mean, minimal and variance of similarity were calculated per window size and used in a random forest classifier to distinguish patients and healthy controls. Classification based on connectedness reached 85% cross-validated accuracy, with 84% specificity and 86% sensitivity. Features that best discriminated patients from controls were variance of similarity at window sizes between 5 and 10. We show impaired connectedness in spontaneous speech of patients with schizophrenia-spectrum disorders even in patients with low ratings of positive symptoms. Effects were most prominent at the level of sentence connectedness. The high sensitivity, specificity and tolerability of this method show that language analysis is an accurate and feasible digital assistant in diagnosing schizophrenia-spectrum disorders.
Copyright © 2021 The Author(s). Published by Elsevier B.V. All rights reserved.