Exploring Language Markers of Mental Health in Psychiatric Stories

Research from Leiden University and Utrecht University finds Receptiviti’s science outperforms BERT and neural network based approaches in predicting mental disorders based on language.

In the Receptiviti LIWC Research Series, we highlight important research conducted using our platform and science that has implications for our customers' and partners' businesses and for society at large:

Diagnosing mental disorders is complex due to the genetic, environmental and psycholog- ical contributors and the individual risk factors. Language markers for mental disorders can help to diagnose a person. Research thus far on language markers and the associated mental disorders has been done mainly with the Linguistic Inquiry and Word Count (LIWC) program. In order to improve on this research, we employed a range of Natural Language Processing (NLP) techniques using LIWC, spaCy, fastText and RobBERT to analyse Dutch psychiatric interview transcriptions with both rule-based and vector-based approaches. Our primary objective was to predict whether a patient had been diagnosed with a mental disorder, and if so, the specific mental disorder type. Furthermore, the second goal of this research was to find out which words are language markers for which mental disorder. LIWC in combination with the random forest classification algorithm performed best in predicting whether a person had a mental disorder or not (accuracy: 0.952; Cohen’s kappa: 0.889). SpaCy in combination with random forest predicted best which particular mental disorder a patient had been diagnosed with (accuracy: 0.429; Cohen’s kappa: 0.304).

Read the article


Spruit, Marco & Verkleij, Stephanie & Schepper, Kees & Scheepers, Floortje. (2022). Exploring Language Markers of Mental Health in Psychiatric Stories. Applied Sciences. 12. 2179. 10.3390/app12042179.