In Brief: This study shows that analyzing text from online crowdfunded loan requests can improve predictions of loan default by revealing insights into borrowers' psychological and personality traits. Incorporating Linguistic Inquiry and Word Count (LIWC) analysis into predictive models provides a more nuanced view of borrower risk and significantly enhances accuracy, potentially boosting lenders' ROI by up to 5.75% compared to models relying solely on financial and demographic data.
Background: Limitations of Traditional Loan Assessment
Traditionally, lenders have relied on financial metrics—such as credit history, income, and debt—as well as demographic information (e.g., gender, geographic location) and subjective insights from interactions with borrowers. However, these traditional data points often fall short in fully capturing a borrower’s ability to repay a loan. This study explores how analyzing the text of online crowdfunded loan requests can enhance predictions of loan default by providing additional insights into borrowers' personalities, psychological states, and intentions—factors not easily discernible from financial metrics alone.
The study found that integrating LIWC- based textual analysis into predictive models significantly improved their accuracy, potentially increasing lenders' return on investment (ROI) by up to 5.75% compared to models relying solely on financial and demographic data.
Approach: Predicting Loan Default
Prior research has demonstrated that personality traits, such as risk tolerance and impulsivity, significantly impact financial literacy and decision-making. These traits influence how people approach and manage financial risks. Additionally, research has established links between language use and personality traits, focusing on the "big five" personality dimensions: extroversion, agreeableness, conscientiousness, neuroticism, and openness. The study used a dataset of over 120,000 loan applications from Prosper, a prominent crowdfunding platform, to assess if the language in loan requests could improve default predictions compared to traditional financial models. The researchers applied a multi-faceted analytical approach, including a naive Bayes classifier to evaluate word associations with loan outcomes, Latent Dirichlet Allocation (LDA) topic modeling to identify key topics in loan requests, and a binary logistic regression model incorporating these topics. They also utilized LIWC to analyze textual data in terms of psychological and emotional dimensions.
Significance of LIWC in the Analysis
A crucial aspect of the study was the use of the Linguistic Inquiry and Word Count (LIWC), which analyzes language and categorizes the words into psychologically significant categories. LIWC enabled the researchers to examine how specific aspects of language correlates with loan default risk. By analyzing word frequencies across different LIWC categories, the study identified psychological indicators within language that are associated with borrowers who defaulted. In total, Fourteen LIWC categories were statistically significantly tied to repayment behaviour – all of which are language-based psychological and personality indicators that traditional financial metrics often miss, highlighting the importance of psychological insights in assessing loan default risk.
Findings: Predicting Borrowers Who Default
The study found that loan requests from borrowers who defaulted included language related to personal hardship, family issues, religious references, and pleas for assistance. These requests frequently “differ in their usage of pronouns and tenses, and those seemingly harmless pronouns have the ability to predict future economic behaviours.” Interestingly, defaulters displayed writing styles consistent with liars and extroverts. The findings indicate that defaulting borrowers’ writing contains markers of psychology that traditional financial metrics alone do not capture.
Implications For Lenders and Financial Institutions
By integrating insights from textual data with traditional loan assessment methods, lenders gain a deeper, more nuanced understanding of borrower risk. This enhanced perspective improves their ability to predict defaults and make more informed funding decisions. Particularly valuable in online crowdfunding environments—where personal interactions are sparse and traditional financial measures fall short—this approach offers all lenders an advantage: Incorporating psychological and emotional factors into predictive models marks a major leap forward in risk management, providing a more sophisticated approach for assessing loan default probability.
By leveraging Receptiviti’s LIWC framework, lenders can significantly enhance their predictive accuracy and reduce risk associated with loan defaults. As the lending landscape evolves, incorporating such advanced analytical approaches into traditional due diligence processes can lead to more effective and holistic risk evaluation, addressing the limitations of conventional financial models.