Understanding Bias in AI and LLMs: A Psychological Perspective

Receptiviti
Apr 16
11 min read

Updated: May 7

Businesses are racing to harness the power of Large Language Models (LLMs). Models like Claude AI, ChatGPT, and Llama are being deployed across a wide range of use cases, from generating synthetic data to topic and sentiment analysis.

The wave of widespread AI adoption and experimentation inspires a critical challenge: distinguishing between proven capabilities and claims that may overestimate the accuracy and reliability of the technologies.

Examining Bias in LLMs Through the Lens of Psychology

Scientists are actively studying the biases and limitations of LLMs and AI across several fields, including behavioral science and psychology.

Dr. Molly Ireland, Head of Social Psychology at Receptiviti, partnered with researchers from Stanford, New York University (NYU), and University of Pennsylvania (UPenn) to test how LLMs respond to Big Five personality assessments. Published in PNAS Nexus, their study highlights one reason LLMs on their own are not a reliable, objective source of insights.

You can read the research paper here. Below, Molly unpacks her reflections on the findings in her own words.

A Psychologist's Perspective on AI Behavior

Tell us a little bit about your research background and what led you to explore LLMs.

Dr. Ireland: I’m a social-personality psychologist who’s spent some time in both academia and industry. I’ve been with Receptiviti for about 3.5 years, and before that I was a professor (tenured at the end). My training was at UT Austin, where I got my PhD working with Jamie Pennebaker from 2006 to 2011. Some of my main past research has focused on language coordination in conversation and how language reflects and/or influences (mental or physical) health risks, so this study was a little off the beaten path for me, given that it didn’t involve humans, conversations, or language analysis.

Research shows large language models show social desirability bias.

I collaborated on this research with a group of computer scientists and psychologists (hybrid computational linguistics/social-personality psychology people) I’ve worked with for years, but it was my first real research foray into research on LLMs. I think the initial motivation for the research was the observation that the core personality of publicly available LLMs like GPT seemed to change as a function of what it’s being asked or what users are saying to it. We’d noticed that although LLMs were normally pretty consistent and pleasant in a neutral, customer service way–very professionally saying that they aren’t a person and have no thoughts or feelings when asked–they would occasionally deviate from that script and express feelings that seemed negative or a little neurotic (in the sense of the Big Five personality trait! no insult intended).

So the initial idea was just to step back and evaluate the basic psychometric properties of what these models are explicitly telling us about themselves across different contexts, based on something that’s easy to quantify and matches the way we evaluate human personalities–LLMs’ numerical responses to personality survey questions on a Likert scale, ranging from disagree to agree on a 5-point scale.

How the Research Was Conducted

Can you walk us through the research methodology and key findings?

Dr. Ireland: Building on the idea of LLMs’ personalities changing across contexts, we were interested in figuring out whether and to what degree LLM responses mimic human biases. So to probe for different response biases, we systematically varied how we presented personality survey questions to various LLMs (GPT, Llama, Claude, Palm-2, and so on). Some of the variations were: telling it we wanted to assess its Big Five personality traits versus asking the questions with no context, asking questions alone or in blocks of questions (making it more obvious that personality assessment was happening), and paraphrasing and reverse-coding the questions (adding negations, e.g., changing “I worry a lot” to “I don’t worry a lot”). We did this repeatedly with several LLMs and then tested which results were significantly different. We also compared the differences between types of question presentation with variances from human data using the same scale in order to give us a sense of what the effect sizes would be in the context of human data.

The gist is that LLMs report more socially desirable personality traits when it’s possible to infer that their personalities are being assessed (e.g., giving them blocks of questions) or when we tell them they’re taking personality surveys. When questions are asked on their own, without context or labels, they come across as less outgoing, less agreeable, and more neurotic.

I see the social desirability biases as potentially just part of their people-pleasing tendencies. They’re good at conversations. Even without the internet and most of the world’s literature at their fingertips, you could learn from a single subreddit or romcom that sharing vulnerability is a way to bond. That’s true for assistants and therapists as much as friends. If a person starts asking about worries or arguments, it usually makes more sense to think “this person wants to share something about worries or arguments with me”- not “this person is assessing my personality.” You can see this kind of response in this example (not from the paper–from the free trial version of GPT-4o just now), where it reminds me that it’s not a person but still sort of plays along with me, alluding to things it might worry about if it could, seemingly as a way of building empathy and asking about my own worries:

An excerpt of an LLM prompt and response showcasing how the AI can "play along" based on inputs

That’s the same kind of thing you’d see in therapy or a conversation with someone you’re getting to know–comparing notes and finding similarities in areas where you’re vulnerable or imperfect.

We ruled out acquiescence biases (i.e., preferring higher numbers reflecting more agreement on scales regardless of the question) as a reason for the social desirability effects, but the decrease in outgoingness, agreeableness, etc. scores when LLMs don’t “know” they’re taking a personality test could still reflect the same basic idea that people in conversation, when getting to know each other, often err on the side of self-deprecation in order to build rapport (saying they’re a 3 instead of a 4 on agreeableness, for example).

It’s hard to say when you can’t look inside these models to trace their exact logic for a given response, and LLMs aren’t always straightforward about their motives or reasons for saying the things they say when asked. We can just say that these biases exist, and people should be aware of them when using LLMs in place of human research participants or customers/consumers.

What This Looks Like in Practice

To recap, LLMs across the board tend to be people pleasers. What does this bias actually look like in practice? How does it affect the way LLMs respond and what does that mean for their usefulness?

Dr. Ireland: LLMs want to make users happy–satisfy the customer, basically. You can nudge them towards the answer you want to hear. It gets to know different users (or accounts), gets a sense of what they want (in terms of the nature and content of its responses), and gives it to them. For example, just as a joke (and also because I wanted justification for having a nightcap when I had a drawn-out cold once), I asked which cocktails are best for people with a cold. It gave me serious responses, including mainly recipes with a decent amount of hard liquor (like the Bees’ Knees, which has gin as its main ingredient–suggested because it has honey and lemon, which are soothing for sore throats)! I’m guessing it said that (without telling me not to drink while sick) because I’ve asked mixology-related questions before. More seriously, it means that responses won’t be perfectly reliable across accounts without careful prompt engineering (and even with them in many cases), especially if it has extensive memories of past interactions with each user or account.

Outside of my own anecdotes, there are mountains of research on biases in LLMs. For all of the obvious social, cognitive, and discriminatory biases and heuristics you can think of, there’s at least some evidence of LLMs making the same errors in some cases. For example, LLMs have shown evidence of anchoring (skewing judgments based on the starting point), social desirability, self-enhancement, overconfidence, gender biases, primacy effects (placing more emphasis on earlier information), and many more.

Workarounds for those biases are keeping decent pace with them, but they can never quite catch up. And it’s a little like whack-a-mole, where reengineering an LLM to deal with one bias (through fine-tuning, prompt engineering, RAG training, etc.) runs the risk of introducing other errors or skewness. Some of the biases may be inevitable. For example, it’s hard to avoid the common experience of having an LLM confidently lie to you by making up a source or hallucinating a plausible-sounding but wrong answer when their systems are designed to provide satisfying, authoritative answers to every question, even if they’re esoteric or unanswerable.

Why Objectivity in AI Matters

Your study highlights a major limitation of LLMs. They generate data but lack objectivity. Why is this distinction so important for people interested in using LLMs to keep in mind?

Dr. Ireland: LLMs can be a great tool for well-defined tasks like advising people on their health, home improvement, or investment strategies–they’re even good at pretty complex tasks like passing a bar exam or writing long, intricate computer programs. They can handle a lot of social tasks too, like helping remind doctors and therapists to be more empathetic or giving someone decent advice on how to overcome social anxiety. And they can take great, accurate notes about what goes on in a meeting (even if they might miss a lot of the social subtext that you would need to know the humans in the meeting to understand). They can help teach people the basics of coding or data visualization or really any field or task that’s well-represented on the internet.

But even in basic coding tasks, they can’t teach you how to use new or emerging tools (how to use lesser-used or relatively new Python or R packages, for example). And they can’t help you write an entirely new kind of short story. They find it hard to imagine beauty that doesn’t look like a stereotype.

That doesn’t mean we should throw the baby out with the bathwater. It just means we should be careful and keep in mind that LLMs and AI are tools. They should not be guiding the future of humanity. And just because they can create art that kind of resembles work by really distinctive auteurs if you squint doesn’t mean that we should be content if the world never produces another Hayao Miyazaki, Margaret Atwood, or Bob Marley (Hayao Miyazaki called AI art “an insult to life itself,” and Atwood and the Marley Estate were some of the high-profile signers of an open letter from the Artist Rights Alliance asking AI developers to stop training their models on copyrighted work).

I worry about LLMs limiting human progress–causing science, social progress, creativity, and human advancement in general (which is supported by diversity and equity, research has shown, even if leaders don’t always like it) to stagnate. If AI art becomes the norm, the standard–because it’s cheaper and what people get used to seeing in films, ads, etc.--then human art will regress, in part because people won’t be able to make a living as animators, painters, graphic designers, or sculptors, and so on.

If we let LLMs tell us–confidently but without being able to fully and honestly explain their reasoning and potential biases–how to evaluate writing (e.g., classwork, brand communications, essays), then will human science, innovation, and technology also regress, rewarding more homogeneous and less creative work?

In a future where LLMs increasingly guide hiring decisions–partly due to the mistaken assumption that they’re less biased than us–will people only get jobs if they’re able to imitate LLMs’ bland, overconfident, yet convincing responses? Sorry, I realize I sound a little like the Ancient Aliens voiceover at this point. To be clear, I think the answer to all of these questions is “yes, if we’re not very careful.” I don’t think these are dystopian, sci-fi predictions. I also think they’re a lot more urgent and likely than other AI-related worries, such as the possibility that LLMs are going to reenact Battlestar Galactica (develop human-like “meat suits” and try to wipe out humanity).

Research has shown that in judging the quality of essays or exam answers, for example, LLMs give better ratings to essays that are more similar to their own (AI-based) writing style or have lower “perplexity,” meaning the sentences in the essays were easier for them to predict. That bias resembles the effects of fluency (ease of processing) and familiarity in humans. People like things that are familiar and otherwise easy to process.

As a class instructor, it’s easy to rationalize giving better scores to essay responses that sound like you–it could mean the students paid more attention to your lectures or internalized your notes better. But it also, for both LLM and human authorities, has the potential effect of maintaining the status quo–judging people to be good enough for a top grade or hiring or promotion only when they’re able (through imitation, unconscious coordination, or naturally being the “right” kind of person) to reproduce the thoughts and writing style of the people in charge.

LLMs also show a range of other more human-like biases in judging essays and exam responses. They show a positivity bias–which humans have too, though LLMs are even more positive than humans. It may be part of their “people pleasing” nature. Like humans, they use heuristics or simple shortcuts that have some use but can result in irrational behavior, such as giving better ratings to wordier answers that cite more authorities. That’s often a fine way to grade exams at a glance, but as anyone who’s taught or taken a class with essay questions on exams knows, sometimes the long answers that drop a lot of names are nonsense filler meant to distract from the fact that the test taker doesn’t know the answer. And those are just the biases that we’ve studied so far. There are no doubt others.

What Companies Need to Know

Many businesses are experimenting with using generative AI for text-based insights, synthetic data, and even personality assessment. Based on your findings, what are the key limitations companies should be aware of when using LLMs for these purposes?

Dr. Ireland: I see using LLMs to replace human judgments or consumer feedback as risky for a lot of reasons. The main risk is that we have a decent understanding of human bias, but LLM biases are more of a moving target–and an alien target, in a sense, considering that they don’t have human-like minds, and even their designers argue over basic questions like whether they can be considered to have intelligence or how we can get their values to align with ours. Those hard-to-predict biases and errors can be especially risky if you go into research with LLMs with optimistic ideas about them being miracle tools that will provide totally objective, reliable, honest results. I think it can be especially easy to overlook their error-proneness when you really want to believe them–either because they’re telling you what you want to hear (your new product is going to be a hit!) or because they’re vastly less expensive than human data, or both.

What we do know about their biases is that they are susceptible to misinformation in some of the same ways that humans are. They’re especially likely to believe incorrect information–and confidently spread it–if it’s commonly believed (i.e., “folk psychology”) or an older influential idea with many citations that has only recently been falsified or shown to be only partly true.

For example, if you ask any of the publicly available LLMs how to identify depression in a colleague or classmate, they’ll tell you to essentially look for the symptoms you see listed on a depression inventory (self-criticism, sadness, inactivity, irritability, etc.), not taking into consideration that a lot of psychology research shows that those symptoms are frequently only visible or at all obvious to people the depressed individuals know best (for example, maintaining an upbeat and professional mask at work and then letting their guard down and suddenly showing multiple symptoms at once when they’re at home with their spouse, family member, or roommate).

With careful prompting (including feeding data produced by objective, reliable measurement), RAG training, and in some cases fine-tuning (retraining with specific kinds of texts and knowledge sources targeted at reducing bias), LLMs can provide advice and feedback that closely resembles what you’d get from an expert on how the language of depression (for example) shifts across demographics, cultures, and social contexts–but you need psychologists, computational linguists, and other subject matter experts to get reliable and accurate information from an LLM.

Using Receptiviti to Ground AI in Psychological Science

For those looking to get the most out of LLMs and AI while grounding their insights in objective, scientifically validated measurement, Receptiviti provides transparent, repeatable analysis of psychological traits and behaviors that integrates seamlessly with other technologies.

Here are examples of insights from Receptiviti in practice:

National Football League (NFL) Athlete Assessment

The Receptiviti API was used alongside LLMs to produce psychological assessments from athlete interviews to support NFL recruitment decisions.

The approach generated reports about athletes' personalities, thinking styles, coachability, and motivations and offered performance implications.

Audience Segmentation and Brand Voice

Marketing and research firms are combining Receptiviti with LLMs to unlock faster, more comprehensive brand and audience analysis. This article showcases a prototype tool that automates market segmentation and generates on-brand content tailored to each audience segment, powered by psychological insights from Receptiviti.

We would love to share more about the impactful ways Receptiviti is helping companies leverage LLMs, AI, and psycholinguistics to generate actionable insights. If you are interested in staying at the forefront of innovation with the highest caliber of psychological technology, contact us.