top of page

Cognitive State as Infrastructure: Why AI Needs an Explicit User-State Measurement Layer

  • Writer: Jonathan Kreindler, Receptiviti Co-Founder
    Jonathan Kreindler, Receptiviti Co-Founder
  • Apr 22
  • 5 min read

Updated: Apr 23


The user-state signals that influence model behavior are not available to the systems meant to govern, evaluate, and improve the model's behavior.
The user-state signals that influence model behavior are not available to the systems meant to govern, evaluate, and improve the model's behavior.

AI systems respond to users' cognitive and emotional state. They do so implicitly and often with good intuition but also with varying degrees of accuracy and consistency. The core problem is that what they’re responding to remains largely implicit, unstructured, and invisible to developers, evaluators, and the systems meant to govern model behavior.


When a user sounds frustrated, a well-tuned model will often soften its tone. When language suggests overload, it may acknowledge overwhelm. When prompts feel urgent, responses tend to become more direct. None of this was explicitly designed—it emerges from training on human communication, where psychological signals are embedded in language. Models infer user state constantly, but opaquely, with inconsistent accuracy, as a side effect of generation.


Current systems have some visibility into their own reasoning (via probing, or auxiliary prediction heads in advanced setups), yet the underlying user-state signals that influence model behavior remain crude, unstructured, and largely unavailable to the systems meant to govern, evaluate, or improve the model's behavior. The consequences range from poorly calibrated responses that further overwhelm users to more serious cases where a system inadvertently exacerbates user distress it can’t see.


The Signal is There. It's Just Invisible.


As a result, it becomes extremely difficult to understand or systematically improve how models align with, support, and help users. There is no stable variable for cognitive load in evaluation pipelines, no structured representation of anxiety or confusion in fine-tuning, and no reliable way to track whether a model responds differently under pressure versus calm reasoning.


If a system can’t explicitly observe the conditions under which it makes decisions, it can’t reliably optimize for them. What often appears as model inconsistency in benchmarks may actually reflect unobserved variation in interaction state. Today, that variation is inferred implicitly and inconsistently inside the model. Instrumentation makes it explicit, stable, and inspectable.


Making User-State Explicit


The solution is conceptually straightforward but it changes the system in a fundamental way: Treat interaction state as measurable infrastructure. Not a model of the user, but a structured description of the conversation as it unfolds.


The goal is not to build persistent representations of individuals. Proper system design makes that impossible.


Instead, the system captures transient properties of the interaction - things like cognitive load, analytical thinking, emotional anxiety, risk orientation, confidence, and attention - derived from the language in the current exchange. Represented as structured variables or normalized vectors, these become stable, measurable, and comparable dimensions that the system can inspect and act upon.


This isn't theoretical


In controlled A/B tests with OpenAI's GPT-5 in Study Mode, the only change was injecting a compact vector of 12 psychological signals (z-scored, clipped ±3), derived from Receptiviti's validated linguistic psychology framework. No retraining. No style prompts. The vector was added directly as context via prompt injection, and GPT-5 adapted on its own.


The tests involved student-AI tutoring dialogues spanning five subjects across algebra, probability, Newton’s law, chemistry, and compound interest. Responses were scored independently by five blinded LLMs on clarity, reasoning & scaffolding, cognitive load management, reassurance & confidence support, and alignment with student needs.


Results:


  • Overall educational effectiveness: Baseline 8.08 → Vector-informed 9.00 (+11.4% on a 0–10 scale).

  • Largest gains: Reassurance & Confidence Support (+18.8%), Alignment with Student Needs (+15.8%).

  • All evaluators scored the vector-informed responses at or above baseline across every dialogue and metric.


When cognitive load was high, responses became more constrained, structured, and action oriented. When analytical thinking was lower, explanations simplified accordingly. Same model, same tasks, only user state visibility changed.

Methodology and full results available on request.


What Changes Across the System


Explicit measurement turns implicit, probabilistic inference into stable, highly usable signals:


> Evaluation becomes conditional:

Instead of asking whether response A is better than B in the abstract, we can ask whether it is better for a user with high cognitive load and low analytical processing. Unobserved user-state variation likely explains a meaningful portion of apparent "inconsistency" in current benchmarks.


> Training becomes more targeted:

We can condition fine-tuning or RL on explicit state vectors, generate synthetic data conditioned on specific cognitive/emotional profiles, or train auxiliary prediction heads to forecast state trajectories.


> Safety and alignment improve through context-dependence:

Reward models can incorporate state signals, reducing false positives in distress detection and enabling more nuanced policies (e.g., distinguishing exploratory questions from distress).


For production systems, explicit user state becomes infrastructure. It enables real-time logging of vectors, monitoring for drift across millions of sessions, and A/B testing on state-conditional metrics. This supports scalable governance, auditing how the model behaves under high cognitive load, setting policies tied to measurable states rather than vague heuristics, and tracking long-term cognitive outcomes at scale.


Trajectory, Not Snapshots in Time


A single message is a snapshot in time, but what matters far more is the trajectory. When you measure across an interaction, or across sequences of interactions (where continuity is explicitly scoped and controlled), you begin to see patterns that are invisible at the individual turn level, like cognitive load rising over successive turns, analytical thinking dropping as fatigue sets in, emotional signals intensifying, and engagement diminishing. Once you have telemetry on dimensions like these, the system can respond more appropriately - simplifying before overload occurs, shifting from passive to active prompting when engagement drops.


In safety contexts, a substantial body of research shows that human language changes in predictable ways before high-risk behaviors like self-harm and suicide, including shifts in the rates of self-references, increases in absolutist language, and changes in emotional expression. These signals are probabilistic and directional, but directional change across a trajectory provides indicators of risk that are simply not visible to models today. This approach can help reduce false positives: a one-off mention of suicide without a supporting trajectory should be handled differently from a pattern of risk indicators that emerges across sessions.


On Privacy: Measurement for User Benefit, Not Platform Optimization


Social media showed the risks when psychological and engagement signals were measured to maximize time-on-platform. Here the goal is the opposite - to constrain system behavior to better serve the user, producing more appropriate responses, reducing overload, and enabling auditable, governable safety.


Today, systems already infer and react to these signals with zero logging or transparency. Making interaction state explicit creates the possibility of actual governance: clear policies for what is measured and why, opt-in frameworks that give users control, audit trails that allow behavior to be reviewed and challenged, and constraints on how these signals can be used. That is more protective than a system that reacts in ways nobody can understand.


But interaction state must treated as instrumentation, not identity. The signals must be transient, derived only from the current interaction, and not tied to persistent profiles. They should be logged in a way that is auditable but not attributable to individuals, and designed so they cannot be reverse-engineered into user models.


Properly implemented, this creates a controllable interface for debugging, evaluation, and governance, without introducing representations of people.


Cognitive State as Infrastructure: A Forward-Looking View


A standardized cognitive state schema (a compact, privacy-preserving vector derived from user language) could function as a shared layer across AI systems, much like how context windows and system prompts became standard infrastructure. Once interaction state signals are structured and portable, they can integrate with multimodal inputs and support evaluation frameworks that currently have no way to condition on user context. 


Once interaction state measurement is in place, many of problems that currently look like model limitations are, in fact, measurement problems. And measurement problems, unlike model limitations, are solvable.

Trusted by industry leaders:

Subscribe to the blog

bottom of page