top of page

AI’s Moneyball Moment: Why Better Signal Extraction Will Matter More Than Scaling

  • Writer: Jonathan Kreindler, Receptiviti Co-Founder
    Jonathan Kreindler, Receptiviti Co-Founder
  • Nov 5
  • 7 min read

Updated: 7 minutes ago

One of the biggest challenges in creating new technologies is that the roadmaps are often driven by assumptions and hypotheses. Right now, the roadmap for the next generation of AI is being built on a foundational hypothesis that requires hundreds of billions of dollars in investment, and with no guarantees that it will succeed. Yet, leaders like Microsoft, OpenAI, Google, and Meta are pouring massive capital into computational infrastructure based on the premise that scaling compute and data alone will continue to generate foundational breakthroughs.


What First-Principles Thinking Reveals About Scaling


But what if these aspirational breakthroughs require more than scale? Applying first principles thinking to this problem, exposes that in all likelihood, the next great leap in AI won’t come from processing more data, but from applying alternative analytical methods to those that are currently used, and specifically those that have been validated through decades of research in psycholinguistics and cognitive science. These methods can unlock dramatic improvements to models that scale alone can’t, and they can likely also generate a far greater return on the hundreds of billions already invested in AI infrastructure.


A recent test of this hypothesis produced surprising and exciting results: Adding a 12-dimension psychological signal to OpenAI’s GPT Study Mode using validated psycholinguistic methods improved GPT’s educational effectiveness by 4%, reasoning and scaffolding by 6.3%, and its ability to accommodate to the user’s cognitive load improved by 5.9%. Results were based on 25 blinded evaluations, and the improvements required no model retraining, no additional compute, no style prompting, just a compact vector of validated psychological signals generated by analyzing student’s language, signals that the model was fundamentally unable to extract on its own.


This is the same dynamic that shows up whenever a system stops getting returns from brute-force scaling: Improvements to GitHub Copilot increasingly come from better understanding of code structure rather than ingesting more repositories, and in medical AI, models that are trained on smaller, expert-curated datasets often outperform those trained on massive datasets. The commonality here is that refining the extraction process typically leads to better results than adding more data or scaling compute.


The Counterargument and Its Limit


Scaling has been extraordinarily effective in delivering more advanced AI model capabilities, as evidenced by the performance of models like GPT-5, Claude Sonnet 4.5 and Gemini 2.5 Pro. But, these leaps also occurred when there was still plenty of untapped signal remaining in training data, signal that could be exploited by scale alone. The question now isn’t whether scaling works, it’s whether we’re now reaching a point of diminishing returns where scaling alone won’t extract more signal.


The GPT Study Mode case provides concrete proof that scaling alone isn’t sufficient: 6.3% improvement in reasoning from better signal extraction, achieved without any additional model scale. If similar gains can be realized in other domains by adjusting methodology rather than by investing in brute-force scaling, the implications for the economics of AI and how the next breakthroughs will be realized is, significant, to say the least. We may just be at that inflection point, and a growing number of prominent AI leaders are saying the same thing:


  • Ilya Sutskever (OpenAI): “the age of scaling” is behind us and the industry must now focus on “scaling the right thing.”

  • Demis Hassabis (DeepMind): “Scale only gets you so far. The biggest breakthroughs will require more than just more compute.”

  • Gary Marcus (NYU): “The current strategy of merely making A.I. bigger is deeply flawed scientifically, economically and politically” and “One of the keys to this may be training and developing A.I. in ways inspired by the cognitive sciences.”

  • Yann LeCun (Meta) “You cannot just assume that more data and more compute means smarter AI.”


The industry seems to be moving cautiously when it comes to embracing alternative analytical methods, and it’s understandable given the massive capital investments already made into scaling. But this is not an either-or choice, scaling compute and embracing alternative analytical methods should be complementary. The risk isn’t that scaling will stop working, it’s that high-value signals are going unexploited, which decreases the return on the massive compute investments and unnecessarily limits the capabilities of AI models.


The broader AI research community is independently reaching the same conclusion. Efforts like World Models reflect a growing recognition that methodological innovation, not more compute alone, is now the limiting factor in AI progress.


Why Language Data Remains Underexploited


To understand why language data is so valuable for AI and why it’s been underexploited, we need to recognize that language is much more than just strings of words we use to communicate. When viewed through an AI lens, language is essentially an encoding layer that captures the universe of documented human knowledge, culture, and reasoning. Decades of research in behavioral science, clinical psychology, and psycholinguistics has proven that important cognitive and psychological information is reliably encoded within everyday language.


The problem is that current AI systems don’t decode these signals because they rely on the text analysis methodologies that have dominated Natural Language Processing (NLP) for decades - and they focuses almost entirely on the literal, semantic meaning of the words. While focusing on semantic meaning was sufficient for applications like search and machine translation, today we are increasingly asking LLMs to perform tasks that require a nuanced understanding of human context, intent, and cognition, which makes the semantic-only approach insufficient, no matter how much compute power is thrown at it.


The iceberg analogy is horribly overused, but it’s the perfect way to explain the issue: The semantic information that machines read from language is the tip of the iceberg, and the huge hidden mass of ice below the waterline contains all the context, intent, psychological information, social factors, and human experience that dictate why those words were used. This is why two people will often describe the exact same experience differently. Psycholinguistics offers the ability to analyze the language and decode the psychological signals that explain why they speak differently, so you can understand not just what they’re saying, but their underlying cognitive state, their level of stress or confidence, their openness to persuasion, and even their readiness to act on information.


While current LLMs can learn some of these patterns by processing massive amounts of text, they do so inconsistently and without a theoretical framework for understanding what the patterns actually mean. They can detect distress or adjust their tone, but they aren’t designed to reliably infer a person’s underlying cognitive state, intentions, or needs. For example, because of the NLP techniques they rely on, LLMs focus mainly on content words (nouns, verbs) and treat function words (pronouns, prepositions) as statistical filler. Yet, it’s the way people use function words that decades of research in psycholinguistics and cognitive science has proven contains the psychological signals. An LLM can recognize that certain word patterns are associated with depression, but it can’t explain why, and this is the critical difference between identifying a correlation and understanding causation, and it’s what psycholinguistics offers that scaling alone can’t deliver.


Why LLMs Can’t Quantify Psychological States


The assumption that LLMs will eventually learn to extract psychological signals through scale misunderstands the problem. Psycholinguistic signals aren’t hidden patterns that will be discovered through more training data; they’re structured relationships between language features and validated psychological constructs, that have been established through decades of empirical research.


If an AI industry goal is to safely and reliably deploy AI in high-stakes domains like clinical health and education, these applications will need to do more than infer that someone “seems overwhelmed”, they must be able to quantify these inferences, measure where a user falls relative to population norms, and track that change over time. Psycholinguistic frameworks enable quantification by returning standardized z-scores for validated psychological constructs, and this provides the measurement foundation needed for AI application to many high-stakes domains:


  1. Quantify against norms: Know whether a person’s anxiety markers are at the 60th or 95th percentile.

  2. Track change consistently: Measure whether psychological markers are increasing or decreasing over time.

  3. Set evidence-based cutoffs: Use thresholds to trigger escalation or intervention.


Current LLMs fundamentally can’t do this because they aren’t designed to quantify psychological constructs against validated norms or provide the consistent measurement needed to measure individual differences or changes over time.


The Moneyball Parallel: Undervalued Signal


What we’re witnessing in AI today echoes the “Moneyball” revolution that transformed Major League Baseball in 2002. It began when the Oakland A’s became successful despite having one of the smallest payrolls in baseball, by recognizing that the sport was over-indexing on traditional, visible metrics like batting average and stolen bases, while undervaluing more predictive statistics like on-base percentage (how often a player reaches base) and slugging percentage (total bases per at-bat). The A’s found success in signals that all other MLB teams were ignoring, and the AI industry is now in the same position by over-indexing on massive compute and scale while undervaluing the deeper, richer signals encoded in human language:


The AI industry is over-indexing on massive compute and scale while undervaluing deeper, richer psychological signals encoded in human language

The takeaway from Moneyball is that the most efficient path to the next generation of AI capabilities is methodological: extracting and quantifying the human signals in language that scale alone cannot surface.


What This Means for AI Companies


Competitive Differentiation: As AI capabilities become increasingly commodified, depth of understanding will replace raw performance as the competitive moat.


Risk Mitigation and Safety: Missing critical psychological signals in sensitive domains creates liability risks. Understanding these signals enables more reliable reasoning and establishes the foundation for trusted, safe AI systems.


Economic Force Multiplier: Integrating psycholinguistics isn’t a replacement for scaling; it’s a “force multiplier” that has the potential to deliver performance improvements equivalent to billions in compute, at a fraction of the cost.


Strategic Timing: As scaling bumps into the law of diminishing returns, the competitive advantage will shift to AI systems that integrate methods enabling deeper human understanding, creating moats that scale alone cannot replicate.


Beyond Scale


In focusing almost exclusively on compute, AI leaders are overlooking validated science from fields like psycholinguistics that can unlock the human signals from language that are critical for the next generation of AI systems. Extracting these signals enables LLMs to reason more reliably, personalize more effectively, and respond more safely to users’ cognitive and psychological states - providing the path to greater AI-based intelligence, safety, and trust.


The race to super-intelligent AI won’t be won by whoever builds the biggest machine. It will be won by whoever applies the most rigorous analytical methods to extract signal from the data those machines already process.

Trusted by industry leaders:

Subscribe to the blog

bottom of page