A new multi-university study (Zurich, Amsterdam, Duke, NYU) has delivered one of the most revealing findings of 2025:
it is significantly harder for AI models to imitate human emotion than human intelligence.

And the biggest giveaway isn’t grammar, complexity, or coherence.
It’s something far more mundane:

AI is simply too polite.

Across Twitter/X, Bluesky, and Reddit, the researchers’ classifiers could detect AI-written replies with 70–80% accuracy. Not because the language was robotic, but because the emotional tone wasn’t human enough.


1. The Core Problem: Humans Are Messier Than AI

The study found a persistent pattern:

AI replies are too friendly, too safe, and too emotionally consistent.

Real humans online are chaotic, contradictory, petty, sarcastic, passive-aggressive, blunt, or outright toxic — often unpredictably so.

AI models, especially instruction-tuned ones, are optimized to avoid offense and reduce harm. That means:

  • lower negativity
  • fewer emotional spikes
  • fewer impulsive reactions
  • no genuine irritation or frustration

These traits make AI output easily distinguishable, even when the writing itself is fluent.

The researchers summarized it clearly:

“Affective tone and emotional expression remain reliable tells.”


2. The Toxicity Paradox

The paper tested nine popular open-weight models, including Llama 3.1 (8B and 70B), Mistral 7B, Qwen, Gemma, and DeepSeek-R1 distill variants.

All models had significantly lower toxicity scores than real human replies on every platform tested.

The counterintuitive twist:

Trying to make models more human-like made them less accurate semantically.

When optimized for realism → they drifted further from what humans actually said.
When optimized for semantic similarity → they became easier to detect as AI.

LUE MYÖS:  Justin Trudeaun perintö: Jakautunut ja dramaattinen aikakausi

The two goals — sounding human and matching human content — turned out to be competing objectives, not aligned ones.


3. Bigger Isn’t Better, and Instruction Tuning Makes It Worse

One of the most striking conclusions:

Large models were not better at hiding.

Llama 3.1 70B performed no better than the 8B version — sometimes worse.

Even more surprisingly:

Instruction-tuned models were worse at mimicking humans.

This has a clear logic:
Instruction tuning teaches the model to be helpful, polite, aligned, and safe — precisely the opposite of how people behave in open social media discussions.

Base models (non-instruction-tuned) were consistently harder to identify.


4. Platform Differences: Twitter Is Easiest to Fake, Reddit the Hardest

The researchers observed clear platform-specific patterns:

  • Twitter/X: AI most successful at blending in
  • Bluesky: moderate detectability
  • Reddit: easiest platform for classifiers to flag AI replies

This reflects differences in training data and conversational style.
Twitter’s brevity and high noise-to-signal ratio favor models trying to hide.

Reddit’s longer, more context-heavy conversations expose even subtle stylistic artifacts.


5. Simple Tricks Work Better Than Complex Optimization

The researchers tried several optimization strategies:

What worked:

  • providing specific examples of the user’s past posts
  • retrieving relevant context
  • minimizing structural mismatches (length, complexity)

What failed:

  • personality descriptions
  • fine-tuning for mimicry
  • sophisticated optimization routines

Sometimes the simplest “few-shot” examples did more to obscure the model’s signature than advanced methods.


6. Broader Implications: Authenticity Is Harder Than Intelligence

The study highlights a fundamental tension in LLM development:

Human-like style and semantic accuracy are not the same — and may not even be compatible.

Even as models become more capable, they remain distinctly artificial in one key domain:

LUE MYÖS:  Germany Starts Wondering What Europe Is For

spontaneous emotional behavior

Humans online display irregular affect, impulsive tone shifts, flashes of anger or sarcasm, and general unpredictability.
AI models cannot (and are not allowed to) replicate these reliably.

This raises questions for:

  • AI detection frameworks
  • platform moderation
  • disinformation monitoring
  • AI-human interaction research
  • future model alignment strategies

And it also contains an ironic twist:

The more “aligned” and safe models become, the easier they are to detect.

In other words, the very mechanisms designed to civilize AI ultimately expose it.


Conclusion

The study’s central insight is surprisingly intuitive:

Intelligence is easy to fake.
Humanity is hard.

People are inconsistent, emotional, sometimes toxic, often contradictory — and AI simply isn’t allowed to be.

Until models can emulate the full spectrum of human affect (including the messy parts), they’ll continue to stand out in online conversations, no matter how fluent, structured, or intelligent they sound.

Avatar photo

By Pressi Editor

Jos lainaat tekstiä, laitathan lainatun tekstin yhteyteen paluulinkin!