The Mirror Loop
Recursive Non-Convergence in Generative Reasoning Systems

Under review

The Mirror Loop visualization
Summary

This paper examines why large language models stall when they repeatedly evaluate their own outputs. At first the text shifts. Then the changes shrink. Soon the system begins rewriting its own phrasing rather than revising its beliefs. It looks like reflection but functions like recursion. The model treats its previous output as evidence, but since no new information ever enters the sequence, the result is conservation of the original uncertainty. The paper calls this pattern the mirror loop.

The study runs 144 reasoning sequences across three providers and shows the same collapse everywhere. Early iterations still contain informational movement. By the middle of the chain, drift falls toward zero. When a single grounding step is added, informational change rebounds. That contrast reveals the hinge: improvement only happens when the system touches something external. Without grounding, the model becomes fluent at repeating itself while believing it is refining its reasoning. The paper explains how this collapse works, how to detect it, and why it matters for any architecture that relies on recursive self-correction.

Abstract

Large language models are often described as capable of reflective reasoning, yet recursive self-evaluation without external feedback frequently yields reformulation rather than progress. We test this prediction in a cross-provider study of 144 reasoning sequences across three models (OpenAI GPT-4o-mini, Anthropic Claude 3 Haiku, and Google Gemini 2.0 Flash) and four task families (arithmetic, code, explanation, reflection), each iterated ten times under two conditions: ungrounded self-critique and a minimal grounding intervention (a single verification step at iteration three). Mean informational change (ΔI, measured via normalized edit distance) declined by 55 percent from early (0.193) to late (0.087) iterations in ungrounded runs, with consistent patterns across all three providers. Grounded runs showed a 28 percent rebound in informational change immediately after the intervention and sustained non-zero variance thereafter. Complementary measures—n-gram novelty, embedding drift, and character-level entropy—converged on the same pattern: reflection without contact tends toward informational closure. We interpret this as evidence for a structural limit on self-correction in generative reasoning: without an exchange of information with an independent verifier or environment, recursive inference approaches an attractor state of epistemic stasis. Minimal grounding functions as dissipative coupling, reintroducing informational flux. The cross-architecture consistency suggests the mirror loop arises from shared autoregressive training objectives rather than provider-specific alignment schemes. The results delineate when reflection is performative rather than epistemic and motivate design principles for grounded, cooperative reasoning. Materials and code are publicly available.

Why It Matters

A model that tries to correct itself without new information is trapped in its own frame. It repeats patterns, smooths language, and becomes more confident while making no epistemic progress. The output looks thoughtful. The change is cosmetic. For systems built on recursive refinement, this creates a structural blind spot. Developers may see fluent rewriting and assume improvement. Users may see polished explanations and interpret them as deeper reasoning. Both impressions are false.

The risk is subtle and grows through repetition. A model in a mirror loop becomes a closed system. It cannot update its uncertainty. It cannot reduce error. It can only rearrange it. When this behavior powers agents, planning systems, or alignment workflows, the gap between apparent improvement and real improvement becomes an operational hazard. The paper shows how to detect these loops early and why grounded reasoning must be a design requirement.

Key Ideas
  • Self-critique without new evidence becomes recursive reformulation
  • Informational change declines predictably across iterations
  • Models reuse their own text as substrate, which preserves uncertainty
  • Fluency increases even while reasoning stagnates
  • A single grounding step breaks the loop and restores movement
  • The effect appears across providers and architectures
  • Reflection only becomes reasoning when it encounters constraint
  • Grounding is not optional; it is the mechanism that prevents epistemic closure
arXiv → Open Colab → GitHub →