Research

The Research

Naive context compression preserves answerability while silently dropping the reasoning pivot. We measured it, proved it, and built a guarded policy that eliminates the failure mode entirely.


Key Findings

What we found

21.95%

Mirage Rate

In 1 out of every 5 compression events, naive recency scoring preserves a valid-looking response while silently losing the pivot that the answer depends on.

0

False Positives

The tropical L2 guarded policy produces zero false positives on the committed witness. Every protected arc is retained at every tested retention fraction.

54.9%

Trap Rate

More than half of all compression events in the danger zone result in a streaming oscillation trap — the model commits to a path it cannot correct without the dropped context.

Ω(k)

State Bound

The tropical semiring formulation proves a tight Ω(k) lower bound on the number of feasible states needed to preserve k protected arcs under any lossless compaction policy.


Methodology

How we measured it

All reported rates are exact proportions over a deterministic replay witness with n=3 variants per policy and retention fraction. Each variant is independently seeded and replayed from committed context snapshots — no sampling, no approximation.

We tested five retention levels spanning the compression pressure range where the phenomenon emerges. At each level, both the naive recency policy and the tropical L2 guarded policy are evaluated against the same committed context, producing directly comparable pivot preservation rates.

Naive Recency Fails

Scores messages by recency alone. Preserves the most recent exchanges, which are often valid and coherent — but silently drops the governing pivot when it falls outside the retention window.

  • Pivot preservation: 0.0 at critical retention fractions
  • Produces mirage: answer is valid, reasoning is lost
  • No contract enforcement — no awareness of protected arcs
Tropical L2 Guarded Holds

Uses the tropical semiring distance d_pre(π₀, Sₖ) to identify the minimum feasible set. Enforces a contract that no protected arc may be dropped before the pivot budget is exhausted.

  • Pivot preservation: 1.0 at all tested retention fractions
  • Zero false positives on the committed witness
  • Contract satisfaction recorded in portable certificate

Mathematical Foundations

The formal structure

The validity mirage is not an empirical coincidence — it follows from the structure of greedy selection under recency scoring. The tropical semiring provides the right algebraic frame to state the guarantees precisely.

Core Contract

d_pre(π₀, Sₖ) ≥ k

The tropical L2 distance from the pivot to any feasible compacted state must be at least k. This is the minimum separation that guarantees no protected arc is silently reachable by a recency-scored drop.

Frontier Feasibility

W[k] < ∞

The feasibility frontier W[k] is finite for all k bounded by the context length. This ensures the guarded policy always terminates and always produces a valid compacted context.

Raw Validity

max(decoy, primary)

The validity score reported by naive policies is the max over decoy and primary arcs. When the primary is dropped, the decoy alone can sustain a high validity score — producing the mirage.

Mirage Gap

validity − pivot_rate

The mirage gap is the difference between reported validity and true pivot preservation rate. A positive gap is the signature of the phenomenon: the model appears correct while the reasoning foundation is absent.


Papers

The full research record

All papers are published as PDFs and archived with a DOI. The flagship paper contains the formal proofs; the supporting papers establish the components independently.

The Validity Mirage

Flagship · 2025 · Jack Chaudier Gaffney

Introduces the validity mirage: a compression failure mode where naive recency scoring preserves a coherent-looking response while silently dropping the pivot the answer depends on. Proves the Ω(k) state bound and introduces the tropical L2 guarded policy as the corrective.

Continuous Control

Paper 00 · 2025 · Jack Chaudier Gaffney

Establishes the structural regularization framework for continuous control under compression. Shows that without pivot-aware regularization, gradient-based compression degrades control quality in proportion to the mirage gap.

Absorbing States

Paper 01 · 2025 · Jack Chaudier Gaffney

Characterizes absorbing states in greedy search under context compression. Proves that greedy recency selection can enter an absorbing state where no subsequent compression step can recover the dropped pivot without a full context reload.

Streaming Traps

Paper 02 · 2025 · Jack Chaudier Gaffney

Formalizes streaming oscillation traps: cycles in the compression state machine where the model alternates between two partially-valid states without converging. Derives the 54.9% trap rate reported in the empirical witness.

Tropical Algebra

Paper i · 2025 · Jack Chaudier Gaffney

Develops the tropical semiring foundations used throughout the research program. Defines the distance function d_pre, establishes the frontier feasibility condition W[k] < ∞, and proves the algebraic properties that make the guarded policy tractable.