feat(content-router): accept any real compression (remove min-savings floor) by chopratejas · Pull Request #1771 · headroomlabs-ai/headroom

chopratejas · 2026-07-03T21:25:40Z

Description

The compression acceptance gate rejected any compression that saved less than ~15% (min_ratio interpolated 0.85 at low context pressure → 0.65 under pressure). That floor was a crude proxy for "big enough to justify busting the prefix cache," but it dropped genuine token savings — notably lossless code/log folds that shrink <15% (the ratio_too_high rejections).

This makes the gate accept any real shrink (ratio < 1.0): any token saved is worth taking. The two guards that actually protect correctness are untouched:

Reversibility gate — lossy, unmarked tool output still stays verbatim (accuracy; Compressing tool_result content corrupts command output and makes agents act on false information #1307).
Net-cost policy (HEADROOM_NET_COST_POLICY=1, opt-in) — precisely accounts for the prefix-cache-bust economics (savings × expected-reads vs one-time suffix re-write) when a session wants that protection.

Lowering the two values back to 0.85/0.65 restores the savings floor.

Closes #

Type of Change

New feature (non-breaking change that adds functionality)
Performance improvement

Changes Made

ContentRouterConfig.min_ratio_relaxed: 0.85 → 1.0
ContentRouterConfig.min_ratio_aggressive: 0.65 → 1.0
Gate now accepts any compression_ratio < 1.0 at every context pressure; reversibility + net-cost guards unchanged.

Testing

Unit tests pass (pytest)
Linting passes (ruff check)
Type checking passes (mypy headroom)
New tests added for new functionality (existing gate-mechanism tests cover it; they pass explicit min_ratio values and are unaffected)
Manual testing performed

Test Output

# content-router + compression suites (default-config paths):
tests/test_transforms/test_content_router.py ......... 139 passed
# broad compression sweep (-k compress/router/crush/kompress/lossless/ccr/savings/...):
1 failed, 1940 passed, 54 skipped in 181.25s
#   the 1 failure = test_lossless_mode::test_router_lossless_search_no_marker_and_recoverable
#   — a local fastembed-cache state-leak flake; passes in isolation (1 passed in 3.15s),
#   and is in lossless mode which bypasses this gate entirely.
ruff check headroom/transforms/content_router.py  -> All checks passed!
mypy headroom/transforms/content_router.py         -> Success: no issues found

Real Behavior Proof

Environment: local worktree, Python 3.12, PYTHONPATH pinned to the branch checkout.
Exact command / steps: ran the content-router acceptance-gate suites and a compression-adjacent sweep against the branch; verified the flaky test passes in isolation.
Observed result: gate-mechanism tests (explicit min_ratio) unaffected; no default-floor test regressed; blocks that previously produced ratio_too_high at ratios in [0.85, 1.0) are now accepted.
Not tested: no live end-to-end proxy run was performed for this specific change; the behavioral effect (more router:* acceptances, fewer ratio_too_high) is inferred from the gate logic + suite.

Review Readiness

I have performed a self-review
This PR is ready for human review

Checklist

My code follows the project's style guidelines
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective (existing gate tests cover the mechanism)
New and existing unit tests pass locally with my changes
I have updated the CHANGELOG.md if applicable (handled at release time)

Additional Notes

Deliberate tradeoff (discussed and chosen): without the net-cost policy enabled, accepting sub-15% wins can be net-negative on prompt-cached sessions, because compressing a block invalidates the cached suffix (a one-time re-write, at 1.25× on Anthropic). If that shows up in practice, enable HEADROOM_NET_COST_POLICY=1 (the precise economics guard) or restore a floor by lowering the two min_ratio_* values.

… floor) The acceptance gate rejected compressions saving <15% (min_ratio 0.85 at low context pressure, 0.65 under pressure) — a crude proxy for "big enough to justify busting the prefix cache." It dropped genuine token savings, including lossless code/log folds that shrink <15% (the ratio_too_high rejects). Set min_ratio to 1.0 at every pressure: accept ANY real shrink (ratio < 1.0); any token saved is worth taking. The two guards that actually matter are untouched — the reversibility gate keeps lossy-unmarked tool output verbatim (accuracy), and the opt-in net-cost policy (HEADROOM_NET_COST_POLICY=1) precisely accounts for cache-bust economics when enabled. Lower the values to restore a savings floor.

github-actions · 2026-07-03T21:25:54Z

PR governance

This PR follows the template and is marked ready for human review.

Unit mismatch: the apply() acceptance gate computes compression_ratio from len(text.split()) (word count), but a lossless search/log fold cuts TOKENS by collapsing a repeated path prefix into one heading — word count stays flat or rises (the heading adds a word). So the gate saw ratio >= 1.0 and discarded every free, recoverable win as ratio_too_high. Raising the floor to 1.0 (#1771) didn't help; the word-ratio was already >= 1.0. Measure lossless results (strategy_chain has a lossless_* entry) by REAL TOKEN count via the tokenizer already in scope — not words, not bytes — so a fold is accepted iff it genuinely reduces tokens. Gate + result cache use this ratio. Lossy strategies are unchanged (word count tracks their savings) and the reversibility gate is untouched (LOG/SEARCH/DIFF aren't lossy-unmarked). The excluded and bash-search paths already bypass this gate; this fixes the main strategy dispatch. Regression test drives the full router.apply() path and asserts fewer TOKENS (compress()/_apply_strategy_to_content bypass the gate, which is why prior unit tests missed it).

…ate (#1772) ## Description Unit-mismatch bug in the compression acceptance gate. `router.apply()` computes `compression_ratio` from `len(text.split())` (word count), but a **lossless** search/log fold (`compact_lossless`) saves **bytes** by collapsing a repeated path prefix into a single heading — word count stays flat or even *rises* (the heading adds a word). So the gate saw `ratio ≥ 1.0` and discarded every free, byte-recoverable win as `ratio_too_high`. (Raising the floor to 1.0 in #1771 did **not** fix this — the word-ratio was already ≥ 1.0.) Measure lossless results (those whose `strategy_chain` carries a `lossless_*` entry) by **byte ratio** at the gate and in the result cache — the real saving. Lossy strategies are unchanged (word count tracks their token savings), and the reversibility gate is untouched (`LOG`/`SEARCH`/`DIFF` aren't in `LOSSY_UNMARKED_STRATEGIES`). The excluded-tool and bash-search paths already bypass this gate via `continue`; this fixes the **main strategy dispatch** (the lossless-mode `LOG`/`SEARCH`/`DIFF` path). Follow-up to #1771. Closes # ## Type of Change - [x] Bug fix (non-breaking change that fixes an issue) ## Changes Made - At the `apply()` acceptance gate: compute `accept_ratio` = byte ratio for lossless results (`strategy_chain` has `lossless_*`), else the existing word ratio. Gate + result-cache entry now use `accept_ratio`. - Added an end-to-end regression test that drives the full `router.apply()` path. ## Testing - [x] Unit tests pass (`pytest`) - [x] Linting passes (`ruff check`) - [x] Type checking passes (`mypy headroom`) - [x] New tests added for new functionality - [ ] Manual testing performed ### Test Output ```text tests/test_lossless_mode.py::test_router_apply_accepts_lossless_search_byte_measured PASSED tests/test_content_router_tool_role_reversibility.py .......... (10 passed) # broader (pre-move) sweep on the same change: tests/test_lossless_mode.py / test_transforms/test_content_router.py / test_lossless_excluded_compaction.py / test_bash_search_lossless_fold.py — 121 passed ruff check headroom/transforms/content_router.py -> All checks passed! mypy headroom/transforms/content_router.py -> Success: no issues found ``` ## Real Behavior Proof - Environment: local worktree, Python 3.12, `PYTHONPATH` pinned to the branch. - Exact command / steps: new regression test constructs a single-file grep result, runs it through `ContentRouter(lossless=True).apply(...)`, and asserts the tool output is byte-smaller and recovers exactly (`search_unheading(out) == original`). - Observed result: before this fix the fold was rejected (`out == original`, counted `ratio_too_high`); after, it's applied (`len(out) < len(original)`, marker-free, byte-exact recovery). The test also asserts the fold's word count is ≥ the original's, so the test is meaningless if "fixed" by word count. - Not tested: no live end-to-end proxy run; validated via the full `apply()` path in unit tests. ## Review Readiness - [x] I have performed a self-review - [x] This PR is ready for human review ## Checklist - [x] My code follows the project's style guidelines - [x] I have performed a self-review of my code - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective - [x] New and existing unit tests pass locally with my changes - [ ] I have updated the CHANGELOG.md if applicable (handled at release time) ## Additional Notes Why prior tests missed it: `compress()` and `_apply_strategy_to_content` return the folded result directly and never touch the `apply()` acceptance gate, so the existing lossless-mode unit tests (which call those) passed while the real proxy path silently discarded the fold. The new test exercises `apply()` end-to-end.

chopratejas requested review from DevanshiVyas and JerrettDavis as code owners July 3, 2026 21:25

github-actions Bot added the status: ready for review Pull request body is complete and the author marked it ready for human review label Jul 3, 2026

chopratejas merged commit 6c31db9 into main Jul 3, 2026
29 checks passed

chopratejas mentioned this pull request Jul 3, 2026

fix(content-router): token-measure lossless folds at the acceptance gate #1772

Merged

16 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(content-router): accept any real compression (remove min-savings floor)#1771

feat(content-router): accept any real compression (remove min-savings floor)#1771
chopratejas merged 1 commit into
mainfrom
tejas/accept-any-compression

chopratejas commented Jul 3, 2026

Uh oh!

github-actions Bot commented Jul 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

chopratejas commented Jul 3, 2026

Description

Type of Change

Changes Made

Testing

Test Output

Real Behavior Proof

Review Readiness

Checklist

Additional Notes

Uh oh!

github-actions Bot commented Jul 3, 2026

PR governance

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant