feat(content-router): accept any real compression (remove min-savings floor)#1771
Merged
Conversation
… floor) The acceptance gate rejected compressions saving <15% (min_ratio 0.85 at low context pressure, 0.65 under pressure) — a crude proxy for "big enough to justify busting the prefix cache." It dropped genuine token savings, including lossless code/log folds that shrink <15% (the ratio_too_high rejects). Set min_ratio to 1.0 at every pressure: accept ANY real shrink (ratio < 1.0); any token saved is worth taking. The two guards that actually matter are untouched — the reversibility gate keeps lossy-unmarked tool output verbatim (accuracy), and the opt-in net-cost policy (HEADROOM_NET_COST_POLICY=1) precisely accounts for cache-bust economics when enabled. Lower the values to restore a savings floor.
Contributor
PR governanceThis PR follows the template and is marked ready for human review. |
16 tasks
chopratejas
added a commit
that referenced
this pull request
Jul 3, 2026
Unit mismatch: the apply() acceptance gate computes compression_ratio from len(text.split()) (word count), but a lossless search/log fold cuts TOKENS by collapsing a repeated path prefix into one heading — word count stays flat or rises (the heading adds a word). So the gate saw ratio >= 1.0 and discarded every free, recoverable win as ratio_too_high. Raising the floor to 1.0 (#1771) didn't help; the word-ratio was already >= 1.0. Measure lossless results (strategy_chain has a lossless_* entry) by REAL TOKEN count via the tokenizer already in scope — not words, not bytes — so a fold is accepted iff it genuinely reduces tokens. Gate + result cache use this ratio. Lossy strategies are unchanged (word count tracks their savings) and the reversibility gate is untouched (LOG/SEARCH/DIFF aren't lossy-unmarked). The excluded and bash-search paths already bypass this gate; this fixes the main strategy dispatch. Regression test drives the full router.apply() path and asserts fewer TOKENS (compress()/_apply_strategy_to_content bypass the gate, which is why prior unit tests missed it).
chopratejas
added a commit
that referenced
this pull request
Jul 3, 2026
…ate (#1772) ## Description Unit-mismatch bug in the compression acceptance gate. `router.apply()` computes `compression_ratio` from `len(text.split())` (word count), but a **lossless** search/log fold (`compact_lossless`) saves **bytes** by collapsing a repeated path prefix into a single heading — word count stays flat or even *rises* (the heading adds a word). So the gate saw `ratio ≥ 1.0` and discarded every free, byte-recoverable win as `ratio_too_high`. (Raising the floor to 1.0 in #1771 did **not** fix this — the word-ratio was already ≥ 1.0.) Measure lossless results (those whose `strategy_chain` carries a `lossless_*` entry) by **byte ratio** at the gate and in the result cache — the real saving. Lossy strategies are unchanged (word count tracks their token savings), and the reversibility gate is untouched (`LOG`/`SEARCH`/`DIFF` aren't in `LOSSY_UNMARKED_STRATEGIES`). The excluded-tool and bash-search paths already bypass this gate via `continue`; this fixes the **main strategy dispatch** (the lossless-mode `LOG`/`SEARCH`/`DIFF` path). Follow-up to #1771. Closes # ## Type of Change - [x] Bug fix (non-breaking change that fixes an issue) ## Changes Made - At the `apply()` acceptance gate: compute `accept_ratio` = byte ratio for lossless results (`strategy_chain` has `lossless_*`), else the existing word ratio. Gate + result-cache entry now use `accept_ratio`. - Added an end-to-end regression test that drives the full `router.apply()` path. ## Testing - [x] Unit tests pass (`pytest`) - [x] Linting passes (`ruff check`) - [x] Type checking passes (`mypy headroom`) - [x] New tests added for new functionality - [ ] Manual testing performed ### Test Output ```text tests/test_lossless_mode.py::test_router_apply_accepts_lossless_search_byte_measured PASSED tests/test_content_router_tool_role_reversibility.py .......... (10 passed) # broader (pre-move) sweep on the same change: tests/test_lossless_mode.py / test_transforms/test_content_router.py / test_lossless_excluded_compaction.py / test_bash_search_lossless_fold.py — 121 passed ruff check headroom/transforms/content_router.py -> All checks passed! mypy headroom/transforms/content_router.py -> Success: no issues found ``` ## Real Behavior Proof - Environment: local worktree, Python 3.12, `PYTHONPATH` pinned to the branch. - Exact command / steps: new regression test constructs a single-file grep result, runs it through `ContentRouter(lossless=True).apply(...)`, and asserts the tool output is byte-smaller and recovers exactly (`search_unheading(out) == original`). - Observed result: before this fix the fold was rejected (`out == original`, counted `ratio_too_high`); after, it's applied (`len(out) < len(original)`, marker-free, byte-exact recovery). The test also asserts the fold's word count is ≥ the original's, so the test is meaningless if "fixed" by word count. - Not tested: no live end-to-end proxy run; validated via the full `apply()` path in unit tests. ## Review Readiness - [x] I have performed a self-review - [x] This PR is ready for human review ## Checklist - [x] My code follows the project's style guidelines - [x] I have performed a self-review of my code - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective - [x] New and existing unit tests pass locally with my changes - [ ] I have updated the CHANGELOG.md if applicable (handled at release time) ## Additional Notes Why prior tests missed it: `compress()` and `_apply_strategy_to_content` return the folded result directly and never touch the `apply()` acceptance gate, so the existing lossless-mode unit tests (which call those) passed while the real proxy path silently discarded the fold. The new test exercises `apply()` end-to-end.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The compression acceptance gate rejected any compression that saved less than ~15% (
min_ratiointerpolated 0.85 at low context pressure → 0.65 under pressure). That floor was a crude proxy for "big enough to justify busting the prefix cache," but it dropped genuine token savings — notably lossless code/log folds that shrink <15% (theratio_too_highrejections).This makes the gate accept any real shrink (
ratio < 1.0): any token saved is worth taking. The two guards that actually protect correctness are untouched:HEADROOM_NET_COST_POLICY=1, opt-in) — precisely accounts for the prefix-cache-bust economics (savings × expected-reads vs one-time suffix re-write) when a session wants that protection.Lowering the two values back to
0.85/0.65restores the savings floor.Closes #
Type of Change
Changes Made
ContentRouterConfig.min_ratio_relaxed:0.85 → 1.0ContentRouterConfig.min_ratio_aggressive:0.65 → 1.0compression_ratio < 1.0at every context pressure; reversibility + net-cost guards unchanged.Testing
pytest)ruff check)mypy headroom)min_ratiovalues and are unaffected)Test Output
Real Behavior Proof
PYTHONPATHpinned to the branch checkout.min_ratio) unaffected; no default-floor test regressed; blocks that previously producedratio_too_highat ratios in[0.85, 1.0)are now accepted.router:*acceptances, fewerratio_too_high) is inferred from the gate logic + suite.Review Readiness
Checklist
Additional Notes
Deliberate tradeoff (discussed and chosen): without the net-cost policy enabled, accepting sub-15% wins can be net-negative on prompt-cached sessions, because compressing a block invalidates the cached suffix (a one-time re-write, at 1.25× on Anthropic). If that shows up in practice, enable
HEADROOM_NET_COST_POLICY=1(the precise economics guard) or restore a floor by lowering the twomin_ratio_*values.