Skip to content

feat(content-router): accept any real compression (remove min-savings floor)#1771

Merged
chopratejas merged 1 commit into
mainfrom
tejas/accept-any-compression
Jul 3, 2026
Merged

feat(content-router): accept any real compression (remove min-savings floor)#1771
chopratejas merged 1 commit into
mainfrom
tejas/accept-any-compression

Conversation

@chopratejas

Copy link
Copy Markdown
Collaborator

Description

The compression acceptance gate rejected any compression that saved less than ~15% (min_ratio interpolated 0.85 at low context pressure → 0.65 under pressure). That floor was a crude proxy for "big enough to justify busting the prefix cache," but it dropped genuine token savings — notably lossless code/log folds that shrink <15% (the ratio_too_high rejections).

This makes the gate accept any real shrink (ratio < 1.0): any token saved is worth taking. The two guards that actually protect correctness are untouched:

Lowering the two values back to 0.85/0.65 restores the savings floor.

Closes #

Type of Change

  • New feature (non-breaking change that adds functionality)
  • Performance improvement

Changes Made

  • ContentRouterConfig.min_ratio_relaxed: 0.85 → 1.0
  • ContentRouterConfig.min_ratio_aggressive: 0.65 → 1.0
  • Gate now accepts any compression_ratio < 1.0 at every context pressure; reversibility + net-cost guards unchanged.

Testing

  • Unit tests pass (pytest)
  • Linting passes (ruff check)
  • Type checking passes (mypy headroom)
  • New tests added for new functionality (existing gate-mechanism tests cover it; they pass explicit min_ratio values and are unaffected)
  • Manual testing performed

Test Output

# content-router + compression suites (default-config paths):
tests/test_transforms/test_content_router.py ......... 139 passed
# broad compression sweep (-k compress/router/crush/kompress/lossless/ccr/savings/...):
1 failed, 1940 passed, 54 skipped in 181.25s
#   the 1 failure = test_lossless_mode::test_router_lossless_search_no_marker_and_recoverable
#   — a local fastembed-cache state-leak flake; passes in isolation (1 passed in 3.15s),
#   and is in lossless mode which bypasses this gate entirely.
ruff check headroom/transforms/content_router.py  -> All checks passed!
mypy headroom/transforms/content_router.py         -> Success: no issues found

Real Behavior Proof

  • Environment: local worktree, Python 3.12, PYTHONPATH pinned to the branch checkout.
  • Exact command / steps: ran the content-router acceptance-gate suites and a compression-adjacent sweep against the branch; verified the flaky test passes in isolation.
  • Observed result: gate-mechanism tests (explicit min_ratio) unaffected; no default-floor test regressed; blocks that previously produced ratio_too_high at ratios in [0.85, 1.0) are now accepted.
  • Not tested: no live end-to-end proxy run was performed for this specific change; the behavioral effect (more router:* acceptances, fewer ratio_too_high) is inferred from the gate logic + suite.

Review Readiness

  • I have performed a self-review
  • This PR is ready for human review

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective (existing gate tests cover the mechanism)
  • New and existing unit tests pass locally with my changes
  • I have updated the CHANGELOG.md if applicable (handled at release time)

Additional Notes

Deliberate tradeoff (discussed and chosen): without the net-cost policy enabled, accepting sub-15% wins can be net-negative on prompt-cached sessions, because compressing a block invalidates the cached suffix (a one-time re-write, at 1.25× on Anthropic). If that shows up in practice, enable HEADROOM_NET_COST_POLICY=1 (the precise economics guard) or restore a floor by lowering the two min_ratio_* values.

… floor)

The acceptance gate rejected compressions saving <15% (min_ratio 0.85 at low
context pressure, 0.65 under pressure) — a crude proxy for "big enough to
justify busting the prefix cache." It dropped genuine token savings, including
lossless code/log folds that shrink <15% (the ratio_too_high rejects).

Set min_ratio to 1.0 at every pressure: accept ANY real shrink (ratio < 1.0);
any token saved is worth taking. The two guards that actually matter are
untouched — the reversibility gate keeps lossy-unmarked tool output verbatim
(accuracy), and the opt-in net-cost policy (HEADROOM_NET_COST_POLICY=1)
precisely accounts for cache-bust economics when enabled. Lower the values to
restore a savings floor.
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

PR governance

This PR follows the template and is marked ready for human review.

@github-actions github-actions Bot added the status: ready for review Pull request body is complete and the author marked it ready for human review label Jul 3, 2026
@chopratejas chopratejas merged commit 6c31db9 into main Jul 3, 2026
29 checks passed
chopratejas added a commit that referenced this pull request Jul 3, 2026
Unit mismatch: the apply() acceptance gate computes compression_ratio from
len(text.split()) (word count), but a lossless search/log fold cuts TOKENS by
collapsing a repeated path prefix into one heading — word count stays flat or
rises (the heading adds a word). So the gate saw ratio >= 1.0 and discarded
every free, recoverable win as ratio_too_high. Raising the floor to 1.0 (#1771)
didn't help; the word-ratio was already >= 1.0.

Measure lossless results (strategy_chain has a lossless_* entry) by REAL TOKEN
count via the tokenizer already in scope — not words, not bytes — so a fold is
accepted iff it genuinely reduces tokens. Gate + result cache use this ratio.
Lossy strategies are unchanged (word count tracks their savings) and the
reversibility gate is untouched (LOG/SEARCH/DIFF aren't lossy-unmarked). The
excluded and bash-search paths already bypass this gate; this fixes the main
strategy dispatch.

Regression test drives the full router.apply() path and asserts fewer TOKENS
(compress()/_apply_strategy_to_content bypass the gate, which is why prior unit
tests missed it).
chopratejas added a commit that referenced this pull request Jul 3, 2026
…ate (#1772)

## Description

Unit-mismatch bug in the compression acceptance gate. `router.apply()`
computes `compression_ratio` from `len(text.split())` (word count), but
a **lossless** search/log fold (`compact_lossless`) saves **bytes** by
collapsing a repeated path prefix into a single heading — word count
stays flat or even *rises* (the heading adds a word). So the gate saw
`ratio ≥ 1.0` and discarded every free, byte-recoverable win as
`ratio_too_high`. (Raising the floor to 1.0 in #1771 did **not** fix
this — the word-ratio was already ≥ 1.0.)

Measure lossless results (those whose `strategy_chain` carries a
`lossless_*` entry) by **byte ratio** at the gate and in the result
cache — the real saving. Lossy strategies are unchanged (word count
tracks their token savings), and the reversibility gate is untouched
(`LOG`/`SEARCH`/`DIFF` aren't in `LOSSY_UNMARKED_STRATEGIES`). The
excluded-tool and bash-search paths already bypass this gate via
`continue`; this fixes the **main strategy dispatch** (the lossless-mode
`LOG`/`SEARCH`/`DIFF` path).

Follow-up to #1771.

Closes #

## Type of Change

- [x] Bug fix (non-breaking change that fixes an issue)

## Changes Made

- At the `apply()` acceptance gate: compute `accept_ratio` = byte ratio
for lossless results (`strategy_chain` has `lossless_*`), else the
existing word ratio. Gate + result-cache entry now use `accept_ratio`.
- Added an end-to-end regression test that drives the full
`router.apply()` path.

## Testing

- [x] Unit tests pass (`pytest`)
- [x] Linting passes (`ruff check`)
- [x] Type checking passes (`mypy headroom`)
- [x] New tests added for new functionality
- [ ] Manual testing performed

### Test Output

```text
tests/test_lossless_mode.py::test_router_apply_accepts_lossless_search_byte_measured PASSED
tests/test_content_router_tool_role_reversibility.py .......... (10 passed)
# broader (pre-move) sweep on the same change:
tests/test_lossless_mode.py / test_transforms/test_content_router.py /
test_lossless_excluded_compaction.py / test_bash_search_lossless_fold.py — 121 passed
ruff check headroom/transforms/content_router.py  -> All checks passed!
mypy headroom/transforms/content_router.py         -> Success: no issues found
```

## Real Behavior Proof

- Environment: local worktree, Python 3.12, `PYTHONPATH` pinned to the
branch.
- Exact command / steps: new regression test constructs a single-file
grep result, runs it through `ContentRouter(lossless=True).apply(...)`,
and asserts the tool output is byte-smaller and recovers exactly
(`search_unheading(out) == original`).
- Observed result: before this fix the fold was rejected (`out ==
original`, counted `ratio_too_high`); after, it's applied (`len(out) <
len(original)`, marker-free, byte-exact recovery). The test also asserts
the fold's word count is ≥ the original's, so the test is meaningless if
"fixed" by word count.
- Not tested: no live end-to-end proxy run; validated via the full
`apply()` path in unit tests.

## Review Readiness

- [x] I have performed a self-review
- [x] This PR is ready for human review

## Checklist

- [x] My code follows the project's style guidelines
- [x] I have performed a self-review of my code
- [x] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective
- [x] New and existing unit tests pass locally with my changes
- [ ] I have updated the CHANGELOG.md if applicable (handled at release
time)

## Additional Notes

Why prior tests missed it: `compress()` and `_apply_strategy_to_content`
return the folded result directly and never touch the `apply()`
acceptance gate, so the existing lossless-mode unit tests (which call
those) passed while the real proxy path silently discarded the fold. The
new test exercises `apply()` end-to-end.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status: ready for review Pull request body is complete and the author marked it ready for human review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant