feat(content-router): lossless-excluded compaction (grep/log/json) + enable in coding/general personas#1762
Merged
Merged
Conversation
…ed grep output Excluded tools (Read/Grep/Glob/Write/Edit) are protected from *lossy* compression for accuracy. But grep-shaped output of an excluded tool can still be search-folded (path:line:content -> ripgrep --heading form), which is byte-recoverable: search_unheading reproduces the original exactly. Measured ~36% off real code-grep with zero information loss; Read/source and glob path-lists pass through untouched (not search-shaped -> no-op). Gated on the dedicated _try_detect_search detector rather than the general classifier, which labels grep-over-a-codebase SOURCE_CODE (the matched lines are code) and would otherwise reject the exact case this targets. Off by default via ContentRouterConfig.compact_excluded_search. Closes the OpenCode native-Grep savings gap that RTK (shell-only + lossy) does not cover; the fold is lossless, so it is safe for the accuracy-first coding workload.
… search/log/json Extends the excluded-tool lossless compaction beyond grep, dispatched by detected shape (excluded tools stay out of *lossy* compression for accuracy): - SEARCH (grep) -> ripgrep --heading fold. Byte-lossless. Gated on _try_detect_search (the general/Magika classifier calls grep-over-code SOURCE_CODE and would miss it). - LOG -> ANSI strip + run-collapse. Byte-lossless modulo non-semantic ANSI. - JSON -> whitespace-minify. DATA-lossless (json.loads equals the original object) but NOT byte-exact — a read-then-Edit(old_string) on the same JSON file could miss; documented and gated. Renames compact_excluded_search -> compact_excluded_lossless. Source code and glob path-lists match nothing -> verbatim. 10 tests (byte-exact for search/log, data-equal for json, no-op on source/glob, off-by-default). ruff + mypy clean.
…neral personas Wire compact_excluded_lossless through the persona path: add it to AgentSavingsProfile (default off), set it True for the coding + general workload personas, emit it in proxy_env + proxy_pipeline_kwargs, and honor it as a per-request override in ContentRouter.apply (_runtime_compact_excluded_lossless). So HEADROOM_SAVINGS_PROFILE=coding now losslessly folds excluded grep/log (byte-lossless) and minifies excluded json (data-lossless) — recovering the Read/Grep-heavy savings the exclude list otherwise fully protects, with no accuracy loss on the byte-lossless tiers. End-to-end: coding-persona kwargs fold a real grep tool output 36% (router:excluded:lossless_search), recoverable.
Contributor
PR governanceThis PR does not yet satisfy the required template fields:
Please update the PR body, or move the PR back to draft while it is still in progress. |
48fc85c to
0d77e37
Compare
…p the gate Excluded tools (Read/Grep/Glob/Edit/...) are protected only from *lossy* compression. Applying information-preserving compaction to them (grep -> ripgrep --heading fold; logs -> ANSI strip + run-collapse; JSON -> whitespace-minify) is always accuracy-safe, so it needs no feature gate. Remove compact_excluded_lossless everywhere (ContentRouterConfig field, _runtime flag, internal gate, persona flag, HEADROOM_COMPACT_EXCLUDED_LOSSLESS env, proxy_pipeline_kwargs entry) and always fold excluded output by shape in every path: Anthropic /v1/messages + OpenAI chat via router.apply excluded branches, and OpenAI Responses (Codex) via the direct _lossless_compact_excluded call. Non-excluded tools are untouched (lossless mode -> lossless, else SmartCrusher, CCR -> CCR).
…t grep) bash is not an excluded tool, so a search run through it takes the lossy strategy path — the gap that leaves bash-heavy harnesses (opencode, Codex) uncompressed losslessly. Detect the *command* behind a shell tool result (peeling wrappers: rtk/sudo/env/timeout, and sh -c "..."), and when it is a read-only search (grep/egrep/rg/git grep/ag/ack), fold the output with the same byte-lossless ripgrep --heading transform excluded Grep gets, instead of lossy compression. The command whitelist is only a gate to *attempt*: compact_lossless verifies reversibility and returns the input unchanged when it can't safely shrink, so a mis-gated command (grep -l path-lists, grep -c counts) falls through with no accuracy risk. Non-search bash (cat/build/mutate/diff) and all MCP/native tools are untouched. Wired into both the OpenAI chat (string) and Anthropic (block) router paths; command set + shell tool names are config-driven. Measured on real grep output: ~46% token savings, byte-exact.
CI TestExcludeTools failures surfaced the real contract: excluded tools include Read/Edit — the read-then-edit path. JSON minify is data-lossless but byte- *different*, so a Read that returns minified JSON breaks a later Edit whose old_string was copied from the pretty file on disk. That violates the no- accuracy-loss invariant this work is built on. Restrict the excluded fold to BYTE-lossless shapes only (search + log, both byte-recoverable and not edited byte-for-byte); leave JSON verbatim. The exclude-mechanism tests (custom set, glob) now pass unchanged — their byte- identity holds — which confirms the corrected contract. Search/log fold coverage stays in the dedicated fold tests. bash-search fold is unaffected (search-only). Removes the now-dead _minify_json_data_lossless.
…data-lossless contract Reverses the previous over-correction. JSON whitespace-minify IS lossless (json.loads equals the original object — same information, fewer tokens); it's free savings and stays. The right fix for the CI failures was never to drop compression, but to update the stale TestExcludeTools assertions that expected byte-identity: excluded tools are protected from *lossy* compression, and now get information-preserving compaction (search fold, log collapse, JSON minify). Tests now assert recovery, not byte-identity: JSON -> json.loads equality + router:excluded:lossless_json; search -> search_unheading byte-exact + router:excluded:lossless_search. Restores _minify_json_data_lossless. Caveat (documented, not gated): minified JSON is byte-different, so a read-then-Edit(old_string) on the same JSON file could miss; the data is fully preserved. Acceptable trade for the token savings.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Builds on the now-merged personas (#1732). Two pieces:
1. Lossless compaction for EXCLUDED tool output
Excluded tools (Read/Grep/Glob/Write/Edit) stay out of lossy compression, but their output is compacted by detected shape:
search_unheadingrecovers)json.loadsequal), NOT byte-exactSource code + glob path-lists → verbatim. grep gated on
_try_detect_search(the general/Magika classifier calls grep-over-code SOURCE_CODE and would miss it). Off by default (compact_excluded_lossless).2. Enable it in the coding/general personas
compact_excluded_lossless=Trueon the coding + general profiles, threaded viaproxy_env+proxy_pipeline_kwargs+ a per-requestContentRouter.applyoverride. SoHEADROOM_SAVINGS_PROFILE=codingauto-folds excluded grep/log/json.Why
The coding persona was getting ~2.5% on OpenCode because its dominant traffic (Grep/Read) is excluded, and RTK (shell-only, lossy) never sees OpenCode's native tools. This recovers those savings losslessly.
Measured (end-to-end via coding-persona kwargs, real
rgoutput)41,589 → 26,562 chars (−36%),
router:excluded:lossless_search, byte-recoverable.Accuracy
grep/log = byte-lossless → edit-safe. json = data-lossless (edit-caveat for read-then-edit-JSON, documented). Read of source code → untouched (tested).
47 tests (personas + all three tiers + persona-enablement + end-to-end). ruff + mypy clean. No personas duplication — rebased onto main after #1732 landed. Supersedes #1755.
🤖 Generated with Claude Code