Skip to content

feat(proxy): opt-in compression for catch-all passthrough routes#1699

Open
tenderdeve wants to merge 1 commit into
headroomlabs-ai:mainfrom
tenderdeve:feat/1546-compress-passthrough
Open

feat(proxy): opt-in compression for catch-all passthrough routes#1699
tenderdeve wants to merge 1 commit into
headroomlabs-ai:mainfrom
tenderdeve:feat/1546-compress-passthrough

Conversation

@tenderdeve

Copy link
Copy Markdown
Contributor

Description

Requests whose path doesn't match a built-in API route fall through to handle_passthrough, which forwarded the body verbatim — bypassing ContentRouter/Kompress/TOIN entirely. Wrapper-proxy setups that front Headroom on custom paths (e.g. /api/codex-proxy/<key>/v1/responses) got zero compression on coding-agent traffic and hit context-limit 400s in long sessions. This adds an opt-in flag that routes OpenAI Responses-shaped passthrough bodies through the same compression path the native /v1/responses handler uses.

Closes #1546

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Performance improvement
  • Code refactoring (no functional changes)

Changes Made

  • Added ProxyConfig.compress_passthrough (default False) + --compress-passthrough CLI flag + HEADROOM_COMPRESS_PASSTHROUGH=1 env.
  • handle_passthrough: when enabled, POST requests whose path ends in /responses with an OpenAI Responses-shaped body are compressed via the existing _compress_openai_responses_payload_in_executor before forwarding; stale Content-Length is dropped so httpx recomputes it.
  • New _maybe_compress_passthrough_responses helper — fail-open: non-JSON, non-Responses payloads, unmodified results, and any compressor error forward the original body unchanged.
  • Documented the flag in docs/content/docs/proxy.mdx.

Testing

  • Unit tests pass (pytest)
  • Linting passes (ruff check .)
  • Type checking passes (mypy headroom)
  • New tests added for new functionality
  • Manual testing performed

Test Output

$ .venv/bin/python -m pytest tests/test_compress_passthrough.py -q
collected 6 items
tests/test_compress_passthrough.py ......                                [100%]
============================== 6 passed in 0.35s ===============================

$ .venv/bin/ruff check headroom/proxy/handlers/openai.py headroom/proxy/models.py headroom/proxy/server.py tests/test_compress_passthrough.py
All checks passed!

Real Behavior Proof

  • Environment: macOS arm64, Python 3.14, repo .venv.
  • Exact command / steps: .venv/bin/python -m pytest tests/test_compress_passthrough.py -q — covers a Responses-shaped body being compressed, non-JSON passthrough, non-Responses (messages) payload untouched, unmodified-result short-circuit, compressor-error fail-open, and ProxyConfig().compress_passthrough is False default. Plus import smoke: ProxyConfig(compress_passthrough=True), server/handler modules import, helper present.
  • Observed result: 6 passed; flag defaults off; enabled path reuses the native Responses compressor and never raises out to the request.
  • Not tested: live end-to-end through a real second proxy to a real upstream (no external wrapper proxy / upstream credentials in sandbox); the compression call is the same one /v1/responses already exercises in CI.

Review Readiness

  • I have performed a self-review
  • This PR is ready for human review

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have updated the CHANGELOG.md if applicable

Additional Notes

Scoped to OpenAI Responses-shaped bodies (the reporter's exact case). Anthropic /messages and OpenAI /chat/completions passthrough compression are natural follow-ups — deliberately left out to keep this change focused and fail-safe. CHANGELOG is release-managed, left unchecked.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

PR governance

This PR follows the template and is marked ready for human review.

@github-actions github-actions Bot added status: ready for review Pull request body is complete and the author marked it ready for human review status: ci failing Required or reported CI checks are failing and removed status: ready for review Pull request body is complete and the author marked it ready for human review labels Jul 2, 2026

@JerrettDavis JerrettDavis left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for keeping this opt-in and fail-open. The shape of the feature makes sense, but this currently regresses the existing passthrough path when the handler/proxy object does not have a config attribute.

CI is failing in tests/test_proxy_handler_helpers.py with multiple existing passthrough tests raising:

AttributeError: 'OpenAIHandlerMixin' object has no attribute 'config'

The new guard uses getattr(self.config, "compress_passthrough", False), but Python has to resolve self.config before getattr can return the default. Please make the feature flag lookup tolerant of a missing config object, for example by first reading config = getattr(self, "config", None) and then checking getattr(config, "compress_passthrough", False). That keeps the default-off behavior truly no-op for the existing passthrough paths and should make the old tests meaningful again.

Requests whose path doesn't match a built-in API route fall through to
`handle_passthrough`, which forwarded the body verbatim — bypassing the whole
compression pipeline. Wrapper-proxy setups that front Headroom on custom paths
(e.g. `/api/codex-proxy/<key>/v1/responses`) therefore got zero compression on
coding-agent traffic and hit context-limit 400s in long sessions (headroomlabs-ai#1546).

Add `--compress-passthrough` (env `HEADROOM_COMPRESS_PASSTHROUGH=1`), off by
default. When enabled, POST passthrough requests whose path ends in
`/responses` and carry an OpenAI Responses-shaped body are routed through the
same `_compress_openai_responses_payload_in_executor` path the native
`/v1/responses` handler uses, then forwarded to the unknown upstream. Stale
Content-Length is dropped so httpx recomputes it.

Fail-open by construction: non-JSON, non-Responses payloads, unmodified
results, and any compressor error forward the original body unchanged, so
opting in can never drop a catch-all request. Anthropic `/messages` and
`/chat/completions` passthrough compression are left as follow-ups.

Closes headroomlabs-ai#1546
@tenderdeve tenderdeve force-pushed the feat/1546-compress-passthrough branch from 41bce60 to 389fc80 Compare July 3, 2026 06:36
@tenderdeve

Copy link
Copy Markdown
Contributor Author

@JerrettDavis updated in 389fc80 — the flag lookup now resolves config = getattr(self, "config", None) first, then getattr(config, "compress_passthrough"/"optimize", False), so a missing config attribute is treated as feature-off and the existing verbatim passthrough path stays a true no-op. Added a regression test for the missing-config case; the previously-failing tests/test_proxy_handler_helpers.py passthrough tests are green again. Also rebased on main to clear the conflicts.

@github-actions github-actions Bot added status: ready for review Pull request body is complete and the author marked it ready for human review and removed status: ci failing Required or reported CI checks are failing labels Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status: ready for review Pull request body is complete and the author marked it ready for human review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Compress passthrough/catch-all proxy routes (add custom route compression)

2 participants