Skip to content

[BUG] #1701

Description

@wjxn13

Bug: headroom proxy becomes unresponsive after processing one request on Windows 11

Environment:

  • headroom-ai version: 0.28.0 (installed from PyPI wheel)
  • Python version: 3.11.3
  • OS: Windows 11 (build 26200)
  • Install method: pip install headroom-ai[all]

Setup:
Using Headroom as a compression proxy (not MCP), with the following launch command:

headroom proxy --port 8787 --anthropic-api-url https://api.deepseek.com/anthropic --host 127.0.0.1

The proxy sits between CC Switch (a third-party proxy manager on port 15721) and the upstream provider (DeepSeek's Anthropic-compatible API):

Claude Code → CC Switch (:15721) → Headroom (:8787) → api.deepseek.com/anthropic

Workaround applied: HEADROOM_DETECT_BACKEND=python (set via environment variable before launch, per #713/#845).

Symptoms:

  1. Process starts fine — /livez and /readyz return HTTP 200, upstream status shows "healthy".
  2. First request takes ~10 minutes — then succeeds with minimal compression:
    input_tokens_original: 81864 → input_tokens_optimized: 77833
    tokens_saved: 4031 (4.9%)
    optimization_latency_ms: 609972 (~610 seconds)
  3. No actual compression transforms applied — all 76 transforms are either router:protected:, router:excluded:tool, or read_lifecycle:stale:. No router:text, router:smart_crusher, or any actual compression
    transform appears.
  4. Process becomes a zombie after that request — PID still alive, port 8787 still listening, but all HTTP requests (/livez, /readyz, /health, /stats) hang indefinitely until timeout. The process stays in this
    state until killed.
  5. Restart → same pattern repeats — works for one request (~610s), then zombie.

Diagnostic output of /readyz on fresh start:
{"service":"headroom-proxy","status":"healthy","ready":true,"version":"0.28.0","checks":{"upstream":{"enabled":true,"ready":true,"status":"healthy","url":"https://api.deepseek.com/anthropic","error":null}}}

After the zombie state: /readyz times out with no response.

What I've tried:

Questions:

  1. Why does compression take 610 seconds with only 4.9% savings and no actual compression transform?
  2. What causes the proxy to become unresponsive after processing one request? This looks like an asyncio event loop stall or worker thread exhaustion.
  3. Is there a way to enable verbose logging to pinpoint where the hang occurs?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions