[BUG] [PROXY] Long multi-turns coding task: only 0.3% compression

## Description

Using Hedaroom proxy in a long, multi-turn agentic coding task in a big JavaScript code repository, the token savings are quite lower than advertised: I get 0.3% vs. expected 60%-95%

## To Reproduce

Full installation, 

```bash
pip install "headroom-ai[all]" --break-system-packages 
pip install --upgrade litellm --break-system-packages

``` 

Running proxy with command

```bash
headroom proxy --code-aware --openai-api-url http://localhost:13305/api
```

<img width="978" height="956" alt="Image" src="https://github.com/user-attachments/assets/77d3fd67-4a5c-4f7d-b1f6-f09251fc1c44" />

Using the endpoint with KiloCode, I get 0.3% token savings in a long coding session

<img width="1258" height="1171" alt="Image" src="https://github.com/user-attachments/assets/a885267f-b45f-4512-ac1f-8731a5ee0717" />

## Environment

- **Headroom version**: 0.28.0
- **Python version**: 3.14.4
- **OS**: Ubuntu 26.04
- **LLM Provider**: OpenAI compatible Lemonade Server

## Additional Context

<img width="328" height="214" alt="Image" src="https://github.com/user-attachments/assets/1eef6bd7-7853-4380-a23e-6b23eed61d38" />

Dump of stats endpoint: [stats.json](https://github.com/user-attachments/files/29596164/stats.json)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] [PROXY] Long multi-turns coding task: only 0.3% compression #1696

Description

To Reproduce

Environment

Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[BUG] [PROXY] Long multi-turns coding task: only 0.3% compression #1696

Description

Description

To Reproduce

Environment

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions