Skip to content

fix(anthropic): align Claude 4.x fallback pricing & context limits to current rates#1485

Closed
KennethWKZ wants to merge 4 commits into
headroomlabs-ai:mainfrom
KennethWKZ:feat/anthropic-opus-4-8-pricing
Closed

fix(anthropic): align Claude 4.x fallback pricing & context limits to current rates#1485
KennethWKZ wants to merge 4 commits into
headroomlabs-ai:mainfrom
KennethWKZ:feat/anthropic-opus-4-8-pricing

Conversation

@KennethWKZ

@KennethWKZ KennethWKZ commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Summary

Aligns the Anthropic fallback pricing and context-limit tables in headroom/providers/anthropic.py with the current anthropic.com/pricing rates (verified 2026-06-27). Several Claude 4.x entries were carrying stale or incorrect rates, and the newer Sonnet 4.5 / 4.6 models had no fallback metadata at all.

Changes

All values $ / MTok; cached_input = prompt-cache read = 0.1× input.

Tier Model Before After Context
Opus claude-opus-4-8 $15 / $75 / $1.50 $5 / $25 / $0.50 1M
Opus claude-opus-4-7 $15 / $75 / $1.50 $5 / $25 / $0.50 1M
Opus claude-opus-4-6 $15 / $75 / $1.50 $5 / $25 / $0.50 1M
Opus claude-opus-4-5-20251101 $15 / $75 / $1.50 $5 / $25 / $0.50 200K
Sonnet claude-sonnet-4-6 (new) $3 / $15 / $0.30 1M
Sonnet claude-sonnet-4-5 (new) $3 / $15 / $0.30 200K
Haiku claude-haiku-4-5-20251001 $0.80 / $4 / $0.08 (3.5 rates) $1 / $5 / $0.10 200K

claude-sonnet-4-20250514 and all Claude 3.x / 3.5.x entries were already correct — left unchanged.

The Sonnet 4.6 entry is functional, not cosmetic: it ships in the long-context pricing tier (1M window), and without an explicit entry the sonnet pattern default would report 200K.

Tests

Replaced the original fragile parity-with-Opus-4.7 assertion with parametrized absolute-value tier checks:

  • test_current_opus_tier_pricing — Opus 4.5–4.8
  • test_current_sonnet_tier_pricing — Sonnet 4 / 4.5 / 4.6
  • test_current_haiku_tier_pricing — Haiku 4.5
  • test_get_context_limit_claude_sonnet_4_6 — pins the 1M context window

Full suite: 24 passed, ruff clean.

Notes

  • _PATTERN_DEFAULTS (the opus / sonnet / haiku regex fallbacks for unknown models) are intentionally left unchanged — they are conservative heuristics, not model-specific entries. Happy to retune in a follow-up.
  • Source of truth: https://docs.anthropic.com/en/docs/about-claude/pricing

Register the newest Opus tier (claude-opus-4-8) in ANTHROPIC_CONTEXT_LIMITS
(1M) and ANTHROPIC_PRICING (Opus tier: $15/$75/$1.50 per Mtok), matching
the claude-opus-4-7 entry it sits beside. Without this, the proxy cannot
size the context window or estimate cost for the model.

Add context-limit and pricing-parity tests alongside the existing 4-7 ones.
@github-actions

Copy link
Copy Markdown
Contributor

PR governance

This PR does not yet satisfy the required template fields:

  • Missing required section Real Behavior Proof.
  • Missing required section Review Readiness.
  • Check at least one verification item in Testing.
  • Paste real command output or artifact links in TestingTest Output.
  • Check I have performed a self-review before requesting human review.
  • Check This PR is ready for human review or convert the PR back to draft.

Please update the PR body, or move the PR back to draft while it is still in progress.

@github-actions github-actions Bot added the status: needs author action Pull request body or readiness checklist still needs author updates label Jun 27, 2026
KennethWKZ added a commit to KennethWKZ/headroom that referenced this pull request Jun 27, 2026
…s, aws login deps

Layer the team's tooling on top of mhaitana's
feat/bedrock-application-inference-profile-arn-support (ARN converse routing,
named-profile wiring to the LiteLLM completion calls, au. region prefix,
regression tests, and the Claude Code + Bedrock guide that resolves the
review blockers):

- headroom/providers/anthropic.py: register claude-opus-4-8 context limit
  (1M) + Opus-tier pricing (from headroomlabs-ai#1485).
- pyproject.toml: bedrock extra = boto3>=1.41.0 + botocore[crt]>=1.41.0.
  aws login (IAM Identity Provider / console-login, DPoP) requires boto3
  1.41+ with the AWS Common Runtime (CRT) per AWS docs; include bedrock in [all].
- docker-compose.bedrock.yml: Bedrock overlay; passes --bedrock-profile
  explicitly so the named profile reaches LiteLLM completions (not just
  startup discovery); mounts host ~/.aws read-only for the login token cache.
- docker-compose.provider.yml: generic Anthropic-compatible upstream overlay
  (GLM/z.ai, DeepSeek).
- docs/bedrock-cli-setup.md: CLI (non-Docker) Bedrock guide; documents
  CLAUDE_CODE_USE_BEDROCK=0 to prevent Claude Code bypassing the proxy.
- .env.example: documented overlay vars.

@JerrettDavis JerrettDavis left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the new model metadata. The context limit matches the current Anthropic docs, but the pricing is stale for Opus 4.8: Anthropic's current pricing page lists Claude Opus 4.8 at $5 / MTok input and $25 / MTok output, with cached input at the corresponding cache-write/read rates, not the older $15/$75 Opus tier used here. The PR body and parity test also say it should match 4.7, which is exactly the part that needs updating now.

Please update ANTHROPIC_PRICING["claude-opus-4-8"] and the test expectation to the current Opus 4.8 values. Reference: https://docs.anthropic.com/en/docs/about-claude/pricing

Opus 4.8, 4.7, and 4.6 were stored at the deprecated Opus tier
($15 input / $75 output / $1.50 cache-read), but all three have
shipped at the current Opus tier since 4.5: $5/MTok input,
$25/MTok output, and a $0.50/MTok cache read (0.1x input).
Verified against the live anthropic.com/pricing table.

Replaces the fragile parity-with-4.7 assertion with explicit
absolute-value checks across 4.6-4.8 so the rate is pinned rather
than silently inherited.

Refs headroomlabs-ai#1485
@KennethWKZ

KennethWKZ commented Jun 27, 2026

Copy link
Copy Markdown
Contributor Author

This now aligns the entire Claude 4.x line in the fallback pricing / context-limit tables to current rates:

Opus tier — $5 / $25 / $0.50 cache-read (was the deprecated $15 / $75 / $1.50 Opus 4.1 tier):

  • claude-opus-4-8, claude-opus-4-7, claude-opus-4-6, claude-opus-4-5-20251101

Sonnet tier — $3 / $15 / $0.30 cache-read:

  • claude-sonnet-4-6 (new — and functional: ships with the 1M context window; the sonnet pattern default would otherwise report 200K)
  • claude-sonnet-4-5 (new, 200K)
  • claude-sonnet-4-20250514 (unchanged, already correct)

Haiku tier — $1 / $5 / $0.10 cache-read:

  • claude-haiku-4-5-20251001 (was carrying Haiku 3.5 rates $0.80 / $4 / $0.08)

Context windows: added Sonnet 4.6 = 1M (long-context tier); Opus 4.5–4.8 already correct.

Tests: replaced the original fragile parity-with-4.7 assertion with parametrized absolute-value tier checks (Opus 4.5–4.8, Sonnet 4–4.6, Haiku 4.5) plus a Sonnet 4.6 1M-context test.

Left intentionally unchanged: _PATTERN_DEFAULTS (the opus/sonnet/haiku regex fallbacks). They're conservative heuristic estimates for unknown models, not model-specific entries — happy to retune them too if you want unknown-model estimates to track current tiers.

Happy to split into focused PRs if you'd rather keep the review surface tighter.

KennethWKZ added a commit to KennethWKZ/headroom that referenced this pull request Jun 27, 2026
Opus 4.8, 4.7, and 4.6 were stored at the deprecated Opus tier
($15 input / $75 output / $1.50 cache-read), but all three have
shipped at the current Opus tier since 4.5: $5/MTok input,
$25/MTok output, and a $0.50/MTok cache read (0.1x input).
Verified against the live anthropic.com/pricing table.

Replaces the fragile parity-with-4.7 assertion with explicit
absolute-value checks across 4.6-4.8 so the rate is pinned rather
than silently inherited.

Refs headroomlabs-ai#1485
Extend the fallback model tables to cover the newer Sonnet releases
and correct the last stale Opus entry, all per the current
anthropic.com/pricing page (verified 2026-06-27):

- claude-sonnet-4-6: $3/$15/$0.30, 1M context window (ships in the
  long-context pricing tier — the sonnet pattern default would
  otherwise report 200K)
- claude-sonnet-4-5: $3/$15/$0.30, 200K context
- claude-opus-4-5-20251101: corrected from the deprecated $15/$75
  Opus tier to the current $5/$25/$0.50 tier shared by 4.5-4.8

Adds Sonnet context-limit and tier-pricing test coverage; extends
the Opus tier test to include 4.5.

Refs headroomlabs-ai#1485
KennethWKZ added a commit to KennethWKZ/headroom that referenced this pull request Jun 27, 2026
Extend the fallback model tables to cover the newer Sonnet releases
and correct the last stale Opus entry, all per the current
anthropic.com/pricing page (verified 2026-06-27):

- claude-sonnet-4-6: $3/$15/$0.30, 1M context window (ships in the
  long-context pricing tier — the sonnet pattern default would
  otherwise report 200K)
- claude-sonnet-4-5: $3/$15/$0.30, 200K context
- claude-opus-4-5-20251101: corrected from the deprecated $15/$75
  Opus tier to the current $5/$25/$0.50 tier shared by 4.5-4.8

Adds Sonnet context-limit and tier-pricing test coverage; extends
the Opus tier test to include 4.5.

Refs headroomlabs-ai#1485
claude-haiku-4-5-20251001 was carrying Haiku 3.5 rates
($0.80/$4/$0.08); the current anthropic.com/pricing page lists
Haiku 4.5 at $1/MTok input, $5/MTok output, cache read $0.10
(0.1x input). Adds a Haiku tier test.

Refs headroomlabs-ai#1485
@github-actions github-actions Bot added the status: ci failing Required or reported CI checks are failing label Jun 27, 2026
KennethWKZ added a commit to KennethWKZ/headroom that referenced this pull request Jun 27, 2026
claude-haiku-4-5-20251001 was carrying Haiku 3.5 rates
($0.80/$4/$0.08); the current anthropic.com/pricing page lists
Haiku 4.5 at $1/MTok input, $5/MTok output, cache read $0.10
(0.1x input). Adds a Haiku tier test.

Refs headroomlabs-ai#1485
@KennethWKZ

Copy link
Copy Markdown
Contributor Author

Consolidated update — supersedes my earlier piecemeal comments. All rates verified against the live anthropic.com/pricing table on 2026-06-27.

This now aligns the entire Claude 4.x line in the fallback pricing / context-limit tables to current rates:

Opus tier — $5 / $25 / $0.50 cache-read (was the deprecated $15 / $75 / $1.50 Opus 4.1 tier):

  • claude-opus-4-8, claude-opus-4-7, claude-opus-4-6, claude-opus-4-5-20251101

Sonnet tier — $3 / $15 / $0.30 cache-read:

  • claude-sonnet-4-6 (new — and functional: ships with the 1M context window; the sonnet pattern default would otherwise report 200K)
  • claude-sonnet-4-5 (new, 200K)
  • claude-sonnet-4-20250514 (unchanged, already correct)

Haiku tier — $1 / $5 / $0.10 cache-read:

  • claude-haiku-4-5-20251001 (was carrying Haiku 3.5 rates $0.80 / $4 / $0.08)

Context windows: added Sonnet 4.6 = 1M (long-context tier); Opus 4.5–4.8 already correct.

Tests: replaced the original fragile parity-with-4.7 assertion with parametrized absolute-value tier checks (Opus 4.5–4.8, Sonnet 4–4.6, Haiku 4.5) plus a Sonnet 4.6 1M-context test.

Left intentionally unchanged: _PATTERN_DEFAULTS (the opus/sonnet/haiku regex fallbacks). They are conservative heuristic estimates for unknown models, not model-specific entries — happy to retune them too if you want unknown-model estimates to track current tiers.

Happy to split into focused PRs if you would rather keep the review surface tighter.

@KennethWKZ KennethWKZ changed the title feat(anthropic): add claude-opus-4-8 context limit and pricing fix(anthropic): align Claude 4.x fallback pricing & context limits to current rates Jun 27, 2026
@github-actions github-actions Bot removed the status: ci failing Required or reported CI checks are failing label Jun 27, 2026

@KennethWKZ KennethWKZ left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@KennethWKZ KennethWKZ requested a review from JerrettDavis June 27, 2026 08:23
chopratejas pushed a commit that referenced this pull request Jun 28, 2026
## Description

`pip install headroom-ai[bedrock]` cannot serve users who authenticate
with `aws login` (IAM Identity Provider / console-login, DPoP).
Resolving those credentials requires the AWS Common Runtime (CRT);
without `awscrt`, botocore raises `MissingDependencyException`.

The AWS docs state the requirement as: **"Boto3 version 1.41.0 or later
with AWS Common Runtime (CRT)"** — i.e. both a modern boto3 floor and
CRT (installed via the `[crt]` extra).

## Type of Change

- [x] Bug fix (non-breaking)

## Changes Made

- `pyproject.toml` `bedrock` extra: bump `boto3>=1.28.0` →
`boto3>=1.41.0`, add `botocore[crt]>=1.41.0` (installs `awscrt`).
- `uv.lock`: regenerated — adds `awscrt`, resolves `boto3` to 1.42.x.

No code changes — the bedrock backend already passes `aws_profile_name`
through to the LiteLLM calls (via #1456); this just makes the installed
dependencies actually able to resolve `aws login` credentials.

## Impact

- **`aws login` (IAM Identity Provider / DPoP):** now works — awscrt
present.
- **`aws sso login` (classic Identity Center):** unaffected (already
worked).
- **static keys (`~/.aws/credentials`):** unaffected.
- Bumping the boto3 floor only affects the optional `[bedrock]` extra;
bedrock users benefit from a current boto3 regardless.

## Testing

Dependency-only change. `uv lock` resolves cleanly (257 packages, awscrt
0.29.2, boto3 1.42.38). No runtime code path altered, so existing
bedrock tests are unaffected.

## Checklist

- [x] Self-review performed
- [x] No new warnings
- [x] Linting passes

## Additional Notes

Focused on the dependency gap only. ARN routing / named-profile wiring /
docs are handled in #1456; pricing in #1485.
@github-actions github-actions Bot added the status: ci failing Required or reported CI checks are failing label Jun 29, 2026

@JerrettDavis JerrettDavis left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes for the stale fallback-model tests. The pricing table changes match Anthropic's current model docs/pricing direction for Opus 4.x, Sonnet 4.x, and Haiku 4.5, but CI is still red because tests/test_provider_model_fallback.py expects the old Opus fallback rate:

  • TestAnthropicModelFallback.test_pricing_for_known_models still asserts $15/$75.
  • TestAnthropicModelFallback.test_cost_estimation_for_new_models still expects the corresponding $22.5 estimate; the updated rates correctly produce $7.5 for that fixture.

Please update those fallback-model tests to pin the new values too, so the provider-specific tests and the generic fallback tests agree.

KennethWKZ added a commit to KennethWKZ/headroom that referenced this pull request Jul 3, 2026
Opus 4.8, 4.7, and 4.6 were stored at the deprecated Opus tier
($15 input / $75 output / $1.50 cache-read), but all three have
shipped at the current Opus tier since 4.5: $5/MTok input,
$25/MTok output, and a $0.50/MTok cache read (0.1x input).
Verified against the live anthropic.com/pricing table.

Replaces the fragile parity-with-4.7 assertion with explicit
absolute-value checks across 4.6-4.8 so the rate is pinned rather
than silently inherited.

Refs headroomlabs-ai#1485
KennethWKZ added a commit to KennethWKZ/headroom that referenced this pull request Jul 3, 2026
Extend the fallback model tables to cover the newer Sonnet releases
and correct the last stale Opus entry, all per the current
anthropic.com/pricing page (verified 2026-06-27):

- claude-sonnet-4-6: $3/$15/$0.30, 1M context window (ships in the
  long-context pricing tier — the sonnet pattern default would
  otherwise report 200K)
- claude-sonnet-4-5: $3/$15/$0.30, 200K context
- claude-opus-4-5-20251101: corrected from the deprecated $15/$75
  Opus tier to the current $5/$25/$0.50 tier shared by 4.5-4.8

Adds Sonnet context-limit and tier-pricing test coverage; extends
the Opus tier test to include 4.5.

Refs headroomlabs-ai#1485
KennethWKZ added a commit to KennethWKZ/headroom that referenced this pull request Jul 3, 2026
claude-haiku-4-5-20251001 was carrying Haiku 3.5 rates
($0.80/$4/$0.08); the current anthropic.com/pricing page lists
Haiku 4.5 at $1/MTok input, $5/MTok output, cache read $0.10
(0.1x input). Adds a Haiku tier test.

Refs headroomlabs-ai#1485
KennethWKZ added a commit to KennethWKZ/headroom that referenced this pull request Jul 3, 2026
Opus 4.8, 4.7, and 4.6 were stored at the deprecated Opus tier
($15 input / $75 output / $1.50 cache-read), but all three have
shipped at the current Opus tier since 4.5: $5/MTok input,
$25/MTok output, and a $0.50/MTok cache read (0.1x input).
Verified against the live anthropic.com/pricing table.

Replaces the fragile parity-with-4.7 assertion with explicit
absolute-value checks across 4.6-4.8 so the rate is pinned rather
than silently inherited.

Refs headroomlabs-ai#1485
KennethWKZ added a commit to KennethWKZ/headroom that referenced this pull request Jul 3, 2026
Extend the fallback model tables to cover the newer Sonnet releases
and correct the last stale Opus entry, all per the current
anthropic.com/pricing page (verified 2026-06-27):

- claude-sonnet-4-6: $3/$15/$0.30, 1M context window (ships in the
  long-context pricing tier — the sonnet pattern default would
  otherwise report 200K)
- claude-sonnet-4-5: $3/$15/$0.30, 200K context
- claude-opus-4-5-20251101: corrected from the deprecated $15/$75
  Opus tier to the current $5/$25/$0.50 tier shared by 4.5-4.8

Adds Sonnet context-limit and tier-pricing test coverage; extends
the Opus tier test to include 4.5.

Refs headroomlabs-ai#1485
KennethWKZ added a commit to KennethWKZ/headroom that referenced this pull request Jul 3, 2026
claude-haiku-4-5-20251001 was carrying Haiku 3.5 rates
($0.80/$4/$0.08); the current anthropic.com/pricing page lists
Haiku 4.5 at $1/MTok input, $5/MTok output, cache read $0.10
(0.1x input). Adds a Haiku tier test.

Refs headroomlabs-ai#1485
@KennethWKZ

Copy link
Copy Markdown
Contributor Author

Superseded by #1767 — rebased onto latest main, squashed to one commit, extended with Claude 5 family (Fable 5, Opus 4.8, Sonnet 5) and the requested fallback-test alignment.

@KennethWKZ KennethWKZ closed this Jul 3, 2026
@KennethWKZ KennethWKZ deleted the feat/anthropic-opus-4-8-pricing branch July 3, 2026 16:28
chopratejas pushed a commit that referenced this pull request Jul 3, 2026
…1767)

## Summary

Adds Claude 5 generation metadata and aligns the Anthropic fallback
pricing / context-limit tables in `headroom/providers/anthropic.py` with
current [Anthropic
pricing](https://platform.claude.com/docs/en/about-claude/pricing)
(verified 2026-07-04). Several Claude 4.x entries carried stale rates,
the new Claude 5 models (Fable 5, Opus 4.8, Sonnet 5) had no metadata,
and the generic fallback tests disagreed with the provider tests.

Supersedes #1485 (rebased onto latest `main`, squashed to one commit,
extended with Sonnet 5 / Fable 5 and the requested fallback-test fixes).

## Changes

All values `$ / MTok`; `cached_input` = prompt-cache read = 0.1× input.

| Tier | Model | Before | After | Context |
|---|---|---|---|---|
| Fable | `claude-fable-5` | — *(new)* | $10 / $50 / $1.00 | **1M** |
| Opus | `claude-opus-4-8` | — *(new)* | $5 / $25 / $0.50 | **1M** |
| Opus | `claude-opus-4-7` | $15 / $75 / $1.50 | $5 / $25 / $0.50 | 1M |
| Opus | `claude-opus-4-6` | $15 / $75 / $1.50 | $5 / $25 / $0.50 | 1M |
| Opus | `claude-opus-4-5-20251101` | $15 / $75 / $1.50 | $5 / $25 /
$0.50 | 200K |
| Sonnet | `claude-sonnet-5` | — *(new)* | $3 / $15 / $0.30 | **1M** |
| Sonnet | `claude-sonnet-4-6` | — *(new)* | $3 / $15 / $0.30 | **1M** |
| Sonnet | `claude-sonnet-4-5` | — *(new)* | $3 / $15 / $0.30 | 200K |
| Haiku | `claude-haiku-4-5-20251001` | $0.80 / $4 / $0.08 *(3.5 rates)*
| $1 / $5 / $0.10 | 200K |

`claude-sonnet-4-20250514` and all Claude 3.x / 3.5.x entries were
already correct — left unchanged.

The **1M-context** entries (Fable 5, Opus 4.8, Sonnet 5, Sonnet 4.6) are
functional, not cosmetic: they ship in the long-context tier, and
without explicit entries the `sonnet` / `opus` pattern defaults would
report 200K.

Sonnet 5 is pinned to the **standard** Sonnet tier ($3 / $15 / $0.30);
Anthropic's introductory rate ($2 / $10 through Aug 31 2026) is
intentionally not encoded to avoid a time-dependent fixture.

## Fallback-model tests (addresses review on #1485)

`_PATTERN_DEFAULTS["opus"]` is aligned to the current Opus tier ($5 /
$25 / $0.50) so the generic fallback suite and the provider-specific
suite agree:

- `test_pricing_for_known_models` — Opus 4.5 pins $5 / $25 / $0.50
- `test_pattern_based_inference_opus` — unknown-opus fallback now $5 /
$25
- `test_cost_estimation_for_new_models` — fixture estimate corrected
$22.5 → $7.5
- `test_pattern_based_inference_sonnet` — retargeted to
`claude-sonnet-6-*` (the old `claude-sonnet-5-*` probe now
prefix-matches the real `claude-sonnet-5` key)

New provider coverage:

- `test_get_context_limit_claude_5_family` — Fable 5 / Opus 4.8 / Sonnet
5 all 1M
- `test_pricing_claude_5_family` — exact rate table for the 3 new models

Full suite: **48 passed**.

## Source

https://platform.claude.com/docs/en/about-claude/pricing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status: ci failing Required or reported CI checks are failing status: needs author action Pull request body or readiness checklist still needs author updates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants