fix(anthropic): align Claude 4.x fallback pricing & context limits to current rates#1485
fix(anthropic): align Claude 4.x fallback pricing & context limits to current rates#1485KennethWKZ wants to merge 4 commits into
Conversation
Register the newest Opus tier (claude-opus-4-8) in ANTHROPIC_CONTEXT_LIMITS (1M) and ANTHROPIC_PRICING (Opus tier: $15/$75/$1.50 per Mtok), matching the claude-opus-4-7 entry it sits beside. Without this, the proxy cannot size the context window or estimate cost for the model. Add context-limit and pricing-parity tests alongside the existing 4-7 ones.
PR governanceThis PR does not yet satisfy the required template fields:
Please update the PR body, or move the PR back to draft while it is still in progress. |
…s, aws login deps Layer the team's tooling on top of mhaitana's feat/bedrock-application-inference-profile-arn-support (ARN converse routing, named-profile wiring to the LiteLLM completion calls, au. region prefix, regression tests, and the Claude Code + Bedrock guide that resolves the review blockers): - headroom/providers/anthropic.py: register claude-opus-4-8 context limit (1M) + Opus-tier pricing (from headroomlabs-ai#1485). - pyproject.toml: bedrock extra = boto3>=1.41.0 + botocore[crt]>=1.41.0. aws login (IAM Identity Provider / console-login, DPoP) requires boto3 1.41+ with the AWS Common Runtime (CRT) per AWS docs; include bedrock in [all]. - docker-compose.bedrock.yml: Bedrock overlay; passes --bedrock-profile explicitly so the named profile reaches LiteLLM completions (not just startup discovery); mounts host ~/.aws read-only for the login token cache. - docker-compose.provider.yml: generic Anthropic-compatible upstream overlay (GLM/z.ai, DeepSeek). - docs/bedrock-cli-setup.md: CLI (non-Docker) Bedrock guide; documents CLAUDE_CODE_USE_BEDROCK=0 to prevent Claude Code bypassing the proxy. - .env.example: documented overlay vars.
JerrettDavis
left a comment
There was a problem hiding this comment.
Thanks for adding the new model metadata. The context limit matches the current Anthropic docs, but the pricing is stale for Opus 4.8: Anthropic's current pricing page lists Claude Opus 4.8 at $5 / MTok input and $25 / MTok output, with cached input at the corresponding cache-write/read rates, not the older $15/$75 Opus tier used here. The PR body and parity test also say it should match 4.7, which is exactly the part that needs updating now.
Please update ANTHROPIC_PRICING["claude-opus-4-8"] and the test expectation to the current Opus 4.8 values. Reference: https://docs.anthropic.com/en/docs/about-claude/pricing
Opus 4.8, 4.7, and 4.6 were stored at the deprecated Opus tier ($15 input / $75 output / $1.50 cache-read), but all three have shipped at the current Opus tier since 4.5: $5/MTok input, $25/MTok output, and a $0.50/MTok cache read (0.1x input). Verified against the live anthropic.com/pricing table. Replaces the fragile parity-with-4.7 assertion with explicit absolute-value checks across 4.6-4.8 so the rate is pinned rather than silently inherited. Refs headroomlabs-ai#1485
|
This now aligns the entire Claude 4.x line in the fallback pricing / context-limit tables to current rates: Opus tier — $5 / $25 / $0.50 cache-read (was the deprecated $15 / $75 / $1.50 Opus 4.1 tier):
Sonnet tier — $3 / $15 / $0.30 cache-read:
Haiku tier — $1 / $5 / $0.10 cache-read:
Context windows: added Sonnet 4.6 = 1M (long-context tier); Opus 4.5–4.8 already correct. Tests: replaced the original fragile parity-with-4.7 assertion with parametrized absolute-value tier checks (Opus 4.5–4.8, Sonnet 4–4.6, Haiku 4.5) plus a Sonnet 4.6 1M-context test. Left intentionally unchanged: _PATTERN_DEFAULTS (the opus/sonnet/haiku regex fallbacks). They're conservative heuristic estimates for unknown models, not model-specific entries — happy to retune them too if you want unknown-model estimates to track current tiers. Happy to split into focused PRs if you'd rather keep the review surface tighter. |
Opus 4.8, 4.7, and 4.6 were stored at the deprecated Opus tier ($15 input / $75 output / $1.50 cache-read), but all three have shipped at the current Opus tier since 4.5: $5/MTok input, $25/MTok output, and a $0.50/MTok cache read (0.1x input). Verified against the live anthropic.com/pricing table. Replaces the fragile parity-with-4.7 assertion with explicit absolute-value checks across 4.6-4.8 so the rate is pinned rather than silently inherited. Refs headroomlabs-ai#1485
Extend the fallback model tables to cover the newer Sonnet releases and correct the last stale Opus entry, all per the current anthropic.com/pricing page (verified 2026-06-27): - claude-sonnet-4-6: $3/$15/$0.30, 1M context window (ships in the long-context pricing tier — the sonnet pattern default would otherwise report 200K) - claude-sonnet-4-5: $3/$15/$0.30, 200K context - claude-opus-4-5-20251101: corrected from the deprecated $15/$75 Opus tier to the current $5/$25/$0.50 tier shared by 4.5-4.8 Adds Sonnet context-limit and tier-pricing test coverage; extends the Opus tier test to include 4.5. Refs headroomlabs-ai#1485
Extend the fallback model tables to cover the newer Sonnet releases and correct the last stale Opus entry, all per the current anthropic.com/pricing page (verified 2026-06-27): - claude-sonnet-4-6: $3/$15/$0.30, 1M context window (ships in the long-context pricing tier — the sonnet pattern default would otherwise report 200K) - claude-sonnet-4-5: $3/$15/$0.30, 200K context - claude-opus-4-5-20251101: corrected from the deprecated $15/$75 Opus tier to the current $5/$25/$0.50 tier shared by 4.5-4.8 Adds Sonnet context-limit and tier-pricing test coverage; extends the Opus tier test to include 4.5. Refs headroomlabs-ai#1485
claude-haiku-4-5-20251001 was carrying Haiku 3.5 rates ($0.80/$4/$0.08); the current anthropic.com/pricing page lists Haiku 4.5 at $1/MTok input, $5/MTok output, cache read $0.10 (0.1x input). Adds a Haiku tier test. Refs headroomlabs-ai#1485
claude-haiku-4-5-20251001 was carrying Haiku 3.5 rates ($0.80/$4/$0.08); the current anthropic.com/pricing page lists Haiku 4.5 at $1/MTok input, $5/MTok output, cache read $0.10 (0.1x input). Adds a Haiku tier test. Refs headroomlabs-ai#1485
|
Consolidated update — supersedes my earlier piecemeal comments. All rates verified against the live anthropic.com/pricing table on 2026-06-27. This now aligns the entire Claude 4.x line in the fallback pricing / context-limit tables to current rates: Opus tier —
Sonnet tier —
Haiku tier —
Context windows: added Sonnet 4.6 = 1M (long-context tier); Opus 4.5–4.8 already correct. Tests: replaced the original fragile parity-with-4.7 assertion with parametrized absolute-value tier checks (Opus 4.5–4.8, Sonnet 4–4.6, Haiku 4.5) plus a Sonnet 4.6 1M-context test. Left intentionally unchanged: Happy to split into focused PRs if you would rather keep the review surface tighter. |
## Description `pip install headroom-ai[bedrock]` cannot serve users who authenticate with `aws login` (IAM Identity Provider / console-login, DPoP). Resolving those credentials requires the AWS Common Runtime (CRT); without `awscrt`, botocore raises `MissingDependencyException`. The AWS docs state the requirement as: **"Boto3 version 1.41.0 or later with AWS Common Runtime (CRT)"** — i.e. both a modern boto3 floor and CRT (installed via the `[crt]` extra). ## Type of Change - [x] Bug fix (non-breaking) ## Changes Made - `pyproject.toml` `bedrock` extra: bump `boto3>=1.28.0` → `boto3>=1.41.0`, add `botocore[crt]>=1.41.0` (installs `awscrt`). - `uv.lock`: regenerated — adds `awscrt`, resolves `boto3` to 1.42.x. No code changes — the bedrock backend already passes `aws_profile_name` through to the LiteLLM calls (via #1456); this just makes the installed dependencies actually able to resolve `aws login` credentials. ## Impact - **`aws login` (IAM Identity Provider / DPoP):** now works — awscrt present. - **`aws sso login` (classic Identity Center):** unaffected (already worked). - **static keys (`~/.aws/credentials`):** unaffected. - Bumping the boto3 floor only affects the optional `[bedrock]` extra; bedrock users benefit from a current boto3 regardless. ## Testing Dependency-only change. `uv lock` resolves cleanly (257 packages, awscrt 0.29.2, boto3 1.42.38). No runtime code path altered, so existing bedrock tests are unaffected. ## Checklist - [x] Self-review performed - [x] No new warnings - [x] Linting passes ## Additional Notes Focused on the dependency gap only. ARN routing / named-profile wiring / docs are handled in #1456; pricing in #1485.
JerrettDavis
left a comment
There was a problem hiding this comment.
Requesting changes for the stale fallback-model tests. The pricing table changes match Anthropic's current model docs/pricing direction for Opus 4.x, Sonnet 4.x, and Haiku 4.5, but CI is still red because tests/test_provider_model_fallback.py expects the old Opus fallback rate:
TestAnthropicModelFallback.test_pricing_for_known_modelsstill asserts$15/$75.TestAnthropicModelFallback.test_cost_estimation_for_new_modelsstill expects the corresponding$22.5estimate; the updated rates correctly produce$7.5for that fixture.
Please update those fallback-model tests to pin the new values too, so the provider-specific tests and the generic fallback tests agree.
Opus 4.8, 4.7, and 4.6 were stored at the deprecated Opus tier ($15 input / $75 output / $1.50 cache-read), but all three have shipped at the current Opus tier since 4.5: $5/MTok input, $25/MTok output, and a $0.50/MTok cache read (0.1x input). Verified against the live anthropic.com/pricing table. Replaces the fragile parity-with-4.7 assertion with explicit absolute-value checks across 4.6-4.8 so the rate is pinned rather than silently inherited. Refs headroomlabs-ai#1485
Extend the fallback model tables to cover the newer Sonnet releases and correct the last stale Opus entry, all per the current anthropic.com/pricing page (verified 2026-06-27): - claude-sonnet-4-6: $3/$15/$0.30, 1M context window (ships in the long-context pricing tier — the sonnet pattern default would otherwise report 200K) - claude-sonnet-4-5: $3/$15/$0.30, 200K context - claude-opus-4-5-20251101: corrected from the deprecated $15/$75 Opus tier to the current $5/$25/$0.50 tier shared by 4.5-4.8 Adds Sonnet context-limit and tier-pricing test coverage; extends the Opus tier test to include 4.5. Refs headroomlabs-ai#1485
claude-haiku-4-5-20251001 was carrying Haiku 3.5 rates ($0.80/$4/$0.08); the current anthropic.com/pricing page lists Haiku 4.5 at $1/MTok input, $5/MTok output, cache read $0.10 (0.1x input). Adds a Haiku tier test. Refs headroomlabs-ai#1485
Opus 4.8, 4.7, and 4.6 were stored at the deprecated Opus tier ($15 input / $75 output / $1.50 cache-read), but all three have shipped at the current Opus tier since 4.5: $5/MTok input, $25/MTok output, and a $0.50/MTok cache read (0.1x input). Verified against the live anthropic.com/pricing table. Replaces the fragile parity-with-4.7 assertion with explicit absolute-value checks across 4.6-4.8 so the rate is pinned rather than silently inherited. Refs headroomlabs-ai#1485
Extend the fallback model tables to cover the newer Sonnet releases and correct the last stale Opus entry, all per the current anthropic.com/pricing page (verified 2026-06-27): - claude-sonnet-4-6: $3/$15/$0.30, 1M context window (ships in the long-context pricing tier — the sonnet pattern default would otherwise report 200K) - claude-sonnet-4-5: $3/$15/$0.30, 200K context - claude-opus-4-5-20251101: corrected from the deprecated $15/$75 Opus tier to the current $5/$25/$0.50 tier shared by 4.5-4.8 Adds Sonnet context-limit and tier-pricing test coverage; extends the Opus tier test to include 4.5. Refs headroomlabs-ai#1485
claude-haiku-4-5-20251001 was carrying Haiku 3.5 rates ($0.80/$4/$0.08); the current anthropic.com/pricing page lists Haiku 4.5 at $1/MTok input, $5/MTok output, cache read $0.10 (0.1x input). Adds a Haiku tier test. Refs headroomlabs-ai#1485
|
Superseded by #1767 — rebased onto latest |
…1767) ## Summary Adds Claude 5 generation metadata and aligns the Anthropic fallback pricing / context-limit tables in `headroom/providers/anthropic.py` with current [Anthropic pricing](https://platform.claude.com/docs/en/about-claude/pricing) (verified 2026-07-04). Several Claude 4.x entries carried stale rates, the new Claude 5 models (Fable 5, Opus 4.8, Sonnet 5) had no metadata, and the generic fallback tests disagreed with the provider tests. Supersedes #1485 (rebased onto latest `main`, squashed to one commit, extended with Sonnet 5 / Fable 5 and the requested fallback-test fixes). ## Changes All values `$ / MTok`; `cached_input` = prompt-cache read = 0.1× input. | Tier | Model | Before | After | Context | |---|---|---|---|---| | Fable | `claude-fable-5` | — *(new)* | $10 / $50 / $1.00 | **1M** | | Opus | `claude-opus-4-8` | — *(new)* | $5 / $25 / $0.50 | **1M** | | Opus | `claude-opus-4-7` | $15 / $75 / $1.50 | $5 / $25 / $0.50 | 1M | | Opus | `claude-opus-4-6` | $15 / $75 / $1.50 | $5 / $25 / $0.50 | 1M | | Opus | `claude-opus-4-5-20251101` | $15 / $75 / $1.50 | $5 / $25 / $0.50 | 200K | | Sonnet | `claude-sonnet-5` | — *(new)* | $3 / $15 / $0.30 | **1M** | | Sonnet | `claude-sonnet-4-6` | — *(new)* | $3 / $15 / $0.30 | **1M** | | Sonnet | `claude-sonnet-4-5` | — *(new)* | $3 / $15 / $0.30 | 200K | | Haiku | `claude-haiku-4-5-20251001` | $0.80 / $4 / $0.08 *(3.5 rates)* | $1 / $5 / $0.10 | 200K | `claude-sonnet-4-20250514` and all Claude 3.x / 3.5.x entries were already correct — left unchanged. The **1M-context** entries (Fable 5, Opus 4.8, Sonnet 5, Sonnet 4.6) are functional, not cosmetic: they ship in the long-context tier, and without explicit entries the `sonnet` / `opus` pattern defaults would report 200K. Sonnet 5 is pinned to the **standard** Sonnet tier ($3 / $15 / $0.30); Anthropic's introductory rate ($2 / $10 through Aug 31 2026) is intentionally not encoded to avoid a time-dependent fixture. ## Fallback-model tests (addresses review on #1485) `_PATTERN_DEFAULTS["opus"]` is aligned to the current Opus tier ($5 / $25 / $0.50) so the generic fallback suite and the provider-specific suite agree: - `test_pricing_for_known_models` — Opus 4.5 pins $5 / $25 / $0.50 - `test_pattern_based_inference_opus` — unknown-opus fallback now $5 / $25 - `test_cost_estimation_for_new_models` — fixture estimate corrected $22.5 → $7.5 - `test_pattern_based_inference_sonnet` — retargeted to `claude-sonnet-6-*` (the old `claude-sonnet-5-*` probe now prefix-matches the real `claude-sonnet-5` key) New provider coverage: - `test_get_context_limit_claude_5_family` — Fable 5 / Opus 4.8 / Sonnet 5 all 1M - `test_pricing_claude_5_family` — exact rate table for the 3 new models Full suite: **48 passed**. ## Source https://platform.claude.com/docs/en/about-claude/pricing
Summary
Aligns the Anthropic fallback pricing and context-limit tables in
headroom/providers/anthropic.pywith the current anthropic.com/pricing rates (verified 2026-06-27). Several Claude 4.x entries were carrying stale or incorrect rates, and the newer Sonnet 4.5 / 4.6 models had no fallback metadata at all.Changes
All values
$ / MTok;cached_input= prompt-cache read = 0.1× input.claude-opus-4-8claude-opus-4-7claude-opus-4-6claude-opus-4-5-20251101claude-sonnet-4-6claude-sonnet-4-5claude-haiku-4-5-20251001claude-sonnet-4-20250514and all Claude 3.x / 3.5.x entries were already correct — left unchanged.The Sonnet 4.6 entry is functional, not cosmetic: it ships in the long-context pricing tier (1M window), and without an explicit entry the
sonnetpattern default would report 200K.Tests
Replaced the original fragile parity-with-Opus-4.7 assertion with parametrized absolute-value tier checks:
test_current_opus_tier_pricing— Opus 4.5–4.8test_current_sonnet_tier_pricing— Sonnet 4 / 4.5 / 4.6test_current_haiku_tier_pricing— Haiku 4.5test_get_context_limit_claude_sonnet_4_6— pins the 1M context windowFull suite: 24 passed, ruff clean.
Notes
_PATTERN_DEFAULTS(theopus/sonnet/haikuregex fallbacks for unknown models) are intentionally left unchanged — they are conservative heuristics, not model-specific entries. Happy to retune in a follow-up.