Skip to content

neon: sync catalog with live AI Gateway probe (add 12 models, remove gpt-5-5)#3019

Open
andrelandgraf wants to merge 3 commits into
anomalyco:devfrom
andrelandgraf:neon-catalog-sync-2026-07
Open

neon: sync catalog with live AI Gateway probe (add 12 models, remove gpt-5-5)#3019
andrelandgraf wants to merge 3 commits into
anomalyco:devfrom
andrelandgraf:neon-catalog-sync-2026-07

Conversation

@andrelandgraf

Copy link
Copy Markdown
Contributor

Summary

Updates the Neon provider to match what the Neon AI Gateway actually serves today. Neon's gateway is powered by Databricks Foundation Model APIs, so I took the current Databricks supported-models list (docs updated 2026-07-02) as the candidate set and probed every model live against a real Neon AI Gateway branch (us-east-2), then live-checked capabilities. The current Neon entries here were missing 11 working models and listed one (gpt-5-5) that the gateway rejects.

Neon catalog id = Databricks pay-per-token endpoint name minus the databricks- prefix (e.g. claude-sonnet-4-6 → underlying global.anthropic.claude-sonnet-4-6).

Added (12) — all verified working via live calls

Proprietary (pass-through provider list pricing), via base_model:

  • gpt-5-3-codex, gpt-5-2-codex, gpt-5-1-codex-max, gpt-5-1-codex-mini (OpenAI Responses API)
  • claude-opus-4-8
  • gemini-3-5-flash

Open-weight (priced from Databricks pay-per-token DBU rate × $0.070/DBU):

  • llama-4-maverick — $0.50/$1.50
  • meta-llama-3-3-70b-instruct — $0.50/$1.50
  • meta-llama-3-1-8b-instruct — $0.15/$0.45
  • qwen35-122b-a10b — $0.22/$2.20
  • qwen3-next-80b-a3b-instruct — $0.15/$1.20
  • gemma-3-12b — $0.15/$0.50

Removed (1)

  • gpt-5-5 — the gateway returns 400 unknown model "gpt-5-5" (any prefix / dialect). Not currently served by Neon.

How this was verified

For each candidate I sent real requests to the gateway and recorded the result:

  • works? — minimal completion (chat-completions dialect, or the OpenAI Responses dialect for the codex models). All 12 above returned 200.
  • image input (attachment) — sent a generated 32×32 PNG data-URI + "what color?". 200 ⇒ supported; 400 does not support image ⇒ not.
  • tool calling (tool_call) — sent a get_weather function tool and checked for an emitted tool call. All 12 support tool calling.

Capability corrections applied so the entries match the gateway (not just the canonical base model):

  • qwen35-122b-a10b: canonical Qwen3.5 is multimodal, but Databricks/Neon serve it text-only (live: no image input). Overridden to attachment = false, modalities.input = ["text"], and limit.output = 8000 per the Databricks doc.
  • meta-llama-3-3-70b-instruct: overridden to attachment = false (text-only on the gateway; live-confirmed).

Notes / open questions for maintainers

  • The databricks:generate script currently IGNOREs llama / qwen / gemma prefixes, so the Databricks provider omits them. These do work on Neon, so I've added them here with Databricks DBU-derived pricing — happy to drop them if you'd prefer Neon to mirror that ignore-list.
  • Open-weight pricing uses the published Databricks pay-per-token DBU rates × $0.070/DBU (the rate that reproduces the headline GPT-OSS-120B $0.15/$0.60). The pre-existing gpt-oss-120b/gpt-oss-20b Neon entries appear to use OpenRouter pricing instead (e.g. gpt-oss-120b = $0.072/$0.28 rather than Databricks' $0.15/$0.60) — left unchanged here, but flagging the inconsistency.
  • Also verified not available on Neon and intentionally not added: claude-sonnet-5, claude-fable-5, meta-llama-3-1-405b-instruct (unknown model), the Gemini image-output models, and all embedding models (the gateway exposes no /embeddings route).

Test plan

  • bun run validate-equivalent: parsed each new TOML, resolved base_model merges, and confirmed required fields (name, attachment, reasoning, tool_call, open_weights, release_date, last_updated, cost.{input,output}, limit.{context,output}, modalities.{input,output}). All 36 Neon models valid.
  • CI schema validation (GitHub Action).

Verified every Databricks Foundation Model API endpoint against a live Neon
AI Gateway branch (us-east-2). Adds 12 models confirmed working (with
live-checked image-input + tool-calling capabilities) and removes gpt-5-5,
which the gateway rejects as an unknown model.
The two inline models (no base_model to inherit from) were missing the
schema-required `description` field, failing CI validation.
Qwen3.5 122B inherits reasoning=true, so the schema requires reasoning_options.
Mirrors the canonical alibaba entry (toggle + budget_tokens).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant