neon: sync catalog with live AI Gateway probe (add 12 models, remove gpt-5-5)#3019
Open
andrelandgraf wants to merge 3 commits into
Open
neon: sync catalog with live AI Gateway probe (add 12 models, remove gpt-5-5)#3019andrelandgraf wants to merge 3 commits into
andrelandgraf wants to merge 3 commits into
Conversation
Verified every Databricks Foundation Model API endpoint against a live Neon AI Gateway branch (us-east-2). Adds 12 models confirmed working (with live-checked image-input + tool-calling capabilities) and removes gpt-5-5, which the gateway rejects as an unknown model.
The two inline models (no base_model to inherit from) were missing the schema-required `description` field, failing CI validation.
Qwen3.5 122B inherits reasoning=true, so the schema requires reasoning_options. Mirrors the canonical alibaba entry (toggle + budget_tokens).
This was referenced Jul 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Updates the Neon provider to match what the Neon AI Gateway actually serves today. Neon's gateway is powered by Databricks Foundation Model APIs, so I took the current Databricks supported-models list (docs updated 2026-07-02) as the candidate set and probed every model live against a real Neon AI Gateway branch (
us-east-2), then live-checked capabilities. The current Neon entries here were missing 11 working models and listed one (gpt-5-5) that the gateway rejects.Neon catalog id = Databricks pay-per-token endpoint name minus the
databricks-prefix (e.g.claude-sonnet-4-6→ underlyingglobal.anthropic.claude-sonnet-4-6).Added (12) — all verified working via live calls
Proprietary (pass-through provider list pricing), via
base_model:gpt-5-3-codex,gpt-5-2-codex,gpt-5-1-codex-max,gpt-5-1-codex-mini(OpenAI Responses API)claude-opus-4-8gemini-3-5-flashOpen-weight (priced from Databricks pay-per-token DBU rate × $0.070/DBU):
llama-4-maverick— $0.50/$1.50meta-llama-3-3-70b-instruct— $0.50/$1.50meta-llama-3-1-8b-instruct— $0.15/$0.45qwen35-122b-a10b— $0.22/$2.20qwen3-next-80b-a3b-instruct— $0.15/$1.20gemma-3-12b— $0.15/$0.50Removed (1)
gpt-5-5— the gateway returns400 unknown model "gpt-5-5"(any prefix / dialect). Not currently served by Neon.How this was verified
For each candidate I sent real requests to the gateway and recorded the result:
200.attachment) — sent a generated 32×32 PNG data-URI + "what color?".200⇒ supported;400 does not support image⇒ not.tool_call) — sent aget_weatherfunction tool and checked for an emitted tool call. All 12 support tool calling.Capability corrections applied so the entries match the gateway (not just the canonical base model):
qwen35-122b-a10b: canonical Qwen3.5 is multimodal, but Databricks/Neon serve it text-only (live: no image input). Overridden toattachment = false,modalities.input = ["text"], andlimit.output = 8000per the Databricks doc.meta-llama-3-3-70b-instruct: overridden toattachment = false(text-only on the gateway; live-confirmed).Notes / open questions for maintainers
databricks:generatescript currentlyIGNOREsllama/qwen/gemmaprefixes, so the Databricks provider omits them. These do work on Neon, so I've added them here with Databricks DBU-derived pricing — happy to drop them if you'd prefer Neon to mirror that ignore-list.gpt-oss-120b/gpt-oss-20bNeon entries appear to use OpenRouter pricing instead (e.g.gpt-oss-120b= $0.072/$0.28 rather than Databricks' $0.15/$0.60) — left unchanged here, but flagging the inconsistency.claude-sonnet-5,claude-fable-5,meta-llama-3-1-405b-instruct(unknown model), the Gemini image-output models, and all embedding models (the gateway exposes no/embeddingsroute).Test plan
bun run validate-equivalent: parsed each new TOML, resolvedbase_modelmerges, and confirmed required fields (name,attachment,reasoning,tool_call,open_weights,release_date,last_updated,cost.{input,output},limit.{context,output},modalities.{input,output}). All 36 Neon models valid.