neon: route GPT-5 via Responses API + mark image output#3021
Open
andrelandgraf wants to merge 4 commits into
Open
neon: route GPT-5 via Responses API + mark image output#3021andrelandgraf wants to merge 4 commits into
andrelandgraf wants to merge 4 commits into
Conversation
Verified every Databricks Foundation Model API endpoint against a live Neon AI Gateway branch (us-east-2). Adds 12 models confirmed working (with live-checked image-input + tool-calling capabilities) and removes gpt-5-5, which the gateway rejects as an unknown model.
The two inline models (no base_model to inherit from) were missing the schema-required `description` field, failing CI validation.
Qwen3.5 122B inherits reasoning=true, so the schema requires reasoning_options. Mirrors the canonical alibaba entry (toggle + budget_tokens).
The 12 GPT-5 models are served on Neon's OpenAI Responses route (/ai-gateway/openai/v1), not the mlflow chat-completions default — the codex variants are Responses-only (chat/completions returns 400). Add a per-model [provider] override (shape=responses, openai/v1 api, @ai-sdk/openai) so per-model-aware consumers route correctly, and mark modalities.output with "image" since all 12 support the Responses image_generation built-in tool (verified live). The provider default stays mlflow for the other models.
Open
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stacked on #3019 (branched off
neon-catalog-sync-2026-07). Once #3019 merges, this PR's diff reduces to the 12 GPT‑5 model files below.Two related corrections to the Neon provider's GPT‑5 models, both verified live against the gateway:
Route GPT‑5 via the Responses API, not mlflow chat-completions. The Neon provider's default
apiis…/ai-gateway/mlflow/v1, but the GPT‑5 models are served on the OpenAI Responses route (…/ai-gateway/openai/v1). The codex variants are Responses-only — mlflow returns400 "model … is not available on the chat_completions endpoint". So the current entries silently break for consumers that use the provider default. Each of the 12 GPT‑5 models now carries a per-model[provider]override:(Same per-model override pattern Amazon Bedrock uses for
openai.gpt-5.5→ a/openai/v1Responses endpoint.) The provider-level default staysmlflow/v1for the rest of the catalog (Claude/Gemini/Llama/Qwen), which work on chat-completions.Mark
modalities.outputwith"image". All 12 GPT‑5 models generate images via the Responsesimage_generationbuilt-in tool — verified live (each returned a real JPEG, ~33–38 KB). Now that they're pinned to the Responses route where that tool works,output: ["text","image"]is accurate. This also gives the Neon docs an "Image" category via the standardmodalities.output.includes("image")filter (same convention OpenRouter uses for its GPT-image models).Nothing changes for the non-GPT‑5 models.
Models changed (12)
gpt-5,gpt-5-mini,gpt-5-nano,gpt-5-1,gpt-5-2,gpt-5-4,gpt-5-4-mini,gpt-5-4-nano,gpt-5-3-codex,gpt-5-2-codex,gpt-5-1-codex-max,gpt-5-1-codex-miniNotes
modalities.inputis preserved per model (some GPT‑5 addpdf); onlyoutputgainsimage.[provider]overrides varies. models.dev/opencode honor them; some flat-registry consumers (e.g. Mastra's models.dev gateway) currently route by provider id and ignore per-model overrides — I'm opening an upstream Mastra PR to make per-modelshape/apioverrides work generally.Test plan
base_modelmerge validation: all 36 Neon models valid; 12 GPT‑5 resolve tomodalities.output ⊇ [image]andprovider.shape = responses.