Skip to content

Studio: heal DiffusionGemma tool calls into structured tool_calls#6851

Open
oobabooga wants to merge 2 commits into
unslothai:mainfrom
oobabooga:studio-diffusion-api-compat
Open

Studio: heal DiffusionGemma tool calls into structured tool_calls#6851
oobabooga wants to merge 2 commits into
unslothai:mainfrom
oobabooga:studio-diffusion-api-compat

Conversation

@oobabooga

@oobabooga oobabooga commented Jul 3, 2026

Copy link
Copy Markdown
Member

DiffusionGemma tool calls come back as raw text instead of a structured tool_calls array. A client that declares a tool gets <|tool_call>call:get_weather{...}<tool_call|> in the message content with finish_reason: stop, so OpenAI-v1 clients like pi can't see the call.

Closes #6732

(Thinking already works; only tool calls were affected.)

Problem

supports_tools is off for diffusion, which keeps the agentic tool loop from running (it would drop the canvas frames that drive the visualization). But that same flag also switched off the client-tool passthrough, the path that turns the model's text tool call into a structured tool_calls.

So the tool call reached the client as raw text. The parser already understands DiffusionGemma's <|tool_call> format (from #6801); it just wasn't being reached.

Fix

Split the flag in two: supports_tools still gates the agentic loop (stays off for diffusion), and a new supports_tool_passthrough gates only the client-tool passthrough (on for diffusion).

Non-diffusion models are unaffected, since both flags return the same value there.

Verification

Live on unsloth/diffusiongemma-26B-A4B-it-GGUF with a declared tool: OpenAI (streaming and non-streaming) and Anthropic /v1/messages now return proper tool_calls / tool_use, and the diffusion visualization still streams. The existing passthrough and route suites pass unchanged.

Note

This is the Studio half. The model only emits schema-correct arguments once the shim (unslothai/unsloth-zoo#864) and the visual server (llama.cpp #24423) forward the tool definitions too, so it's a no-op until a client sends tools.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new property supports_tool_passthrough to the llama_cpp backend, which returns the underlying _supports_tools value. This property is then used in inference.py to determine tool passthrough and client tool support, falling back to supports_tools if the property is not present. This change ensures that client tool loops can bypass restrictions placed on supports_tools for specific models like DiffusionGemma. There are no review comments to address, and the changes look correct.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@oobabooga

Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] diffusiongemma not OpenAI-v1 compatible

1 participant