Honor an explicit sdpa or flex_attention request when flash is disabled by danielhanchen · Pull Request #6847 · unslothai/unsloth

danielhanchen · 2026-07-03T18:41:10Z

Summary

When flash attention is disabled for a model, the fallback selection could downgrade a caller who explicitly passed attn_implementation="sdpa" or "flex_attention" to a different backend, because the disable reason is flash-specific.

What this does

Keeps an explicit non-flash request as-is; flash requests still fall back as before. Includes a unit test.

When flash attention is disabled for a model, the fallback selection could downgrade a caller who explicitly passed attn_implementation='sdpa' or 'flex_attention' to a different backend, because the disable reason is flash-specific. Keep an explicit non-flash request as-is; flash requests still fall back as before.

for more information, see https://pre-commit.ci

gemini-code-assist

Code Review

This pull request ensures that explicitly requested non-flash attention implementations, such as "sdpa" or "flex_attention", are honored and not downgraded to fallback options when flash attention is disabled. A comprehensive set of unit tests has been added to verify this behavior. There are no review comments, and I have no additional feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

danielhanchen · 2026-07-04T12:51:06Z

@codex review

danielhanchen · 2026-07-04T12:51:07Z

/gemini review

gemini-code-assist

Code Review

This pull request ensures that explicit requests for non-flash attention implementations, specifically 'sdpa' and 'flex_attention', are honored and not downgraded when flash attention is disabled. A new test suite has been added to verify this behavior. There are no review comments, and I have no additional feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5cd1ab7706

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-07-04T12:55:46Z

+    if requested_attn_implementation in ("sdpa", "flex_attention"):
+        return _set_attn_impl(config, requested_attn_implementation)


Only honor flex when it was explicitly requested

When this helper is reached without an explicit attn_implementation argument (the default resolver path), requested_attn_implementation is populated from the config, so a stale/default _attn_implementation='flex_attention' is now returned even when supports_flex_attention is false (for example because flex was disabled by env, dropout, model exclusions, or unavailable imports). Prior code would fall back to SDPA/eager in that scenario; gate this new branch on the argument actually being passed explicitly, or unsupported flex can be selected during model loading.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-07-04T12:55:46Z

+        attn_implementation = "sdpa",
+        supports_sdpa = False,  # conservative flag would have skipped sdpa


Honor SDPA through the resolver path

This test covers the private helper with supports_sdpa=False, but real callers go through resolve_attention_implementation; after the helper returns sdpa, the resolver's final if not supports_sdpa and final_attn_impl == "sdpa" guard immediately switches it back to eager. In the flash-disabled + conservative supports_sdpa=False scenario described here, an explicit attn_implementation="sdpa" still is not honored despite this test passing, so the resolver path needs to be fixed or covered directly.

Useful? React with 👍 / 👎.

danielhanchen · 2026-07-04T13:45:37Z

The helper already honors an explicit sdpa or flex_attention request ahead of the supports_* fallback, and only falls back for a flash request or an implicit choice, so an explicit non-flash implementation is not downgraded. Left as is.

danielhanchen and others added 2 commits July 3, 2026 14:16

[pre-commit.ci] auto fixes from pre-commit.com hooks

5cd1ab7

for more information, see https://pre-commit.ci

gemini-code-assist Bot reviewed Jul 3, 2026

View reviewed changes

gemini-code-assist Bot reviewed Jul 4, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Jul 4, 2026

View reviewed changes

Tighten comments

084ec4b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Honor an explicit sdpa or flex_attention request when flash is disabled#6847

Honor an explicit sdpa or flex_attention request when flash is disabled#6847
danielhanchen wants to merge 3 commits into
mainfrom
attn-impl-honor-explicit

danielhanchen commented Jul 3, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

danielhanchen commented Jul 4, 2026

Uh oh!

danielhanchen commented Jul 4, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jul 4, 2026

Uh oh!

chatgpt-codex-connector Bot Jul 4, 2026

Uh oh!

danielhanchen commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		if requested_attn_implementation in ("sdpa", "flex_attention"):
		return _set_attn_impl(config, requested_attn_implementation)

		attn_implementation = "sdpa",
		supports_sdpa = False, # conservative flag would have skipped sdpa

Uh oh!

Uh oh!

Conversation

danielhanchen commented Jul 3, 2026

Summary

What this does

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

danielhanchen commented Jul 4, 2026

Uh oh!

danielhanchen commented Jul 4, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jul 4, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jul 4, 2026

Choose a reason for hiding this comment

Uh oh!

danielhanchen commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant