Auto Xet to HTTP download fallback in from_pretrained; share Studio's fallback via unsloth_zoo by danielhanchen · Pull Request #6638 · unslothai/unsloth

danielhanchen · 2026-06-24T11:39:58Z

Summary

Hugging Face Xet (hf_xet) is the fast default download transport, but a Xet transfer can hang on a blob with no progress and no exception, and a blocked native Xet thread cannot be killed in-process. Unsloth Studio already recovers from this; the shared logic now lives in unsloth_zoo.hf_xet_fallback (see unslothai/unsloth-zoo#829). This PR puts it to use in two places.

Depends on unslothai/unsloth-zoo#829 (the unsloth_zoo.hf_xet_fallback / unsloth_zoo.hf_cache_state helper). Merge that first; until it lands, the unsloth-main change is a no-op (the helper import fails and the load proceeds normally).

unsloth main

FastModel / FastVisionModel / FastLanguageModel.from_pretrained now warm the repo in a killable subprocess before the in-process weight load. A new maybe_prefetch_hf_snapshot (in models/_utils.py) calls unsloth_zoo.hf_xet_fallback.snapshot_download_with_xet_fallback, which keeps Xet primary and, on a no-progress stall, kills the child, makes the partial safe for HTTP, and respawns once with HF_HUB_DISABLE_XET=1. Because the snapshot is then fully cached, the existing from_pretrained is a cache hit and cannot hang on Xet.

It is best-effort and guarded:

local paths, local_files_only, and offline (HF_HUB_OFFLINE / TRANSFORMERS_OFFLINE) are skipped;
the vLLM fast_inference path is skipped (it has its own downloader);
an older unsloth_zoo without the helper is a no-op (the load proceeds unchanged);
a fully cached repo resolves in-process via local_files_only with no subprocess;
only a both-transports-stalled DownloadStallError is raised (a clear network error instead of a silent hang); any other hiccup falls through to the normal load.

The call sits inside the existing HF_HUB_ENABLE_HF_TRANSFER window in vision.py and llama.py.

Studio

studio/backend/utils/hf_xet_fallback.py is now a thin shim over unsloth_zoo.hf_xet_fallback. It re-exports the shared API (start_watchdog, get_hf_download_state, child_should_disable_xet, hf_hub_download_with_xet_fallback, snapshot_download_with_xet_fallback, DownloadStallError) and injects Studio's marker-aware prepare_cache_for_transport on the HTTP retry, so the download manager keeps its .transport marker semantics. The local copy of the watchdog + spawn-child download + Xet to HTTP retry is removed (the module drops ~450 lines). Call sites (core/inference/llama_cpp.py, core/training/worker.py) and the orchestrator's DownloadStallError import are unchanged.

The hub's hf_cache_state.py keeps its stdlib-only primitives (the hub stays import-light: its unit tests stub pydantic/fastapi and must not pull torch/transformers). Only the substantial fallback orchestration is shared.

Tests

studio/backend/tests/test_hf_xet_fallback.py is slimmed to the shim-specific behavior (re-export surface, and that the shim injects Studio's prepare_cache_for_transport on the HTTP retry); the full watchdog/transport matrix is tested in unsloth_zoo.
The GGUF and training integration tests (test_gguf_xet_fallback_integration.py, test_training_xet_fallback.py) are unchanged and pass via the shim.
The hub suite stays green and ML-free (hub/tests: 109 passed, ~0.5s).

End-to-end (against unsloth_zoo#829): forcing the Xet attempt to stall on a tiny real repo fires the watchdog after the timeout, kills the child, and completes the download over HTTP, returning a valid snapshot.

… via unsloth_zoo Hugging Face Xet downloads can hang on a blob with no progress and no exception, and a blocked native Xet thread cannot be killed in-process. Studio already recovers from this; the shared logic now lives in unsloth_zoo.hf_xet_fallback. unsloth main: FastModel/FastVisionModel/FastLanguageModel.from_pretrained now warm the repo in a killable subprocess (maybe_prefetch_hf_snapshot -> snapshot_download_with_xet_fallback) before the in-process load, so a stalled Xet transfer auto-recovers over HTTP and the subsequent load is a cache hit that cannot hang. Best-effort and guarded: local paths, offline/local_files_only, and the vLLM fast_inference path are skipped; an older unsloth_zoo without the helper is a no-op. Studio: utils/hf_xet_fallback.py is now a thin shim over unsloth_zoo.hf_xet_fallback that injects Studio's marker-aware prepare_cache_for_transport, replacing its local copy of the watchdog + spawn-child download + Xet->HTTP retry. Call sites and the orchestrator's DownloadStallError import are unchanged. The hf_xet_fallback unit test is slimmed to the shim-specific behavior (the full matrix is tested in unsloth_zoo); the GGUF and training integration tests are unchanged.

chatgpt-codex-connector · 2026-06-24T11:40:04Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

for more information, see https://pre-commit.ci

gemini-code-assist

Code Review

This pull request refactors the Xet-to-HTTP fallback mechanism by delegating the core watchdog and download logic to a shared implementation in unsloth_zoo, while maintaining a Studio-specific shim that injects marker-aware cache preparation. Additionally, it introduces a pre-download step (maybe_prefetch_hf_snapshot) in a killable subprocess for Llama and Vision models to prevent in-process hangs on stalled Xet transfers. Feedback focuses on improving robustness: first, by wrapping the cache preparation call in a try-except block to prevent unexpected failures from crashing the fallback process; and second, by properly expanding user paths and checking for local path indicators to avoid false-positive download warnings for local model directories.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-24T11:41:52Z

+def _studio_prepare_for_http(repo_type: str, repo_id: str) -> None:
+    """Make the partial safe for an HTTP resume using Studio's marker-aware purge,
+    so the download manager's ``.transport`` marker accounting stays consistent
+    (vs the generic delete-incompletes default in unsloth_zoo)."""
+    from hub.utils.download_registry import prepare_cache_for_transport
+    prepare_cache_for_transport(repo_type, repo_id, "http")



If prepare_cache_for_transport raises an exception (for example, due to permission issues or missing directories), it will propagate and crash the entire fallback download process, preventing the HTTP retry from starting.

To ensure robustness, wrap this call in a try...except block and log any failure as a debug message. Avoid using broad, silent exception handlers like except Exception: pass. If importing the logger fails, catch the specific ModuleNotFoundError and verify the module name to avoid suppressing unrelated import errors.

def _studio_prepare_for_http(repo_type: str, repo_id: str) -> None: """Make the partial safe for an HTTP resume using Studio's marker-aware purge, so the download manager's ``.transport`` marker accounting stays consistent (vs the generic delete-incompletes default in unsloth_zoo).""" try: from hub.utils.download_registry import prepare_cache_for_transport prepare_cache_for_transport(repo_type, repo_id, "http") except Exception as e: try: from loggers import get_logger get_logger(__name__).debug("prepare_cache_for_transport failed for %s: %s", repo_id, e) except ModuleNotFoundError as err: if err.name != "loggers": raise

References

Avoid using broad, silent exception handlers like except Exception: pass. Instead, log the exception, even if at a debug level, to aid in future debugging.

When catching an ImportError for an optional dependency, prefer catching the more specific ModuleNotFoundError and check the module name to avoid suppressing unrelated import errors.

gemini-code-assist · 2026-06-24T11:41:52Z

+    if not isinstance(model_name, str) or not model_name:
+        return
+    # A local directory / file path has nothing to download.
+    if os.path.isdir(model_name) or os.path.exists(model_name):
+        return


Checking os.path.exists(model_name) directly will fail to detect local paths that start with ~ (since it is not expanded) or non-existent local paths (e.g., ./my_model or absolute paths). This will cause the prefetcher to attempt a download from the Hugging Face Hub, resulting in confusing false-positive warnings.

We should expand user paths using os.path.expanduser and check if the path starts with typical local path indicators (like ., /, ~, \\, or is an absolute path) to return early.

Suggested change

if not isinstance(model_name, str) or not model_name:

return

# A local directory / file path has nothing to download.

if os.path.isdir(model_name) or os.path.exists(model_name):

return

if not isinstance(model_name, str) or not model_name:

return

# A local directory / file path has nothing to download.

expanded_model_name = os.path.expanduser(model_name)

if os.path.isdir(expanded_model_name) or os.path.exists(expanded_model_name):

return

if expanded_model_name.startswith((".", "/", "~")) or "\\\\" in expanded_model_name or os.path.isabs(expanded_model_name):

return

…une prefetch - Studio shim: guard the unsloth_zoo.hf_xet_fallback import and raise an actionable RuntimeError if the installed unsloth_zoo predates the shared helper, instead of a bare ModuleNotFoundError at Studio startup. - Studio shim: wrap snapshot_download_with_xet_fallback too (not just the single-file path) so its HTTP retry uses Studio's marker-aware prepare_cache_for_transport. - unsloth main: maybe_prefetch_hf_snapshot now passes a conservative ignore_patterns so prewarming a mixed-format repo does not pull ONNX/TF/Flax/CoreML/GGUF/training state the Transformers load never reads (ignore list, not an allowlist, so no needed file is dropped). - Add a Studio shim test for the snapshot prep injection.

…et-fallback-wire

for more information, see https://pre-commit.ci

Warm the adapter repo with the Xet->HTTP stall fallback before PeftModel.from_pretrained at both adapter-load sites in loader.py, so a stalled Xet transfer of a saved LoRA cannot hang the in-process load the same way it can for a base model. Thread subfolder through maybe_prefetch_hf_snapshot and drop the checkpoint-*/* ignore when the caller loads from a checkpoint-* subfolder, so prewarming never skips the exact files the load needs.

…nsloth into hf-xet-fallback-wire

for more information, see https://pre-commit.ci

danielhanchen · 2026-06-24T23:26:12Z

@codex review

danielhanchen · 2026-06-24T23:26:13Z

/gemini review

gemini-code-assist · 2026-06-24T23:26:15Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ec9c6e9e5b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-24T23:37:05Z

-        ),
-        daemon = True,
+try:
+    from unsloth_zoo.hf_xet_fallback import (


Bump the zoo dependency before importing the new helper

In environments that upgrade this package while keeping an already-installed unsloth_zoo that still satisfies the current metadata floor (unsloth_zoo>=2026.6.7), importing this Studio shim now raises before any worker can start because unsloth_zoo.hf_xet_fallback is required at module import. The main Unsloth path treats the missing helper as a no-op, but Studio paths that import the watchdog/GGUF fallback fail immediately; either raise the dependency floor to the first zoo release containing this module or keep a compatible local fallback.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-24T23:37:05Z

+            model_name,
+            token = token,
+            revision = revision,
+            cache_dir = cache_dir,
+            ignore_patterns = ignore_patterns,


Limit prefetches to the requested subfolder

When callers load from a specific subfolder, this still snapshots the whole repository; the subfolder argument only affects whether checkpoint-*/* is ignored. For repos that publish multiple checkpoints or variants in sibling subfolders, the normal Transformers load would fetch only the selected subfolder, but the new warmup can force unrelated multi-GB files to download first and exhaust time or disk before the actual load starts. Restrict the snapshot with subfolder-aware patterns, or skip the prefetch for subfolder loads.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-24T23:37:05Z

+                revision = revision,
+                cache_dir = kwargs.get("cache_dir"),
+                local_files_only = local_files_only,
+                fast_inference = fast_inference,


Warm PEFT adapters when fast inference is enabled

When a LoRA/PEFT repo is loaded with fast_inference=True, passing that flag here makes maybe_prefetch_hf_snapshot return early because it assumes vLLM will do the download. In this branch the adapter is still fetched by the following PeftModel.from_pretrained(...) call in the current process, so a stalled Xet transfer for the adapter can still hang instead of using the new HTTP fallback. The base-model vLLM path should stay skipped, but the adapter warmup should not.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-24T23:37:06Z

+                old_model_name,
+                token = token,
+                revision = revision,
+                cache_dir = kwargs.get("cache_dir"),


Reuse the warmed cache for adapter loads

When a caller supplies a custom cache_dir for a LoRA/PEFT load, this prefetch warms the adapter into that non-default cache, but the following PeftModel.from_pretrained(...) call does not receive cache_dir, so it looks in the default HF cache and can start a second in-process Xet download. In that configuration the new fallback does not protect the actual adapter load and also doubles the download/storage; pass the same cache directory to PEFT or prefetch where PEFT will read.

Useful? React with 👍 / 👎.

…eview) - _studio_prepare_for_http: wrap the marker-aware purge so a purge failure (locked file, missing dir) is logged at debug rather than aborting the HTTP retry; the shared orchestrator already guards it, this is belt-and-braces. - maybe_prefetch_hf_snapshot: expand ~ before the on-disk check and treat an absolute / ~ / explicitly-relative / Windows path that is not on disk yet as local, so it is not misread as a Hub repo id and does not emit a confusing pre-download warning.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7f6e137c6f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-25T00:02:29Z

 import huggingface_hub
-from huggingface_hub import constants as hf_constants

+import unsloth_zoo.hf_xet_fallback as shared


Make shim tests independent of missing unsloth_zoo

In the backend CI job I checked, the dependency step installs studio/backend/requirements/studio.txt plus explicit extras/torch/transformers (.github/workflows/studio-backend-ci.yml lines 64-80), but it does not install requirements/base.txt where unsloth-zoo is listed. This module-level import therefore fails test collection with ModuleNotFoundError: No module named 'unsloth_zoo' in that workflow before any tests can run; either add the dependency to that job or stub/import-skip the shared module in this test.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-25T00:02:29Z

+    from unsloth_zoo.hf_xet_fallback import (
+        DEFAULT_GRACE_PERIOD,
+        DEFAULT_HEARTBEAT_INTERVAL,
+        DEFAULT_STALL_TIMEOUT,
+        DownloadStallError,
+        child_should_disable_xet,
+        get_hf_download_state,
+        start_watchdog,


Defer zoo import until after training patches

The training worker imports child_should_disable_xet at the very start of run_training_process (core/training/worker.py line 2063), but this top-level shim import now imports unsloth_zoo immediately. That happens before the Windows ROCm torchao/runtime patches whose ordering comment explicitly says they must run before importing transformers/unsloth_zoo (core/training/worker.py lines 2326-2331), so Windows ROCm training workers can hit the same import-time failures those patches are meant to prevent. Keep the early child_should_disable_xet path stdlib-only or move the zoo import until after the worker setup.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-25T00:02:29Z

+    "*.h5",
+    "*.msgpack",


Do not skip requested TF or Flax weights

When callers pass from_tf=True or from_flax=True, those kwargs are forwarded to the later Transformers from_pretrained call, which then needs tf_model.h5 or flax_model.msgpack from the repo. The new warmup ignores those files unconditionally, so it can report success while leaving the actual requested weights uncached; the following in-process load then starts the Xet download that this fallback was supposed to make killable. Make these ignores conditional on the load format, or pass enough context into the prefetch helper to avoid dropping requested weights.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-25T00:02:29Z

+        maybe_prefetch_hf_snapshot(
+            model_name,
+            token = token,
+            revision = revision,
+            cache_dir = kwargs.get("cache_dir"),


Move direct Llama warmup after config validation

For direct FastLlamaModel.from_pretrained(...) calls, this now snapshots the whole remote repo before the AutoConfig/architecture checks below run. If the repo is unsupported by the installed Transformers version or is not a Llama-compatible model, the old path failed after fetching only config metadata, but this path can download multi-GB weights first and then raise the same validation error; move the warmup until after the config/model-class validation succeeds.

Useful? React with 👍 / 👎.

danielhanchen · 2026-06-25T00:40:16Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7f6e137c6f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-25T00:55:06Z

-        ),
-        daemon = True,
+try:
+    from unsloth_zoo.hf_xet_fallback import (


Bump unsloth_zoo before importing shared fallback

The package metadata still allows unsloth_zoo>=2026.6.7 (unchanged in pyproject.toml), but this new top-level import now requires the newer unsloth_zoo.hf_xet_fallback module. In an environment that satisfies the declared dependency with the older zoo package, importing Studio's utils.hf_xet_fallback raises during startup (for example through the orchestrator/llama_cpp imports), whereas the previous local implementation worked; please bump/pin the required zoo version or keep a compatibility fallback.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-25T00:55:06Z

+                revision = revision,
+                cache_dir = kwargs.get("cache_dir"),
+                local_files_only = local_files_only,
+                fast_inference = fast_inference,


Warm adapters despite fast_inference

When loading a PEFT adapter with fast_inference=True, this passes the flag into maybe_prefetch_hf_snapshot, whose fast-inference guard returns immediately because vLLM handles base-model downloads. The next call still invokes PeftModel.from_pretrained for the adapter in-process, so vLLM/LoRA loads get no killable predownload and can still hang on a stalled Xet adapter transfer; pass False for adapter prewarming or make the skip apply only to base-model loads.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-25T00:55:06Z

+                old_model_name,
+                token = token,
+                revision = revision,
+                cache_dir = kwargs.get("cache_dir"),


Warm adapters in the cache PEFT will read

When callers provide a custom cache_dir, this predownloads the adapter into that cache, but the PeftModel.from_pretrained call below is made without the same cache_dir (and without forwarding **kwargs). PEFT therefore looks in the default Hub cache and can still perform the in-process Xet download, while the new prewarm only adds an extra download; pass the same cache directory through to PEFT or avoid warming a different cache.

Useful? React with 👍 / 👎.

danielhanchen · 2026-06-25T07:51:04Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7f6e137c6f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-25T08:02:45Z

-        ),
-        daemon = True,
+try:
+    from unsloth_zoo.hf_xet_fallback import (


Bump unsloth_zoo minimum for the Studio shim

In environments that upgrade unsloth without also upgrading unsloth_zoo, this import now runs at Studio startup and raises if the installed unsloth_zoo still satisfies the current package floor but predates the new unsloth_zoo.hf_xet_fallback module. The main prefetch path gracefully no-ops when the helper is missing, but Studio becomes unstartable under an allowed dependency set; please bump the unsloth_zoo requirement to the first version containing this helper or keep a local fallback.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-25T08:02:45Z

+                old_model_name,
+                token = token,
+                revision = revision,
+                cache_dir = kwargs.get("cache_dir"),


Warm the same PEFT cache that the adapter load reads

When a PEFT load is called with cache_dir=..., this prefetch writes the adapter snapshot to that custom cache, but the following PeftModel.from_pretrained(...) call does not pass cache_dir (same pattern in FastModel below), so PEFT still resolves the adapter from the default HF cache and may perform the Xet download in-process. In that context the fallback is ineffective and the adapter can be downloaded twice; pass the same cache settings into PEFT or prefetch the cache it actually uses.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-25T08:02:45Z

+        )
+    ]
+    try:
+        snapshot_download_with_xet_fallback(


Restrict snapshot prefetch to the files Transformers will load

For repos that publish duplicate weight formats/variants (for example both model.safetensors and pytorch_model.bin shards), this whole-snapshot call downloads every non-ignored file before Transformers has selected the actual checkpoint. A normal from_pretrained would only fetch the chosen weights, so these repos can now spend tens of GB of extra network/disk and fail with disk-full before the real load; derive allow patterns from the selected config/index or otherwise skip unused weight sets.

Useful? React with 👍 / 👎.

force_download=True previously bypassed the killable prefetch: the warm-cache short-circuit no-oped, then Transformers re-fetched in-process over the Xet path this avoids. Thread force_download through maybe_prefetch_hf_snapshot and every call site so a forced reload also runs in the killable child.

…w 3) Studio's shim hard-imported unsloth_zoo.hf_xet_fallback at startup, so a Studio install whose unsloth_zoo predates the shared helper crashed at boot. Instead of raising, fall back to thin stubs: plain HF snapshot/file downloads with the no-progress watchdog disabled (the same best-effort posture core Unsloth uses in from_pretrained), a DownloadStallError mirror, and no-op watchdog/state helpers. Studio boots and downloads work; automatic Xet -> HTTP recovery returns as soon as unsloth_zoo is upgraded. These are thin stubs, not a second copy of the orchestration. Verified against the installed older unsloth_zoo and added a degraded-path test (5 passed).

for more information, see https://pre-commit.ci

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ed16544cf8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-25T08:38:49Z

+            revision = revision,
+            cache_dir = cache_dir,
+            ignore_patterns = ignore_patterns,
+            force_download = force_download,


Skip prefetches when force_download is requested

When callers pass force_download=True, this warmup also forces a download, but the original force_download kwarg is still forwarded to the later Transformers/PEFT from_pretrained call, so that load bypasses the warmed cache and starts a second in-process Hub transfer. In that scenario the new fallback does not protect the actual load from a stalled Xet download and also doubles network/disk work; skip the prefetch for forced downloads or consume/clear the flag before the final load.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-25T08:38:49Z

+        )
+    ]
+    try:
+        snapshot_download_with_xet_fallback(


Forward proxy settings into the prefetch

In environments that require callers to pass proxies through from_pretrained(..., proxies=...), the later Transformers load would use that proxy, but this prefetch path calls the Hub snapshot helper without any of the caller's proxy settings. If the direct connection stalls rather than failing immediately, DownloadStallError is raised here before the proxied normal load can run, so proxy-only users can fail on the new warmup despite having supplied valid download settings; forward the relevant Hub kwargs or skip the prefetch when they are present.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-25T08:38:49Z

+        # to HTTP on a no-progress stall, so the in-process load below is a cache
+        # hit and cannot hang on a stalled Xet transfer.
+        maybe_prefetch_hf_snapshot(
+            model_name,


Warm separate tokenizer repositories

When tokenizer_name is set to a different remote repo, this warmup only snapshots model_name, but the tokenizer is loaded later via load_correct_tokenizer(tokenizer_name=...) in the same process. In that configuration the model weights are protected by the Xet fallback, while the tokenizer repo can still start an in-process Hub/Xet download and hang; prefetch the tokenizer repo as well when it is remote and distinct from the model repo.

Useful? React with 👍 / 👎.

…raded shim (review 4) - maybe_prefetch_hf_snapshot now returns whether it warmed the cache, and every call site clears force_download after a successful warm. force_download=True previously warmed the cache in the killable child and then re-forced the same download in-process over Xet, defeating the protection. - Add the prefetch guard to the diffusion FastModel path (diffusion.py), which loaded weights via model_cls.from_pretrained without the Xet->HTTP recovery the vision/language paths get. - Degraded Studio shim (older unsloth_zoo) now honors cancel_event: it raises before starting and after finishing the plain HF download instead of dropping cancellation via **kwargs. - Prewarm skips redundant *.bin when the repo also ships safetensors (best-effort model_info), so a dual-format repo is not double-downloaded before Transformers picks safetensors. Threaded use_safetensors through the call sites. - Studio shim test imports the shared helper optionally so the degraded-path test runs on an unsloth_zoo without it; added a degraded cancellation test.

…nsloth into hf-xet-fallback-wire # Conflicts: # studio/backend/utils/hf_xet_fallback.py

…alidation Two paths let a real in-process weight download run without the Xet stall fallback, or ran the prefetch before it could fail locally: - A num_labels classification load takes the AutoModelForSequenceClassification branch (an in-process download) even under fast_inference=True, but the llama prefetch skipped the warm whenever fast_inference was set, so those weights fetched over un-killable Xet. Gate the skip on num_labels is None, so only a real vLLM-owned load bypasses the warm. - The FastBaseModel (vision / FastModel) prefetch ran before the mutually- exclusive load_in_4bit/8bit/16bit validation, so an invalid combination downloaded a multi-GB snapshot before failing. SentenceTransformer already moved that check ahead of its prefetch; the vision loader cannot, because check_and_disable_bitsandbytes_loading only resolves the effective flags after the config fetch (moving the check earlier would false-reject a valid load of an already-quantized model). Move the base + tokenizer prefetch to run after the validation instead, so an invalid call fails without downloading. The moved block is unchanged and still precedes the in-process load and the vLLM fast_inference_setup remap. Adds static AST regression tests asserting the vision prefetch follows the load-mode validation and the llama prefetch's fast_inference skip is gated on num_labels.

danielhanchen · 2026-07-01T10:16:32Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a89ec0da0d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-07-01T10:29:35Z

+            model_name,
+            token = token,
+            revision = revision,
+            cache_dir = kwargs.get("cache_folder") or os.environ.get("SENTENCE_TRANSFORMERS_HOME"),


Use the FastModel cache for ST fallback prefetch

When FastSentenceTransformer falls through to the FastModel.from_pretrained path below, that weight load does not consume SENTENCE_TRANSFORMERS_HOME or cache_folder as a Hugging Face cache_dir (it is invoked later without a cache_dir argument). With cache_folder/SENTENCE_TRANSFORMERS_HOME set, this prefetch warms the SentenceTransformers cache, but the actual FastModel weight load uses the default HF cache, so it misses the warm snapshot and can still perform the unprotected in-process Xet download the change is meant to avoid.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-07-01T10:29:35Z

+    except Exception:
+        return False  # older unsloth_zoo without the helper: load normally


Require a zoo build that contains the fallback helper

In environments that satisfy the current declared dependency (unsloth_zoo>=2026.6.7) but predate the new unsloth_zoo.hf_xet_fallback module, this broad import fallback silently returns False, so every new maybe_prefetch_hf_snapshot call becomes a no-op and the protected Xet→HTTP pre-download never runs. Unless the package minimum is bumped to the first zoo release containing this helper, normal installs can keep hitting the same unkillable in-process Xet downloads this change is meant to avoid.

Useful? React with 👍 / 👎.

The custom SentenceTransformer fallback path prefetches into the cache the ST load uses (cache_folder, else SENTENCE_TRANSFORMERS_HOME), and the module-file loads were already resolved to that same cache. But the model WEIGHTS load runs through FastModel.from_pretrained, which resolves its cache from the HF cache_dir and never reads cache_folder / SENTENCE_TRANSFORMERS_HOME. With either set, the weights loaded from the default HF cache and missed the warm, so they still ran the unprotected in-process Xet download the prefetch is meant to avoid. Resolve the same cache into kwargs["cache_dir"] before the FastModel load (when a custom ST cache is in play and the caller passed no explicit HF cache_dir, which wins), so the weights hit the warm. A None resolution leaves the default HF cache, matching the prefetch. Adds a static AST regression test asserting from_pretrained resolves the ST cache into kwargs["cache_dir"] before the FastModel.from_pretrained call.

danielhanchen · 2026-07-01T11:07:37Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4f5f8da7b5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-07-01T11:20:11Z

+            model_name,
+            token = token,
+            revision = revision,
+            cache_dir = kwargs.get("cache_folder") or os.environ.get("SENTENCE_TRANSFORMERS_HOME"),


Honor explicit HF cache_dir for ST prefetches

When FastSentenceTransformer.from_pretrained(..., cache_dir=...) falls through to the FastModel path, this prefetch warms cache_folder/SENTENCE_TRANSFORMERS_HOME (or the default cache) and ignores the explicit HF cache_dir. Later the fallback weight load preserves and forwards kwargs["cache_dir"], so the warmed snapshot is missed and the weights can still be downloaded in-process over Xet from the custom cache. Include kwargs.get("cache_dir") in this cache resolution so the warm and load target the same cache.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-07-01T11:20:12Z

+                    _filename_has_variant(sibling.rfilename, variant)
+                    if variant
+                    else _is_canonical_model_weight_safetensors(sibling.rfilename)


Keep variant .bin unless the safetensors name is loadable

For variant loads, this treats any in-scope safetensors whose basename contains .<variant>. or .<variant>- as proving the .bin weights are redundant. Transformers variant loading looks for the canonical model.<variant>.safetensors/index/shard names, so a repo with an unrelated sidecar such as consolidated.fp16.safetensors plus the actual pytorch_model.fp16.bin will add *.bin to ignore_patterns; the prefetch then skips the only loadable weights and from_pretrained falls back to an unprotected in-process Xet download.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-07-01T11:20:12Z

+        _warm_tokenizer_repo = (
+            isinstance(_tokenizer_repo, str)
+            and bool(_tokenizer_repo)
+            and _tokenizer_repo != model_name
+        )


Warm the vLLM path tokenizer after remapping

When FastVisionModel runs with fast_inference=True and no separate tokenizer_name, _vllm_owns_weights skips the base snapshot warm, but this condition also skips the tokenizer warm because _tokenizer_repo == model_name. Later fast_inference_setup may remap model_name, then the code sets tokenizer_name = model_name and calls AutoProcessor/AutoTokenizer in-process, so processor/tokenizer files can still fetch over unprotected Xet. Warm the final tokenizer repo after remap (or include fast_inference in this condition) so the cache hit is guaranteed.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-07-01T11:20:12Z

+            use_safetensors = kwargs.get("use_safetensors"),
+            # Diffusion variants (variant="fp16") are common: forward it so the warm never drops a
+            # variant .bin for a non-variant safetensors.
+            variant = kwargs.get("variant"),


Forward the diffusion variant to the real load

When FastDiffusionModel.from_pretrained(..., variant="fp16") is used, the prefetch warms the variant-specific files, but load_kwargs never includes variant, so model_cls.from_pretrained still asks Transformers for the default weight variant. That misses the warmed cache (and can load the wrong precision or fetch the default weights in-process over Xet), so the same variant value needs to be forwarded to the actual load.

Useful? React with 👍 / 👎.

…ess Xet fetch Four cases where the prewarm and the real load targeted different files or different caches, so the load still ran an unprotected in-process Xet download: - ST prefetch ignored an explicit HF cache_dir. FastSentenceTransformer resolves the prefetch (and fallback module-load) cache from cache_folder / SENTENCE_TRANSFORMERS_HOME, but the FastModel fallback weight load forwards kwargs["cache_dir"]. A caller passing cache_dir warmed one cache and read another, missing the warm. The three ST cache resolutions now honor kwargs.get("cache_dir") first. - Variant .bin dropped for a non-canonical safetensors sidecar. For a variant load the prewarm dropped *.bin as soon as ANY variant-infixed safetensors was shipped, so a repo with consolidated.fp16.safetensors (a sidecar the load does not read) plus the real pytorch_model.fp16.bin had its only loadable weights dropped from the warm. The check now requires a CANONICAL model variant name (model.<variant>.safetensors, either shard infix, or the variant index) via _is_canonical_variant_model_weight_safetensors, matching the strictness already used for the no-variant path. - vLLM path never warmed the tokenizer. With fast_inference and no separate tokenizer_name, the base warm is skipped (vLLM owns the weights) and the tokenizer warm was deferred because fast_inference_setup may remap model_name. The final tokenizer repo was then loaded in-process (AutoProcessor / AutoTokenizer) with no warm. Warm the final tokenizer repo (tokenizer-only) after the remap on the vLLM path. - Diffusion variant not forwarded to the load. FastDiffusionModel forwarded variant to the prefetch but not to model_cls.from_pretrained, so the pipeline asked for the default weight variant -- missing the warmed variant weights (wrong precision, or a default weight a variant-only repo may not ship, fetched in-process). load_kwargs now forwards variant. Adds regression tests for each (a non-canonical sidecar keeps its .bin, the ST cache resolutions honor cache_dir, the vLLM tokenizer warm runs after the remap, and the diffusion load forwards variant).

danielhanchen · 2026-07-01T12:10:59Z

All four are real and fixed in e883f71:

Honor explicit HF cache_dir for ST prefetches (P2): the three ST cache resolutions (the prefetch and the fallback module loads) now resolve kwargs.get("cache_dir") before cache_folder / SENTENCE_TRANSFORMERS_HOME, so a caller passing cache_dir warms and reads the same cache. The FastModel fallback load already forwards kwargs["cache_dir"].
Keep variant .bin unless the safetensors name is loadable (P2): the variant safetensors detector now requires a CANONICAL model variant name (model.<variant>.safetensors, either shard infix, or model.safetensors.index.<variant>.json) via _is_canonical_variant_model_weight_safetensors, so a non-canonical sidecar like consolidated.fp16.safetensors no longer causes the real pytorch_model.fp16.bin to be dropped from the warm. This matches the strictness already used on the no-variant path.
Warm the vLLM path tokenizer after remapping (P2): on the fast_inference path the base warm is skipped and the tokenizer warm was deferred (the comment notes fast_inference_setup may remap model_name). The final tokenizer repo is now warmed (tokenizer-only) after the remap, so the in-process AutoProcessor / AutoTokenizer load is a cache hit.
Forward the diffusion variant to the real load (P2): FastDiffusionModel now forwards variant to model_cls.from_pretrained (not only to the prefetch), so it reads the warmed variant weights instead of asking for the default variant.

Regression tests added for each (35 -> 42 in the prefetch-scope suite).

for more information, see https://pre-commit.ci

chatgpt-codex-connector · 2026-07-01T12:11:33Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

chatgpt-codex-connector · 2026-07-01T12:12:15Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

chatgpt-codex-connector · 2026-07-01T12:13:52Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

danielhanchen · 2026-07-01T12:14:27Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 08ba451756

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-07-01T12:28:58Z

+            cache_dir = kwargs.get("cache_dir")
+            or kwargs.get("cache_folder")
+            or os.environ.get("SENTENCE_TRANSFORMERS_HOME"),


Align ST prefetch cache with native loads

When the caller passes cache_dir but not cache_folder, this prefetch warms that explicit HF cache, but the native SentenceTransformer(...) paths do not read it: the inference branch only copies cache_folder into st_kwargs, and the fast-encoder branch passes cache_folder = kwargs.get("cache_folder"). In those contexts the actual load falls back to the default/ST cache, missing the warmed snapshot, so large repos can be downloaded twice and the real weight fetch still happens in-process without the Xet fallback this change is meant to provide. Either don't prefer cache_dir for native ST loads or translate it into the cache_folder passed to SentenceTransformer.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-07-01T12:28:58Z

+        # Resolve the cache the load uses: an explicit HF cache_dir wins (the FastModel fallback load
+        # forwards it), else cache_folder, else SENTENCE_TRANSFORMERS_HOME (which ST honors when
+        # cache_folder is unset), else the default HF cache -- a wrong-cache warm would be missed.
+        maybe_prefetch_hf_snapshot(


Avoid whole-snapshot warm for root-only ST fallback

When FastSentenceTransformer is used on a plain encoder repo with no modules.json, the code later falls back to FastModel.from_pretrained, which reads only root weights. This unconditional prefetch leaves weights_at_root at its default False, so _prefetch_ignore_patterns treats the repo as multi-component: mixed safetensors/bin root repos keep both formats and arbitrary subdir weights are not excluded. In that fallback path this can pull multi-GB weights the actual load never reads before the real load starts; determine modules.json first or pass weights_at_root=True for the no-modules case.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-07-01T12:28:58Z

+        # Pre-download the confirmed diffusion repo (Xet -> HTTP on a stall) so the weight load is a cache
+        # hit. subfolder is NOT forwarded: the pipeline loads the whole repo root (every component
+        # subfolder), so narrowing to one would leave unet/, vae/, text_encoder/ to in-process Xet.
+        maybe_prefetch_hf_snapshot(


Mark diffusion prefetch as root-weight load

FastDiffusionModel resolves a Transformers model class and later calls model_cls.from_pretrained(...) on the repo root, not a diffusers-style multi-component pipeline. Leaving this warm at the default weights_at_root=False sends _prefetch_ignore_patterns down the whole-snapshot path, so even use_safetensors=True or safetensors-preferred mixed repos keep redundant .bin weights and nested weight dirs eligible for download. For mixed-format diffusion checkpoints this can pull a second multi-GB weight set that the real load never reads; pass weights_at_root=True for this prefetch.

Useful? React with 👍 / 👎.

danielhanchen · 2026-07-01T12:45:11Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 08ba451756

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-07-01T12:56:34Z

+            cache_dir = kwargs.get("cache_dir")
+            or kwargs.get("cache_folder")
+            or os.environ.get("SENTENCE_TRANSFORMERS_HOME"),


Warm the cache SentenceTransformer will read

When a caller passes cache_dir to FastSentenceTransformer.from_pretrained and the model takes the for_inference or fast-encoder SentenceTransformer(...) branch, this prefetch writes the snapshot into that HF cache, but the later native SentenceTransformer loads only forward cache_folder (or nothing), not cache_dir. In that scenario the real load misses the warmed snapshot and can still perform an in-process Hub/Xet download, defeating the stall fallback and duplicating the transfer; either map cache_dir to cache_folder for those branches or avoid preferring it for this prefetch.

Useful? React with 👍 / 👎.

…ansformer loads The ST prefetch warms the cache resolved as cache_dir -> cache_folder -> SENTENCE_TRANSFORMERS_HOME (an explicit HF cache_dir wins). But the for_inference and fast-encoder branches then construct a native SentenceTransformer, which takes cache_folder, not cache_dir: for_inference copied only a caller-supplied cache_folder into st_kwargs, and the fast-encoder passed cache_folder = kwargs.get("cache_folder"). So a caller passing cache_dir warmed one cache and the native load read another (cache_folder / SENTENCE_TRANSFORMERS_HOME / the default HF cache), missing the warm and starting an unprotected in-process Hub/Xet download that defeats the stall fallback and duplicates the transfer. Both native loads now set cache_folder = kwargs.get("cache_dir") or kwargs.get("cache_folder"): the explicit HF cache_dir wins, else the caller's cache_folder, else None so ST honors SENTENCE_TRANSFORMERS_HOME -- matching the prefetch's resolution so the native load hits the warmed cache. The FastModel fallback branch already resolves cache_dir (unchanged). Adds a static regression guard.

for more information, see https://pre-commit.ci

danielhanchen · 2026-07-02T05:23:25Z

Real and fixed in 6d4ce26.

Warm the cache SentenceTransformer will read (P2): the ST prefetch resolves its cache as cache_dir -> cache_folder -> SENTENCE_TRANSFORMERS_HOME (an explicit HF cache_dir wins), but the for_inference and fast-encoder branches then construct a native SentenceTransformer, which takes cache_folder, not cache_dir: for_inference copied only a caller-supplied cache_folder into its kwargs, and the fast-encoder passed cache_folder = kwargs.get("cache_folder"). So a caller passing cache_dir warmed one cache and the native load read another, missing the warm and starting an unprotected in-process Hub/Xet download. Both native loads now set cache_folder = kwargs.get("cache_dir") or kwargs.get("cache_folder") (cache_dir wins, else cache_folder, else None so ST honors SENTENCE_TRANSFORMERS_HOME), matching the prefetch's resolution. The FastModel fallback branch already resolves cache_dir (unchanged). Static regression guard added.

danielhanchen · 2026-07-02T05:24:30Z

@codex review

danielhanchen · 2026-07-02T05:24:36Z

/gemini review

chatgpt-codex-connector · 2026-07-02T05:30:58Z

Codex Review: Didn't find any major issues. Keep them coming!

Reviewed commit: 7a6fc8d89c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

gemini-code-assist · 2026-07-02T05:35:45Z

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

danielhanchen · 2026-07-02T05:44:12Z

/gemini review

gemini-code-assist · 2026-07-02T05:46:46Z

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

Condense PR-added comments/docstrings across the loader prefetch wiring (_utils.py, vision/llama/loader/diffusion/sentence_transformer.py, tokenizer_utils.py), the prefetch-scope tests, and the Studio Xet->HTTP shim (backend utils + tests, model-selector tsx). Comments/docstrings/JSDoc only, no code changed (AST + TS-compiler verified). Preserves the prefetch-placement and Studio marker-injection rationale tersely. Prefetch-scope tests 43 passed; Studio backend tests 18 passed.

chatgpt-codex-connector · 2026-07-03T12:28:15Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

From a consolidated review pass (12x reviewer.py personas + 3 forks). The SentenceTransformer prefetch now forwards force_download and clears it after a successful warm, matching the llama / vision / diffusion sites. Before, a force_download=True ST load warmed without forcing, so a stale cache could be reused or the refresh could fall to an unguarded in-process Hub/Xet download. Generalises the ST prefetch-wiring AST test to accept the return-captured call. Prefetch suite 43 passed. Not taken: the llama / vision base-load prefetch omitting revision is correct -- revision is a named from_pretrained parameter that never reaches the in-process AutoModel load (which resolves model_name on its default branch), so the warm matches the load; forwarding it would warm the wrong branch. The degraded Studio shim dropping local_files_only is not production-reachable (its GGUF callers never pass it, and the public snapshot wrapper is test-only).

chatgpt-codex-connector · 2026-07-03T13:41:55Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

danielhanchen requested a review from Datta0 as a code owner June 24, 2026 11:39

[pre-commit.ci] auto fixes from pre-commit.com hooks

18d277c

for more information, see https://pre-commit.ci

gemini-code-assist Bot reviewed Jun 24, 2026

View reviewed changes

danielhanchen and others added 8 commits June 24, 2026 11:59

Merge branch 'main' of https://github.com/unslothai/unsloth into hf-x…

c59d19d

…et-fallback-wire

Merge commit '18d277c90' into hf-xet-fallback-wire

ca77f43

[pre-commit.ci] auto fixes from pre-commit.com hooks

4d4e45a

for more information, see https://pre-commit.ci

Merge remote-tracking branch 'origin/main' into hf-xet-fallback-wire

e30d868

Merge branch 'hf-xet-fallback-wire' of https://github.com/unslothai/u…

81c953b

…nsloth into hf-xet-fallback-wire

[pre-commit.ci] auto fixes from pre-commit.com hooks

ec9c6e9

for more information, see https://pre-commit.ci

chatgpt-codex-connector Bot reviewed Jun 24, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Jun 25, 2026

View reviewed changes

danielhanchen and others added 4 commits June 25, 2026 08:18

Merge remote-tracking branch 'origin/main' into hf-xet-fallback-wire

d29a46b

[pre-commit.ci] auto fixes from pre-commit.com hooks

ed16544

for more information, see https://pre-commit.ci

chatgpt-codex-connector Bot reviewed Jun 25, 2026

View reviewed changes

danielhanchen added 2 commits June 25, 2026 08:54

Merge branch 'hf-xet-fallback-wire' of https://github.com/unslothai/u…

d321f98

…nsloth into hf-xet-fallback-wire # Conflicts: # studio/backend/utils/hf_xet_fallback.py

danielhanchen added 2 commits July 1, 2026 09:56

Tighten frontend comments (comment-only, no logic change)

3f750f7

chatgpt-codex-connector Bot reviewed Jul 1, 2026

View reviewed changes

[pre-commit.ci] auto fixes from pre-commit.com hooks

08ba451

for more information, see https://pre-commit.ci

chatgpt-codex-connector Bot reviewed Jul 1, 2026

View reviewed changes

danielhanchen and others added 2 commits July 2, 2026 05:22

[pre-commit.ci] auto fixes from pre-commit.com hooks

7a6fc8d

for more information, see https://pre-commit.ci

		except Exception:
		return False # older unsloth_zoo without the helper: load normally

Uh oh!

Uh oh!

Conversation

danielhanchen commented Jun 24, 2026

Summary

unsloth main

Studio

Tests

Uh oh!

chatgpt-codex-connector Bot commented Jun 24, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

danielhanchen commented Jun 24, 2026

Uh oh!

danielhanchen commented Jun 24, 2026

Uh oh!

gemini-code-assist Bot commented Jun 24, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

danielhanchen commented Jun 25, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

danielhanchen commented Jun 25, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!