-
Notifications
You must be signed in to change notification settings - Fork 289
Pull requests: unslothai/unsloth-zoo
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Forward declared tools to the DiffusionGemma visual server
#864
opened Jul 3, 2026 by
oobabooga
Member
Loading…
Add a grouped bnb 4-bit training forward for gpt-oss experts
#862
opened Jul 3, 2026 by
danielhanchen
Member
Loading…
Add a banded sliding-window SDPA fast path for Gemma-4 style layers
#861
opened Jul 3, 2026 by
danielhanchen
Member
Loading…
Route gemma-4 sliding-window layers through FlashAttention-2
#860
opened Jul 3, 2026 by
danielhanchen
Member
Loading…
Keep DeepSeek-V4 hyper-connection mixers eager to stop backward inf
#859
opened Jul 3, 2026 by
danielhanchen
Member
Loading…
Optional gate gradient identity for the grouped MoE combine
#858
opened Jul 3, 2026 by
danielhanchen
Member
Loading…
Cap the fused cross entropy chunk target so chunking stays active on large GPUs
#857
opened Jul 3, 2026 by
danielhanchen
Member
Loading…
Load prequantized bnb 4-bit MoE expert checkpoints under transformers v5
#856
opened Jul 3, 2026 by
danielhanchen
Member
Loading…
Fix MLX notebook generate input/output compatibility
#855
opened Jul 3, 2026 by
Lyxot
Contributor
Loading…
feat(mlx): support distributed inference loading
#854
opened Jul 3, 2026 by
Lyxot
Contributor
Loading…
dataset_utils: add mask_out_tokens to train_on_responses_only (fixes unslothai/unsloth#6695)
#852
opened Jul 3, 2026 by
pjordanandrsn
Loading…
mlx: fixes for bnb dequant, report_to, early stopping, and NEFTune
#851
opened Jul 3, 2026 by
oobabooga
Member
Loading…
feat(mlx): load GPTQ/AWQ pre-quantized checkpoints on Apple Silicon
#848
opened Jul 2, 2026 by
BardiaKoopah
Contributor
Loading…
fix(gguf): preserve Qwen3.5/3.6 MTP config so nextn tensors convert
#847
opened Jul 2, 2026 by
LeoBorcherding
Contributor
Loading…
2 tasks done
fix(mlx): normalize legacy tokenizer sidecars for VLM loads
#846
opened Jul 2, 2026 by
Lyxot
Contributor
Loading…
MLX: clear error when mlx-lm/mlx-vlm is too old for a QK-norm arch (gemma4/qwen3_5)
#845
opened Jul 2, 2026 by
danielhanchen
Member
Loading…
Optional recompute-in-backward for the v5 grouped MoE path
#838
opened Jun 28, 2026 by
danielhanchen
Member
Loading…
Grouped-GEMM MoE forward for transformers<5 ModuleList experts
#837
opened Jun 28, 2026 by
danielhanchen
Member
Loading…
dataset_utils: add mask_thinking_tokens to mask the </think> token
#833
opened Jun 26, 2026 by
Sushankthatipally
Loading…
Feat(mlx): Add ORPO (loss_type='orpo') for text models to MLXTrainer
#830
opened Jun 24, 2026 by
BardiaKoopah
Contributor
Loading…
Add shared Xet to HTTP stall fallback (hf_xet_fallback, hf_cache_state)
#829
opened Jun 24, 2026 by
danielhanchen
Member
Loading…
Support DoRA (use_dora=True) in the safetensors LoRA merge
#828
opened Jun 24, 2026 by
danielhanchen
Member
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2026-06-30.