Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions .github/workflows/pr5351-cpu-inference-macos.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# SPDX-License-Identifier: AGPL-3.0-only
# Copyright 2026-present the Unsloth AI Inc. team. All rights reserved.
#
# PR-5351 CPU-inference cross-OS lane: macOS (Apple Silicon).
# Same as the Ubuntu lane but on macos-14. llama-cpp-python builds
# with Metal autodetect disabled to stay on the CPU code path so the
# result mirrors a non-GPU Mac.

name: PR-5351 CPU inference macOS

on:
push:
branches: [pr-5351-cross-os-validation]
paths:
- 'studio/backend/core/chat/**'
- 'tests/studio/test_cpu_inference_on_extracted_document.py'
- '.github/workflows/pr5351-cpu-inference-macos.yml'
workflow_dispatch:

concurrency:
group: pr5351-cpu-inference-macos-${{ github.ref }}
cancel-in-progress: true

jobs:
cpu-inference:
runs-on: macos-14
timeout-minutes: 40
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'

- name: Install backend + llama-cpp-python (CPU build)
run: |
python -m pip install --upgrade pip
pip install -r studio/backend/requirements/studio.txt
pip install \
python-multipart aiofiles sqlalchemy cryptography \
pyyaml jinja2 mammoth pymupdf pymupdf4llm pytest pytest-asyncio \
pytest-timeout huggingface_hub requests numpy
# Disable Metal so the lane stays CPU-only; mirrors a no-GPU Mac.
CMAKE_ARGS="-DGGML_METAL=OFF -DGGML_ACCELERATE=OFF -DGGML_NATIVE=OFF" \
pip install --upgrade --quiet llama-cpp-python

- name: CPU inference on extracted document
env:
PR5351_LLAMA_THREADS: '3'
run: |
python -m pytest -q tests/studio/test_cpu_inference_on_extracted_document.py -s --tb=short

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}
Comment on lines +26 to +52
53 changes: 53 additions & 0 deletions .github/workflows/pr5351-cpu-inference-ubuntu.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# SPDX-License-Identifier: AGPL-3.0-only
# Copyright 2026-present the Unsloth AI Inc. team. All rights reserved.
#
# PR-5351 CPU-inference cross-OS lane: Ubuntu.
# Builds llama-cpp-python from source for CPU, downloads a 0.5B GGUF
# from HF, extracts a synthetic PDF via the PR's document extractor,
# and asserts the model answers a ground-truth question. Proves
# end-to-end document-attach -> extract -> inference works on a CPU
# runner with no GPU.

name: PR-5351 CPU inference Ubuntu

on:
push:
branches: [pr-5351-cross-os-validation]
paths:
- 'studio/backend/core/chat/**'
- 'tests/studio/test_cpu_inference_on_extracted_document.py'
- '.github/workflows/pr5351-cpu-inference-ubuntu.yml'
workflow_dispatch:

concurrency:
group: pr5351-cpu-inference-ubuntu-${{ github.ref }}
cancel-in-progress: true

jobs:
cpu-inference:
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'

- name: Install backend + llama-cpp-python (CPU build)
run: |
python -m pip install --upgrade pip
pip install -r studio/backend/requirements/studio.txt
pip install \
python-multipart aiofiles sqlalchemy cryptography \
pyyaml jinja2 mammoth pymupdf pymupdf4llm pytest pytest-asyncio \
pytest-timeout huggingface_hub requests numpy
# CPU wheel ships pre-built on Linux; falls back to source if needed.
CMAKE_ARGS="-DGGML_NATIVE=OFF" pip install --upgrade --quiet llama-cpp-python

- name: CPU inference on extracted document
env:
PR5351_LLAMA_THREADS: '4'
run: |
python -m pytest -q tests/studio/test_cpu_inference_on_extracted_document.py -s --tb=short

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}
Comment on lines +28 to +53
49 changes: 49 additions & 0 deletions .github/workflows/pr5351-cpu-inference-windows.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# SPDX-License-Identifier: AGPL-3.0-only
# Copyright 2026-present the Unsloth AI Inc. team. All rights reserved.
#
# PR-5351 CPU-inference cross-OS lane: Windows.
# llama-cpp-python wheels exist for Windows; if pip falls back to
# source, MSVC is preinstalled on windows-latest. CPU-only.

name: PR-5351 CPU inference Windows

on:
push:
branches: [pr-5351-cross-os-validation]
paths:
- 'studio/backend/core/chat/**'
- 'tests/studio/test_cpu_inference_on_extracted_document.py'
- '.github/workflows/pr5351-cpu-inference-windows.yml'
workflow_dispatch:

concurrency:
group: pr5351-cpu-inference-windows-${{ github.ref }}
cancel-in-progress: true

jobs:
cpu-inference:
runs-on: windows-latest
timeout-minutes: 40
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'

- name: Install backend + llama-cpp-python (CPU build)
shell: pwsh
run: |
python -m pip install --upgrade pip
pip install -r studio/backend/requirements/studio.txt
pip install python-multipart aiofiles sqlalchemy cryptography pyyaml jinja2 mammoth pymupdf pymupdf4llm pytest pytest-asyncio pytest-timeout huggingface_hub requests numpy
$env:CMAKE_ARGS = "-DGGML_NATIVE=OFF"
pip install --upgrade --quiet llama-cpp-python

- name: CPU inference on extracted document
shell: pwsh
env:
PR5351_LLAMA_THREADS: '4'
run: |
python -m pytest -q tests/studio/test_cpu_inference_on_extracted_document.py -s --tb=short

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}
Comment on lines +25 to +49
60 changes: 60 additions & 0 deletions .github/workflows/pr5351-macos.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# SPDX-License-Identifier: AGPL-3.0-only
# Copyright 2026-present the Unsloth AI Inc. team. All rights reserved.
#
# PR-5351 cross-OS validation: macOS lane.
# macos-14 (arm64). Validates the multiprocessing `spawn` path that
# differs from Linux's default `fork`, the MLX detection branch in
# core/chat/vlm_capability.py, and Safari/WebKit-relevant filesystem
# behaviour. CPU-only; CUDA spoof auto-engages via tests/conftest.py.

name: PR-5351 macOS

on:
push:
branches: [pr-5351-cross-os-validation]
paths:
- 'studio/backend/**'
- 'tests/studio/**'
- 'tests/conftest.py'
- '.github/workflows/pr5351-macos.yml'
workflow_dispatch:

concurrency:
group: pr5351-macos-${{ github.ref }}
cancel-in-progress: true

jobs:
pytest:
runs-on: macos-14
timeout-minutes: 25
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'

- name: Install backend test dependencies (CPU only)
run: |
python -m pip install --upgrade pip
pip install -r studio/backend/requirements/studio.txt
pip install \
python-multipart aiofiles sqlalchemy cryptography \
pyyaml jinja2 mammoth unpdf requests \
'numpy<3' pytest pytest-asyncio httpx
pip install --index-url https://download.pytorch.org/whl/cpu 'torch>=2.4,<2.11'
pip install 'transformers>=4.51,<5.5'

- name: PR-5351 document tests (macOS spawn semantics)
working-directory: studio/backend
env:
# macOS's default start method is spawn; exercise the same
# config users see in production.
UNSLOTH_STUDIO_EXTRACT_CONCURRENCY: '2'
run: |
python -m pytest -q tests/test_chat_document_extraction.py tests/test_chat_document_routes.py tests/test_inference_worker.py tests/test_vision_cache.py tests/test_anthropic_messages.py tests/test_openai_tool_passthrough.py tests/test_models_get_model_config_case_resolution.py --tb=short

- name: PR-5351 regression tests + cancel timing
run: |
python -m pytest -q tests/studio/test_extractor_semaphore_leak.py tests/studio/test_html_independent_of_inference.py tests/studio/test_gguf_singleton_shared.py tests/studio/test_stream_cancel_registration_timing.py --tb=short

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}
Comment on lines +28 to +60
57 changes: 57 additions & 0 deletions .github/workflows/pr5351-ubuntu.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# SPDX-License-Identifier: AGPL-3.0-only
# Copyright 2026-present the Unsloth AI Inc. team. All rights reserved.
#
# PR-5351 cross-OS validation: Ubuntu lane.
# Runs the document-extraction tests, the cancellation-timing structural
# test, and the three regression tests added in the fix commit against
# Python 3.11 on ubuntu-latest. CPU-only; the existing tests/conftest.py
# auto-installs the CUDA spoof so unsloth/unsloth_zoo device probes
# return "cuda".

name: PR-5351 Ubuntu

on:
push:
branches: [pr-5351-cross-os-validation]
paths:
- 'studio/backend/**'
- 'tests/studio/**'
- 'tests/conftest.py'
- '.github/workflows/pr5351-ubuntu.yml'
workflow_dispatch:

concurrency:
group: pr5351-ubuntu-${{ github.ref }}
cancel-in-progress: true

jobs:
pytest:
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'

- name: Install backend test dependencies (CPU only)
run: |
python -m pip install --upgrade pip
pip install -r studio/backend/requirements/studio.txt
pip install \
python-multipart aiofiles sqlalchemy cryptography \
pyyaml jinja2 mammoth unpdf requests \
'numpy<3' pytest pytest-asyncio httpx
pip install --index-url https://download.pytorch.org/whl/cpu 'torch>=2.4,<2.11'
pip install 'transformers>=4.51,<5.5'

- name: PR-5351 document tests
working-directory: studio/backend
run: |
python -m pytest -q tests/test_chat_document_extraction.py tests/test_chat_document_routes.py tests/test_inference_worker.py tests/test_vision_cache.py tests/test_anthropic_messages.py tests/test_openai_tool_passthrough.py tests/test_models_get_model_config_case_resolution.py --tb=short

- name: PR-5351 regression tests + cancel timing
run: |
python -m pytest -q tests/studio/test_extractor_semaphore_leak.py tests/studio/test_html_independent_of_inference.py tests/studio/test_gguf_singleton_shared.py tests/studio/test_stream_cancel_registration_timing.py --tb=short

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}
Comment on lines +29 to +57
59 changes: 59 additions & 0 deletions .github/workflows/pr5351-windows.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# SPDX-License-Identifier: AGPL-3.0-only
# Copyright 2026-present the Unsloth AI Inc. team. All rights reserved.
#
# PR-5351 cross-OS validation: Windows lane.
# windows-latest. Validates the multiprocessing `spawn` path
# (mandatory on Windows), path normalisation, and EAGAIN-style
# Process construction failures under load (the exact bug class the
# semaphore-leak fix protects against). CPU-only; CUDA spoof
# auto-engages via tests/conftest.py.

name: PR-5351 Windows

on:
push:
branches: [pr-5351-cross-os-validation]
paths:
- 'studio/backend/**'
- 'tests/studio/**'
- 'tests/conftest.py'
- '.github/workflows/pr5351-windows.yml'
workflow_dispatch:

concurrency:
group: pr5351-windows-${{ github.ref }}
cancel-in-progress: true

jobs:
pytest:
runs-on: windows-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'

- name: Install backend test dependencies (CPU only)
shell: pwsh
run: |
python -m pip install --upgrade pip
pip install -r studio/backend/requirements/studio.txt
pip install python-multipart aiofiles sqlalchemy cryptography pyyaml jinja2 mammoth unpdf requests "numpy<3" pytest pytest-asyncio httpx
pip install --index-url https://download.pytorch.org/whl/cpu "torch>=2.4,<2.11"
pip install "transformers>=4.51,<5.5"

- name: PR-5351 document tests (Windows spawn semantics)
working-directory: studio/backend
shell: pwsh
env:
UNSLOTH_STUDIO_EXTRACT_CONCURRENCY: '2'
run: |
python -m pytest -q tests/test_chat_document_extraction.py tests/test_chat_document_routes.py tests/test_inference_worker.py tests/test_vision_cache.py tests/test_anthropic_messages.py tests/test_openai_tool_passthrough.py tests/test_models_get_model_config_case_resolution.py --tb=short

- name: PR-5351 regression tests + cancel timing
shell: pwsh
run: |
python -m pytest -q tests/studio/test_extractor_semaphore_leak.py tests/studio/test_html_independent_of_inference.py tests/studio/test_gguf_singleton_shared.py tests/studio/test_stream_cancel_registration_timing.py --tb=short

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}
Comment on lines +29 to +59
Loading
Loading