fix(core): load ONNX Runtime dynamically so headroom._core imports on non-AVX2 x86-64#1715
Open
Parideboy wants to merge 1 commit into
Open
fix(core): load ONNX Runtime dynamically so headroom._core imports on non-AVX2 x86-64#1715Parideboy wants to merge 1 commit into
Parideboy wants to merge 1 commit into
Conversation
… non-AVX2 x86-64 fastembed's ort-download-binaries-rustls-tls feature statically links Microsoft's prebuilt ONNX Runtime into the extension on non-Windows targets. That prebuilt x86_64 binary requires AVX2 and its code runs as soon as the module loads, so import headroom._core died with SIGILL on pre-AVX2 CPUs before the runtime AVX2 guard from headroomlabs-ai#1162 could intervene. Build with ort-load-dynamic on every platform (Windows already did, for DirectML link-lib reasons), collapsing the two identical target blocks into one dependency. ORT is now only dlopen'd at first use, where the AVX2 guard falls back to the non-ONNX detection tiers. To keep Magika/fastembed working out of the box, extend the Windows-only ORT_DYLIB_PATH auto-pin in headroom/_ort.py to all platforms: it now resolves the pip onnxruntime package's shared library (.dll/.so/.dylib), whose CPU wheels use runtime dispatch and run on pre-AVX2 machines too. Fixes headroomlabs-ai#1278 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Contributor
PR governanceThis PR follows the template and is marked ready for human review. |
JerrettDavis
approved these changes
Jul 2, 2026
JerrettDavis
left a comment
Collaborator
There was a problem hiding this comment.
Code review looks good. The change removes the static ONNX Runtime download path from the Rust extension, keeps the Windows dynamic-load behavior, and extends the import-time ORT_DYLIB_PATH pinning in a way that is covered by Linux/macOS/Windows-oriented tests. I did not find a correctness blocker in the diff.
One process note before merge: the current check rollup is not fully green because several test jobs are marked cancelled rather than successful. I am treating that as CI state, not a requested code change.
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
import headroom._coredies with SIGILL (Illegal instruction) on x86-64 CPUs without AVX2 (Pentium N4200, Celeron N4500, AMD FX 8350 — all reported on the issue). The repo sets noRUSTFLAGS/target-cpuanywhere, so first-party Rust code is baseline x86-64; the AVX2 code comes from Microsoft's prebuilt ONNX Runtime, statically linked into the extension by fastembed'sort-download-binaries-rustls-tlsfeature on non-Windows targets. Because it is statically linked, its code is mapped and initialized when the extension module loads — before the runtime AVX2 guard from #1162 can run, which is why that fix helped Magika init but not the import-time crash.Fix, mirroring what Windows already does for its own reasons (DirectML link libs): build with
ort-load-dynamicon every platform, so ONNX Runtime is onlydlopen'd at first use, where the #1162 AVX2 guard falls back to the non-ONNX detection tiers on unsupported CPUs. Since both target blocks became identical, they are collapsed into one platform-independentfastembeddependency.To keep Magika/fastembed working out of the box on Linux/macOS, the existing
ORT_DYLIB_PATHauto-pin (headroom/_ort.py, previously Windows-only) now resolves the piponnxruntimepackage's shared library on all platforms (onnxruntime.dll/libonnxruntime.so*/libonnxruntime*.dylib). The piponnxruntimeCPU wheels use runtime CPU dispatch, so they also work on pre-AVX2 machines — non-AVX2 users get working ML detection instead of a crash. Without theonnxruntimepackage, ML detection degrades gracefully to the non-ONNX tiers exactly as it already does on Windows.Fixes #1278
Type of Change
Changes Made
crates/headroom-core/Cargo.toml: replaced the per-targetfastembedblocks (ort-download-binaries-rustls-tlson non-Windows,ort-load-dynamicon Windows) with a single platform-independent dependency onort-load-dynamic, with a comment documenting both the DirectML and the AVX2/[BUG] SIGILL crash on headroom._core import: prebuilt wheel requires AVX2, incompatible with non-AVX2 x86-64 CPUs #1278 rationale.Cargo.lock: regenerated —ort-sysdrops its static-download dependencies (hmac-sha256,lzma-rust2,ureq); no version bumps.headroom/_ort.py:ORT_DYLIB_PATHauto-pin extended from Windows-only to all platforms via a small_find_dylibhelper that resolves the platform's shared-library name inside the piponnxruntimepackage.tests/test_transforms/test_ort_dylib.py: replaced the obsoletetest_noop_on_non_windowswith Linux (versioned.so) and macOS (.dylib) pin tests; module docstring updated.docs/content/docs/configuration.mdx:ORT_DYLIB_PATHrow updated from Windows-only wording to the cross-platform behavior.Testing
pytest)ruff check .)mypy headroom)Test Output
Real Behavior Proof
upstream/main(9fbd47b).cargo check -p headroom-coreafter the feature switch; inspected theCargo.lockdiff; rebuilt and ranpython -c "import headroom; from headroom._core import detect_content_type; print(detect_content_type('hello world'))"; ran the ort-pin test suite with monkeypatchedlinux/darwinplatforms.ort-load-dynamic; the lockfile showsort-sysno longer pulls the binary-download machinery (hmac-sha256,lzma-rust2,ureqremoved), confirming the statically-linked prebuilt ORT is gone; import + content detection works withORT_DYLIB_PATHauto-pinned to the pip onnxruntime library; all 8 pin tests pass including the new Linux/macOS branches.Review Readiness