Skip to content

Support loading original InternVL2 checkpoints natively#47047

Open
malakazlan wants to merge 1 commit into
huggingface:mainfrom
malakazlan:internvl2-native-loading
Open

Support loading original InternVL2 checkpoints natively#47047
malakazlan wants to merge 1 commit into
huggingface:mainfrom
malakazlan:internvl2-native-loading

Conversation

@malakazlan

@malakazlan malakazlan commented Jul 3, 2026

Copy link
Copy Markdown

CI

What does this PR do?

The original OpenGVLab/InternVL2-* checkpoints ship the bespoke internvl_chat
remote-code layout. Under Transformers v5 (meta-device init), that custom code
calls .item() during construction and crashes, so the checkpoints can no longer
be loaded via AutoModel.from_pretrained(..., trust_remote_code=True). This is the
blocker tracked in vllm-project/vllm#38425.

This PR makes the original checkpoints load into the native
InternVLForConditionalGeneration, without remote code:

  • Config alias — register the internvl_chat model_type as an alias of the
    native InternVLConfig (mirrors the existing gpt-sw3GPT2Config alias), so
    AutoConfig/AutoModel resolve the local implementation when trust_remote_code
    is not passed (per @hmellor's guidance in the linked issue).
  • Config normalization — map the original internvl_chat config
    (llm_config / intern_vit_6b vision_config / select_layer) onto
    InternVLConfig in from_dict.
  • Load-time weight conversion — rename the original weight layout onto the
    native names and split the fused vision attn.qkv into q/k/v via Chunk,
    reusing the mapping already established in convert_internvl_weights_to_hf.py.

Test plan

Built Transformers from source (v5) and verified on OpenGVLab/InternVL2-1B:

  • AutoConfig.from_pretrained(..., trust_remote_code=False) returns a native
    InternVLConfig (Qwen2 text + InternVL vision sub-configs).
  • from_pretrained loads with no missing / unexpected / mismatched weights.
  • Each converted tensor (incl. the split q/k/v) is bit-exact vs. the original.

Added tests/models/internvl/test_modeling_internvl.py::InternVLOriginalCheckpointTest
(passes, CPU-only).

Notes

  • InternVL2-2B (InternLM2 backbone, interleaved wqkv split) is a planned follow-up.
  • Loading emits the generic “model of type internvl_chatinternvl” info
    warning; happy to silence it for this alias if preferred.
  • AI assistance was used; I have reviewed every line and can defend the change.

cc @hmellor

Before submitting

The original OpenGVLab/InternVL2-* checkpoints ship the bespoke internvl_chat
remote-code layout, which is incompatible with Transformers v5 meta-device
initialization (the custom code calls .item() during construction). Load them
into the native InternVLForConditionalGeneration without trust_remote_code:

- Register the internvl_chat model_type as an alias of the native InternVLConfig
  (mirrors the existing gpt-sw3 -> GPT2Config alias), so AutoConfig/AutoModel
  resolve the local implementation when trust_remote_code is not passed.
- Normalize the original internvl_chat config (llm_config / intern_vit_6b
  vision_config / select_layer) onto InternVLConfig in from_dict.
- Add a load-time weight conversion that renames the original weight layout onto
  the native names and splits the fused vision attn.qkv into q/k/v via Chunk.

Verified on OpenGVLab/InternVL2-1B: loads with no missing/unexpected/mismatched
weights and each converted tensor is bit-exact against the original checkpoint.

Relates to vllm-project/vllm#38425. InternVL2-2B (InternLM2 wqkv split) to follow.

Signed-off-by: malakazlan <azlanmalikai@gmail.com>
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, internvl

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

CI recap

Dashboard: View test results in Grafana
Latest run: 28684427653:2
Result: failure | Jobs: 15 | Tests: 170,843 | Failures: 2 | Duration: 24h 55m

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant