Fix GenerationConfig continuous batching serialization by VectorPeak · Pull Request #47038 · huggingface/transformers

VectorPeak · 2026-07-03T09:46:12Z

What does this PR do?

This PR fixes a GenerationConfig serialization round-trip loss for ContinuousBatchingConfig.

What Problem This Solves

GenerationConfig.save_pretrained() persists generation settings by writing the result of GenerationConfig.to_json_string() into generation_config.json. During that JSON conversion, nested dataclass values are passed through convert_dataclass_to_dict(), which currently serializes dataclasses by calling to_dict() when that method exists.

The problem is that ContinuousBatchingConfig is a dataclass, but it did not define to_dict(). Because that helper had no fallback return for dataclasses without to_dict(), the continuous batching config could silently fall through as None during JSON conversion. In practice, a configured continuous batching block could therefore be persisted as JSON null.

A user or service can construct a generation config like this:

GenerationConfig(
    continuous_batching_config=ContinuousBatchingConfig(
        block_size=128,
        default_compile_level=2,
        varlen_compile_config=CompileConfig(dynamic=True),
        decode_compile_config=CompileConfig(mode="default"),
    )
).save_pretrained(...)

Before this fix, the saved generation_config.json could contain:

{
  "continuous_batching_config": null
}

That means the saved config no longer carries the actual continuous batching parameters, including values such as block_size, default_compile_level, and the nested varlen_compile_config / decode_compile_config settings. After a normal save_pretrained() -> from_pretrained() round trip, the loaded GenerationConfig has lost the continuous batching configuration instead of reconstructing it.

This matters for serving and inference setups that rely on saved generation configs: a configuration that was valid in memory can become incomplete after being saved and reloaded, so behavior can drift from the original runtime settings without an explicit error.

Change

This PR fixes both directions of the GenerationConfig round trip: writing ContinuousBatchingConfig into JSON, and restoring it back into typed config objects when the generation config is loaded again.

For the save / serialization path:

Add ContinuousBatchingConfig.to_dict() so convert_dataclass_to_dict() has an explicit structured representation to use instead of falling through to None.
Serialize the top-level continuous batching fields from the dataclass state, preserving user-provided values such as block_size, default_compile_level, max_cached_graphs, and other continuous batching knobs.
Delegate nested varlen_compile_config and decode_compile_config serialization to CompileConfig.to_dict() when those fields are present, so nested compile settings keep the same serialization behavior as standalone CompileConfig objects.
Preserve the existing CompileConfig.to_dict() filtering behavior, including not leaking internal implementation fields such as _compile_all_devices into the saved JSON.

For the load / deserialization path:

Convert a saved continuous_batching_config dictionary back into a ContinuousBatchingConfig inside GenerationConfig.__init__, matching the existing pattern used by other nested generation config objects.
Convert nested varlen_compile_config and decode_compile_config dictionaries back into CompileConfig during ContinuousBatchingConfig.__post_init__, so callers get typed config objects after GenerationConfig.from_pretrained() instead of raw dictionaries.
Keep the reconstruction local to continuous batching config handling rather than changing the generic dataclass serializer, which limits the behavioral surface of the fix.

For coverage:

Add a regression test that saves a GenerationConfig containing ContinuousBatchingConfig, reloads it with GenerationConfig.from_pretrained(), and verifies that the continuous batching fields survive the round trip.
Include nested CompileConfig values in the test to cover the deeper round-trip path, not just the top-level ContinuousBatchingConfig object.
Assert that nested compile configs are restored as CompileConfig instances and that internal compile-only fields are not emitted through to_dict().

This keeps the fix scoped to continuous batching generation config serialization. It does not alter continuous batching scheduling, generation execution, compile defaults, or unrelated generation config fields; it only makes the saved configuration faithfully represent the object that was already present in memory.

Evidence

Local behavior proof after the patch:

contains null: False
contains block_size: True
ContinuousBatchingConfig
CompileConfig
False

The final False verifies that _compile_all_devices is not leaked through nested CompileConfig.to_dict() serialization.

Possible call chain / impact

User / service saves generation config
  -> GenerationConfig.save_pretrained(...)
  -> GenerationConfig.to_json_string(...)
  -> convert_dataclass_to_dict(continuous_batching_config)
  -> ContinuousBatchingConfig parameters are preserved instead of becoming null

User / service reloads generation config
  -> GenerationConfig.from_pretrained(...)
  -> GenerationConfig.__init__(...)
  -> continuous_batching_config dict is restored to ContinuousBatchingConfig
  -> nested compile config dicts are restored to CompileConfig

This PR only changes serialization/deserialization of ContinuousBatchingConfig. It does not change continuous batching runtime scheduling, generation behavior, compile defaults, or unrelated generation config fields.

Code Agent Policy

I confirm that this is not a pure code agent PR.

AI assistance was used for diagnosis, patch drafting, validation planning, and PR text preparation. The human submitter should review all changed lines and understand the diff before checking this box.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline and the Pull Request checks?
Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
- Coordination issue: GenerationConfig serializes ContinuousBatchingConfig as null #47039.
Did you make sure to update the documentation with your changes according to the guidelines?
- No docs update needed; this fixes config round-trip behavior and adds a regression test.
Did you write any new necessary tests?

Validation run locally:

python -m pytest tests/generation/test_configuration_utils.py::GenerationConfigSerializationTest::test_serialize_generation_continuous_batching_config -q
python -m pytest tests/generation/test_configuration_utils.py::GenerationConfigSerializationTest::test_serialize_generation_watermarking_config -q
python -m ruff format --check src/transformers/generation/configuration_utils.py tests/generation/test_configuration_utils.py
python -m ruff check src/transformers/generation/configuration_utils.py tests/generation/test_configuration_utils.py
git diff --check

Limitations:

make fix-repo

was not run because make is unavailable in the current Windows PowerShell environment.

python utils/tests_fetcher.py --diff_with_last_commit

failed with an IndexError after selecting files from the previous commit rather than the current uncommitted diff. The default tests_fetcher.py invocation completed, but reported no changed files before the patch was committed.

Who can review?

Continuous batching / generation reviewers from the template are likely relevant after tests pass and the coordination issue has maintainer feedback: @remi-or, @ArthurZucker, @McPatate.

VectorPeak · 2026-07-03T12:02:42Z

CI note: the remaining red check appears to be a self-hosted runner/container infrastructure failure rather than a test failure from this PR.

The CI recap reports 68,570 tests with 0 failures. The failing job is tests_non_model [shard 6/8], and it failed during Initialize containers before checkout, dependency installation, or test execution. The relevant log lines are:

Pod ... is unhealthy with phase status Failed
TypeError: Cannot read properties of null (reading 'jobPod')
Executing the custom container implementation failed. Please contact your self hosted runner administrator.

I do not have permission to rerun the failed upstream job from the fork side.

zucchini-nlp · 2026-07-03T12:14:08Z

+    def to_dict(self) -> dict[str, Any]:
+        """Serializes this instance to a Python dictionary."""
+        output = copy.deepcopy(self.__dict__)
+        if self.varlen_compile_config is not None:
+            output["varlen_compile_config"] = self.varlen_compile_config.to_dict()
+        if self.decode_compile_config is not None:
+            output["decode_compile_config"] = self.decode_compile_config.to_dict()
+        return output


rather than manually doing it per key, lets force all dataclasses to resolve as dicts. I copied this from a few lines above

def convert_dataclass_to_dict(obj): if isinstance(obj, dict): return {key: convert_dataclass_to_dict(value) for key, value in obj.items()} elif is_dataclass(obj): # Some of our dataclasses have a custom `to_dict()` method, and we prefer it if hasattr(obj, "to_dict"): return obj.to_dict() else: return obj

Thanks for the suggestion! I updated the patch to use a generic dataclass fallback in convert_dataclass_to_dict() and removed the per-key ContinuousBatchingConfig.to_dict() handling.

My initial intent with the manual keys was to avoid widening the behavioral surface, but this shared fallback is cleaner and matches the direction you suggested while still preferring custom to_dict() implementations when present.

Validation rerun locally:

python -m pytest tests/generation/test_configuration_utils.py::GenerationConfigSerializationTest::test_serialize_generation_continuous_batching_config -q python -m pytest tests/generation/test_configuration_utils.py::GenerationConfigSerializationTest::test_serialize_generation_watermarking_config -q python -m ruff format --check src/transformers/generation/configuration_utils.py tests/generation/test_configuration_utils.py python -m ruff check src/transformers/generation/configuration_utils.py tests/generation/test_configuration_utils.py git diff --check

cc @remi-or , to make sure if saving continuous-batch config is intended

github-actions · 2026-07-04T09:55:06Z

CI recap

Dashboard: View test results in Grafana
Latest run: 28701913722:2
Result: failure | Jobs: 15 | Tests: 171,151 | Failures: 6 | Duration: 23h 35m

Fix continuous batching config serialization

e0172b5

VectorPeak mentioned this pull request Jul 3, 2026

GenerationConfig serializes ContinuousBatchingConfig as null #47039

Open

VectorPeak marked this pull request as ready for review July 3, 2026 09:52

Merge branch 'main' into fix-continuous-batching-config-serialization

3cafc9c

zucchini-nlp reviewed Jul 3, 2026

View reviewed changes

VectorPeak and others added 2 commits July 3, 2026 20:22

Use generic dataclass serialization fallback

dac3f93

Merge branch 'main' into fix-continuous-batching-config-serialization

599f658

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix GenerationConfig continuous batching serialization#47038

Fix GenerationConfig continuous batching serialization#47038
VectorPeak wants to merge 4 commits into
huggingface:mainfrom
VectorPeak:fix-continuous-batching-config-serialization

VectorPeak commented Jul 3, 2026 •

edited by github-actions Bot

Loading

Uh oh!

VectorPeak commented Jul 3, 2026

Uh oh!

zucchini-nlp Jul 3, 2026

Uh oh!

VectorPeak Jul 3, 2026

Uh oh!

zucchini-nlp Jul 3, 2026

Uh oh!

github-actions Bot commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

VectorPeak commented Jul 3, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

What Problem This Solves

Change

Evidence

Possible call chain / impact

Code Agent Policy

Before submitting

Who can review?

Uh oh!

VectorPeak commented Jul 3, 2026

Uh oh!

zucchini-nlp Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

VectorPeak Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jul 4, 2026

CI recap

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

VectorPeak commented Jul 3, 2026 •

edited by github-actions Bot

Loading