Skip to content

feat(mm): add Qwen Image single-file checkpoint loader with fp8 support#9253

Open
Pfannkuchensack wants to merge 2 commits into
invoke-ai:mainfrom
Pfannkuchensack:feat/qwen-image-checkpoint-loader
Open

feat(mm): add Qwen Image single-file checkpoint loader with fp8 support#9253
Pfannkuchensack wants to merge 2 commits into
invoke-ai:mainfrom
Pfannkuchensack:feat/qwen-image-checkpoint-loader

Conversation

@Pfannkuchensack
Copy link
Copy Markdown
Collaborator

Summary

Adds Main_Checkpoint_QwenImage_Config and QwenImageCheckpointModel so that single-file safetensors checkpoints (e.g. Qwen-Image-Edit 2511 fp8_scaled from Civitai) can be imported. ComfyUI-style fp8 weights are dequantized to bf16 at load time; the existing default_settings.fp8_storage toggle then optionally re-casts to fp8 for VRAM savings.

Also wires _apply_fp8_layerwise_casting into the Qwen Image diffusers loader so the fp8 storage option works across all three formats (diffusers, single-file checkpoint; GGUF stays untouched as it carries its own quantization).

Shared variant inference (marker tensor → filename heuristic) and transformer architecture auto-detection are extracted into module-level helpers so the GGUF and checkpoint loaders stay in sync.

Related Issues / Discussions

Qwen-Image-Edit 2511 fp8_scaled.

QA Instructions

  1. Install a Qwen Image single-file safetensors checkpoint via the Model Manager.
    Suggested test files:
    • ComfyUI-style fp8_scaled: e.g. Qwen-Image-Edit 2511 fp8 from Civitai
    • Plain bf16/fp16 safetensors (if available)
  2. Confirm the model is detected as Main / QwenImage / Checkpoint (not Diffusers, not GGUFQuantized) and that the variant (edit vs generate) is inferred correctly:
    • Filename containing "edit" (case-insensitive) → edit
    • State dict containing __index_timestep_zero__edit
    • Otherwise → generate
    • Explicit override in import options must win.
  3. Generate an image with the imported model on the Qwen graph — should run end-to-end and produce a sensible image. For an Edit variant, verify the reference image actually conditions the output (dual modulation works).
  4. Toggle FP8 Storage in the model's default settings and re-generate:
    • Log line FP8 layerwise casting enabled for <model> ... should appear.
    • VRAM usage of the transformer should drop ~50%; output should remain visually equivalent.
    • Repeat the same toggle test for a diffusers-format Qwen Image model (previously fp8_storage was a no-op there).
  5. Regression check — re-import a GGUF Qwen Image model and a diffusers folder Qwen Image model; both must still load and infer correctly (loader helpers were extracted, behavior should be identical).
  6. Run the test suite:
    uv run --extra cuda pytest tests/backend/model_manager/configs/ tests/backend/model_manager/load/test_load_default_fp8.py
    All 61 tests should pass (7 new checkpoint variant-detection tests + 2 format-discrimination tests).

Merge Plan

Standard merge — no DB schema changes, no migrations needed. The new config class registers in the discriminator union but only matches files that are explicitly Qwen Image single-file checkpoints (not GGUF, not diffusers), so it cannot accidentally re-classify existing models.

I did not test it yet. Need to make some space for it.

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration — n/a, backend only
  • Documentation added / updated (if applicable) — n/a, no user-facing config changes
  • Updated What's New copy (if doing a release after this PR)

…h fp8 support

Adds Main_Checkpoint_QwenImage_Config and QwenImageCheckpointModel so that
single-file safetensors checkpoints (e.g. Qwen-Image-Edit 2511 fp8_scaled
from Civitai) can be imported. ComfyUI-style fp8 weights are dequantized to
bf16 at load time; the existing default_settings.fp8_storage toggle then
optionally re-casts to fp8 for VRAM savings.

Also wires _apply_fp8_layerwise_casting into the Qwen Image diffusers loader
so the fp8 storage option works across all three formats (diffusers, single-
file checkpoint, GGUF stays untouched as it carries its own quantization).

Shared variant inference (marker tensor → filename heuristic) and transformer
architecture auto-detection are extracted into module-level helpers so the
GGUF and checkpoint loaders stay in sync.
@github-actions github-actions Bot added python PRs that change python files backend PRs that change backend files python-tests PRs that change python tests labels May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend PRs that change backend files python PRs that change python files python-tests PRs that change python tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant