Qwen3 Text Encoder (Nunchaku INT4)
Language: English | 中文
tonera/Qwen3-text-Nunchaku is a quantized Qwen3 text encoder for the FLUX.2 klein family. It can be used as a drop-in replacement for the pipeline text_encoder.
Note: As of 2026-04-09, the Nunchaku PR for this functionality has still not been merged into the official main branch. If you want to try it early, you can pull and merge the code from nunchaku-ai/nunchaku#927.
Usage
import torch
from diffusers import Flux2KleinPipeline
from nunchaku import NunchakuQwenEncoderModel
text_encoder = NunchakuQwenEncoderModel.from_pretrained(
"tonera/Qwen3-text-Nunchaku/svdq-int4-Qwen3-text-Nunchaku.safetensors",
device="cuda",
torch_dtype=torch.bfloat16,
)
pipeline = Flux2KleinPipeline.from_pretrained(
"black-forest-labs/FLUX.2-klein-9B",
text_encoder=text_encoder,
torch_dtype=torch.bfloat16,
).to("cuda")
image = pipeline(
"A cat holding a sign that says hello world",
num_inference_steps=4,
guidance_scale=1.0,
).images[0]
image.save("flux2-klein-qwen3-text.png")
If you also use a Nunchaku-quantized Transformer, you can keep passing it through the usual transformer= argument of Flux2KleinPipeline.
Quantization quality
The following metrics are taken from hidden_states_last in data.json:
| Metric | Value |
|---|---|
| Cosine similarity | 0.995844 |
| Relative L2 | 0.091183 |
| Max absolute error | 6.539062 |
- Downloads last month
- 820
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for tonera/Qwen3-text-Nunchaku
Base model
black-forest-labs/FLUX.2-klein-9B