qwen-image-edit是千问最新推出的20B的图像编辑模型,正常部署大致需要60G显存以上。魔塔社区更新的DiffSynth库,支持qwen-image-edit模型以最低显存运行,本人亲自测试,只需要4G显存就可以运行qwen-image-edit模型。
1)下载模型
下载地址:Qwen-Image-Edit
2)下载库
本文的主角就是DiffSynth库,可到github上打包下载:
github.com/modelscope/DiffSynth-Studio
3)显存管理
DiffSynth-Studio 为 Qwen-Image 模型提供了细粒度的显存管理,让模型能够在低显存设备上进行推理,可通过以下代码开启 offload 功能,在显存有限的设备上将部分模块 offload 到内存中。FP8 量化功能也是支持的。
path1 = ["E:/Models/Qwen/Qwen-Image-Edit/transformer/diffusion_pytorch_model-00001-of-00009.safetensors",
"E:/Models/Qwen/Qwen-Image-Edit/transformer/diffusion_pytorch_model-00002-of-00009.safetensors",
"E:/Models/Qwen/Qwen-Image-Edit/transformer/diffusion_pytorch_model-00003-of-00009.safetensors",
"E:/Models/Qwen/Qwen-Image-Edit/transformer/diffusion_pytorch_model-00004-of-00009.safetensors",
"E:/Models/Qwen/Qwen-Image-Edit/transformer/diffusion_pytorch_model-00005-of-00009.safetensors",
"E:/Models/Qwen/Qwen-Image-Edit/transformer/diffusion_pytorch_model-00006-of-00009.safetensors",
"E:/Models/Qwen/Qwen-Image-Edit/transformer/diffusion_pytorch_model-00007-of-00009.safetensors",
"E:/Models/Qwen/Qwen-Image-Edit/transformer/diffusion_pytorch_model-00008-of-00009.safetensors",
"E:/Models/Qwen/Qwen-Image-Edit/transformer/diffusion_pytorch_model-00009-of-00009.safetensors"]
path2 = ["E:/Models/Qwen/Qwen-Image-Edit/text_encoder/model-00001-of-00004.safetensors",
"E:/Models/Qwen/Qwen-Image-Edit/text_encoder/model-00002-of-00004.safetensors",
"E:/Models/Qwen/Qwen-Image-Edit/text_encoder/model-00003-of-00004.safetensors",
"E:/Models/Qwen/Qwen-Image-Edit/text_encoder/model-00004-of-00004.safetensors"]
path3 = "E:/Models/Qwen/Qwen-Image-Edit/vae/diffusion_pytorch_model.safetensors"
path4 = "E:/Models/Qwen/Qwen-Image-Edit/tokenizer/"
path5 = "E:/Models/Qwen/Qwen-Image-Edit/processor/"
pipe = QwenImagePipeline.from_pretrained(
torch_dtype=torch.bfloat16,
device="cuda",
model_configs=[
ModelConfig(path = path1, model_id="Qwen/Qwen-Image-Edit", offload_device="cpu", offload_dtype=torch.float8_e4m3fn),
ModelConfig(path = path2, model_id="Qwen/Qwen-Image-Edit", offload_device="cpu", offload_dtype=torch.float8_e4m3fn),
ModelConfig(path = path3, model_id="Qwen/Qwen-Image-Edit", offload_device="cpu", offload_dtype=torch.float8_e4m3fn),
],
tokenizer_config=None,
processor_config=ModelConfig(path = path5, model_id="Qwen/Qwen-Image-Edit"),
)
pipe.enable_vram_management(vram_limit = 0)
开启显存管理后,框架会自动根据设备上的剩余显存确定显存管理策略。enable_vram_management 函数提供了以下参数,用于手动控制显存管理策略:
- vram_limit: 显存占用量限制(GB),默认占用设备上的剩余显存。注意这不是一个绝对限制,当设置的显存不足以支持模型进行推理,但实际可用显存足够时,将会以最小化显存占用的形式进行推理。将其设置为0时,将会实现理论最小显存占用。
当vram_limit = 0:实际测试所需显存为3.8G,生成一张图片大致需要31分钟。
我的显卡是4090,vram_limit = 23.5,生成一张图片大致需要8分钟。
测试样例:
prompt = “生成一张hello kitty的3D图片”

prompt = “这只hello kitty,拿着五彩画板和画笔,站在画板前画画”

prompt = “”” 转变成吉卜力风格,把衣服换成黑色的印有”Qwen”文字的T恤。 “””


© 版权声明
文章版权归作者所有,未经允许请勿转载。如内容涉嫌侵权,请在本页底部进入<联系我们>进行举报投诉!
THE END














- 最新
- 最热
只看作者