
Stop! These models are not for txt2img inference!
Don't put them in your stable-diffusion-webui/models directory and expect to make images!
So what are these?
These are new Modelscope based models for txt2video, optimized to produce 16:9 video compositions. They've been trained on 9,923 video clips and 29,769 tagged frames at 24 fps, 1024x576 res.
Note that these are the bigger brothers to the https://civitai.com/models/96454/zeroscope-v2-576w-txt2video models. The XL models use 15.3GB of VRAM when rendering 30 fps at 1024x576.
Where do they go?
Drop them in the \stable-diffusion-webui\models\ModelScope\t2v folder
It's imperative you rename the text2video_pytorch_model.pt to .pth extension after downloading.
The files must be named open_clip_pytorch_model.bin, and text2video_pytorch_model.pth
Who made them? Original Source?
https://huggingface.co/cerspense/zeroscope_v2_XL
What else do I need?
These models are specifically for use with the txt2video Auto1111 WebUI Extension
描述:
训练词语:
名称: zeroscopeV2XL_v10.bin
大小 (KB): 1926219
类型: Model
Pickle 扫描结果: Success
Pickle 扫描信息: No Pickle imports
病毒扫描结果: Success
名称: zeroscopeV2XL_v10.pt
大小 (KB): 2756809
类型: Model
Pickle 扫描结果: Success
Pickle 扫描信息: No Pickle imports
病毒扫描结果: Success