![[Experimental] 8GB VRAM Tik Tok Dance Workflow (AnimateLCM, Depth Controlnet, LoRA)版本v1.0 (ID: 967949)](https://image.1111down.com/xG1nkqKTMzGDvpLrqFT7WA/a9db8e1e-3382-4346-be3b-e58750839276/width=450/35170891.jpeg)
Introduction
This is a highly experimental workflow for generating dance videos that fit in 8GB VRAM. It requires you to tinker with the relative strength of the LoRA and the controlnet. It also requires a LoRA trained on only 1 attire and the attire has to roughly match the driving video to work well.
This was inspired by work by Reddit user specific_virus8061, who made a music video on a 8GB VRAM GPU. I noticed morphing of the video and this is a common limitation of AnimateDiff with a 16 length context window. I tried various methods to get around this and this workflow is the outcome.
Link to Reddit Post: https://www.reddit.com/r/StableDiffusion/comments/1fsz1dp/potato_vram_ai_music_video/
Who is it for?
People who have 8GB VRAM and do not mind tinkering with a workflow to get the most out of their hardware.
Who is it not for?
-
People who are looking for a one-click workflow.
-
People who have the VRAM to run a proper solution like MimicMotion
Workflow
The 1st part of the workflow uses a fixed latent batch seed behaviour with a depth controlnet and and character LoRA to generate images. You use the Image Generation Group to generate the individual frames, which will be saved as latents in the output/dance folder.
The 2nd part of the workflow puts the images through a AnimateLCM pass to create a video. Copy these latents to the input folder and refresh ComfyUI. Disable the Image Generation Group and activate the Video Generation Group. You can now set the latents in the LoadLatent nodes. You can add more LoadLatent and LatentBatch nodes as needed for the length of your video.
LoRA
Please use LoRAs that were trained only on 1 specific attire. You can try the LoRAs by cyberAngel, each LoRA has typically be trained on 1 attire.
https://civitai.com/user/cyberAngel_/models?baseModels=SD+1.5
VRAM
VRAM usage is controlled by the Meta Batch node and the 2 Batch VAE decode nodes. Settings below have been tested to work well. Please leave a comment if these settings do not work for you.
8GB VRAM: Meta Batch: 12, VAE Decode: 2
12GB VRAM: Meta Batch: 24, VAE Decode: 16
Evaluation of the Results
This is by no means a perfect workflow, the hands, collar, tie, buttons and background all have issues to be fixed. I am releasing this for the community with lower VRAM to have fun and see how far they can take the concept.
Models Needed
-
SD1.5 LCM: https://civitai.com/models/81458?modelVersionId=256668
-
AnimateLCM_sd15_t2v.ckpt (https://huggingface.co/wangfuyun/AnimateLCM)
-
Install Using Manager:
-
depth_anything_v2_vitl.pth
-
control_v11f1p_sd15_depth_fp16.safetensors
-
Custom Nodes Needed
Install missing custom nodes using manager.
-
ComfyUI's ControlNet Auxiliary Preprocessors
-
ComfyUI Frame Interpolation
-
ComfyUI-Advanced-ControlNet
-
AnimateDiff Evolved
-
ComfyUI-VideoHelperSuite
-
rgthree's ComfyUI Nodes
-
KJNodes for ComfyUI
-
Crystools
描述:
Initial Release
训练词语:
名称: Experimental8GBVRAMTikTok_v10.zip
大小 (KB): 7
类型: Archive
Pickle 扫描结果: Success
Pickle 扫描信息: No Pickle imports
病毒扫描结果: Success