
!Disclaimer! lol - This may end up being frustrating. This requires a bit of patience and understanding to get working well. But, on the plus side, there's tons of techniques and unique node usage to be borrowed and harvested and if you getting working as well as I have, it's really neat lol.
This is to be considered experimental as it relies on consistent output from LLMs and this one relies on multiple LLM automation sections. This requires larger LLMs than typically used. These types of workflows and nodes that use the LLMs also rely on detailed profiles designed for tasks and detailed prompt instructions for the LLM nodes as well as examples. Any LLM node is a your mileage may vary type of node. Base on LLM used, Prompt, Profile used, and etc.
So this one is touch and go and very depend on your LLM. It takes two images, one as the story teller, the other as the reference for the story. It's a work in progress but fun so far. The Prompt Schedule is the touch and go part. Darwin is pretty good about formatting it correctly, but not always. Your LLM mileage may vary lol.
Notable things in this workflow.
You can use a video instead of an image, I was testing any node and used it to duplicate an image 19 times and trick wav2lip into thinking it was a video lol. You can actually use a video, or use the single image with any node.
Number of frames is determined by the length of the generated audio, I have it sent to the batch to int node, and then use the resulting int as the size of the batch.
Concatenate image node - Decided to see if video would work, It did, which was awesome lol. Currently requires both videos have the same number of frames. Working on a solution so I can either interpolate the second video or reduce it's frame rate.
Prompt Schedule - So, getting Darwin to mostly do this correctly was a bit of a challenge, it still doesn't get the concept of extending past 300 frames in a logical manner and remembering not to add a period or comma at the end of the final prompt. But it does well enough for now lol. Better models will likely give you less issues.
Geeky Ghost LCM is a custom 1.5 model of mine merged with Photon LCM, any LCM model works, choose one you like or merge your own lol.
Darwin (The model I use) has vision, so you'll need a Vision model for image descriptions. Not all have vision. Llava and some experimental Llama3 models out there for the most part. A few others, but those mainly.
描述:
训练词语:
名称: geekyGhostStory_v10.zip
大小 (KB): 8
类型: Archive
Pickle 扫描结果: Success
Pickle 扫描信息: No Pickle imports
病毒扫描结果: Success