Workflow to generate image descriptions on Apple Silicon Mac版本v1.0 (ID: 1202130) 综合资源合集综合资源合集

About

This is a workflow that makes use of multiple image-to-text tools and a LLM to produce the final image descriptions for a batch of images in a folder, and write out the corresponding .txt files.

This is especially helpful when captioning /describing NSFW images for LoRA training or fine-tuning, therefore the choice of the following 3 VLMs:

Florence2, WD1.4 tagger
JoyCaption alpha 2
huihui-ai/Qwen2-VL-7B-Instruct-abliterated

The LLM part for the final composition of the image description is done via the ollama node. I would say it is one of the easiest way to use a local LLM.

You will get amazing results by using a uncensored large model such as the huihui-ai/Llama-3.3-70B-Instruct-abliterated

(abliterated /uncensored models shall be used for both Qwen2-VL and the LLM to achieve the best results, nsfw or not).

Installation

Except for the ComfyUI_Qwen2-VL-Instruct and Comfyui_JC2 nodes, install the missing nodes using the ComfyUI manager.

ComfyUI_Qwen2-VL-Instruct

You will need to use the Qwen2-VL-Instruct node from this fork for this workflow to work:

https://github.com/edwios/ComfyUI_Qwen2-VL-Instruct

This fork incorporated two major changes: Allows Image input same as the other VLM tools and uses the Mac GPU (mps) with Python 3.12 and up to and including PyTorch 2.6.

Comfyui_JC2

You may also want to use this ComfyUI_JC2 fork to utilise the Mac GPU for JoyCaption: Alpha 2.

How to use

Everything you will need to interact with this workflow is on the Left most.

The simplest way to start is to enter the path to the directory containing the images. The results will be written to the same directory and name of the images but with the .txt extension.

Optionally, you can do the followings:

Change the VLM prompt to have Qwen2-VL to focus on a specific aspect of the image or images
Change the LLM prompt such as for better reasoning, or if you want it to write the descriptions in a SFW way (use at least a 70b instruct model for this). [No, this is NOT the same as using a 'safe' model.]

Credits

Credits go to all that contributed to make ComfyUI and all these nodes available for all of us.

Especially ComfyUI_Qwen-VL-Instruct, ComfyUI_JC2, ComfyUI-WD14-Tagger, ComfyUI-Ollama, ComfyUI-Florence2 and Ollama for making these amazing machine learning models available on mps, or at least not force it into a Nvidia only solution.

描述:

V1.0

训练词语:

名称: workflowToGenerateImage_v10.zip

大小 (KB): 1644

类型: Archive

Pickle 扫描结果: Success

Pickle 扫描信息: No Pickle imports

病毒扫描结果: Success

Workflow to generate image descriptions on Apple Silicon Mac

资源下载

下载价格VIP专享

仅限VIP下载升级VIP

犹豫不决让我们错失一次又一次机会！！！

原文链接：https://1111down.com/1174654.html，转载请注明出处

Workflow to generate image descriptions on Apple Silicon Mac版本v1.0 (ID: 1202130)

About

Installation

How to use

Credits

描述:

V1.0

训练词语:

在线客服

升级VIP

全屏浏览

夜间模式

繁简切换

返回顶部

Workflow to generate image descriptions on Apple Silicon Mac版本v1.0 (ID: 1202130)

About

Installation

How to use

Credits

描述: V1.0

训练词语:

猜你喜欢

Kesha Lora版本V1 (ID: 1298881)

Iron Patriot版本v1.0 (ID: 1291511)

Dark Ishihara版本V1 (ID: 1303876)

Female dancer posing SDXL版本V1 (ID: 1249895)

ybqy版本V1 (ID: 1314652)

Halina Pawlowská版本v1.0 (ID: 1294642)

在线客服

升级VIP

全屏浏览

夜间模式

繁简切换

返回顶部

社交账号快速登录

社交账号快速登录

描述:

V1.0