
I took the GPT-4 captioned dataset found here: https://civitai.com/models/281464?modelVersionId=322747, made by steffangund, and then duplicated the entire zip file. I then re-tagged every photo in the set using WD14 tagging conventions, and then trained the LORA using both sets - thus training with short, one & two word descriptors along with more verbose phrases to describe the same images.(I also added a number of screen grabs from the movies Teen Spirit and Neon Demon)
The result, hopefully, is a more responsive model that will listen better to your img-to-txt input, no matter how you input it (so long as it's in English).
描述:
训练词语: elfa
名称: 441651_training_data.zip
大小 (KB): 59046
类型: Training Data
Pickle 扫描结果: Success
Pickle 扫描信息: No Pickle imports
病毒扫描结果: Success
名称: Elle_Fanning.safetensors
大小 (KB): 41698
类型: Model
Pickle 扫描结果: Success
Pickle 扫描信息: No Pickle imports
病毒扫描结果: Success