![Carry Key [Flux/XL/1.5] (r557)版本Carry Key v1.1 (ID: 578041)](https://image.1111down.com/xG1nkqKTMzGDvpLrqFT7WA/5e7ad157-7b3c-4a3a-b975-fa5e636d5cfe/width=450/16217307.jpeg)
Flux:
A Flux dev version of the SDXL images. Please check the version description for more info, and I've included the preview wildcards and training kohya-config as well.
The captioning and multi-concept training techniques used for the SDXL model did not work at all with Flux. It just takes the concepts and bleeds them together miserably into a single Frankenstein cosplay. To combat this, I abused took advantage of Flux's love of long prompts and descriptive details.
Overview:
-
Model consists of five of her cosplay sets, Aerith, Belle, Gwen, Nami, and Zelda.
-
Masked training was not used, and it did not seem needed.
-
Regularization was not used as it seemed to require significantly more training.
-
LoRA trained using the
sd3-flux.1
branch of the Kohya_ss scripts. -
LoRA trained on Flux-Dev2Pro for use with Flux.1-Dev (fp8/fp16).
The training was tested using the MGAS comparison method to find the best epoch. The introduction article can be found here.
Usage:
-
Intended for use with Flux.1-Dev. Performs OK on other realistic focused models.
-
Size: 1024,1024, Steps: 20, Sampler: Euler-Simple, CFG/Guidance scale: 1.0/3.0, Strength: 1.0 (Deis-Beta and UniPc/Dpm_2/Dpm++2M-SGM_Uniform work OK too)
-
[Optional] Adetailer. Was NOT used for preview images.
-
All cosplays are tied together by the common tag, {cae7skey}, which consists of two rare tokens on either side of a number.
-
Add the following tags in your prompts for the individual cosplays. Flux prefers more detailed prompts using natural language so please see preview images for examples of how to incorporate the tags this way:
-
Aerith - {Aerith_\(Final Fantasy\) cosplay, light brown hair, sidelocks, single long braid, cropped red denim jacket, long pink dress, violet undershirt}
-
Belle - {Belle_\(Beauty and the Beast\) cosplay, dark brown hair, brown eyes, ponytail, blue ribbon, white blouse, blue dress, white apron}
-
Gwen - {Gwen_\(Spider-Gwen\) cosplay, blue eyes, blonde hair, short asymmetric cut, eyebrow stud piercing, white shirt, navy tie, black cardigan, blue plaid skirt}
-
Nami - {Nami_\(One Piece\) cosplay, short hair, vibrant orange hair, brown eyes, sheer pink shawl, blue vest, traditional white dress, blue floral designs}
-
Zelda - {Zelda_\(Legend of Zelda\) cosplay, glittering makeup, blue eyes, long hair, blonde hair, crown braids, white fur-trimmed cloak, gold emblem, blue ribbon}
-
Future Outlook:
The goal of these three versions was to learn how to train and caption locally (✔️), learn how to effectively do multi-concept training for cosplays (✔️), and eventually create a cosplay-focused nobody (non-celeb) checkpoint/LoRA that can be used on the Civitai generator (⭕).
==========================================================
SDXL:
A significantly better SDXL LoRA of my favorite cosplay girl. Feel free to use it with credit given to ya boi.
Overview:
-
Model consists of five of her cosplay sets, Aerith, Belle, Gwen, Nami, and Zelda.
-
Masked training was used for more flexible backgrounds.
-
Regularization set was captioned with GPT-4o for greater control of details.
-
Original Dreambooth trained using Constant AdaFactor with Min_SNR_Gamma loss.
Extensively tested using the MGAS comparison method (here) during both training and image generation, while an example regularization prompt can be found in the VDC article (here).
Usage:
-
Trained on, and works best with, RealVisXL V4.0, but does work well enough with the most popular realistic XL checkpoints.
-
Size: 1024,1024, Steps: 30, Sampler: DPM++2S-a Karras, CFG scale: 7, Strength: 1.0
-
[Optional] Adetailer always helps, with denoise around 0.4
-
All cosplays are tied together by the common tag, {cae7skey}, which consists of two rare tokens on either side of a number.
-
Add the following tags in your prompts (see preview images for examples) for the individual cosplays:
-
Aerith - {asar4acos, long brown hair, blue eyes, cropped red jacket, pink dress, black necklace with a flower pendant}
-
Belle - {bnha5icos, brown hair, ponytail, blue bow, brown eyes, white blouse, long blue skirt, white apron}
-
Gwen - {gera6cpec, blonde hair, asymmetric hair, blue eyes, eyebrow piercing, white collared shirt, black sweater with blue tie, blue plaid skirt}
-
Nami - {nlwx8cpac, orange hair, short hair, brown eyes, blue and white top, pink shawl, long white skirt, gold necklace with triangular designs, blue upper arm tattoo, gold armlets}
-
Zelda - {zawa9clos, blonde hair, crown braid, pointy ears, blue eyes, white fur coat, white outfit, gold embroidery, golden buckles, brown leather gloves}
-
==========================================================
SD1.5:
First-and-a-half attempt at making a SD1.5 LoRA using my favorite cosplay girl, Carry Key.
Use it as much as you want, so long as you give, ya boi reaper (lowercase 'r'), a shout out, and feel free to share/merge as well. <|:D
¡Disclaimer!
-
The original dataset has been significantly pruned (96 --> 64) in order to meet Civitai's guidelines. Dataset is now 3:1 face, with sweaters, jackets, and collared shirts abound, against the remaining medium close shots. Model was retrained on this.
-
Also, due to my process for generating the dataset images, textures can end up being messed up; skin textures, in particular, can look a little "grainy" when looked at closely and upscaling is advised to mitigate this (see also recipe below).
-
Too many dataset images had blue eyes, and green, as well as the fact that she wears colored contacts in various photoshoots. Some bright colored eyes can come out "fried", or oversaturated. To mitigate this, either try weighted prompting, i.e. (blue eyes:0.5), using a synonym that wasn't trained on, i.e. azure eyes, or to try appending the word 'contacts', i.e. blue eye contacts.
Preview Resources:
Preview images used this LoRA, (strength: 0.15-0.25), and this embedding:
-- LoRA also helps with texture issue mentioned above
-- the embed helps with hands, but tends to mess up the face
-- this new hand embed has minimal effect on the overall style and faces
DB Model Training:
> The full Dreambooth model was trained using the excellent YouTube tutorial, and OneTrainer parameter config file, of our resident Maestro, SECourses: here
> I am in generative art for the long haul, so any positive criticism is welcome; negative criticism, couched in some funny snark, quip, or remark, is welcome too. /('.').7
> The ComfyUI workflows should be embedded in each PNG, via the SD Prompt Reader node, for the easiest of copy-pasta shenanigans. Generated images were alright, but each preview image has been upscaled (see recipe below). (>'.')>
Upscale Recipe:
Here is the rough recipe to replicate the preview images, if not using my workflows:
-
LoRA extracted from a Dreambooth model trained on HyperRealism_v3, use this checkpoint for best results. RealisticVision and StockPhoto work too, mostly.
-
First Pass Generation: {Size: 768,768, Steps: 25, Sampler: DPM++2M Karras, CFG scale: 7}
-- Strength: 0.7 - 1.0; be careful about over-frying the image when >1
-
Second Pass Fix: Must use some kind of High-Res fix, in order to mitigate aforementioned texture issue. Here is what I did:
(i) SD Ultimate Upscaler {x2.0, 10-15 steps, 8-10 CFG, DPM++2M Karras, 0.3 denoise}
-- Usually need to use a half-tile seam-fix method at 0.3-0.5 denoise
(ii) SUPIR Upscaler {x1.0, 10 steps, 1.5 cfg scale start; 2.0 end, 5 churn, 1 eta, 0.9 control scale start; 0.95 end, 5-10 restore CFG, RestoreDPMPP2M sampler}
-- Use SDXL_lightning model with 10 steps, normal SDXL with 40 steps
-- The first upscale pass, (i), is usually good enough on its own, unless you are trying to show off for preview images
Future Outlooks:
I worked hard on this one, and had a lot of fun too, but it is far from perfect and far from final. For now, this version is primarily for her face.
For one, it doesn't handle her various character cosplays very well. You can try to prompt for, Nami, Nico Robin, Tifa Lockhart, Aerith Gainsborough, Gwen Stacy, Princess Zelda, and more, but I had mixed results with this, at best.
The only one that seemed to work very well was, Jynx \(League of Legends\). In future versions I will work diligently to figure out how to more reliably prompt for her cosplays.
<(-_o).\,,/
==========================================================
¡Legal & Risk!
It is prohibited to use this model for any commercial purposes, any illegal scenarios/activities, or in any way which violates Civitai's terms of service. Please generate responsibly! :D
描述:
Re-release: Version 1.1 - Pruned dataset to meet site guidelines.
Original Dreambooth model trained on 64 images. This dataset was then regularized by a 1:1 balanced set of 5200 ground-truth images of real women, prepared by our resident Maestro, SECourses (see his Patreon). This set was trained for 50 epochs in batches of 1.
-
Total trained steps (?): 64 x 2 x 50 / 1 = 6400
-
Extracted LoRA Dim/Alpha: 128/64
I removed the majority of text watermarks from the images. I tested for the preview images in huge batches of 20, without negatively prompting for text, etc., and rarely encountered one.
Check preview images for prompting ideas. The 'ohwx' tag is a rare token, which helps with training on established models, but makes this hard to mix with other models who used the same. I recommend using, ((ohwx)), as the first tag to get a better face. Please see SECource's linked tutorial in the description for a more detailed explanation of rare tokens.
训练词语: ohwx
名称: carryKey_r557_v1-1.safetensors
大小 (KB): 194821
类型: Model
Pickle 扫描结果: Success
Pickle 扫描信息: No Pickle imports
病毒扫描结果: Success