
I'm uploading some of my "preliminary test LoCons" on this page. Not all of them, but some of them -- specifically, I'm uploading the ones that I think are fun, interesting, and at least somewhat "user friendly."
Now, if you plan on trying these LoCons, please realize...
These "preliminary test LoCons" aren't really "betas."
They aren't even "alphas."
They are not in any specific stage of development -- and in some sense, they are complete/completed, because they are exactly what they need to be.
Each of these LoCons is cooked with 100% of a "candidate dataset," after all duplicates and "near-duplicates" have been removed with DupeGuru at maximum strength.
There is no tagging, other than tagging each image with a universal trigger word (because my end goal is always to train LoCons with Text Encoder training).
Testing these "preliminary LoCons" reveals A LOT about how difficult it will be to make a reasonably flexible LoCon that unlocks the true potential of the dataset... AND it reveals if the goals of the LoCon are even POSSIBLE!
For example, you could take a look at my Hellsing Ultimate LoCon -- it's one that I put quite a lot of work into, and I put that work into it BECAUSE the preliminary test LoCon revealed that 1) the basic style goals were achieved, and therefore the dataset itself was/is appropriate, and 2) "overfit" and "underfit" features seemed reasonably addressable through careful tagging, repeats, training settings, etc.
On the other hand, I've made a preliminary test model for "Mononoke" (2007), the spinoff of "Samurai Horror Tales" with its interesting ukiyo-e art style. That preliminary test model revealed that 1) my basic training settings caused many features to become overfit within four epochs, and 2) the "cracking paint" style didn't even come through. So I know that Mononoke is going to be a style LoCon that requires me to tag things very carefully, test different training settings, and so forth.
So, again, I am uploading the ones that seem somewhat useful, interesting, and stylistically accurate. But they are what they are -- no more, no less, and no different! No refunds!
(If you do want some LoRAs/LoCons that have had some more thought/effort put into them, maybe check out https://civitai.com/user/SweetRammaJamma!)
___
SOME INFORMATION ABOUT MY TESTING, AND ABOUT THE SAMPLE IMAGES:
My default test checkpoint for Pony is ColorburstXL. Unless otherwise indicated, all of the sample images are generated using ColorburstXL. Using other checkpoints might alter the style to your tastes -- for example, I find Reponygine to be quite pleasant while keeping the basic style of each LoCon.
I'm uploading image generation data with the sample data, so please take note of the presence OR ABSENCE of tags like "score_9," "source_anime," etc -- some of the LoCons work just fine WITHOUT those tags, or with those tags weighted (down:0.8)
When I DO test with score/source tags (which is most of the time), it's almost always "primarytriggerword, score_9, score_8_up, score_7_up, score_6_up, source_anime"
Some of these preliminary test models are surprisingly "no fuss"... But some are a little bit more overfit, unpredictable, or just plain "squirrely," and require more careful/specific prompting to get decent results (especially in the negative prompt). So, please look in each LoCon's section below for any helpful prompting tips (and if you run into any issues/solutions, please let me know in the comments -- if nothing else, your comments will help me troubleshoot things for future LoCons).
Note that my testing uses wildcards, and many of my prompts are total nonsense. My negative prompts often contain some nonsense, too. So don't assume that my prompts are wise/ideal -- use common sense, and refer to the checkpoint makers for general best practices.
___
General note: Use "solo" in the positive prompt unless you want to risk main characters popping up all over the place. As far as I know, this isn't easily avoidable when making "trigger word only" LoRAs/LoCons of this sort. But "solo" seems to work perfectly fine.
___
AND NOW, COMMENTS ON SOME SPECIFIC LOCONS:
___
#1: Space Adventure Cobra (1982) -- LoCon for Pony, added May 6th, 2024
Trigger word is SPZCZB (which is SPAce COBra, with each vowel replaced by a Z)
In terms of "user friendliness," I'd rate it a solid B.
Cobra (the male main character), plus a major female character, have red/orange clothing, and this can bleed into some of the generations. "Orange shirt," "red shirt," "compression shirt" (etc) seem sufficient to address the clothing bleed.
Generations tend to include forward-swept hair, too. I'm not sure if that can be avoided through negative prompting. But it doesn't bug me because both Cobra and a major female character have that sort of hair. I'm not even sure how I'd tag that for the full/final Space Cobra LoCon... but at least it's "authentic."
I couldn't recreate the gold "skeleton robot" villain (Crystal Boy) using just basic tagging... but the output was still a helluva lot cooler than without the LoCon, so I'm pretty happy with that.
Be prepared for a "vacuum-sealed boobs" look. That's accurate to the source material. Feature, not a bug! Have you not seen Space Adventure Cobra?! You gotta at least watch the movie!
Overall, it should be relatively easy to make a more complete and flexible Space Adventure Cobra LoCon.
___
#2: Mob Psycho -- LoCon for Pony, added May 6th, 2024
Trigger word is MZBPSZ (which is MOB PSYcho, with each vowel replaced by a Z)
This one turned out to be quite cooperative /user friendly. I don't have specific negative prompting advice, because I didn't run into any headaches! But YMMV.
Omitting Pony's score/source tags works well (at least with ColorburstXL). Check the sample images.
___
#3: Berserk (TV, 1997) -- LoCon for Pony, added May 7th, 2024
Trigger word is BZRSTV (which is BERSerk TV, with each vowel replaced by a Z)
No real surprises, here. The sample images will show you what you can expect to get when prompting ordinary humans. Casca's dark skin will often show up when prompting "1girls," but if you want to avoid that entirely, "dark skin" in the negative prompt is sufficient.
I didn't try it, but you can probably pretty easily avoid any muscular females by putting stuff like "muscular" in the negative prompt.
Prepare for some of Griffith's features to sometimes show through when you're prompting 1girls. Because... you know... Griffith looks like "a 1girl." Even WDtagger agrees (sometimes)! Same with Judeau (though to less of an extent than with Griffith).
I haven't yet tried generating any monsters/demons, but there might be some interesting results, there. All of the monsters/demons are in the dataset, after all! (But getting specific likenesses of the members of God Hand is probably impossible. That's the kind of thing that would have to wait for a final version)
___
#4: Mobile Suit Gundam (1979) -- LoCon for Pony, added May 7th, 2024
Trigger word is GZNDTV (which is GUNDam TV, with each vowel replaced by a Z)
With this LoCon at 1.0 strength, your initial images are going to be blurry. Much like with my Guyver LoCon, this is a function of the data that's available for the kinds of classic anime that aired at 480i resolution on CRT televisions! At some point, I'm going to investigate doing my own upscaling -- there may be ways to improve quality while staying faithful to the source material.
During image generation, my own way of dealing with this is just upscaling, upscaling, upscaling, along with facedetailing. Even just a Lanczos resize of 1.33x and a resampling at 5 steps /0.20 denoise, can de-blur things quite a bit, while keeping things maximally "authentic."
You may have luck trying "RealESRGAN_x4Plus Anime 6B" or a similar "model upscaler." I don't generally play with those, because they aren't really tests of the LoCon itself.
Another thing you could do differently from me is lowering the LoCon's strength (duh!), putting the trigger word at the end of your prompt and lowering (the strength:0.1), or trying negative embeddings. (Or you could even try sharpening things in Photoshop. I did not do any sort of post-processing to the sample images, though -- the sample images are extremely upscaled, but otherwise "raw")
Also, maybe try testing it with something other than ColorburstXL! :)
Other than the blurriness, the sample images are reasonable representations of what you'll find during image generation. You may want to put things like "helmet" and "military uniform" in the negative prompt unless you're specifically trying to generate those.
___
#5: Heavy Metal L-Gaim (1986) -- LoCon for Pony, added May 8th, 2024
Trigger word is HMLGZZ (which is Heavy Metal L-GAIm, with each vowel replaced by a Z)
Similar the Gundam 1979, this LoCon suffers from the source material being blurry... except it's probably even worse. Because Gundam is an immensely profitable, internationally-known series, and pretty much nobody cares about Heavy Metal L-Gaim unless they want to check out "what that Five Star Stories guy (Mamoru Nagano) was working on before he started writing Five Star Stories." And basically, I don't think anyone has bothered to preserve and/or remaster Heavy Metal L-Gaim.
Which is a shame. Because it's as good as any 1980s television anime, and Yoshiyuki Tomino worked on it, and yet it's refreshingly Not Gundam(TM).
Now, are YOU going to have fun with this LoCon? Get some laughs out of it? Find it interesting? I have no frickin' idea. The LoCon does produce characters who look like they're from Heavy Metal L-Gaim... but that means that some dudes look like girls, and some chicks look like boys, and... THE 1980S WERE A WEIRD TIME, AND IT'S NOT MY FAULT THAT HEAVY METAL L-GAIM PUT THE WEIRDNESS OF THE 1980S INTO A SCI-FI ANIME, OKAY?!
On your end, if you want to deal with image blurriness, you'll have to attempt the same basic suggestions that I made for the Gundam 1979 LoCon.
On my end, this preliminary test LoCon confirms for the Nth time that yeah, "SDXL" and "low-resolution source material" don't mix too well. But I like Heavy Metal L-Gaim enough that I might try doing some pre-processing on the dataset in the future.
UPDATE, May 11th, 2024: I finally found a model upscaler that seems to work better than Lanczos resizing, while still being fairly faithful to the classic anime appearance. At some point, I will probably try retraining the Heavy Metal L-Gaim LoCon with upscaled images... but maybe it's not necessary, if the generated images can just be upscaled with the same model upscaler! Feel free to test: 2x AniScale-2-ESRGAN - OpenModelDB
___
#6: Armored Trooper VOTOMS (1983) -- LoCon for Pony, added May 9th, 2024
Trigger word is VZTZMS (which is VOTOMS, with each vowel replaced by a Z)
The LoCon does achieve the basic style of the 1983 VOTOMS television show. I'm sure it's not to the average person's tastes, but it could be good for a laugh, if you're a VOTOMS fan.
Also, for laughs, I copied a Hatsune Miku prompt from the AAM XL checkpoint page, and... Wow, the various "quality" words (like specifically, the "highres, 4k, 8k, intricate detail, cinematic lighting, amazing quality, amazing shading, soft lighting, Detailed Illustration, anime style, wallpaper" part) did seem to take the "bite" out of the VOTOMS style. Which could be a good thing or a bad thing, depending on your goals. And it might be worth trying on the Gundam 1979 and Heavy Metal L-Gaim LoCons. I was wondering how to make these LoCons more versatile, but I guess it's not as hard as I assumed!
(Please note: I have no idea how Pony actually interprets tags like "amazing quality." I simply copied another prompt wholesale, plugged it in, and didn't test any specifics)
___
#7: Cowboy Bebop (1998) -- LoCon for Pony, added May 10th, 2024
Trigger word is CZWBZP (which is COWboy BeBOP, or "COWBOP," with each vowel replaced by a Z)
It produces nice-looking pictures. Kind of refreshing after testing the LoCons for old (blurry) mecha anime.
Based on my testing: if you leave any "blanks" (like if you don't add a specific hair style), those blanks will get filled in by features of Spike or Faye. But, negative prompting is not always necessary -- if you look at the sample images, "twintails" was enough to get a picture of a lanky dude without Spike's afro.
That said, I would recommend having "suit, formal, necktie, yellow shirt" in your negative prompt, unless you're trying to get those features.
Overall, the LoCon is a little bit "squirrely," but not too bad. If you find the results a bit too stubborn/uncontrollable, then you could try lowering the strength of (the trigger word:0.5) or the strength of <the entire LoCon:0.9> -- all of my sample images were generated with the LoCon at 1.0 strength.
___
#8: Five Star Stories (1989) -- LoCon for Pony, added May 11th, 2024
Trigger word is FZVSTZ (which is FIVe STAr, with each vowel replaced by a Z)
Image quality is good, and the unique style shines through.
"1girls" default to looking like the Fatima... and Ladios Sopp. "Braided ponytail" in negatives will help you if Sopp shows up. "Forehead jewel" does seem to get rid of the Fatima forehead jewels. When trying to generate black hair, I had to add a couple of other colors ("green hair" and "black hair") to the negative prompt to get actual black hair. But these things were all manageable, and they were only required in some situations.
"Shoulder pads" and/or "shoulder armor" did not reliably eliminate the weird... shoulder thingies... that the Fatima wear. But specifying other clothing (like "white t-shirt") seemed sufficient to eliminate the weird... shoulder thingies.
"1boys" tend to look like "that other Mortar Headd pilot." Which is fine with me. Because Sopp looks like "a 1girl."
___
#9: Cutie Honey (1973) -- LoCon for Pony, added May 21st, 2024
Trigger word is CZTHZN (which is CUTie HONey, with each vowel replaced by a Z).
This one is pretty boring.
Prepare to see Honey everywhere.
Prepare for boring and repetitive backgrounds.
Yeah, this LoCon is "faithful." Faithfully boring! Even for people (like me) who like anime from this era... this LoCon, in its current form, simply isn't "versatile"!
BUT...
This is the first time that I upscaled low-resolution images to get them up to an SDXL-friendly resolution, and that turned out pretty well! So in that sense, this Cutie Honey LoCon is a refreshing kind of success.
I will admit that I did not try cooking the Cutie Honey LoCon with original "non-upscaled" images. So, even though I'm happy with the results, the results are not all that scientific. So, I'll probably be repeating this experiment with Heavy Metal L-Gaim, because I've already cooked a preliminary test LoCon for Heavy Metal L-Gaim using the "non-upscaled" images.
This is the upscaler I used ("2x_AniScale2_ESRGAN_i16_110K.pth"): https://openmodeldb.info/models/2x-AniScale-2-ESRGAN
___
#10 Devilman TV (1972) -- LoCon for Pony, added May 21st, 2024
This is one that I baked a while ago. I actually hate the trigger word, but it's not like I'm gonna re-bake it for that.
Trigger word is DZVZRG (which is DEVilman ORiGinal, with each vowel replaced by a Z)
Also, many (but not all) of the images with Devilman were tagged with MCDevil. This turned out to work pretty well, considering how low effort it was -- see sample images.
Because this is an older LoCon of mine, the training settings were a little different, but the changes probably didn't cause any practical difference.
I am tired, at the moment. May have more comments later. The Devilman TV show is pretty funny...!
_________
THE SETTINGS I USE FOR TRAINING THESE PRELIMINARY TEST LOCONS (all based on my 4090 desktop card):
Keep in mind that I'm not necessarily recommending these settings. I've never read a "technical paper" on LLM/StableDiffusion topics, and I'm simply not savvy with computer science. My settings were developed through trial and error, based on my own observations and tastes. So, for example, I couldn't tell you what exactly my Conv values are contributing during actual image generation.
Also keep in mind that for my complete/final LoCons (such as my Hellsing LoCon), I use different settings -- I max out the Dim/Conv at Batch Size 4, which has turned out to be 116 Dim (58 Alpha), 88 Conv (44 Alpha). WHEREAS FOR MY PRELIMINARY TEST LOCONS, I USE:
Derrian_Distro's LoRA Easy Training Scripts https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
"plain/old LoCon" type (not "LoCon per the LyCORIS package"... because "plain LoCon" supports Scale Weight Norms for SDXL, and "LyCORIS LoCon" doesn't -- although this may have changed /been updated?)
Batch Size = 8
Dim = 64, Alpha = 32
Conv = 32, Alpha = 16
Gradient Checkpointing = ON
Scale Weight Norms = 2
Network/Neuron Dropout = 0.33
Weight Decay = 0.5
Min SNR Gamma = 5
Optimizer = PRODIGY
Learning Rate Scheduler = Constant
d_coef = 2
(and of course, "learning_rate = 1.0")
betas = "0.9,0.99"
use_bias_correction = "True"
decouple = "True"
safeguard_warmup = "True"
Training precision = bf16
Save precision = fp16
Pyramid Noise Offset (iterations = 6, discount = 0.3)
Resolution = 1024 (with buckets on, standard 64 pixel increments)
no_half_vae = true
xformers = true
cache_latents_to_disk = true (this is very frigging important for me, cuz I use such large datasets)
... And then, for the preliminary test LoCons, I use up to 10000(!!!) images. If there are more than 10000 images available (!!!!!!!), I'll pluck out 10000 at random. My method for this is pretty crude -- I use Bulk Rename Utility (CTRL+8 to randomize all files in the folder, then prepend numbers to the files with padding=6, then move 10000 of the files, then remove the prepended numbers). There's probably a smarter way to do this. If you want to help me out by providing a Python script to do it, let me know :P
I cook for 9 epochs. Doing more might "firm things up a bit," but my experience is that the overall style will DEFINITELY come through after just a few epochs, and any overfitting/underfitting will become evident by 9 epochs.
Doing 9 epochs for these preliminary cooks isn't a number that came out of nowhere -- it's based on the Scale Weight Norms data that I have from cooking A LOT of LoCons (at Scale Weight Norms = 2). And my observation is that, if there are 2000+ images, 9 epochs will result in Keys Scaled = 350+, and Average Key Norm ~1.25. When I'm cooking my final LoCons, I will cook for 20 to 24 epochs (!!!), and Keys Scaled will end up at ~350~550, and Average Key Norm will end up at about ~1.6.
Also, one reason I chose Scale Weight Norms = 2 as my default setting is that, over the 20+ epochs on my final LoCons, the Keys Scaled will sometimes reverse -- as in, maybe the Keys Scaled will get up to 500, but then on the next epoch, the Keys Scaled will go back down to 450. And I noticed from my records that epoch 9 to epoch 10 is the first time that I've ever seen Keys Scaled go back down. Of course, "Keys Scaled" goes up and down EVERY STEP, but the point is, THAT occasional phenomenon of "Keys Scaled POSSIBLY going back DOWN on epoch 10" is why I decided "fuck it, 9 epochs is probably enough."
And 9 epochs has turned out to be good enough for my purposes.
And, an important part is: For SDXL, at resolution=1024, for 10000(!!!) images, the cook time for 9 epochs is under 11 hours. Which means things can run overnight, when the weather is cooler. (If you're renting a 4090 remotely, that would be less of a concern, ha ha ha ha HA HA HA)
Obviously, everything is dependent upon dataset. But I'm at least keeping some variables controlled (training parameters), and keeping changes minimal (i.e. I only change the dataset), and the results/records are all "written down"... so that's science, right? :)
_______
SOME GENERAL INFORMATION ABOUT ALL OF MY LORAS/LOCONS:
To generate the sample images, I don't use outside LoRAs. As in, I only use the checkpoints plus my own LoRA/LoCon. I don't use embeddings, either. So the sample images are very "pure/simple."
I use ComfyUI only, so my image generation methods might be different from what's typical with Automatic1111 or CivitAI's online generator. Notably, I can't inpaint in the STRAIGHTFORWARD/POWERFUL/EASY Automatic1111 way! If things get too weird, I usually just "roll the dice" again! But chances are good that inpainting will be an asset to you, especially for things like getting eyes that are 100% stylistically correct.
99% of my testing is with Euler A, at 25-30 steps, CFG 6-7, with the LoRA/LoCon at strength 1.0, and with the trigger word at the very start of the positive prompt. I test both latent upscaling (1.25x to 1.5x, 0.60-0.65 denoise, ~12 steps) and "regular" upscaling. I've tested face detailers, too. All of it works appropriately, but some typical "light troubleshooting" will be required if you're going for a specific output.
I do upscale my sample images quite a bit. Usually, it's latent upscaling first, then an additional pass with "regular" upscaling at about 1.5x. I usually resize with boring old Lanczos, and not "model upscalers" (such as AnimeSharp4X), because Lanczos resizing ends up more faithful to the original anime's style -- but really, I do whatever works.
In regards to upscaling and YOU: Don't be surprised if something like my Guyver OVA LoCon gives you gritty/grainy initial results. Experimentation is required. And it varies from LoRA to LoRA -- for example, FLCL is a nice-looking and somewhat modern anime, so it doesn't have the same problems as the Guyver OVAs.
Where you put the trigger word does matter, and (the trigger word's weight:1.2) does affect things. So to decrease the "faithfulness/literalness" of the style, put the trigger word all the way at the end, and (decrease its weight:0.5). Of course, you can play with the weight of <the actual LoRA/LoCon:0.8>, but I tried to ensure that things work reasonably at 100% strength. (If my LoRAs/LoCons are totally unusable at 1.0 strength, I simply don't release them! I find another way to re-train them!)
For both Pony and Animagine, I try "simple background, gradient background" in the negative prompt. This does seem to help with the typical boring backgrounds of SDXL anime models, and it might help more than usual in cases where a LoRA/LoCon is based on an anime that has many boring/simple backgrounds (FLCL comes to mind).
In general, my prompts are uncreative. I just use wildcards -- which is partly why, if you look at the image generation data, you'll see some "dead weight" words that don't seem to be doing anything. Also, I'm very lazy, and I don't even bother removing tags like "worst quality, low quality" when switching from AnimagineXL to Pony. So, I gave you my actual prompts in the image generation data... but don't assume that my prompts are worth imitating.
___
GENERAL INFORMATION FOR MY PONY LOCONS:
My Pony testing was all done with pretty standard quality tags in the positive prompt, in the form of "primary trigger word, score_9, score_8_up, score_7_up, score_6_up, source_anime," so I can guarantee that those work, but feel free to experiment!
For checkpoints to combine with this Pony LoCon at 1.0 strength, I recommend starting with the following (which are in the "Recommended Resources" at the bottom of the page):
1) ColorburstXL (Pony), because it produces "faithful style" while still being more consistent and aesthetically pleasing than the base Pony model. Then, maybe try...
2) Luminaverse (Pony) will get you something more "modern" while still being quite faithful to the LoCon.
3) Reponygine (Pony) will get you something quite "modern and pleasant," while still retaining good core style features (like eyes and hair details) and character likenesses.
4) "Prototype" (Pony). I don't quite know how to describe the look yet, but it works well.
5) AnimeConfettiComrade (Pony), IF you want a MORE "literal" interpretation of the LoCon, while still being more consistent than the base Pony model.
You CAN use this Pony LoCon with the base Pony model, but the style might be overpowering, and you may have to lower the LoCon's strength a fair bit below 1.0, and/or move the trigger word to the end of the positive prompt, and/or reduce the strength of (the trigger word:0.1) -- which are all the same ways that you'd adjust the strength with other checkpoints, of course. Adjust to your taste, always :) And, there are many more Pony-based checkpoints that I haven't even tried, so if you find one that works well with this LoCon, please let us all know!
_
Yes, my filenames are long. It's because I personally use LoRA filenames to quickly compare different versions of my LoRAs while testing things. I recommend renaming the file to your tastes! Though it would make sense to leave the primary trigger word in the filename, right? (It's up to you!)
Yes, my Pony LoCons are extraordinarily large. But I made them for myself, and, to quote my genius friend, "it works for me and what I do." If you try these Pony LoCons, and you've made some SDXL style LoRAs/LoCons that you like, I would be happy to discuss training settings with you.
__
No warranty, express or implied.
I place no special restrictions on what you may do with these .safetensors files. Likewise, I am not responsible for anything that you do with them.
描述:
Preliminary test LoCon for Pony.
训练词语: FZVSTZ
名称: SDXL-FZVSTZ-V6-ND33-dco2-WD05-SWN2-MSG5-8B64R32A32C16A-1024-LOCON-PONY-E9-000009.safetensors
大小 (KB): 475371
类型: Model
Pickle 扫描结果: Success
Pickle 扫描信息: No Pickle imports
病毒扫描结果: Success