LIPS版本v1.1 (ID: 477934) 综合资源合集综合资源合集

LIPS (Long Indecently Pierced and Sagging) is a checkpoint/model trained (AKA fine-tuned) based on a personal image collection of mine that I've been collecting since 2016.

I had to train a model with my own dataset because I found that I could only mix/merge existing models that had what I wanted for older versions. Either SDXL is new or is undertrained by the community.

Fun fact: No one said "LGTM SHIPPY IT"
(see Luminaverse XL for reference)

Story time: My inspirations for training this model are AutismMix for understanding the concepts I like, Luminaverse XL for having a good base style, and Animagine XL v3.1 for understanding the overall composition (but unfortunately missing fine details). I've attempted merging them as well as 30+ different other trained models, but some just don't merge well, and when getting down to block level merging, I realized that training my own would be probably more efficient.

This is my first attempt that isn't fully broken or almost identical to the base model. The training took about 80 GPU hours, so if there is a follow-up version, I'll have to say goodbye to my PC for 3+ days.

The main (and first round) of training was performed on a dataset of 8997 images, with Danbooru-style captioning to make sure the trained model understands my prompts. Then, the second round (of fine-tuning) was introduced using only the "favorited" images to fix the broken U-Net after the harsh first round.

The captioning was slightly changed to fit the base model's tokens:

# equivalent danbooru search → token included in caption



score:150.. → masterpiece

score:100..150 → best quality

score:75..100 → great quality

score:25..75 → medium quality

score:0..25 → normal quality

score:-5..0 → low quality

score:..-5 → worst quality



rating:g → safe

rating:s → sensitive

rating:q → nsfw

rating:e → nsfw, explicit



date:2021-01-01.. → newest

date:2018-01-01..2021-01-01 → recent

date:2015-01-01..2018-01-01 → mid

date:2011-01-01..2015-01-01 → early

date:..2011-01-01 → oldest

No other metatags (like absurdres) were included except for the ones above.

The captions were ordered. The following order groups apply:

artist, copyright, character, general

Additionally, these categories were ordered. Meaning that in the general category, the first tag is always 1girl or 1boy.

Most captions contained 1 artist, 1-2 copyrights, 1 character, and 10-30 general tags.

Note: the training data contains about 1700 non-1girl/non-1boy images!

Based on Jaccard indexing the caption tags, when infering results, at least one of the following (positive) tags should be included to get the best results:

nipple piercing, cleft of venus, long labia, clitoris, uncensored, female pubic hair, exhibitionism, anus, public indecency, clitoral hood, pussy, female masturbation, completely nude, spread pussy, urethra, object insertion, sagging breasts, pussy juice, presenting, breasts apart, anal, peeing

In my limited testing, the model performs quite well with just worst quality in the negative prompt. I usually put 1girl, solo, breasts in my positive prompt. Take note that I sort the general/meta tags so the final prompt would look something like 1girl, something, something, breasts, something, solo

I'd recommend starting with a single negative tag and then combining them to get the desired results. It's worth exploring tags other metatags too, like: derivative work, off-topic, redrawn, 3d, render

Training Settings

The training settings can be found in each version's description.
For visibility, here are the settings for v1.0:

Main training settings:

7+7+12 epochs (The first two training sessions crashed after 8 epochs. Resumed the training from epoch 7 twice.)
1 repeat per epoch
AdamW8bit optimizer
Learning rates: U-Net=1e-5; TextEnc=2e-6
Scheduler: Cosine with (restarts and) warmup for 10% of the steps
Train batch size of 4
No noise offset
8997 images
Base model: Luminaverse XL v1.0 + FixVAE v2

Secondary training settings:

25 epochs without crashing
1 repeat per epoch
AdamW8bit optimizer
Learning rates: U-Net=1e-6; TextEnc=0 (but no caching)
Scheduler: Constant with no warmup
Train batch size of 4
Noise offset: 0.0357
2317 images (a subset of the previous 8997)
Base model: The output of the main training

Based on Luminaverse XL v1.0

I intended this model to be NSFW biased but as I'm doing more testing I find that even if I suggest an NSFW context like "uncensored" or "body exploration" or anything similar, it fails to undress the subjects. Sometimes it works, but not always. If I use tags that heavily imply nudity, it works most of the time. (Like asking for "nipples" or "pubic hair")

License

This model follows the same Licensing as Luminaverse XL and by extension Animagine XL 3.0, the Fair AI Public License 1.0-SD, compatible with Stable Diffusion models. Here are the Key points as taken from the original Model's page:

Modification Sharing: If you modify this model, you must share both your changes and the original license.
Source Code Accessibility: If your modified version is network-accessible, provide a way (like a download link) for others to get the source code. This applies to derived models too.
Distribution Terms: Any distribution must be under this license or another with similar rules.
Compliance: Non-compliance must be fixed within 30 days to avoid license termination, emphasizing transparency and adherence to open-source values.