
LIPS (Long Indecently Pierced and Sagging) is a checkpoint/model trained (AKA fine-tuned) based on a personal image collection of mine that I've been collecting since 2016.
I had to train a model with my own dataset because I found that I could only mix/merge existing models that had what I wanted for older versions. Either SDXL is new or is undertrained by the community.
Fun fact: No one said "LGTM SHIPPY IT"
(see Luminaverse XL for reference)
Story time: My inspirations for training this model are AutismMix for understanding the concepts I like, Luminaverse XL for having a good base style, and Animagine XL v3.1 for understanding the overall composition (but unfortunately missing fine details). I've attempted merging them as well as 30+ different other trained models, but some just don't merge well, and when getting down to block level merging, I realized that training my own would be probably more efficient.
This is my first attempt that isn't fully broken or almost identical to the base model. The training took about 80 GPU hours, so if there is a follow-up version, I'll have to say goodbye to my PC for 3+ days.
The main (and first round) of training was performed on a dataset of 8997 images, with Danbooru-style captioning to make sure the trained model understands my prompts. Then, the second round (of fine-tuning) was introduced using only the "favorited" images to fix the broken U-Net after the harsh first round.
The captioning was slightly changed to fit the base model's tokens:
# equivalent danbooru search → token included in caption
score:150.. → masterpiece
score:100..150 → best quality
score:75..100 → great quality
score:25..75 → medium quality
score:0..25 → normal quality
score:-5..0 → low quality
score:..-5 → worst quality
rating:g → safe
rating:s → sensitive
rating:q → nsfw
rating:e → nsfw, explicit
date:2021-01-01.. → newest
date:2018-01-01..2021-01-01 → recent
date:2015-01-01..2018-01-01 → mid
date:2011-01-01..2015-01-01 → early
date:..2011-01-01 → oldest
No other metatags (like absurdres) were included except for the ones above.
The captions were ordered. The following order groups apply:
artist, copyright, character, general
Additionally, these categories were ordered. Meaning that in the general category, the first tag is always 1girl or 1boy.
Most captions contained 1 artist, 1-2 copyrights, 1 character, and 10-30 general tags.
Note: the training data contains about 1700 non-1girl/non-1boy images!
Based on Jaccard indexing the caption tags, when infering results, at least one of the following (positive) tags should be included to get the best results:
nipple piercing, cleft of venus, long labia, clitoris, uncensored, female pubic hair, exhibitionism, anus, public indecency, clitoral hood, pussy, female masturbation, completely nude, spread pussy, urethra, object insertion, sagging breasts, pussy juice, presenting, breasts apart, anal, peeing
In my limited testing, the model performs quite well with just worst quality
in the negative prompt. I usually put 1girl, solo, breasts
in my positive prompt. Take note that I sort the general/meta tags so the final prompt would look something like 1girl, something, something, breasts, something, solo
I'd recommend starting with a single negative tag and then combining them to get the desired results. It's worth exploring tags other metatags too, like: derivative work, off-topic, redrawn, 3d, render
Training Settings
The training settings can be found in each version's description.
For visibility, here are the settings for v1.0:
Main training settings:
-
7+7+12 epochs (The first two training sessions crashed after 8 epochs. Resumed the training from epoch 7 twice.)
-
1 repeat per epoch
-
AdamW8bit optimizer
-
Learning rates: U-Net=1e-5; TextEnc=2e-6
-
Scheduler: Cosine with (restarts and) warmup for 10% of the steps
-
Train batch size of 4
-
No noise offset
-
8997 images
-
Base model: Luminaverse XL v1.0 + FixVAE v2
Secondary training settings:
-
25 epochs without crashing
-
1 repeat per epoch
-
AdamW8bit optimizer
-
Learning rates: U-Net=1e-6; TextEnc=0 (but no caching)
-
Scheduler: Constant with no warmup
-
Train batch size of 4
-
Noise offset: 0.0357
-
2317 images (a subset of the previous 8997)
-
Base model: The output of the main training
Based on Luminaverse XL v1.0
I intended this model to be NSFW biased but as I'm doing more testing I find that even if I suggest an NSFW context like "uncensored" or "body exploration" or anything similar, it fails to undress the subjects. Sometimes it works, but not always. If I use tags that heavily imply nudity, it works most of the time. (Like asking for "nipples" or "pubic hair")
License
This model follows the same Licensing as Luminaverse XL and by extension Animagine XL 3.0, the Fair AI Public License 1.0-SD, compatible with Stable Diffusion models. Here are the Key points as taken from the original Model's page:
-
Modification Sharing: If you modify this model, you must share both your changes and the original license.
-
Source Code Accessibility: If your modified version is network-accessible, provide a way (like a download link) for others to get the source code. This applies to derived models too.
-
Distribution Terms: Any distribution must be under this license or another with similar rules.
-
Compliance: Non-compliance must be fixed within 30 days to avoid license termination, emphasizing transparency and adherence to open-source values.
描述:
v1.1 codename "SG" training settings:
-
11 epochs
-
10 repeats per epoch
-
AdamW8bit optimizer
-
Learning rates: U-Net=8e-6; TextEnc=0 (with caching)
-
Scheduler: Cosine with (restarts and) warmup for 10% of the steps
-
Train batch size of 4
-
Noise offset: 0.0357
-
140 images (with breast tags dropped out of captions)
-
Base model: LIPS v1.0
训练词语:
名称: lips_v11.safetensors
大小 (KB): 6775430
类型: Model
Pickle 扫描结果: Success
Pickle 扫描信息: No Pickle imports
病毒扫描结果: Success