Style of Jet Set Radio版本v1.0 (ID: 1019255) 综合资源合集综合资源合集

Overview

This LoRA, based on character artwork for Jet Set Radio, was a bit of an experiment. I didn't expect it to work at all, given that its dataset violated all the rules I typically follow when making Flux LoRAs.

? Only 26 images were used (for me, it's heresy to even think about making a LoRA with fewer than 200 images!)

? All of the images were just character artwork on a white background (this lacking in diversity was painful, I did not even apply any kind of augmentation!)

? Minimal captioning, just "JSR style" (as an avid JoyCaption enjoyer, I felt this as an insult!)

Still, it turned out well. It has a weird, funky vibe, but I like it. Although it obviously doesn't capture Jet Set Radio's in-game graphics style (because I only used artwork), the style is recognizable and unique.

It mostly prefers short prompts and may sometimes revert to realistic backgrounds, especially when photo-related elements are mentioned or the prompts are too detailed. This likely occurs because the dataset included only images with white background rather than actual environments. And well, occasionally, this creates a cool effect, making the character look like a comic book hero in a real-world setting.

Also, if no background is specified in prompt, it may default to a plain white background. It tends to include people even if none are specified in the prompt, so it's better to explicitly describe what should appear in the foreground.

I plan to address these issues by retraining model with a synthetic dataset, bootstrapped from this LoRA itself.

Usage

Gallery images were generated with the following settings:

Model: flux1-dev (fp8e4m3fn)

Text Encoder: t5pxxl_fp16

Sampler: euler

Scheduler: 24 steps (normal)

Flux Guidance: 4

LoRA Strength: 1

Training

The LoRA was fine-tuned on an RTX 3090 with the AI Toolkit and the hyperparameters, as follows:

Rank/alpha: 8/1

Optimizer: prodigy

Steps: 2800

Batch size: 1

Learning rate: 1

Learning rate scheduler: cosine

Decouple: true

Use bias correction: true

Betas: (0.9, 0.99)

Weight decay: 0.01

D-coefficient: 0.9

Noise offset: 0.1

Resolution: (512, 768, 1024)