
This version has a watermark in the bottom right. Part of the experiment involved captioning it vs not captioning it. This version is the better of the 2, with it captioned.
Part of my research and an extension of my wooly flux work. Through subsequent testing I've come close to a "one size fits all" solution to a couple of major use cases in training.
This time it was trained using TensorArt's online training tool.
This lora was trained using 100 images. 30 of which were captioned using my own specialized format of captioning like so:
precursor legacy mode, cel-shaded textures, warm lighting, era 2000s fantasy, mechanic attire, duo, tech-savvy style, rating sfw, early 2000s video game graphics, keira inside samos' hut speaking with daxter in his ottsel form behind her, ¬the entrance to the hut is visible behind her as she leans forwards and exposes her cleavage while daxter playfully leans towards the door behind her, soft indoor lighting filter.
--
The final 70 images included 5 images per main character from the game, plus a few group shots and specific objects. They were captioned with a lot of tags like so:
Daxter, Ottsel
With a selection of scenery images included also tagged like so:
Sentinel Beach, day, tree, nature, scenery, rock, lantern, item chests,
--
I'm still working on it, but this seems to be somewhere close to a good mix if you want to get started making your own style lora.
The captioning style in particular is my baby:
Understanding Caption Structure and Its Role
The model is trained to interpret visual prompts through structured captions that define styles, characters, and scene elements in a precise manner. Each section of the caption contributes specific information to guide the model’s output, ensuring accurate and cohesive results.
Mode:
-
What it does: Specifies the artistic medium or style in which the image should be rendered (e.g., oil painting, digital, 3D render).
-
Purpose: Sets the overall aesthetic of the output by determining how colors, textures, and shapes will be represented.
Additional Tags:
-
What they do: Describe techniques used within the chosen mode (e.g., smooth gradients, bold outlines).
-
Purpose: Refines the artistic approach, allowing for customization of brushwork, shading, or texture application.
Era:
-
What it does: Defines a specific time period or artistic movement (e.g., 1600s Baroque, 2020s Cyberpunk).
-
Purpose: Establishes the visual atmosphere by referencing historical or futuristic styles, influencing character design, architecture, and mood.
Fashion Style:
-
What it does: Describes the clothing or costume worn by subjects (e.g., streetwear, medieval armor).
-
Purpose: Helps in constructing the appearance and identity of characters by focusing on attire, reflecting the theme or setting.
Subject Count:
-
What it does: Specifies the number of characters or subjects in the scene (e.g., solo, duo).
-
Purpose: Controls the composition and dynamics of the scene, indicating if it’s focused on a single subject or involves interactions between multiple characters.
Unique Style Identifier:
-
What it does: Identifies a distinctive visual theme or style that sets the image apart (e.g., whimsical fantasy, futuristic warrior).
-
Purpose: Adds a signature element to the scene, guiding the model towards a specific mood or creative vision.
Rating:
-
What it does: Indicates the content rating (e.g., Rating SFW, Rating NSFW).
-
Purpose: Ensures the generated image adheres to appropriate standards based on intended usage, either safe-for-work or otherwise.
Prompt:
-
What it does: Describes the scene itself, broken down into detailed visual elements (e.g., “A character standing in a neon-lit city, wielding a plasma sword”).
-
Purpose: Provides the core description of what should be generated, focusing on characters, objects, and their interactions within the scene.
Filter:
-
What it does: Defines a visual effect to apply to the final image (e.g., soft light filter, sepia tone).
-
Purpose: Alters the appearance of the output by adding specific visual treatments, such as changes in color balance, contrast, or atmosphere.
How It Works Together:
Each part of the caption plays a distinct role in guiding the model. The mode sets the foundation for the art style, while the tags and era help fine-tune the specifics of the scene. Fashion style and subject count shape the subjects, while the unique style identifier ensures a clear and cohesive theme. Finally, the prompt and filters add narrative and finishing touches, creating a well-rounded, detailed output based on the desired visual direction.
This structured approach ensures flexibility and precision in generating artwork, allowing for a wide range of creative possibilities.
描述:
训练词语: Precursor Legacy Mode,Early 2000s Video Game Graphics,Samos, the sage,Kiera,Jak,Daxter, Ottsel,Maia Acheron, the sorceress,Gol Acheron, the sage
名称: lora.TA_trained5.safetensors
大小 (KB): 598402
类型: Model
Pickle 扫描结果: Success
Pickle 扫描信息: No Pickle imports
病毒扫描结果: Success