SciStyle版本v1.0 (ID: 230090)

SciStyle版本v1.0 (ID: 230090)

SciStyle

v1 of SciStyle is a test model for a new image captioning pipeline I've been working on. The model was trained on a subset of 1k images of various styles/mediums. Surprised by the results for a model trained on only 1k images, I decided to release it here. The full model is currently being worked on.

For more info on the image captioning pipeline, refer to my Discord thread linked bellow


Questions/Feedback/Updates?

Visit my thread on the Unstable Diffusion Discord


Info

S&D

Base Model: Stable Diffusion v1.5

Type: Experimental Fine-tune

Clip: 1

Medium: Multi-medium

Caption Style: Natural Language + Booru Style

Dataset Size: Subset, 4k images out of 25k images + DnD dataset

Training Resolution: 768x768

Difference from v1: More fantasy focused, additional training on a DnD dataset.


V1

Base Model: Stable Diffusion v1.5

Type: Experimental Fine-tune

Clip: 1

Medium: Multi-medium

Caption Style: Natural Language + Booru Style

Dataset Size: Subset, 1k images out of 25k images

Training Resolution: 768x768


V2

Base Model: Stable Diffusion v1.5

Type: Experimental Fine-tune

Clip: 1

Medium: Multi-medium

Caption Style: Natural Language + Booru Style

Dataset Size: Subset, 6.5k images out of 25k images

Training Resolution: 768x768

Difference from v1: More species from various Sci-fi and fantasy universes.


Features

  1. Multi-medium: Capable of generating images from multiple art mediums, simply include the medium in the prompt.

  2. Natural Language & Booru: Accepts both natural language prompts and booru style prompts.

  3. Extra Detail: Understands subtle details often skipped by SD models. Such as, number of objects/subjects in a scene, background information, color information for various parts of the image, atmosphere, ect.. (see my discord thread above for more info on how this is achieved.)

  4. Flexible: Can easily be merged with other SD1.5 checkpoints /LoRAs


Usage

Special Tokens:

  • SciStyle, can be used as a class token at the beginning of the prompt, but is not necessary.

  • Tag for various art mediums, i.e., a comic book illustration of, 90s anime screencap of or, simply add the medium towards the end of the prompt; comic book illustration, photorealistic. These are just examples of tag placement. Feel free to experiment with other mediums


Recommended Settings

Sampler/Solver:

  • Euler a

    • Steps: 20 - 32

    • CFG: 6 - 7.5

  • DPM++ SDE Karras

    • Steps: 30 - 40

    • CFG: 6 - 8.5

  • DPM++ 2M SDE Karras

    • Steps: 50+

    • CFG: 7 - 8

These are just recommendations.

Hires Fix

Settings for all ESRGAN models:

  • Upscale by

    • 1.5 if resolution is > 512x768

    • Don't exceed 2.0 (unless you have a beefy rig)

  • Denoise Strength

    • 0.25 - 0.35

  • Hires Steps

    • If sampling steps > 60,

      • hires steps = half of sampling steps

    • Otherwise, leave at 0

Extensions

ADetailer
Download here

Neutral Prompt

Download here

Read repo(s) Descriptions for usage guides

Negative Embeddings

Only if you want to remake one of the sample images. Personally, I would avoid using negative embeddings and instead use a simple negative prompt and then add+ or subtract- tokens per new idea. I only use them to speed-up inference during sample generation. That being said, other negative embeddings such as EasyNegative, ect.. are also fine to use with this model.


Checkout my other models

SDXL

SD1.5

LoRA

描述:

Initial Release
Subset

训练词语: SciStyle

名称: scistyle_v10.safetensors

大小 (KB): 2440046

类型: Model

Pickle 扫描结果: Success

Pickle 扫描信息: No Pickle imports

病毒扫描结果: Success

SciStyle

SciStyle

SciStyle

SciStyle

SciStyle

SciStyle

SciStyle

SciStyle

SciStyle

SciStyle

资源下载
下载价格VIP专享
仅限VIP下载升级VIP
犹豫不决让我们错失一次又一次机会!!!
原文链接:https://1111down.com/926605.html,转载请注明出处
由于网站升级,部分用户密码全部设置为111111,登入后自己修改, 并且VIP等级提升一级(包月提升至包季,包季提升到包年 包年提升至永久)
没有账号?注册  忘记密码?

社交账号快速登录