Stable Diffusion 3.5 Medium版本Workflow (ID: 1006708) 综合资源合集综合资源合集

Please see our Quickstart Guide to Stable Diffusion 3.5 for all the latest info!

Stable Diffusion 3.5 Medium is a Multimodal Diffusion Transformer with improvements (MMDiT-x) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.

Please note: This model is released under the Stability Community License. Visit Stability AI to learn or contact us for commercial licensing details.

Model Description

Developed by: Stability AI
Model type: MMDiT-X text-to-image generative model
Model Description: This model generates images based on text prompts. It is a Multimodal Diffusion Transformer (https://arxiv.org/abs/2403.03206) with improvements that use three fixed, pretrained text encoders, with QK-normalization to improve training stability, and dual attention blocks in the first 12 transformer layers.

License

Community License: Free for research, non-commercial, and commercial use for organizations or individuals with less than $1M in total annual revenue. More details can be found in the Community License Agreement. Read more at https://stability.ai/license.
For individuals and organizations with annual revenue above $1M: please contact us to get an Enterprise License.

Implementation Details

MMDiT-X: Introduces self-attention modules in the first 13 layers of the transformer, enhancing multi-resolution generation and overall image coherence.
QK Normalization: Implements the QK normalization technique to improve training Stability.
Mixed-Resolution Training:
- Progressive training stages: 256 → 512 → 768 → 1024 → 1440 resolution
- The final stage included mixed-scale image training to boost multi-resolution generation performance
- Extended positional embedding space to 384x384 (latent) at lower resolution stages
- Employed random crop augmentation on positional embeddings to enhance transformer layer robustness across the entire range of mixed resolutions and aspect ratios. For example, given a 64x64 latent image, we add a randomly cropped 64x64 embedding from the 192x192 embedding space during training as the input to the x stream.

These enhancements collectively contribute to the model's improved performance in multi-resolution image generation, coherence, and adaptability across various text-to-image tasks.

Text Encoders：
- CLIPs: OpenCLIP-ViT/G, CLIP-ViT/L, context length 77 tokens
- T5: T5-xxl, context length 77/256 tokens at different stages of training
Training Data and Strategy:

This model was trained on a wide variety of data, including synthetic data and filtered publicly available data.

For more technical details of the original MMDiT architecture, please refer to the Research paper.

Usage & Limitations

While this model can handle long prompts, you may observe artifacts on the edge of generations when T5 tokens go over 256. Pay attention to the token limits when using this model in your workflow, and shortern prompts if artifacts becomes too obvious.
The medium model has a different training data distribution than the large model, so it may not respond to the same prompt similarly.
We recommended to sample with Skip Layer Guidance for better struture and anatomy coherency.

描述:

Official Stability AI ComfyUI workflow for SD 3.5 Medium

训练词语:

名称: stableDiffusion35_workflow_trainingData.zip

大小 (KB): 2

类型: Training Data

Pickle 扫描结果: Success

Pickle 扫描信息: No Pickle imports

病毒扫描结果: Success

Stable Diffusion 3.5 Medium

资源下载

下载价格VIP专享

仅限VIP下载升级VIP

犹豫不决让我们错失一次又一次机会！！！

原文链接：https://1111down.com/1125257.html，转载请注明出处

Stable Diffusion 3.5 Medium版本Workflow (ID: 1006708)

Model Description

License

Implementation Details

Usage & Limitations

描述:

Official Stability AI ComfyUI workflow for SD 3.5 Medium

训练词语:

在线客服

升级VIP

全屏浏览

夜间模式

繁简切换

返回顶部

Stable Diffusion 3.5 Medium版本Workflow (ID: 1006708)

Model Description

License

Implementation Details

Usage & Limitations

描述: Official Stability AI ComfyUI workflow for SD 3.5 Medium

训练词语:

猜你喜欢

Kesha Lora版本V1 (ID: 1298881)

Iron Patriot版本v1.0 (ID: 1291511)

Dark Ishihara版本V1 (ID: 1303876)

Female dancer posing SDXL版本V1 (ID: 1249895)

ybqy版本V1 (ID: 1314652)

Halina Pawlowská版本v1.0 (ID: 1294642)

在线客服

升级VIP

全屏浏览

夜间模式

繁简切换

返回顶部

社交账号快速登录

社交账号快速登录

描述:

Official Stability AI ComfyUI workflow for SD 3.5 Medium