
NOTE: This model has it's own VAE, which is baked into the model. For best results, please ensure that the selected VAE in automatic1111 is set to "Automatic". If you've never poked around in the VAE settings, this will be the default.
NextPhoto is the result of a whole lot of training, data curation, and block merging. The model is designed exclusively for the generation of photo-realistic photos, and as such it cannot generate non-photo images (even if prompted to do so). For more details about version 3.0, check out the "About this version".
All sample images were generated using ESRGAN_4x upscaling model at 2x upscaling, with 0.45 denoising strength. I'm not gonna upload a 32 bit model, as the v3 model was trained using 16bit precision, so it would literally just be a waste of space.
Usage Guide
-
(highly recommended) The negative prompt is quite important for the photorealism, but you don't really have to change it ever to get great results. I'd recommend the following negative prompt as a base: (worst quality:0.8), cartoon, halftone print, burlap,(cinematic:1.2), (verybadimagenegative_v1.3:0.3), (surreal:0.8), (modernism:0.8), (art deco:0.8), (art nouveau:0.8)
-
This prompt uses the verybadimagenegative_v1.3 textual embedding. You'll need
-
Place the downloaded file into the "embeddings" folder of the SD WebUI root directory, then restart stable diffusion.
-
-
Positive Prompts: You don't need to think about the positive a whole ton - the model works quite well with simple positive prompts.
-
Examples:
-
A well-lit photograph of woman at the train station
-
A perfect well-lit medium photograph of an old married couple sitting on their porch
-
A poorly lit photograph of a man walking on the trail at night
-
-
For more examples of positive prompts you can look at the sample photos for the model.
-
-
Upscaling: This model works will still generate photorealistic images without upscaling, but upscaling is strongly recommended for photorealism. You'll need to use the ESRGAN_4x upscaling model (not R-ESRGAN) in the hires fix section for decent results. Set the weight anywhere from 0.3 to 0.5 for best results, and the upscale amount to 2. I normally set my weight to 0.5 or 0.45.
-
Sampler: I use DPM++ 2M Karas, and generally don't stray from it. While the other samplers can still produce good results, DPM++ 2M Karas is the most consistent in my experience with this model.
-
For further improvements:
-
Reduce your CFG scale: The default classifier free guidance scale scale of 7 works good, but occasionally this can be too high. Reduce the CFG scale until you like the results - I generally bottom out at 4.0, as anything lower than that and the negative prompt starts getting ignored. Increasing the CFG scale past 7 or 8 will result in more "dramatized" photos (not in a good way), but will also result in the model listening more to the prompts, so balance as needed. High CFG scales can work well for specific situations, but lower CFG scales work great quite consistently.
-
Avoid excess LORA and Textual Inversion use: As v2 and v3 of this model are custom trained and not purely block merged, any LORAs or Textual Inversions may not work as well as they do in other models. Based on my experience, you can still get good results with them, but I'd recommend treading lightly - I'd recommend an additive approach where you add LORAs or inversions selectively when needed.
-
描述:
I've trained the original NextPhoto model against a custom curated set of high quality photographs, then block merged against itself to improve results. The results are the following:
-
Significantly improved photorealism: the training was extremely effective at improving the realism of the model. Skin texture is improved, subject integration into the background is improved, lighting is improved.
-
Better NSFW support/moderate NSFW bias: This part wasn't actually intentional. I included a decent amount of NSFW into the training data to improve the skin textures, and as a result the model is better at the human body (though not hardcore). This also means that the model tends to default to NSFW in some situations - you'll probably need to add some stuff to the negative prompt to avoid this, or explicitly specify clothing in the positive prompt.
-
Minor feature overfitting: Some features are somewhat overfit - specifically some faces. This doesn't pose too much of a problem, as specifying more detail about the face can mitigate this (ethnicity, age, shape, emotion, etc.), but it's something to keep in mind. I'm already working on v3.0 which should resolve this, but I figured I should release v2.0 as it's such a major bump in quality.
-
Better non-human results: Non-human prompts are also improved - the model was trained with a roughly even mix of human/non-human images, so environments, macro shots, etc, are improved.
-
Lower negative prompt importance: The new model is more attuned to generating good results out of the box - even with no negative prompt at all. That being said, I do still recommend the same negative prompt as before (though with slightly lower emphasis - or removal - of the negative textual embeddings).
-
Different scheduler recommendation: When upscaling, the model performs better using Euler A instead of DPM++ 2M Karas. I've tested all the upscalers, and ESRGAN_4x still works the best (at 0.35 to 0.5 denoising strength), but when used with DPM++ 2M Karas, the results are oversharpened. Using Euler A can mitigate against this. DPM Adaptive also yields good results (but is much slower), and the other schedulers tend to be too blurry when upscaling.
训练词语: A perfect photo,A poorly lit photo,A perfect closeup photo
名称: nextphoto_v20.safetensors
大小 (KB): 2486412
类型: Model
Pickle 扫描结果: Success
Pickle 扫描信息: No Pickle imports
病毒扫描结果: Success