
Thanks to sp00ns' guide:
Training a Custom Adetailer Model | Civitai
I created a custom foot model using yolov8x.
The foot model that sp00ns provided was helpful, but I wanted to see about making my own.
Version 1.0:
I'd tried using AutoDistiller and Grounded SAM to automatically label each of the 1000 images, but it partially failed, in that it also registered hands as feet. (Also I hate Colab, as I can't get work done there without it ending the job prematurely)
Therefore, I painstakingly labeled each and every image using RectLabel on my Mac, then spent about 8 hours training the YOLO model on my PC.
Though I'd planned for 500 epochs, it ended early and determined that the best was at the 93rd epoch.
I included a lot of my own generated images, as well as some stock images; anime, 3D models, and realistic images; male and female, varying skin tones, and various footwear configurations as well as barefoot images. That being said, there are some things it still cannot handle well, such as unconventional poses (like images rotated by 90 degrees), and images where the foot is the subject of composition. My guess is because the vast majority of the training images were of that with the feet taking up a small percentage of the canvas, not enough training was dedicated for closeups of feet. On the other hand, my intent was to use this model to refine feet that would otherwise be neglected, such as in the case of full body shots where the feet take up a tiny fraction of the canvas space.
In short, this version is very good at dealing with feet for standing poses especially in full body shots. But it can struggle with feet outside of that range.
Version 2.0:
I noticed that I mislabeled my training/validation folders for version 1, so my training folder was actually my validation folder and vice-versa. I went ahead and renamed them, however simply doing so and assuming that it would take no more than 100 epochs like version 1 led to some other issues--it started detecting whole bodies as feet. So that was 3 hours down the drain. I set the epochs for 200, migrated a lot of the old validations images into the training folder, and added around 160 new images (using RectLabel to painstakingly label each and every image manually.) This time, after 12 hours, it determined that epoch 148 was the best version, so that is what this is.
From what I've tested, it can detect feet in various configurations far better than v1.0 with few issues; it can detect soles; it can detect feet rotated by 90 degrees; and it can mostly detect feet in unconventional poses--depending on the pose.
A few issues I've noticed, however, is that it sometimes detects hands/knees/other objects as feet, albeit at a lower confidence level than actual feet. If you see this occurring, I'd recommend increasing the Detection model confidence threshold in the Adetailer Detection settings to at least 0.5.
For images of feet that takes up a great majority of the canvas, sometimes it detects them, sometimes it partially detects them, and sometimes it detects one but not the other. Arguably, this model wasn't designed for such images, even though such images were included in the training dataset, because what this model does is crops the total canvas to focus on its target; feet, in order to dedicate a lot of image generation to refine/modify said feet. If feet are already the focus of the image, taking up 50% or more of the total canvas, then this model effectively serves little in the way of refining the target. One can still use it for that purpose if they so please, but it might lead to more problems than solutions, depending on how you use it.
Installation:
Simply move the file into the ~\stable-diffusion-webui\models\adetailer folder and restart the webui. Should also work on ComfyUI, but I haven't tested it there. Of course, you'll need the ADetailer extension for Automatic 1111, or its equivalent on ComfyUI for any of this to work.
Tip: You can increase the ADetailer model count in Automatic 1111 by going to: Settings>ADetailer>Max models.
Note: Civitai doesn't seem to have a category for ADetailer stuff, so I'm setting it as a checkpoint--even though it's not. The settings on pruned or full and precision stuff I just set to whatever.
Also to note, these days stable diffusion seems to be good at doing feet at least in portrait aspect ratios, so I had a hard time coming up with a good use case for portrait. So I instead used the model to paint Tharja's toenails in the example. But, this model will be especially good for landscape aspect ratios similar to what I do normally, as the feet tend to be quite low quality there.
描述:
Installation: Simply move the file into the ~\stable-diffusion-webui\models\adetailer folder and restart the webui.
This version was created since I noticed that my training images where mistakenly labelled as my validation images, and vice-versa. Initially, I simply switched the names around and tried for 100 epochs (since version 1 stopped at 93). However, that led to some problems, and not only did it detect feet, but it detected whole bodies as though it was feet as well, so I tried something different.
For this version, I transferred a lot of the old validation images into the training images, and tried for 200 epochs, but it determined that epoch #148 was the best. Though this version has its own problems here and there, such as occasionally detecting hands and knees as feet, it now seems to detect feet better than v1.0 and the interim version.
If you find that it's detecting hands/knees/other object falsely as feet, I'd recommend increasing the Detection model confidence threshold setting in ADetailer to at least 0.5 and/or using advanced prompt/controlnet methods.
This model isn't perfect, but it should be an improvement to v1.0 for those who desire a foot detection model outside of just simple standing/sitting poses.
Note:
This site's being very buggy today, so to differentiate the buggy v2.0 stuff, I'm temporarily calling this v2.2 so I can see what I'm working with here (there's way too many duplicates, and they won't go away when I delete them). If those other versions go away, or this one is successfully uploaded, then I'll rename this version back to v2.0
训练词语:
名称: adetailerFootYolov8x_v20.pt
大小 (KB): 133508
类型: Model
Pickle 扫描结果: Success
Pickle 扫描信息: No Pickle imports
病毒扫描结果: Success