I've dabbled with ComfyUI, allows for some good workflows and options but does take a lot of setup work (and I'm not sure I've got mine setup correctly).
However, I'd suggest starting with Automatic1111 to get the feel how it works.
In terms of base models:
SD 1.5 - a lot of resources, Lora's, etc available however can have issues with hands, etc if not prompted correctly (seems to rely a lot on negative prompting). Max res: 512x512 before upscaling
SD 2.0 - not tried it, I don't think the community got behind it. I believe max res is still 512x512
SDXL - more matured model and better overall quality of image generated (eg. better handling of hands, etc), good community support. requires more resources that 1.5, however max res is 1024x1024
SD 3 - Very new and the community seems to be turning against it, the publicly released model seems to have problems. In theory should be more efficient than SDXL while producing better images. Max Res 1024x1024
The style of prompting between SD 1.5, XL and 3 vary quite a bit, so a good prompt for SD 1.5 will not necessarily produce a good output in XL (and vice versa)
Not a definitive guide there, but hopefully gives you info to get started.
If anyone happens to know any Checkpoints/Models that are good for creating images of Transformers (Optimus Prime, Megatron, etc), please point me in the right direction - I've found Dall-e and a few other online systems to be reasonable good at them, but Stable Diffusion just doesn't manage it (though not yet explored Lora's for that)