How quick are stable diffusion and gpt with M1, in your experience? I tried running a gpt-like on some regular x86 desktop hardware and it was horrible, basically unusable.
SD, particularly from here takes a few seconds, maybe a minute or two depending on other load (I'm usually doing other stuff there) to generate a full image on an M1 mac mini. GPT responses following this particular setup should generate a response in a few seconds.
There's a lot of tweaking that can be made, in particular, if you have a GPU, you can configure a llama model to be run in parallel on the GPU.
reply