122 sats \ 1 reply \ @mrsu 15 Jun \ parent \ on: Making my local LLM voice assistant faster and more scalable with RAG tech
Yes. Its getting much easier. You can easily spin up a back end now without much tech knowledge. You don't even need a GPU (although that speeds things up significantly).
If you're interested, look into Ollama or LM Studio. They provide APIs to interface with the LLMs. Then there are a bunch of clients you can install and point to this endpoint.