I like that you can use chatGPT as a tool, like a spellchecker on steroids, but I'm not happy to send all my private information to a server outside of my control.
There are a few projects out there that try to replicate chatGPT with a local application, that do not send any of your information to servers outside of your device. Everything is done locally.
I tried a few of them, such as privateGPT and gpt4all and each have their pros and cons. Here I want to tell you how to setup one of them that I found the best for most people.
Step 1: Download the installer for your OS here
Step 2: Open the application, it will ask you to download a model. Select GPT4All Falcon.
Step 3: Use it.
It only takes a couple of seconds to generate a response.
You can also add your own documents under Settings->Plugins->LocalDocs Plugin. Once you add a folder, click on the Databse looking icon and select which folders you want to include in your query.
It works surprisingly well for a local app that runs on a standard computer in real time.
Of course, it's not perfect, but I see it as a nice tool to have.
There was a post about exactly that in combination with Obsidian here recently
reply
Has anyone here experimented and documented the specs of a home AI standalone machine setup?
reply
It really depends on what you want to achieve.
A reasonable laptop can run this GPT4ALL and StableDiffusion quite well.
Probably the best deal would be a mac mini with at least an M1 chip, those are extremely powerful machines, and their are silent. They do run both GPT and SD without any issues.
reply
I've been considering an M1 Mac mini as an AI workhorse. The minis in particular are unusually cost-efficient for an apple machine
reply
Apple silicon is a game changer.
reply
How quick are stable diffusion and gpt with M1, in your experience? I tried running a gpt-like on some regular x86 desktop hardware and it was horrible, basically unusable.
reply
SD, particularly from here takes a few seconds, maybe a minute or two depending on other load (I'm usually doing other stuff there) to generate a full image on an M1 mac mini. GPT responses following this particular setup should generate a response in a few seconds.
There's a lot of tweaking that can be made, in particular, if you have a GPU, you can configure a llama model to be run in parallel on the GPU.
reply
reply
Thanks for sharing!
reply
I tried it out. It's extremely slow. I sent a simple "Hello" prompt, and he took like 10 seconds to write me response back. My PC is somewhat decent (i7 and RTX 2000 with 16 GB RAM).
reply
Interesting.
I think there's no GPU acceleration at all by default at least on this setup, so basically the entire speed is based on the CPU and the instructions it supports. The more modern the CPU, the better.
I've read that you can configure llama to use the GPU to get much better results
reply