reply on: Lightweight, in-process (embedded, sqlite-like) vector database \ stacker news

pull down to refresh

147 sats \ 5 replies \ @k00b OP 15 Feb \ parent \ on: Lightweight, in-process (embedded, sqlite-like) vector database AI tech devs

Ah I hadn't heard of it which shows how unfamiliar am I with these tools.

147 sats \ 4 replies \ @optimism 15 Feb

It's probably not as fast as the one you linked, but the vector db is not the bottleneck for me as most of the text embedding models are super slow for me and I'm not entirely sure why yet.

This morning I used Chroma for embedding audio. With a cheap, old tokenizer, but just to see if that actually works (it does, because apparently maffs don't give a shit if a float comes from text, pictures, audio)

147 sats \ 3 replies \ @k00b OP 15 Feb

afaik if you're running the embedding model on a GPU, or quantized on a CPU, it shouldn't be super slow. But I also haven't run much of this stuff locally yet.

147 sats \ 2 replies \ @optimism 15 Feb

I've been running it on Apple Metal - torch says it is using the NPU, but the Apple part is probably why it is such a mess.

147 sats \ 1 reply \ @k00b OP 15 Feb

We were only scratching the surface when I was in college, but everyone imagined inference would be much cheaper/more efficient than it ended up being.

If bigger=smarter forever, edge inference will always be relatively slow/dumb.

157 sats \ 0 replies \ @optimism 15 Feb

Like with all things, that extrapolation of the upslope fails to consider that fun isn't infinite (I hate this fact of life.) So there's a time when bigger=smarter, and there is a time when the diminishing returns on how much smarter you get for your bigger, and at that equilibrium, suddenly smarter=smarter.

We'll get there.