pull down to refresh

I'm getting around 5 token/sec with i7 + 16GB RAM + RTX 2000 using LLaMa 7B. It's not fast enough for me to consider it usable.
reply
This link was posted by birriel 41 minutes ago on HN. It received 64 points and 6 comments.
reply