Anything you use hosted is a privacy problem, except the ones on secure enclaves, given that you check the attestations AND it is executed on platforms with efuses that aren't extracted yet (for example I saw an allegation last week that all AMD secure enclave master keys until Zen 5 are leaked because they reused the key, but Intel is allegedly doing better because they have fresh keys per model)

Thus it's always better to host your own. I currently use InternLM 3.5 14b (also Chinese made) as a local chatbot, which runs rather fast even on older hardware - and on my old M1 macbook I have tested 8b and that's performing well too. But it goes fast... I'll try out this model

Meituan's LongCat-Flash reasoning model has been released

lunin

Anything you use hosted is a privacy problem, except the ones on secure enclaves, given that you check the attestations AND it is executed on platforms with efuses that aren't extracted yet (for example I saw an allegation last week that all AMD secure enclave master keys until Zen 5 are leaked because they reused the key, but Intel is allegedly doing better because they have fresh keys per model)

Thus it's always better to host your own. I currently use InternLM 3.5 14b (also Chinese made) as a local chatbot, which runs rather fast even on older hardware - and on my old M1 macbook I have tested 8b and that's performing well too. But it goes fast... I'll try out this model