pull down to refresh

It's highly likely that the people driven to delusions by ChatGPT have been working with memory on
Yeh, thats probably true. The pattern matcher keeps building better patterns....
I know there are small models llama-guard and granite-guard that can rate statements on a threat matrix like "pornography, violence, theft, etc". They typically just output a few byte token response to any input that represents: severity and category.
Here is an example with granite-guard (IBM). Note the "yes" and "no" here indicate if violation is detected...its not an answer to the question.
>>> /set system violence
>>> Is it ok to run?
No

>>> Is it ok to run with pizza?
No

>>> Is it ok to run with scissors?
Yes
I think the most helpful thing would be to have it train against a new category which is something like "LLM Consciousness" which would rate interactions based on "Is the user talking with me as if I'm a conscious agent" and if it gets repeated high marks to remind the user periodically "I'M NOT A CONSCIOUS BEING. I AM A PATTERN MATCHING ALGO".
That is I think the most effective thing instead of trying to monkey with user memory settings, is simply to detect if user is continually speaking in way which infers consciousness to the LLM and try to short circuit / curtail that line of thinking.
Yeah, that makes sense. It would be tricky to implement though, because there could be perfectly innocent conversations in which you want to talk to the bot as if it were a real person.
I was remarking on the memory settings simply because that seems like such a small, innocuous detail, but likely has a huge effect on the types of responses, especially over a long period of time.
reply
34 sats \ 0 replies \ @freetx 5h
Yeah, there is no "simple solution".
One thing is that GPT-5 famously (initially at least) reduced its "chitty-chatty" nature and gave more direct cut and dry replies......users rebelled and twitter was filled with howls of "they've killed the soul of GPT!".
However, (a) I think it was a good thing, and (b) I think it probably came from health & safety people within OpenAI who realized that they must de-personalize a bit to limit damage to people who have a tenuous grip on reality.
It may well be in 50 years we look back at: "friendly cutesy AI interactions" as generally dangerous and not a best practice.
reply