pull down to refresh
332 sats \ 3 replies \ @south_korea_ln 22h \ on: Claude runs a vending machine: decides to stock tungsten cubes, loses $250 AI
Not sure anyone will read through to the end of the article, but I found it the most thought-provoking one.
Before this part, every mistake could be somehow explained, and there are ways to respond to it. But the part where it hallucinates a new persona and then conveniently uses April Fool's as a way out of it is quite stunning...
I'm sure my understanding of LLMs is a little shallow, but I mostly think of them as a very complex prediction machine for forecasting the next word in a given set.
Such an understanding would explain a little the way LLMs seem to change their mind about what is happening or has happened. While this is pretty disturbing to humans, because it feels deeply duplicitous, it may simply be the most likely next word for the LLM.
Once the April Fool's context became more important for it, it made complete "sense" to the LLM to use the April Fool's excuse and in its "mind" it became the most likely next words even going down the path of acting like it had always been part of the plan.
If it's all just guessing the next word based off all the words on the internet, an LLM can sound very like a human but might not have a sense of reality or past vs present or duplicity.
I think this is called
context poisoning
. It's not much explored, but there is this paper, see for example section 4.1 that talks about semantic triggers
.reply