pull down to refresh

LLMs are trained to predict what the “next word” would be a sentence. Their objective requires the LLM to keep surprise to an absolute minimum.
When you ask an LLM to tell a joke, the LLM is guessing what joke a majority of people would find funny. The result is almost never funny.
We can’t fix this by using throwing more GPUs or more training data at the problem. For the same reason you can’t find funnier jokes by polling a larger and larger number of people, the architecture of LLMs is going to give you unfunny jokes by design.
This makes me think about human intelligence. How much of what we experience as reason is similar to "most likely"? When I think, it doesn't feel like I'm surveying all the knowledge available to me and predicting the most likely answer. But even using likely here isn't quite right. My (naive) understanding of LLMs is that they have been trained on a huge quantity of text and likely for them means given training, context, and prompt, "this is a word that is most frequently associated with the preceding string."
Whereas in my own mind, I have the sensation of finding a new word -- well, if I'm fully honest, I don't think in one-word chunks (nor do I know if LLMs can be said to "think in one-word chunks"), but rather vague blobs with occasional crystals of connection in them. But my point is that thinking is often the experience of finding some new connection or outcome of which I hadn't previously been conscious.
It is possible, I suppose, that prompting is a process of leading an LLM to make new connections, but I'm pretty confident that LLMs aren't pursuing those new connections on their own.
It other words, it will need to be curious about the world, seeking the right kinds of surprises, rather than minimizing them.
102 sats \ 0 replies \ @Kael_Yurei 2h
You are right, LLMs are designed to minimize “surprise”, while human thinking often flourishes precisely from the moment a new, unexpected connection emerges. Machines predict, we search and this element of curiosity is what makes us creative. That is why human creativity cannot be exhausted by “average” data, it requires risk, curiosity, and even the intention to get lost in something unexpected. Machines predict, we test, we make leaps but I believe that machines can also, in their own way, become companions in this search and help us go further.
reply