pull down to refresh

Large Language Models are not suited for ASCII art. They tokenize the input and only generate tokens as output. They lose a lot of spatial information and are not really trained for aligning the characters of the output.
It's similar to painting with a hammer. A very skilled person might do something that resembles art, but a hammer is not really meant for that😂
29 sats \ 2 replies \ @optimism 12h
Gotta push the limits. Also the readme says its multimodal, so I was expecting a jpg lol.
reply
100 sats \ 1 reply \ @klk OP 11h
It's multimodal for input, not output unfortunately.
reply
I wonder how much can be improved by removing 139 languages, and audio and video modality.
reply