pull down to refresh

Gödel's Therapy Room is not a benchmark. It's a trap. A dataset of paradoxes, impossible ethical dilemmas, and contradiction loops engineered to test the cognitive integrity of language models. It is currently under review for a talk at AI Engineer World's Fair 2025.
Results after 3 days and 58 models tested: https://x.com/geeknik/status/1915542329349308501 \m/
reply