pull down to refresh


Using prompt injections to play a Jedi mind trick on LLMs
A handful of international computer science researchers appear to be trying to influence AI reviews with a new class of prompt injection attack.
Nikkei Asia has found that research papers from at least 14 different academic institutions in eight countries contain hidden text that instructs any AI model summarizing the work to focus on flattering comments.
Nikkei looked at English language preprints – manuscripts that have yet to receive formal peer review – on ArXiv, an online distribution platform for academic work. The publication found 17 academic papers that contain text styled to be invisible – presented as a white font on a white background or with extremely tiny fonts – that would nonetheless be ingested and processed by an AI model scanning the page.
It’s a laziness arms race
reply
Yeah, sure, there’s some laziness in the mix. But honestly? It’s more like a race to automate everything and make money. Man, I laughed so hard at this news!
reply
I’ve heard of a bunch of things like this.
Academics stopped reading papers a long time ago, so it was inevitable that they’d also stop reviewing them and writing them as well.
We’re in for an Elsagate of academic papers.
reply
Self defense?
"If someone uploads your paper to Claude or ChatGPT and you get a negative review, that's essentially an algorithm having very strong negative consequences on your career and productivity as an academic," he explained. "You need to publish to keep doing your work. And so trying to prevent this bad behavior, there's a self-defense component to that."
reply
Based on that argument, sounds like it could be self-defense.
reply