Well, this is probably already apparent to many of you, but it helps to see it written down:
With full Internet access, our agent can re-identify Hacker News users and Anthropic Interviewer participants at high precision, given pseudonymous online profiles and conversations alone, matching what would take hours for a dedicated human investigator.
LLMs enable "fully automated deanonymization attacks that operate on unstructured text at scale. Where previous approaches required predefined feature schemas, careful data alignment, and manual verification, LLMs can extract identity-relevant signals from arbitrary prose, efficiently search over millions of candidate profiles, and reason about whether two accounts belong to the same person."
A framework for scalable deanonymization with LLMsA framework for scalable deanonymization with LLMs
Users who share more content are substantially easier to identify.
The advice they give is this:
Avoid sharing specific details, and adopt a security mindset: if a team of smart investigators were trying to identify you from your posts, could they plausibly figure out who you are? If yes, LLM agents will soon be able to do the same.
But honestly, I don't think its too helpful. My bigger question is how likely is it that an LLM can deanonymize me even if I am careful about the specific details I post.
https://twiiit.com/dpaleka/status/2024892671563891130