Large-scale online deanonymization with LLMs (arxiv.org)

0 points 4 hours ago ago | visit original

🤖 AI Summary

Recent research has demonstrated the capability of large language models (LLMs) to perform large-scale deanonymization online, significantly impacting the AI and machine learning community's understanding of privacy in digital spaces. The researchers developed a methodology that enables an agent with full internet access to accurately re-identify pseudonymous users from platforms like Hacker News and during Anthropic interviews by analyzing their profiles and conversation history. This process, which could traditionally take human investigators hours, is streamlined through an automated attack pipeline that extracts identity-relevant features, searches for matches using semantic embeddings, and verifies these matches to minimize false positives. The study constructed three datasets to assess the effectiveness of the LLM-based technique, achieving impressive results—up to 68% recall at 90% precision—far surpassing traditional methods that struggled to reach even 0% recall. This advancement raises significant concerns about the efficacy of current privacy protections for pseudonymous users and suggests a need for a reassessment of online privacy threat models. The findings emphasize that the obscurity once considered a safeguard for user anonymity is increasingly vulnerable, urging the AI/ML community and policymakers to address the implications for digital identity and privacy protection in the age of advanced AI tools.

Loading comments...

loading comments...