Trends in LLM-Generated Citations on ArXiv (spylab.ai)

🤖 AI Summary
A recent analysis reveals a rising trend of hallucinated citations—references generated by large language models (LLMs) that cite incorrect authors or link to unrelated papers—within arXiv preprints. Though currently estimated at a low rate of about 0.025% of references, this figure likely underestimates the true extent, as it only accounts for hallucinations linked to existing arXiv titles. The increase in such erroneous citations appears to coincide with the emergence of AI research assistants like Deep Research, suggesting growing reliance on automated tools for literature compilation. Importantly, these fabricated citations can misattribute credit, distort scholarly records, and propagate misinformation across publications. The study employed a robust pipeline analyzing PDFs of arXiv papers since 2020, using text extraction tools combined with heuristic filters and manual verification to differentiate genuine references from hallucinations. Early data pre-2022 showed mostly parsing errors or rare title collisions, whereas by mid-2025, systematic fabrication patterns—such as mismatched authors and incorrect links—became more common. The issue raises concerns about both the readiness of current LLMs for generating accurate academic citations and the scholarly rigor of authors increasingly dependent on these tools. As citation integrity underpins reliable scientific discourse, this emerging citation noise challenges the AI/ML community to develop improved verification methods and LLMs with enhanced factual grounding to safeguard the quality of research output.
Loading comments...
loading comments...