History LLMs: Models trained exclusively on pre-1913 texts (github.com)

0 points 132 days ago ago | visit original

🤖 AI Summary

Researchers from the University of Zurich and Cologne University have unveiled a groundbreaking family of large language models (LLMs) trained exclusively on texts published before 1913, designed to serve as historical windows into the past. These models, which can reach up to 4 billion parameters and are built on the Qwen3 architecture, leverage 80 billion tokens of historical data to simulate the perspectives and discourse of their respective eras. Unlike modern LLMs that suffer from hindsight contamination, these time-locked models cannot access information beyond their cutoff date, enabling them to provide insights reflective of pre-World War I thought, enriching research in the humanities and social sciences. The significance of this project lies in its potential to explore historical viewpoints on various social issues, from gender roles to political tensions, without contemporary biases influencing their responses. By embodying the language and ideas of their training data, these models facilitate an unprecedented mode of interaction with historical thoughts and narratives. However, researchers caution that the models can reproduce the biases inherent in historical texts, including racism and misogyny, which is acknowledged as a critical feature for understanding historical context. The team aims to create a responsible access framework for researchers to engage with these tools while seeking input on areas of focus and validation methods for their outputs.

Loading comments...

loading comments...