🤖 AI Summary
The author frames translation as a change-of-basis problem from linear algebra: concepts are abstract vectors and words (in a language) are the basis coordinates used to represent them. Some languages have compact basis vectors that capture a concept in a single word; others must “spread” that same concept across many words. That difference is cosmetic in math but significant for humans: limited time, cognitive load and the discrete nature of vocabulary mean translators often perform a PCA-like dimensionality reduction (keeping only the largest coordinates) or introduce inaccuracies when they pick imperfect words. The essay also likens vocabulary limits to quantization: language can’t express infinitesimal shades of meaning, so translations necessarily lose nuance.
For the AI/ML community this is directly relevant because modern LLMs literally encode language as vectors and perform linear-algebra operations. Cross-lingual alignment, embedding dimensionality, and model quantization all affect how well a model preserves culturally specific or tightly clustered concepts. Practical implications: multilingual systems must learn robust basis mappings (better alignment, higher-dimensional or contextualized embeddings), avoid over-aggressive quantization or compression, and use richer strategies (subword vocabularies, glosses, explicit explanations) when fidelity matters. The piece highlights why “untranslatable” phenomena aren’t mystical but arise from representational choices and resource trade-offs both in humans and machines.
Loading comments...
login to comment
loading comments...
no comments yet