🤖 AI Summary
A hobbyist ran side-by-side, human-evaluated comparisons of locally runnable translation models for English↔Chinese to find a practical on-device choice. Models tested included LibreTranslate (default), Opus-MT, NLLB-600M, NLLB-3.3B, MADLAD-400 and Tower; a native Chinese speaker ranked outputs for fidelity, fluency and tone. Bottom-line findings: lightweight models (Opus-MT, LibreTranslate, NLLB-600M) are fast and consumer-friendly but show trade-offs—LibreTranslate is literal and slightly stilted, Opus-MT reads naturally but can omit or alter meaning and only accepts single sentences unless pre-split, and NLLB tends to be very literal. Heavier models (NLLB-3.3B, MADLAD-400) often produce higher-quality, more faithful renderings; Tower produced the best, most tonal translations but required substantial RAM (~25GB), long runtimes (>10 minutes on CPU) and careful tuning.
Key technical takeaways: per-sample runtimes varied widely (e.g., Opus‑MT ~6–8s, NLLB ~8–12s, NLLB‑3.3B ~17–31s, MADLAD ~30–90s, Tower hundreds of seconds) and resource needs dictate usability on-device. Practical recommendations: pick by use case—notably Opus‑MT for conversational output, LibreTranslate for literal/language‑learning needs, NLLB for stoic literalism, and NLLB‑3.3B/MADLAD as higher-quality compromises if you can afford extra CPU/RAM. Implementation notes: sentence splitting for Opus‑MT, GPU/ample RAM for Tower, and expect tuning to unlock best LLM-based translation results.
Loading comments...
login to comment
loading comments...
no comments yet