đ¤ AI Summary
An argument aimed at q/kdb+ developers warns against asking LLMs to produce highly terse code: while the q community prizes compact expressions like (2#x)#1,x#0 (and equivalents in Python), the author shows that terseness hurts LLM reliability. Using examples and dialogue with Claude, they illustrate that compressed one-liners are harder for LLMs to explain and validate, and that forcing terseness sacrifices the main advantage of LLM coding assistantsâaccurate generation, explanation, debugging, and extension.
The core is an informationâtheoretic claim: shorter representations pack the same information into fewer tokens, raising average perâtoken surprisal (and thus perplexity). Because LLMs generate by ranking probable next tokens (and apply topâp pruning), highâperplexity terse forms may be out of reach or errorâprone. Empirical evidence with Qwen 2.5 Coder 7B Instruct shows i += 1 has perplexity â38.7 versus i = i + 1 at â10.9, supporting the theory. Practical takeaway: prefer the more verbose, lowerâperplexity code that LLMs naturally produce (and use prompts like âbe verboseâ or âthink stepâbyâstepâ)âit spreads cognitive load across tokens, improves generation accuracy, and often outweighs aesthetic preferences for terseness.
Loading comments...
login to comment
loading comments...
no comments yet