🤖 AI Summary
"Right-sized AI" advocates replacing blanket use of huge foundation models (ChatGPT, Gemini, Claude) with smaller, task-specific or on-device models when building web apps. The piece argues that for many production tasks — e.g., extracting merchant and total from a receipt — an OCR step plus a tiny text-classifier or SLM running locally is faster, cheaper, more private, and more reliable than routing everything to a remote LLM. Training large models is a fixed, provider-side cost, but inference costs scale with usage; choosing the right model directly reduces latency, API bills, and environmental impact. The article’s analogy—don’t take a Formula 1 car to fetch fast food—underscores efficiency over raw capability.
Practically, developers should decompose problems into discrete tasks, pick models matched to each task, and decide where the model should live using a simple client-vs-server rubric (connectivity, data locality, frequency, bandwidth, privacy). Client-side inference is feasible with TF.js, Transformers.js, ONNX.js and supports local-first, remote-first, or hybrid deployments (download small models, fallback to APIs). Recommendations: prefer smaller models when sufficient, demand inference/training cost transparency, place compute near data to avoid round trips, and reuse available on-device models. These shifts improve UX, lower costs and emissions, and make AI features scalable and sustainable.
Loading comments...
login to comment
loading comments...
no comments yet