Show HN: On-Device vs. Cloud LLMs for Agentic Tool Calling in a Real iOS App (subralabs.com)

🤖 AI Summary
A team developed an AI concierge feature in a resort directory app for iOS, utilizing both Apple's on-device Foundation Models (3B parameters) and OpenAI's GPT-OSS 20B via OpenRouter. While the on-device model offered zero latency and enhanced privacy, it struggled with complex tasks requiring multi-step reasoning and coherent response formulation in Italian. Conversely, the cloud model managed these tasks effectively, interpreting tool results accurately and maintaining conversational context. This project highlights key implications for the AI/ML community regarding the limitations of smaller on-device models for intricate tasks. Findings revealed that while on-device solutions excel in single-step applications, such as simple queries, they falter in maintaining context and composure during compound interactions. As a result, the cloud solution proved necessary for production, though there is potential for larger on-device models in the near future. The team concluded with a strategic framework that allows users to toggle between both models, ensuring they can prioritize privacy or capability based on their needs.
Loading comments...
loading comments...