Receipt parsing – A comparison between a small and a medium sized model (blog.yasuflores.me)

🤖 AI Summary
A small experiment compared two models on the task of extracting the total amount from receipt images using an open ExpressExpense dataset. The medium-sized model Moondream2 achieved 92.38% accuracy, while the tiny Granite Docling (≈256M parameters) reached 60.95%. The author shared full experiment code and examples in a public repo and noted simple post-processing and prompt tweaks were used to improve outputs. Significance and implications: Moondream2’s high accuracy makes it a strong candidate for production receipt-parsing pipelines, though it occasionally omits the “$” sign depending on receipt layout. Granite Docling is attractive for on-device use where memory and inference latency matter, but it needs extra cleanup (regexes/remove stray dots) to reach acceptable quality. Both models can likely be improved further with prompt engineering and tailored post-processing, highlighting a common practical trade-off in ML systems: model footprint versus out-of-the-box extraction accuracy. The author recommends Moondream2 for apps where a small error rate (~8%) is acceptable given the much faster user experience compared with manual entry.
Loading comments...
loading comments...