Echolocating Through the AGI Reality Distortion Field (medium.com)

0 points 3 hours ago ago | visit original

🤖 AI Summary

Kobalt Labs reverse-engineered an opaque “413 Request Too Large” failure on Anthropic’s messages.create API by fuzzing payload sizes with a binary-search approach. The team discovered the endpoint enforces a total request-size limit (not clearly documented): a 23.07 MB PDF decoded to ~30.76 MB after Base64 and was accepted, revealing the practical cutoff was around that total-request size. They also hit other server-side limits (529 overloaded) when trying enormous-page PDFs, and found the API validates that Base64 decodes to an expected file type, so random Base64 won’t bypass checks. Technically, the experiment required crafting truly uncompressible PDFs because normal images and multi-page text compress too well. Kobalt used fitz to assemble PDFs, numpy to generate random pixel matrices, and PIL to save PNGs with no compression so each page meaningfully increased size; then Base64 encoding increased payload size further. Operational fixes followed: adding size_bytes to file models, pre-send combined-file-size validation, and upload UI limits. The takeaways for AI/ML engineers: undocumented size and content-validation rules can silently break production flows, Base64 bloat matters, and proactive client/server-side file-size checks plus robust error handling (rate limits, overloads) are essential to maintain reliability.

Loading comments...

loading comments...