Show HN: Generate Images in Claude.ai (blog.msahli.com)

🤖 AI Summary
A developer demonstrated how to add on-demand image generation to Anthropic’s Claude by wiring a small Python “generate_image.py” tool into Claude’s bash-based tool system. The tool invokes Google’s Gemini image model (gemini-2.5-flash-image) as the primary provider and falls back to a Cloudflare Workers AI endpoint if Gemini is unavailable. Claude calls the script via bash_tool, the script returns a JSON payload (success, filename, filepath, size_bytes/MB, provider_used), Claude copies the result to the outputs directory, and the user gets a download link — enabling natural-language prompts like “Generate an image of a sunset” inside Claude itself. Technically, the script emphasizes robustness and safe integration: it requires GEMINI_API_KEY or CLOUDFLARE_WORKER_URL + CLOUDFLARE_API_TOKEN environment variables, performs exponential backoff with jitter on transient errors, does defensive JSON parsing of Gemini’s response, validates image bytes with imghdr and Content-Type, sanitizes filenames and avoids name collisions, and exposes clear error codes for Claude to surface. The pattern shows how Claude’s extensible tool contract (CLI args → JSON stdout → file copy) can integrate third-party generative models, while the Gemini free tier limits make the Cloudflare fallback useful for reliability and demos. This is a practical blueprint for adding multimodal outputs to conversational LLM workflows.
Loading comments...
loading comments...