I built a WhatsApp AI assistant that processes images, voice notes, and PDFs (github.com)

0 points 3 hours ago ago | visit original

🤖 AI Summary

A developer built an end-to-end WhatsApp travel support agent that combines Retrieval-Augmented Generation (RAG) with Amazon Bedrock agents to handle text and voice messages, ingest PDFs (and other unstructured docs), and manage support tickets. The stack — delivered as AWS CDK for Python — is deployed in four stages: an Aurora PostgreSQL instance used as a vectorized knowledge base, a Knowledge Base layer for Bedrock that chunks documents and stores embeddings, an Amazon Bedrock agent that handles session tracking and natural-language queries, and a WhatsApp frontend wired through API Gateway and Lambda. DynamoDB holds passenger profiles and tickets, S3 stores media, and Amazon Transcribe converts voice notes; the implementation uses Titan Embeddings v2 and Anthropic Claude 3 models for retrieval and response generation. This is significant because it shows a reproducible, production-oriented RAG pattern on AWS that eliminates much custom conversation state logic by leveraging Bedrock agents and uses a relational DB (Aurora PostgreSQL) as the vector store instead of a specialized vector-only DB. Key technical implications include automated PDF ingestion, embedding/indexing inside PostgreSQL, tight integration with AWS services (SSM, IAM, Transcribe, DynamoDB), and built-in ticket escalation workflows. Caveats: deploy stacks in the same AWS account/region due to SSM parameter dependencies, and expect costs for Bedrock, embeddings, Aurora, Transcribe, etc. The project provides CDK code for teams wanting a maintainable, extensible enterprise-ready RAG assistant.

Loading comments...

loading comments...