AI PB: A Grounded Generative Agent for Personalized Investment Insights (arxiv.org)

🤖 AI Summary
AI PB is a production-scale generative agent built for real retail finance that proactively produces grounded, compliant, and personalized investment insights rather than passively answering queries. The system uses a component-based orchestration layer that deterministically routes requests between internal and external LLMs depending on data sensitivity, a hybrid retrieval stack combining OpenSearch with a finance-domain embedding model, and a multi-stage recommendation pipeline that blends rule-based heuristics, sequential behavioral modeling, and contextual bandits for online personalization. Deployed fully on-premises to meet Korean financial regulations, AI PB runs on Docker Swarm with vLLM across 24× NVIDIA H100 GPUs and was validated through human QA and operational metrics. For the AI/ML community, AI PB demonstrates a concrete blueprint for building trustworthy, compliant LLM systems in high-stakes domains. Key technical takeaways include deterministic routing to minimize data leakage, the value of domain-tuned embeddings plus traditional search for grounding, and the practical benefit of combining offline behavioral models with contextual bandits for adaptive recommendations. The work highlights operational lessons (on-prem orchestration, vLLM for throughput) and shows that layered safety and grounding can make generative agents viable in regulated production environments—offering a replicable pattern for other industries that require privacy, auditability, and dynamic personalization.
Loading comments...
loading comments...