🤖 AI Summary
A compact open-source simulation (bayesian-agent) demonstrates an autonomous grid-world agent that learns which foods replenish or drain energy using exact Bayesian inference and Thompson sampling, rather than heuristics or neural nets. The agent begins with neutral priors (μ0=0.0, σ0²=10.0, pseudo-observations n0=0.1) over each (shape,color) food type and updates a Normal posterior after each sampled energy observation using conjugate Normal-Normal updates: σ₁² = σ₀²/(1+w/n) and μ₁ = (n·μ₀ + w·x)/(n + w) (w=1). Decisions use Thompson sampling: sample an energy draw from each posterior, subtract a distance movement cost, and pick the highest sampled value. The repo includes bayesian_agent.py, environment.py, config.py and a curses-based main loop you can run locally.
This minimal architecture highlights important properties for the AI/ML community: exact online updates (O(1) per observation) that shrink uncertainty monotonically (σ² ≈ σ₀²/n) and yield natural exploration via posterior variance, and Thompson sampling with known logarithmic regret in bandit settings. The demo surfaces interpretable, sample-efficient behaviors—rapid preference formation, aversion to toxic items, uncertainty-driven exploration—and makes it easy to experiment with priors, movement costs, UCB vs Thompson, multi-agent or non-stationary extensions, and comparisons to RL baselines.
Loading comments...
login to comment
loading comments...
no comments yet