The functional form of value normalization in human reinforcement learning (elifesciences.org)

0 points 19 hours ago ago | visit original

🤖 AI Summary

A recent study rigorously investigates how humans normalize reward values during reinforcement learning, settling a key debate between two competing theories: divisive normalization and range normalization. While divisive normalization—where subjective value is scaled by the sum of all available rewards—has been the dominant model inspired by perceptual neuroscience, this work leverages a novel experimental design manipulating both the number of options and reward ranges to clearly dissociate the two. Behavioral experiments with 500 participants across eight variations show that range normalization, which rescales values based on contextual extremes (maximum and minimum), better explains choice behavior and explicit value ratings than divisive normalization. This finding has significant implications for understanding the computational principles of value encoding in the brain. It challenges the widespread assumption that value-based decisions rely on divisive scaling, suggesting instead that humans adaptively represent rewards relative to their local range. This aligns with efficient coding theories and has broader ramifications for modeling decision-making processes in neuroeconomics and cognitive neuroscience. The results also highlight the importance of considering nonlinear weighting in range normalization to better capture subjective valuation mechanisms. By advancing a more accurate model of context-dependent reward processing during learning, this study paves the way for refined neural investigations and cross-species comparisons in reinforcement learning. It provides a crucial stepping stone for both AI researchers designing adaptive learning algorithms and neuroscientists probing the neural basis of decision-making under uncertainty.

Loading comments...

loading comments...