A Globally Optimal Alternative to Multi-Layer Perceptrons (www.mdpi.com)

0 points 7 hours ago ago | visit original

🤖 AI Summary

The paper introduces LReg (Lagrange Regressor), a globally optimal, finite-element–style alternative to multi-layer perceptrons that sidesteps the nonconvex loss landscapes that trap MLPs in local minima. LReg represents functions with Lagrange basis functions on an adaptively refined mesh and uses a mesh-refinement–coarsening (MRC) loop plus a top-k selection scheme to allocate degrees of freedom where the signal is complex (e.g., multi-frequency or high-noise regions). The authors derive an explicit scaling law and error-bound formula that predict how approximation error decreases with degrees of freedom, validate these bounds across synthetic regression tasks, and show LReg can solve PDEs, fit vector-valued functions, and handle multi-frequency data that typical MLPs struggle with. Technically, LReg combines a 1D pipeline (local fits, interval counting, mesh updates) with discrete/continuous optimization and closed-form Lagrange basis expressions to guarantee convergence toward a global minimum for the fitted coefficients. Crucially, LReg also functions as a parameter-efficient fine-tuning (PEFT) module for large pre-trained models (not limited to transformers), drastically reducing trainable parameters (e.g., GPT-2 variants from hundreds of millions to a few hundred thousand parameters) while maintaining or improving test loss and throughput. For the AI/ML community this offers a theoretically grounded, interpretable regression primitive and a practical PEFT strategy that can reduce compute and stabilize optimization where MLPs are brittle.

Loading comments...

loading comments...