Hope: A post-transformer architecture for general intelligence at low compute (blankline.org)

🤖 AI Summary
The research initiative "Hope" has introduced a novel post-transformer architecture aimed at achieving general intelligence with lower computational demands. The initiative has successfully completed four of its seven pre-registered validation stages, showcasing a significant improvement in performance on novel tasks from the Abstraction and Reasoning Corpus (ARC), achieving a 9.2% held-out exact match—around double the best published baseline. This initiative highlights a critical perspective that current transformer architectures, foundational to leading AI models like GPT, fundamentally compute the wrong operations for general intelligence, prompting the need for a new design approach. Hope's architecture, called Hope-1, uses a discrete-latent organization combined with a program-decoder and search mechanism that outperform traditional transformer methods. Initial experiments demonstrated that the 0.69M-parameter model effectively generalized from training to unseen tasks, highlighting its potential for self-improvement without requiring additional training data. Future research phases aim to validate the architecture at larger scales, targeting 1B parameters with a focus on unlocking substantial improvements in efficiency and performance per compute unit. If successful, this could significantly alter the landscape of AI architecture, providing a pathway to enhanced capabilities while reducing the computational burden typically associated with transformer models.
Loading comments...
loading comments...