Brain-IT: Image Reconstruction from fMRI via Brain-Interaction Transformer (AmitZalcher.github.io)

0 points 1 day ago ago | visit original

🤖 AI Summary

Researchers introduced Brain-IT, a novel pipeline for reconstructing images from fMRI that centers on a Brain Interaction Transformer (BIT). BIT maps individual voxels into shared functional clusters (Voxel-to-Cluster) to produce Brain Tokens, then uses a Cross-Transformer with query tokens to predict localized patch-level image features. Crucially, BIT predicts two complementary feature types: high-level semantic features that steer a diffusion model toward correct content, and low-level VGG-style structural features that initialize the image’s coarse layout. The system uses a two-branch design—Low-Level branch to generate a coarse structural prior (refined with a Deep Image Prior) and a Semantic branch to condition a diffusion model—so information flows directly from voxel clusters to localized image patches. This architecture is significant because it substantially increases faithfulness to the actually seen images while being extremely data-efficient: shared clusters and shared model components enable training with limited data and transfer across subjects. Quantitatively and visually Brain-IT outperforms prior fMRI-to-image methods, and—strikingly—achieves comparable results using only one hour of a new subject’s fMRI versus others trained on 40 hours. For AI/ML this implies practical, cross-subject brain decoding with fewer samples, improved structural-semantic disentanglement for generative conditioning, and a promising direction for neurotech and brain–computer interfacing research.

Loading comments...

loading comments...