🤖 AI Summary
            AlphaFold’s hidden output—the high-dimensional embeddings produced by its Evoformer module—is powering a second wave of AI in biology. Beyond the model’s headline 3D predictions (median backbone error ~0.96 Å), the Evoformer builds rich residue- and pairwise representations by reasoning over MSAs and geometric constraints through 48 stacked blocks with cross-talk between an MSA tensor (N_sequences × N_residues) and a pair tensor (N_residues × N_residues). Crucial architectural features include triangular updates that enforce transitive geometric consistency (triangle inequality) and attention-based MSA↔pair communication. Practically this yields a single-residue embedding of shape (N_residues × 384) and a pair embedding (N_residues × N_residues × 128), which must be explicitly saved from the AlphaFold pipeline (community guides exist).
Those embeddings have immediate downstream utility: simple regressors on differences in single representations predict ΔΔG with Pearson ≈0.58, AF2BIND uses pairwise attention “baiting” to find binding pockets, AlphaMissense scored 71M missense variants (89% classified), and generative models like PCMol condition molecule design on target embeddings. Compared to sequence-only protein language models (ESM-2, ProtT5; trained on 229M+ sequences), AlphaFold embeddings trade broader evolutionary coverage for geometric precision (PDB-trained ≈200k structures). The takeaway: choose embeddings by task—structure and binding favor AlphaFold; function and disorder favor PLMs—and the future points to multimodal hybrids that merge structural priors with large-scale sequence semantics.
        
            Loading comments...
        
        
        
        
        
            login to comment
        
        
        
        
        
        
        
        loading comments...
        no comments yet