Generating Novel Bacteriophages Using Genome Language Models (www.biorxiv.org)

🤖 AI Summary
Researchers report the first successful generative design of whole, viable bacteriophage genomes using large genome language models. Starting from Evo 1 and Evo 2 (pretrained on millions of genomic sequences and fine-tuned on ~15,000 Microviridae genomes), the team used prompt engineering, architectural and host-tropism predictive filters, and experimental screening to turn model outputs into biology. From roughly 300 synthesized genome candidates based on ΦX174 (a well‑characterized 5.4 kb, 11‑gene lytic phage), 16 novel phages were recovered. Computational checks (geNomad, BLAST, structure prediction) showed the genomes were phage-like yet distinct from natural sequences; cryo‑EM revealed one designed phage incorporates an evolutionarily distant DNA‑packaging protein. Several designs exhibited faster lysis kinetics or outcompeted ΦX174 in growth assays, and a cocktail of generated phages overcame resistance in three ΦX174‑resistant E. coli strains. This work is significant because it demonstrates genome‑scale generative design of living systems—not just single proteins or circuits—under steerable constraints (host tropism, genomic architecture). Technical advances include large‑scale pretraining, task‑specific fine‑tuning, inference‑time guidance with biological predictors, and an experimental pipeline that converts model sequences into functioning phages. Implications span rapid generation of evolutionarily novel therapeutic phages, more resilient phage cocktails against resistant bacteria, and a generalizable blueprint for designing larger, functionally complex genomes.
Loading comments...
loading comments...