Show HN: SynthonGPT – Drug Discovery LLM with 0% Hallucinations (synthongpt.mireklzicar.com)

0 points 2 hours ago ago | visit original

🤖 AI Summary

SynthonGPT is a new Transformer-based drug-discovery LLM that claims to eliminate generative “hallucinations” by learning and operating directly in reaction + synthon space. During training a known molecule is encoded and the decoder is taught to predict a plausible reaction plus the corresponding synthons/building blocks; at inference the model only assembles molecules from real, pre-existing Chemspace synthons and reaction templates rather than inventing arbitrary SMILES. The result is GPT-style molecular generation where every proposal is synthetically grounded, combinatorial, orderable from suppliers, and—by design—avoids non-synthesizable, hallucinated outputs. The demo also highlights a search through a “Freedom Space” of 160 billion molecules and says the approach scales to trillions without storing enumerated databases. For the AI/ML and cheminformatics communities this is significant because it directly couples generative modeling with practical synthesis constraints, tackling a major pain point of molecule generators: producing candidates that can’t actually be made. Key technical takeaways: a Transformer encoder–decoder trained on reaction/synthon pairs, constrained decoding to known synthons/reactions (Chemspace), and no need for an exhaustive enumerated molecule DB (contrast with systems like CHEESE), improving scalability. Practical caveats include dependency on Chemspace coverage and reaction-rule quality, and the usual need for downstream experimental validation; full access is gated and provided on request.

Loading comments...

loading comments...