DiscoverBase by Base207: Semantic Design of de novo Genes with Evo
207: Semantic Design of de novo Genes with Evo

207: Semantic Design of de novo Genes with Evo

Update: 2025-11-24
Share

Description

️ Episode 207: Semantic Design of de novo Genes with Evo


In this episode of PaperCast Base by Base, we explore how a genomic language model called Evo can use genomic context to design entirely new DNA sequences that encode functional genes and multi-component defence systems.


Study Highlights:
Researchers trained the Evo genomic language model on long prokaryotic and phage DNA sequences and used genomic neighbourhoods as prompts to autocomplete new genes whose functions mirror those of their neighbours. They experimentally validated Evo-designed toxin–antitoxin systems and type III toxin–antitoxin modules, discovering novel protein toxins, protein antitoxins and RNA antitoxins that strongly modulate bacterial survival despite low or absent sequence similarity to natural proteins. Using prompts from anti-CRISPR operons, they generated diverse anti-CRISPR proteins that block SpCas9 activity and protect cells from phage infection, including candidates that cannot be confidently assigned to any known protein family. Finally, they scaled this semantic design strategy to build SynGenome, a public resource of more than 120 billion base pairs of Evo-generated DNA organised by gene ontology and domain annotations to enable function-guided exploration across many biological pathways.


Conclusion:
This work shows that genomic language models can move beyond imitating nature, using semantic relationships in genomes to design de novo functional genes and systems that expand the sequence space available for protein engineering and synthetic biology.


Music:
Enjoy the music based on this article at the end of the episode.


Reference:
Merchant AT, King SH, Nguyen E, Hie BL. Semantic design of functional de novo genes from a genomic language model. Nature. 2025. https://doi.org/10.1038/s41586-025-09749-7


License:
This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/


Support:
Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00


Official website https://basebybase.com


Castos player https://basebybase.castos.com


On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics.


Chapters


  • (00:00:00 ) - A New Way to Design New Genomes
  • (00:05:45 ) - Artificial Intelligence's challenge to protein design
  • (00:11:08 ) - Uncovering the genome's hidden secrets
  • (00:11:53 ) - The Secret Life of Genes
Comments 
loading
00:00
00:00
1.0x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

207: Semantic Design of de novo Genes with Evo

207: Semantic Design of de novo Genes with Evo

Gustavo Barra