DiscoverLatent Space: The AI Engineer PodcastInformation Theory for Language Models: Jack Morris
Information Theory for Language Models: Jack Morris

Information Theory for Language Models: Jack Morris

Update: 2025-07-02
Share

Description

Our last AI PhD grad student feature was Shunyu Yao, who happened to focus on Language Agents for his thesis and immediately went to work on them for OpenAI. Our pick this year is Jack Morris, who bucks the “hot” trends by -not- working on agents, benchmarks, or VS Code forks, but is rather known for his work on the information theoretic understanding of LLMs, starting from embedding models and latent space representations (always close to our heart).

Jack is an unusual combination of doing underrated research but somehow still being to explain them well to a mass audience, so we felt this was a good opportunity to do a different kind of episode going through the greatest hits of a high profile AI PhD, and relate them to questions from AI Engineering.

Papers and References made

  • AI grad school: https://x.com/jxmnop/status/1933884519557353716

  • A new type of information theory: https://x.com/jxmnop/status/1904238408899101014

  • Embeddings

    • Text Embeddings Reveal (Almost) As Much As Text: https://arxiv.org/abs/2310.06816

    • Contextual document embeddings https://arxiv.org/abs/2410.02525

      Harnessing the Universal Geometry of Embeddings: https://arxiv.org/abs/2505.12540

  • Language models

    • GPT-style language models memorize 3.6 bits per param: https://x.com/jxmnop/status/1929903028372459909

    • Approximating Language Model Training Data from Weights: https://arxiv.org/abs/2506.15553

      • https://x.com/jxmnop/status/1936044666371146076

  • LLM Inversion

  • "There Are No New Ideas In AI.... Only New Datasets"

    • https://x.com/jxmnop/status/1910087098570338756

    • https://blog.jxmo.io/p/there-are-no-new-ideas-in-ai-only

  • misc reference: https://junyanz.github.io/CycleGAN/

for others hiring AI PhDs, Jack also wanted to shout out his coauthor

Zach Nussbaum, his coauthor on Nomic Embed: Training a Reproducible Long Context Text Embedder.

Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Information Theory for Language Models: Jack Morris

Information Theory for Language Models: Jack Morris