DiscoverODSC's Ai X PodcastBeyond Transformers: The Next Wave of AI Architectures and LLM Engineering with Maxime Labonne
Beyond Transformers: The Next Wave of AI Architectures and LLM Engineering with Maxime Labonne

Beyond Transformers: The Next Wave of AI Architectures and LLM Engineering with Maxime Labonne

Update: 2025-02-07
Share

Description

In this episode, we sit down with Maxime Labonne, Head of Post-Training and Senior Staff Machine Learning Scientist at Liquid AI, to explore the evolving landscape of LLM engineering, Liquid AI, next-generation foundation models, automated benchmarking, model optimization and the shift beyond Transformer architectures. 


Key Takeaways

LLM Engineering is Evolving Rapidly

- Success in LLM engineering requires strong software engineering skills, expertise in fine-tuning, inference optimization, and deployment knowledge.

- As AI systems grow more complex, LLM Ops is becoming just as critical as MLOps in ensuring scalable, production-ready AI pipelines.

- The field is increasingly specialized, with roles focusing on inference, optimization, deployment, and fine-tuning techniques.

Transformer Architectures are Being Replaced

- State-space models (SSMs) and hybrid architectures are emerging as powerful alternatives, offering improved memory efficiency, inference speed, and scalability.

- Leading AI labs—including OpenAI, DeepSeek, and Bytedance—are moving away from the traditional Transformer model 

- Merging multiple fine-tuned models can combine specialized capabilities (e.g., math + coding) while reducing compute costs.

Agentic AI Workflows are Promising But Still Immature

- Current Agentic AI frameworks lack standardization, leading to inconsistent performance in real-world applications.

Fine-tuning Should be Used Selectively

- Many organizations fine-tune unnecessarily, when RAG or preference alignment would be a better, lower-cost alternative.

- Distilled models are gaining traction for being faster, cheaper, and easier to integrate while preserving reasoning capabilities.

LLM Engineering Careers are Rapidly Expanding

- The demand for specialists in inference optimization, fine-tuning, and model deployment is growing, with new roles emerging in model evaluation and LLM Ops.

- Future-proofing AI systems means designing architectures that can easily swap models and adapt to new AI innovations. 

References and Resources Mentioned:

- Maxime Labonne's LLM Course on GitHub⁠ https://github.com/mlabonne/llm-course⁠

- Maxime Labonne's Published Articles on His Blog⁠ https://mlabonne.github.io/blog/⁠

- The LLM Engineer's Handbook by Maxime Labonne⁠ https://www.amazon.com/LLM-Engineers-Handbook-engineering-production/dp/1836200072⁠

- Quantizing Deep Neural Networks⁠ https://arxiv.org/abs/1609.07061⁠

- Unsloth AI⁠ https://unsloth.ai/⁠

- Open LLM Leaderboard https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

- Chatbot Arena by LMSys https://chat.lmsys.org/

- OpenHands GitHub Repository⁠ https://github.com/All-Hands-AI/OpenHands⁠

- Speculative Decoding by OpenAI⁠ https://arxiv.org/abs/2211.17192⁠

- Hugging Face's Implementation of Speculative Decoding⁠ https://huggingface.co/blog/whisper-speculative-decoding⁠

- Graph Neural Networks Using Python on GitHub⁠ https://github.com/mlabonne/graph-neural-networks⁠

- Liquid AI Benchmarks https://www.liquid.ai/benchmarks

- MergeKit GitHub Repository⁠ https://github.com/arcee-ai/mergekit⁠

- Liquid AI Playground https://www.liquid.ai/playground

- Bytedance's Emory Architecture Paper⁠ https://arxiv.org/abs/2201.10005⁠

- Understanding the Key-Value Cache in Transformers⁠ https://arxiv.org/abs/2006.14939⁠

- Hugging Face Transformers Library⁠ https://github.com/huggingface/transformers⁠

- OpenRouter⁠ https://openrouter.ai/⁠

- Maxime Labonne's Twitter⁠ https://twitter.com/maximelabonne⁠

- Maxime Labonne's LinkedIn⁠ https://www.linkedin.com/in/maximelabonne⁠

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Beyond Transformers: The Next Wave of AI Architectures and LLM Engineering with Maxime Labonne

Beyond Transformers: The Next Wave of AI Architectures and LLM Engineering with Maxime Labonne