DiscoverTwo Voice DevsEpisode 253 - The Future of Voice? Exploring Gemini 2.5's TTS Model
Episode 253 - The Future of Voice? Exploring Gemini 2.5's TTS Model

Episode 253 - The Future of Voice? Exploring Gemini 2.5's TTS Model

Update: 2025-08-29
Share

Description

In this episode of Two Voice Devs, Mark and Allen dive into the new experimental Text-to-Speech (TTS) model in Google's Gemini 2.5. They explore its capabilities, from single-speaker to multi-speaker audio generation, and discuss how it's a significant leap from the old days of SSML. They also touch on how this new technology can be integrated with LangChainJS to create more dynamic and natural-sounding voice applications. Is this the return of voice as the primary interface for AI?


[00:00:00 ] Introduction

[00:00:45 ] Google's new experimental TTS model for Gemini

[00:01:55 ] Demo of single-speaker TTS in Google's AI Studio

[00:03:05 ] Code walkthrough for single-speaker TTS

[00:04:30 ] Lack of fine-grained control compared to SSML

[00:05:15 ] Using text cues to shape the TTS output

[00:06:20 ] Demo of multi-speaker TTS with a script

[00:09:50 ] Code walkthrough for multi-speaker TTS

[00:11:30 ] The model is tuned for TTS, not general conversation

[00:12:10 ] Using a separate LLM to generate a script for the TTS model

[00:13:30 ] Code walkthrough of the two-function approach with LangChainJS

[00:16:15 ] LangChainJS integration details

[00:19:00 ] Is Speech Markdown still relevant?

[00:21:20 ] Latency issues with the current TTS model

[00:22:00 ] Caching strategies for TTS

[00:23:30 ] Voice as the natural UI for AI

[00:25:30 ] Outro


#Gemini #TTS #VoiceAI #VoiceFirst #AI #Google #LangChainJS #LLM #Developer #Podcast

Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Episode 253 - The Future of Voice? Exploring Gemini 2.5's TTS Model

Episode 253 - The Future of Voice? Exploring Gemini 2.5's TTS Model

Mark and Allen