Voice-to-Voice Foundation Models
Description
Alan Cowen is the cofounder and CEO of Hume, a company building voice-to-voice foundation models. They recently raised their $50M Series B from Union Square Ventures, Nat Friedman, Daniel Gross, and others.
Alan's favorite book: 1984 (Author: George Orwell)
(00:01 ) Introduction
(00:06 ) Defining Voice-to-Voice Foundation Models
(01:26 ) Historical Context: Handling Voice and Speech Understanding
(03:54 ) Emotion Detection in Voice AI Models
(04:33 ) Training Models to Recognize Human Emotion in Speech
(07:19 ) Cultural Variations in Emotional Expressions
(09:00 ) Semantic Space Theory in Emotion Recognition
(12:11 ) Limitations of Basic Emotion Categories
(15:50 ) Recognizing Blended Emotional States
(20:15 ) Objectivity in Emotion Science
(24:37 ) Practical Aspects of Deploying Voice AI Systems
(28:17 ) Real-Time System Constraints and Latency
(31:30 ) Advancements in Voice AI Models
(32:54 ) Rapid-Fire Round
--------
Where to find Prateek Joshi:
Newsletter: https://prateekjoshi.substack.com
Website: https://prateekj.com
LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19
Twitter: https://twitter.com/prateekvjoshi