Sesame AI Creates Lifelike AI Voices, Garners Praise On Social Media

It’s hard to break the clutter in the increasingly crowded AI space at the moment, but every now and then, a product manages to do just that.

Sesame has been garnering plenty of interest the last few days with its AI generated voices. The company has created a research preview that allows users to speak with two AI agents, Maya and Miles. The conversations feel quite natural, with the AI agents adding in several human-like features, such as pauses, breaths, pauses and and words like “like” and “you know”. The content of the speech too doesn’t feel clunky, and feels quite human-like.

Several X users have reacted to Sesame’s AI agent on X. “Man, sesame’s voice model is absolutely insane. You have to try this demo,” wrote Shopify CEO Tobi Lutke. “Absolutely astonishing voice ai demo. The whole site experience is (fire emoji),” wrote Vercel CEO Guillermo Rauch. Several people called it the “Her” moment, referring to the 2013 movie about an AI agent.

Sesame says its aim is to “bring the computer to life”. “We believe in a future where computers are lifelike. They will see, hear, and collaborate with us the way we’re used to. A natural human voice is key to unlocking this future,” it says. Sesame says it’ll build a personal companion, which it has already demoed to users. But it also intends to get into hardware and build a lightweight eyewear device, which will be designed to be worn all day and give access to the AI companion.

Sesame is founded by  Brendan Iribe, Ankit Kumar and Ryan Brown, and has offices in San Francisco, New York and Bellevue. It’s raised its series A from Andreessen Horowitz. “Sesame is built around the simple, but non-obvious, idea that the answer isn’t in the screens of AR glasses — it’s in the audio,’ says Andreessen Horowitz. “To date, the emotional flatness of AI audio has been exhausting and unnatural. But if you remove the visual display from AR glasses and instead focus on an amazing audio-first AI system, you can create a computing experience that feels seamless and intuitive,” they add.

There’s plenty of interesting startups that have been founded in the AI voice space. Eleven Labs has been making a splash for its text to voice features, and is now worth $4 billion. Meanwhile companies like OpenAI and Grok have their own voice agents that have been sounding more human-like with time. And with voice agents — like everything else in AI — getting better at the speed of light, voice might end up emerging as one of the biggest use-cases of AI.

Posted in AI