- : ,
- -
- : -
- ( ),
LiveKit Text-to-Speech Documentation
Inworld Text-to-Speech
Overview
Inworld Text-to-Speech
Documentation
, ”
.
, ' ' ”


It's time to bring your most ambitious AI applications to life with emotionally intelligent, real-time voice AI from Inworld. Now, you can use Inworld's pre-built voices or clone your own from a few seconds of audio in Inworld's API and via LiveKit's Agents framework. Inworld's multi-lingual, expressive voices are state-of-the art quality with real-time latency, but at about ~5% the cost of alternatives. You can learn more about Inworld's text-to-speech (TTS) models here.
You can now access Inworld voices and text-to-speech models via LiveKit's Agents framework plugin. This makes it easier for developers to create previously unimaginable, real-time voice experiences such as multiplayer games, agentic NPCs, customer-facing avatars, live training simulations, and more at an accessible price.
Experience Inworld TTS in a voice-driven, tabletop RPG game built by the LiveKit team. You can access the GitHub code repository to build your own voice-first, multi-agent game experience.
Get started in just minutes:
Ready to start building? Explore additional documentation to get started.
Whether you are building immersive games, voice-first apps or agentic tools, Inworld + LiveKit is designed to give you full-stack control with real-world performance.
On June 17, 2025, Inworld and LiveKit hosted a Realtime AI Meetup in San Francisco. Hundreds of voice AI developers, founders, and enthusiasts gathered to explore how text-to-speech and speech-to-text models are built, key considerations for development, and how to maximize their potential in AI agents.
Our TTS modeling framework allows us to advance voice AI's emotional and contextual understanding and easily add new functionality, while keeping costs affordable. This helps democratize access to building high-quality, real-time voice experiences.”
Jean Wang, Inworld Head of Product.
Latency has a direct impact on user experience, and developers must consider how to manage it effectively. Fast models help, but efficient data streaming, optimized network communication, and prompting can also be crucial. Metrics like 'first response latency' are key. Developers should consider this not only in the models they use, but also in how they implement their applications.”
Michael Solati, LiveKit Developer Advocate.