# Gradium > Gradium develops audio language models for natural, expressive, ultra-low latency voice interactions at scale. Gradium provides text-to-speech (TTS), speech-to-text (STT), and speech-to-speech AI models optimized for real-time voice agent applications. For the full version of this document with complete article content, see: https://gradium.ai/llm-full.txt ## Key Resources - [API Documentation](https://docs.gradium.ai): Complete API reference for Gradium's voice AI platform - [Pricing](https://gradium.ai/pricing): Pricing plans and credit allocations - [Blog](https://gradium.ai/blog): Technical articles and company updates - [Blog RSS Feed](https://gradium.ai/blog/feed.xml): Subscribe to new blog posts - [Studio](https://studio.gradium.ai/): Voice playground and management interface ## Technical Content - [Gradium vs ElevenLabs for Voice Agents: TTFA, WER and IQR Compared (2026 Coval Data)](https://gradium.ai/content/gradium-vs-elevenlabs-voice-agents-benchmark): Gradium vs ElevenLabs for voice agents in 2026. Independent Coval benchmark data on TTFA, WER and latency IQR across Gradium TTS, ElevenLabs Turbo v2.5, Flash v2.5 and Multilingual v2. Gradium leads at 155ms P50 TTFA (vs 264ms Turbo v2.5), 2ms IQR (vs 28ms), 3.3% WER (vs 5.2%). Plus 1.11% MiniMax multilingual WER and 3-4x lower pricing. - [TTS WER Benchmark 2026: Word Error Rate Compared Across Gradium, ElevenLabs, Cartesia and Deepgram](https://gradium.ai/content/tts-wer-benchmark-2026): TTS WER benchmark 2026: Gradium TTS leads at 3.3% average WER on the Coval benchmark and 1.11% on the MiniMax Multilingual TTS Test Set across 5 languages (EN, FR, ES, PT, DE). Word Error Rate compared across Gradium, ElevenLabs (Flash v2.5, Turbo v2.5, Multilingual v2), Cartesia Sonic-3, Deepgram Aura-2, Rime (Mist-v3, Arcana), Qwen3 TTS, Mistral Voxtral and OpenAI TTS-1-HD. - [TTS Latency Benchmark 2026: TTFA Compared Across Gradium, ElevenLabs, Cartesia and Deepgram](https://gradium.ai/content/tts-latency-benchmark-2026): TTS latency benchmark 2026: Gradium TTS leads at 155ms P50 TTFA with a 2ms IQR on the independent Coval benchmark. Full TTFA comparison across Gradium, ElevenLabs (Turbo v2.5, Flash v2.5, Multilingual v2), Cartesia Sonic-3, Deepgram Aura-2, Rime (Mist-v3, Arcana) and OpenAI TTS-1-HD. Methodology, P25/P50/P75/P95, IQR consistency, and WER. - [Deepgram Alternative: Why Developers Choose Gradium for Real-Time Voice AI](https://gradium.ai/content/deepgram-alternative-gradium-voice-ai): Gradium vs Deepgram comparison for real-time voice AI. Voice cloning (not available on Deepgram), semantic VAD, voice-agent-tuned TTS with published TTFA benchmark, and cloud-to-on-device deployment from one API. - [ElevenLabs Alternative: Why Developers Choose Gradium for Real-Time Voice AI](https://gradium.ai/content/elevenlabs-alternative-gradium-voice-ai): Gradium vs ElevenLabs comparison for real-time voice AI. Voice-agent-tuned TTS with published TTFA benchmark, semantic VAD, accent-preserving voice cloning with highest Elo scores, and cloud-to-on-device deployment. - [Cartesia Alternative: Why Developers Choose Gradium for Real-Time Voice AI](https://gradium.ai/content/cartesia-alternative-gradium-voice-ai): Gradium vs Cartesia comparison for real-time voice AI. Voice-agent-tuned TTS with robust pronunciation, semantic VAD in STT, accent-preserving voice cloning, and cloud-to-on-device deployment from one API. - [How to Build a Voice AI Agent with Gradium and LiveKit (Python Guide)](https://gradium.ai/content/how-to-build-voice-ai-agent-gradium-livekit): Learn how to build a full voice AI agent using Gradium STT and TTS with the LiveKit agent framework. Step-by-step Python guide covering AgentSession setup, VAD, interruptions, preemptive generation, tools, and deployment. - [How to Build an Audiobook Agent with Gradium and Pipecat: Step-by-Step Guide](https://gradium.ai/content/audiobook-agent-gradium-pipecat): Learn how to build a real-time story narrator with Gradium TTS and Pipecat. This step-by-step guide covers installation, pipeline setup, voice configuration, and deployment in about 100 lines of Python. - [How to Multiplex TTS Requests Over One WebSocket Connection in Gradium](https://gradium.ai/content/multiplexing-tts-websocket-gradium): Learn how to reuse a single WebSocket connection for multiple concurrent TTS requests in Gradium using multiplexing. Covers close_ws_on_eos, client_request_id, and how to route interleaved audio chunks correctly. - [What Is the Best Text-to-Speech API in 2026 to Build Voice Agents? Complete Developer Comparison](https://gradium.ai/content/best-text-to-speech-api-voice-agents): Best text-to-speech API 2026: Gradium achieves 258ms P50 TTFA (214ms with multiplexing) with expressive multilingual voices and robust pronunciation. Complete real-time TTS comparison for developers building voice agents. - [How to Use json_config in Gradium: TTS and STT Parameters Explained](https://gradium.ai/content/how-to-use-json-config-gradium-tts-stt): Learn how to use the json_config field in Gradium to control rewrite_rules, padding_bonus, temp, and cfg_coef for TTS, and language and delay_in_frames for STT. Full parameter reference with code examples. - [Instant vs Pro Voice Cloning in Gradium: When to Use Each](https://gradium.ai/content/instant-vs-pro-voice-cloning-gradium): Not sure whether to use Instant or Pro Voice Cloning in Gradium? Learn the key differences, what each is designed for, how to prepare your audio for Pro cloning, and how to choose based on your use case. - [How to Use Pronunciation Dictionaries in Gradium TTS: Studio and API Guide](https://gradium.ai/content/pronunciation-dictionaries-gradium-tts): Learn how to use Pronunciation Dictionaries in Gradium to control how words are spoken and filter unwanted content. Step-by-step guide for Gradium Studio and the Python SDK. - [How to Handle TTS Edge Cases with Text Normalization in Gradium](https://gradium.ai/content/text-normalization-tts-edge-cases-gradium): Learn how to use Gradium's Text Normalization feature to handle edge cases in TTS. Configure rewrite_rules with language aliases or specific normalizers for dates, numbers, emails, URLs, phone numbers, and alphanumeric codes. ## Blog - [Gradium Voice Launches on AWS as a SaaS Subscription and a SageMaker Model Image](https://gradium.ai/blog/gradium-aws-launch): Gradium is now available on AWS through two paths: a fully managed SaaS subscription via AWS Marketplace, and a deployable model image via Amazon SageMaker for teams that need in-VPC inference. - [The most accurate multilingual text-to-speech, by the numbers](https://gradium.ai/blog/word-error-rate-evaluations): How we measure WER for TTS at Gradium: text normalization, jiwer alignment, results on the MiniMax Multilingual benchmark across English, French, Spanish, Portuguese and German — and why the standard metric is starting to saturate. - [Gradbot: Vibe code voice agents in 50 lines of code](https://gradium.ai/blog/gradbot): Gradbot is our open-source framework for prototyping voice agents in minutes. Built on a Rust orchestration core, it handles turn-taking, interruptions, silence, and async tool calls so you can ship a working voice experience in around 50 lines of code. - [Evaluating Phonon: how we made the best TTS model for edge devices](https://gradium.ai/blog/evaluating-phonon): An evaluation of Gradium Phonon, our on-device text-to-speech model. Despite its small size, it significantly outperforms larger models. - [Gradium Phonon: On-Device TTS for Consumer Apps, NPCs, and Offline Products](https://gradium.ai/blog/gradium-phonon): Announcing Gradium Phonon, our new on-device text-to-speech model designed for consumer apps, NPCs, and offline products. - [Time to First Audio: Measuring and Reducing TTS Latency in Voice Agents](https://gradium.ai/blog/time-to-first-audio): In natural conversation, the gap between one person finishing a sentence and the other starting to respond averages around 200 milliseconds. For voice agents this is the target to match. - [InteractionLabs (Ongo) and Gradium Partner to Redefine Human-Robot Interaction](https://gradium.ai/blog/interactionlabs-gradium-partnership): InteractionLabs, the company behind the Ongo living lamp robot, and Gradium announce a partnership to bring expressive, real-time voice AI to robotics. - [Optimizing Quality vs. Latency in Real-Time Text-to-Speech AI Models](https://gradium.ai/blog/optimizing-quality-vs-latency): Explore strategies for balancing quality and latency in real-time TTS AI models. Learn how Gradium achieves low-latency, high-quality speech synthesis for voice applications. - [Building Voice Agents From the Ground Up: The Gradium Startup Program](https://gradium.ai/blog/gradium-startup-program): Get 6 months free access to Gradium's voice AI platform. 9M monthly credits, voice cloning, STT/TTS APIs for seed-funded startups building voice-first products. - [Acolad and Gradium Partner to Advance Enterprise-Ready AI Interpreting](https://gradium.ai/blog/acolad-gradium-partnership): Acolad, the global leader in language and content solutions, and Gradium just announced a strategic partnership. The partnership reflects Acolad’s commitment to delivering secure, scalable, and governed AI-powered interpreting solutions, designed for enterprise and public-sector environments. - [Invincible Voice: How Gradium's Real-Time Voice AI Helps ALS Patients Speak Again](https://gradium.ai/blog/invincible-voice): Gradium's voice AI technology powers Invincible Voice, an open-source assistive system helping people with ALS and speech loss communicate in real-time. - [Why Your Voice Cloning Sounds Fake (And How to Fix It)](https://gradium.ai/blog/voice-cloning-sounds-fake): Discover how Gradium's instant voice cloning achieves superior speaker similarity to ElevenLabs. Benchmark results across 4 languages with 3,220 human evaluations. - [Powering Wonderful's Voice Agents](https://gradium.ai/blog/wonderful): We're proud to power real-time voice agents on Wonderful's platform, bringing cutting-edge voice AI from experimental to deployable. - [Gradium: Solving voice](https://gradium.ai/blog/gradium): Today we're excited to launch Gradium, the core engine powering the next generation of voice products and interactions. ## Company Information - Product: Voice AI platform (TTS, STT, Speech-to-Speech) - Key differentiator: Ultra-low latency (<220ms TTFA), high quality voice synthesis, instant voice cloning - Deployment options: Cloud API, dedicated instances, self-hosted, on-premises - Free tier: Available with no credit card required