How do I install Gradium TTS for Pipecat?

Install the Gradium extra with uv add "pipecat-ai[gradium]" (or the equivalent pip install command). You will also need a Gradium account, an API key set as the GRADIUM_API_KEY environment variable, and a voice ID from Gradium's catalogue or a custom clone. The service is then available as GradiumTTSService from pipecat.services.gradium.

What settings can I configure on GradiumTTSService?

The current runtime-configurable settings, passed through GradiumTTSService.Settings(...), are model (the model identifier, defaulting to 'default'), voice (the voice identifier), and language (the synthesis language). These can be updated mid-conversation using TTSUpdateSettingsFrame without restarting the pipeline. Older code may use the deprecated InputParams/params= pattern, replaced by Settings/settings= as of Pipecat v0.0.105.

Why is Gradium's audio output fixed at 48kHz in Pipecat?

Gradium's TTS service always outputs audio at a 48kHz sample rate when used through Pipecat, and this is set automatically rather than being a configurable option. This is documented explicitly in the Pipecat API reference. If your pipeline's transport or downstream processing expects a different fixed sample rate, you would need to add a resampling step, since the service itself does not expose a setting to change this.

Does Gradium TTS support real-time voice switching in Pipecat?

Yes. Updating the voice setting at runtime through UpdateSettingsFrame automatically disconnects and reconnects Gradium's WebSocket connection with the new voice applied, without requiring manual connection management in your pipeline code. This is useful for agents that need to switch character voices or branded voice identities mid-session.

Is Gradium the default TTS provider in the Pipecat quickstart?

No. Pipecat's official quickstart uses Cartesia for TTS and Deepgram for STT by default. Gradium is a natively supported, separately maintained service in Pipecat's provider catalogue, installable as its own extra (pipecat-ai[gradium]) alongside dozens of other STT, TTS, and LLM providers. Switching a Pipecat project to use Gradium TTS instead of the quickstart default is a configuration change, not a different framework or architecture.

Where can I find a working example of Gradium with Pipecat?

Pipecat's GitHub repository includes a runnable example using Gradium TTS, linked directly from the official Gradium service documentation page on docs.pipecat.ai. For a complete step-by-step walkthrough building a full voice application, see How to Build an Audiobook Agent with Gradium and Pipecat, which covers installation, pipeline setup, voice configuration, and deployment using this same integration.

Gradium TTS in Pipecat: Setup and Integration Guide

Gradium ships as a natively supported text-to-speech provider inside Pipecat, the open-source Python framework for building real-time voice and multimodal agents. The integration is maintained as part of Pipecat's official service catalogue, with a dedicated GradiumTTSService class, a published API reference, and a working example in the Pipecat repository.

This article covers what that integration actually provides: how to install it, what each configuration setting does, what is fixed by design, and where to find the official references if you are wiring it into a production pipeline.

What Pipecat is and where Gradium fits

Pipecat's provider model: one framework, many services

Pipecat is an open-source Python framework for building real-time voice and multimodal conversational agents. It connects speech-to-text, an LLM, and text-to-speech into a single real-time pipeline, and handles the surrounding plumbing: audio transport, turn-taking, and interruption detection. Rather than building its own speech models, Pipecat orchestrates external services through a common interface, and ships installable extras for dozens of providers across STT, TTS, and LLM categories, from Deepgram and ElevenLabs to Cartesia, Hume, and Gradium, each maintained as a separate optional dependency.

This means choosing a TTS provider in Pipecat is a configuration decision, not an architectural one. The same pipeline structure, audio transport, and turn-taking logic stays in place regardless of which TTS service is plugged in.

Where Gradium sits in that ecosystem

Gradium is one of these natively supported services, installable as a Pipecat extra (pipecat-ai[gradium]) and exposed through GradiumTTSService, a class that follows the same configuration pattern as every other TTS service in the framework. Gradium maintains close integration ties with Pipecat: the service is documented directly on docs.pipecat.ai, with a dedicated API reference page and a runnable example shipped in the official Pipecat GitHub repository.

For a step-by-step build using Gradium with Pipecat, see How to Build an Audiobook Agent with Gradium and Pipecat, which walks through a complete voice application using this same service.

Setting up GradiumTTSService in Pipecat

Installation and basic configuration

Installing the Gradium extra pulls in the dependencies needed to run GradiumTTSService:

uv add "pipecat-ai[gradium]"

Before using the service, you need a Gradium account, an API key generated from the Gradium dashboard, and a voice ID, either selected from Gradium's voice catalogue or created as a custom clone. The API key is read from the GRADIUM_API_KEY environment variable.

A minimal setup looks like this:

from pipecat.services.gradium import GradiumTTSService

tts = GradiumTTSService(
    api_key=os.getenv("GRADIUM_API_KEY"),
    settings=GradiumTTSService.Settings(
        voice="_6Aslh2DxfmnRLmP",
    ),
)

The service connects to Gradium's WebSocket API endpoint, with traffic automatically routed to the nearest available region. The endpoint can be overridden to pin a specific region or a custom deployment if your infrastructure requires it.

Configurable settings: voice, model, and language

GradiumTTSService exposes its runtime-configurable options through a Settings object, which can be updated mid-conversation using TTSUpdateSettingsFrame without restarting the pipeline. The current settings are model (which model identifier to use for synthesis, defaulting to "default"), voice (the voice identifier), and language (the synthesis language).

tts = GradiumTTSService(
    api_key=os.getenv("GRADIUM_API_KEY"),
    settings=GradiumTTSService.Settings(
        model="default",
        voice="your-voice-id",
    ),
)

Pipecat's documentation notes a recent change worth flagging if you are working from older example code: the InputParams and params= pattern used in earlier versions of the service is deprecated as of Pipecat v0.0.105, replaced by the Settings and settings= pattern shown above.

What is fixed: the 48kHz output constraint

One detail worth knowing before building around this service: Gradium's TTS output through Pipecat is fixed at a 48kHz sample rate. This is set automatically and is not configurable. For most voice agent pipelines using modern WebRTC or telephony transports, 48kHz is a standard and well-supported rate, but it is a constraint to account for if your downstream pipeline expects a specific alternate sample rate and would otherwise need a resampling step.

Features that matter for voice agent pipelines

Word-level timestamps

Gradium's TTS service in Pipecat provides word-level timestamps alongside the generated audio. This is the kind of detail that matters specifically for production features rather than basic synthesis: synchronized captions, karaoke-style text highlighting, or precise alignment between spoken audio and an on-screen transcript all depend on knowing exactly when each word starts and ends in the output stream.

Runtime voice switching

Changing the voice setting at runtime, through UpdateSettingsFrame, automatically disconnects and reconnects the underlying WebSocket connection with the new voice configuration applied. This is handled by the service itself rather than requiring manual connection management, which matters for any agent that needs to switch character voices, languages, or branded voice identities mid-session without restarting the entire pipeline.

The service also exposes the standard Pipecat service connection events, on_connected, on_disconnected, and on_connection_error, which can be used to log connection state or trigger custom handling around the WebSocket lifecycle.

@tts.event_handler("on_connected")
async def on_connected(service):
    print("Connected to Gradium")

Get started

GradiumTTSService is in Pipecat's official catalogue today. Install it with uv add "pipecat-ai[gradium]", read the API reference on docs.pipecat.ai, and generate an API key at gradium.ai. For a full walkthrough, see How to Build an Audiobook Agent with Gradium and Pipecat.

Glossary

GradiumTTSService. The Pipecat service class that connects to Gradium's WebSocket text-to-speech API. Provides streaming synthesis, instant voice cloning support, word-level timestamps, and runtime-configurable voice and model settings within a Pipecat pipeline.

Pipecat extra. An optional, separately installable dependency group in the pipecat-ai Python package that adds support for a specific external service. Gradium is installed via the gradium extra (pipecat-ai[gradium]), alongside extras for dozens of other providers.

Settings object. Pipecat's pattern for exposing runtime-configurable parameters on a service, passed via a settings= constructor argument and updatable mid-conversation through frames like TTSUpdateSettingsFrame. Replaced the older InputParams/params= pattern as of Pipecat v0.0.105.

TTSUpdateSettingsFrame. A Pipecat frame type used to update a TTS service's configurable settings, such as voice or model, while a pipeline is already running, without requiring a full service restart.

Word-level timestamp. Timing metadata indicating the start and end of each word in a synthesized audio output. Used to synchronize on-screen text, captions, or transcript highlighting with spoken audio. Provided natively by Gradium's TTS service in Pipecat.