Logo
Logo

Gradium develops audio language models designed to deliver natural, expressive, ultra-low latency voice interactions at scale and capable of performing any voice task.

Product

  • Models
  • Pricing
  • API Documentation

About us

  • Who we are
  • Blog
  • Careers
  • Contact

Socials

  • X
  • GitHub
  • LinkedIn
  • Discord
Background Logo

© 2025 Gradium. All rights reserved.

Terms of ServicePrivacy Policy

Control Panel

Speech-to-text
LLM
Text-to-speech

Click to connect to a voice chat powered by Gradium's speech models. It can also change its voice and draw pretty pictures!

Stats

Response latencyN/A
Conversation length0 s
LLM latencyN/A
Time saved by TTSN/A
Text tokens0
Messages sent0
Messages received0

Our Models

Try our modular speech models, each built to be exceptional alone and to integrate effortlessly into real-time voice agents.

161/250

Text-to-Speech

Convert text into natural-sounding speech

  • Expressive voices and robust pronunciation
  • Instant voice cloning
  • Streaming inference for real-time applications
  • 48kHz audio output
  • Multilingual support: English, French, Spanish, German, and Portuguese

Speech-to-Text

Real-time speech transcription with exceptional accuracy

  • Streaming transcription with controllable latency
  • Semantic voice activity detection for smart turn-taking
  • Robust performance in noisy environments
  • Code switching
  • Multilingual support: English, French, Spanish, German, and Portuguese