Phonon update: 1.00% WER on Seed-TTS, smaller than every model we beat
Phonon, our 100M-parameter on-device Text-To-Speech model, reaches 1.00% WER on the Seed-TTS English benchmark, outperforming NeuTTS Air, KaniTTS2, and NeuTTS Nano. With a fixed voice, it drops to 0.83% WER, ahead of Kokoro and Magpie.