Generate hyper-realistic speech, clone voices instantly, and design unique audio personas with the power of Qwen3-TTS
| Founded year: | 2026 |
| Country: | United States of America |
| Funding rounds: | Not set |
| Total funding amount: | Not set |
Description
Qwen3-TTS is an open-source text-to-speech (TTS) model family developed by the Qwen team at Alibaba Cloud. The website https://qwen3tts.art/ serves as the official online qwen3 tts platform for this technology, allowing users to experience its capabilities directly in a browser.The system supports multilingual speech synthesis across 10 languages (Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, and Italian), including some dialect variations. Its main distinguishing features include:
- Zero-shot voice cloning from as little as 3 seconds of reference audio
- Voice design/creation using natural language descriptions (e.g., creating entirely new character voices from text prompts)
- Instruction-based control of emotion, prosody, speaking style, and tone
- Real-time streaming generation with low latency (reported first-packet latency around 97 ms)
- Handling of mixed-language text, complex punctuation, and special symbols