Text to speech
Generate spoken audio from text using Valsea voice aliases. The endpoint is synchronous and returns the generated audio bytes directly.
This endpoint is OpenAI SDK compatible. Set the OpenAI client baseURL to
https://api.valsea.ai/v1 and use client.audio.speech.create(...).
Voices
Valsea exposes stable voice aliases instead of provider-specific speaker IDs. The same alias maps to the correct underlying speaker for the requested language.
| Voice | Description |
|---|---|
valsea-neutral | Default balanced voice |
valsea-male | Male voice |
valsea-female | Female voice |
Supported languages are vietnamese and english.
Audio samples
Vietnamese sample
English sample
Request body
| Parameter | Type | Required | Description |
|---|---|---|---|
model | valsea-tts | Yes | Public TTS model name. |
input | string | Yes | Text to synthesize, up to 10,000 characters. |
voice | valsea-neutral | valsea-male | valsea-female | Yes | Stable Valsea voice alias. |
language | vietnamese | english | No | Language for voice alias routing. Default: vietnamese. |
response_format | mp3 | wav | No | Audio response format. Default: mp3. |
speed | number | No | Playback speed from 0.25 to 4. Default: 1. |
normalization | no | basic | advanced | No | Text normalization level. Default: basic. |
audio_quality | integer | No | Bitrate/quality value. Supported: 32, 64, 128, 192, 256, 320. Default: 64. |
Code examples
import fs from 'fs/promises';
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'YOUR_API_KEY',
baseURL: 'https://api.valsea.ai/v1',
});
const response = await client.audio.speech.create({
model: 'valsea-tts',
voice: 'valsea-neutral',
input: 'Xin chao, day la giong noi Valsea.',
response_format: 'mp3',
extra_body: { language: 'vietnamese' },
});
const buffer = Buffer.from(await response.arrayBuffer());
await fs.writeFile('speech.mp3', buffer);
Billing
TTS is billed by generated audio duration, rounded up to the next whole minute. The response includes normal credit headers such as X-Credits-Used.