Voicebot realtime flow
Build a realtime voicebot by composing Valsea speech-to-text, OpenAI chat completions, and Valsea realtime text-to-speech.
This page describes a composable flow rather than one standalone endpoint. Use the session gateway when you need the room-based realtime session API documented on the Voicebot realtime sessions page.
Flow
- Capture microphone audio in the browser.
- Send each detected utterance to
POST /v1/audio/transcriptions. - Send the transcript to
POST /v1/chat/completionswith your OpenAI-compatible conversation messages. - Stream the assistant text to
WS /v1/realtime/ttsfor low-latency speech playback. - Keep your conversation state server-side so browser clients never need to hold long-lived secrets.
Speech To Text
Use the standard transcription endpoint for utterance-level recognition.
cURL
curl -X POST https://api.valsea.ai/v1/audio/transcriptions \
-H "Authorization: Bearer YOUR_API_KEY" \
-F file=@utterance.wav \
-F model=valsea-transcribe \
-F language=vietnamese
Conversation Turn
Use chat completions for the voicebot response.
JavaScript
const response = await fetch('https://api.valsea.ai/v1/chat/completions', {
method: 'POST',
headers: {
Authorization: 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'valsea-fast',
messages: [
{ role: 'system', content: 'You are a helpful voice assistant.' },
{ role: 'user', content: transcriptText },
],
}),
});
const completion = await response.json();
const assistantText = completion.choices[0].message.content;
Realtime Text To Speech
Open a realtime TTS WebSocket, authenticate, and stream assistant text to receive audio chunks.
JavaScript
const socket = new WebSocket('wss://api.valsea.ai/v1/realtime/tts');
socket.addEventListener('open', () => {
socket.send(JSON.stringify({ token: 'YOUR_API_KEY' }));
});
socket.addEventListener('message', (event) => {
if (event.data instanceof Blob) {
// Queue audio chunks for playback.
return;
}
const message = JSON.parse(event.data);
if (message.type === 'successful-authentication') {
socket.send(
JSON.stringify({
type: 'speak',
text: assistantText,
voice: 'valsea-default',
audio_format: 'mp3',
}),
);
}
});
Related APIs
| API | Docs |
|---|---|
| Transcription | /docs/api/transcribe |
| Chat completions | POST /v1/chat/completions |
| Realtime TTS | /docs/realtime-tts |
| Realtime sessions | /docs/api/voicebot-livekit |