Voicebot realtime flow

Build a realtime voicebot by composing Valsea speech-to-text, OpenAI chat completions, and Valsea realtime text-to-speech.

This page describes a composable flow rather than one standalone endpoint. Use the session gateway when you need the room-based realtime session API documented on the Voicebot realtime sessions page.

Flow

Capture microphone audio in the browser.
Send each detected utterance to POST /v1/audio/transcriptions.
Send the transcript to POST /v1/chat/completions with your OpenAI-compatible conversation messages.
Stream the assistant text to WS /v1/realtime/tts for low-latency speech playback.
Keep your conversation state server-side so browser clients never need to hold long-lived secrets.

Speech To Text

Use the standard transcription endpoint for utterance-level recognition.

cURL

curl -X POST https://api.valsea.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F file=@utterance.wav \
  -F model=valsea-transcribe \
  -F language=vietnamese

Conversation Turn

Use chat completions for the voicebot response.

JavaScript

const response = await fetch('https://api.valsea.ai/v1/chat/completions', {
  method: 'POST',
  headers: {
    Authorization: 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'valsea-fast',
    messages: [
      { role: 'system', content: 'You are a helpful voice assistant.' },
      { role: 'user', content: transcriptText },
    ],
  }),
});

const completion = await response.json();
const assistantText = completion.choices[0].message.content;

Realtime Text To Speech

Open a realtime TTS WebSocket, authenticate, and stream assistant text to receive audio chunks.

JavaScript

const socket = new WebSocket('wss://api.valsea.ai/v1/realtime/tts');

socket.addEventListener('open', () => {
  socket.send(JSON.stringify({ token: 'YOUR_API_KEY' }));
});

socket.addEventListener('message', (event) => {
  if (event.data instanceof Blob) {
    // Queue audio chunks for playback.
    return;
  }

  const message = JSON.parse(event.data);
  if (message.type === 'successful-authentication') {
    socket.send(
      JSON.stringify({
        type: 'speak',
        text: assistantText,
        voice: 'valsea-default',
        audio_format: 'mp3',
      }),
    );
  }
});

API	Docs
Transcription	`/docs/api/transcribe`
Chat completions	`POST /v1/chat/completions`
Realtime TTS	`/docs/realtime-tts`

Voicebot realtime flow

Flow

Speech To Text

cURL

Conversation Turn

JavaScript

Realtime Text To Speech

JavaScript

Related APIs