# Valsea

> Valsea is an AI-powered speech intelligence API platform built for Southeast Asia. It provides enterprise-grade speech-to-text transcription, translation, annotation, clarification, conversion, formatting, and sentiment analysis through simple REST endpoints. The transcription endpoint is OpenAI SDK compatible. It also offers real-time live transcription via WebSocket.

- Base URL: https://api.valsea.ai
- Authentication: Bearer token via `Authorization: Bearer YOUR_API_KEY` header
- Alternative auth: `X-API-Key: YOUR_API_KEY` header
- API keys start with `vl_` prefix
- Get API keys from the dashboard at https://valsea.ai/en/dashboard/api-keys
- All endpoints consume credits from your account balance
- All POST endpoints support two response formats: `json` (default, minimal) and `verbose_json` (extended with metadata)

## Docs

- [Introduction](https://valsea.ai/docs): Overview of the platform — transcription, translation, annotation, clarification, conversion, formatting, sentiment analysis, and live transcription
- [Quickstart](https://valsea.ai/docs/quickstart): Get started in under 5 minutes with code examples in cURL, JavaScript, Python, and OpenAI SDK
- [Authentication](https://valsea.ai/docs/authentication): How to get and use API keys, credit system overview
- [Live Transcription via WebSocket](https://valsea.ai/docs/realtime): Real-time streaming speech-to-text via WebSocket at `wss://api.valsea.ai/v1/realtime`

## API Reference

- [API Overview](https://valsea.ai/docs/api): Summary of all 7 REST endpoints with base URL and authentication details

### POST /v1/audio/transcriptions — Transcribe Audio

- [Full documentation](https://valsea.ai/docs/api/transcribe)
- OpenAI SDK compatible — use `baseURL: 'https://api.valsea.ai/v1'` with the official OpenAI TypeScript or Python client
- Content-Type: multipart/form-data
- Parameters:
  - `file` (binary, required): Audio file — WAV, MP3, M4A, FLAC, OGG, WEBM. Max 10 MB, max 1 hour
  - `model` (string, required): Always `valsea-transcribe`
  - `language` (string, required): One of singlish, english, chinese, korean, vietnamese, thai, indonesian, malay, filipino, tamil, khmer
  - `response_format` (string, optional): `json` or `verbose_json`. Default: `json`
  - `enable_correction` (boolean, optional): Enable grammar/language correction. Default: true
  - `enable_tags` (boolean, optional): Enable semantic tagging. Default: true
- Response (json): `{ "text": "..." }`
- Response (verbose_json): `{ "text", "raw_transcript", "detected_languages", "corrections", "semantic_tags", "annotated_text", "clarified_text" }`
- OpenAI TypeScript SDK example: `client.audio.transcriptions.create({ file, model: 'valsea-transcribe', language: 'english' })`
- OpenAI Python SDK example: `client.audio.transcriptions.create(file=open("audio.wav","rb"), model="valsea-transcribe", language="english", extra_body={"enable_correction": True})`

### POST /v1/translations — Translate Text

- [Full documentation](https://valsea.ai/docs/api/translate)
- Content-Type: application/json
- Parameters:
  - `model` (string, required): Always `valsea-translate`
  - `text` (string, required): Text to translate
  - `source` (string, optional): Source language name (e.g. "english", "chinese") or "auto" for detection. Default: "auto"
  - `target` (string, required): Target language name (e.g. "chinese", "vietnamese", "thai")
  - `response_format` (string, optional): `json` or `verbose_json`. Default: `json`
- Response (json): `{ "translated_text": "..." }`
- Response (verbose_json): `{ "translated_text", "source_language", "target_language" }`

### POST /v1/annotations — Annotate Text

- [Full documentation](https://valsea.ai/docs/api/annotate)
- Content-Type: application/json
- Annotate text with language corrections and semantic tags. Useful for processing colloquial transcriptions.
- Parameters:
  - `model` (string, required): Always `valsea-annotate`
  - `text` (string, required): Text to annotate
  - `response_format` (string, optional): `json` or `verbose_json`. Default: `json`
  - `language` (string, optional): Language hint (e.g. "singlish")
  - `enable_correction` (boolean, optional): Enable grammar/language correction
  - `enable_tags` (boolean, optional): Enable semantic tagging
- Response (json): `{ "text", "annotations" }`
- Response (verbose_json): `{ "text", "raw_text", "accent_corrections", "semantic_tags", "annotated_text", "annotations" }`

### POST /v1/clarifications — Clarify Text

- [Full documentation](https://valsea.ai/docs/api/clarify)
- Content-Type: application/json
- Transform noisy or colloquial transcriptions into clear, grammatically correct text.
- Parameters:
  - `model` (string, required): Always `valsea-clarify`
  - `text` (string, required): Text to clarify
  - `response_format` (string, optional): `json` or `verbose_json`. Default: `json`
  - `language` (string, optional): Language hint (e.g. "singlish")
- Response (json): `{ "clarified_text": "..." }`
- Response (verbose_json): `{ "clarified_text", "raw_text", "explanations", "revisions" }`

### POST /v1/conversions — Convert Annotated Text

- [Full documentation](https://valsea.ai/docs/api/convert)
- Content-Type: application/json
- Convert annotated text (with semantic tags) into clean, readable text.
- Parameters:
  - `model` (string, required): Always `valsea-convert`
  - `annotated_text` (string, required): Annotated text to convert
  - `response_format` (string, optional): `json` or `verbose_json`. Default: `json`
  - `semantic_tags` (array, optional): Array of `{ tag, phrase, meaning }` objects
- Response (json): `{ "converted_text": "..." }`
- Response (verbose_json): `{ "converted_text", "annotated_text" }`

### POST /v1/formatting — Format Transcript

- [Full documentation](https://valsea.ai/docs/api/format)
- Content-Type: application/json
- Transform a raw transcript into structured documents like meeting minutes, sales summaries, action items, subtitles, and more.
- Parameters:
  - `model` (string, required): Always `valsea-format`
  - `transcript` (string, required): The transcript to format
  - `output_type` (string, required): One of meeting_minutes, sales_summary, service_log, subtitles, email_summary, action_items, key_quotes, interview_notes
  - `response_format` (string, optional): `json` or `verbose_json`. Default: `json`
  - `semantic_tags` (array, optional): Array of `{ tag, phrase, meaning }` objects
  - `stream` (boolean, optional): Enable streaming response

### POST /v1/sentiment — Analyze Sentiment

- [Full documentation](https://valsea.ai/docs/api/sentiment)
- Content-Type: application/json
- Analyze the overall sentiment and emotional tone of a transcript.
- Parameters:
  - `model` (string, required): Always `valsea-sentiment`
  - `transcript` (string, required): The transcript to analyze
  - `response_format` (string, optional): `json` or `verbose_json`. Default: `json`
  - `semantic_tags` (array, optional): Array of `{ tag, phrase, meaning }` objects
- Response (json): `{ "sentiment": "positive"|"neutral"|"negative", "confidence": 0.0-1.0 }`
- Response (verbose_json): `{ "sentiment", "confidence", "reasoning" }`

## Live Transcription (WebSocket)

- [Full documentation](https://valsea.ai/docs/realtime)
- Endpoint: `wss://api.valsea.ai/v1/realtime`
- Auth: Pass `Authorization: Bearer YOUR_API_KEY` or `X-API-Key: YOUR_API_KEY` as WebSocket headers
- Audio format: Raw PCM 16-bit, 16kHz, mono, sent as base64

### Client messages:
- `session.start`: Initialize session with `{ type: "session.start", model: "valsea-rtt", language: "english", enable_correction: true, hint_text: "" }`
- `audio.append`: Send audio chunk `{ type: "audio.append", audio: "BASE64_PCM16_DATA" }`
- `audio.commit`: Signal end of speech segment `{ type: "audio.commit" }`
- `session.stop`: End session gracefully `{ type: "session.stop" }`

### Server messages:
- `session.created`: Connection established with sessionId and supported_models
- `session.ready`: Engine ready for audio
- `transcript.partial`: Intermediate mutable result `{ text, isFinal: false, timestampMs }`
- `transcript.final`: Stable committed segment `{ text, raw_text, isFinal: true, timestampMs, corrections }`
- `error`: Error event `{ code, message }`

### Transcript semantics:
- Keep partial text in temporary UI state only (it may change).
- Persist only final segments to transcript history/storage.
- Clear the current partial once a final event is received.

## Available Models

- `valsea-transcribe`: Audio-to-text transcription with accent-aware models
- `valsea-translate`: Text translation between 50+ languages
- `valsea-annotate`: Text annotation with semantic tags and corrections
- `valsea-clarify`: Transform colloquial/noisy text into clear standard text
- `valsea-convert`: Convert annotated text into clean readable output
- `valsea-format`: Format transcripts into meeting minutes, summaries, etc.
- `valsea-sentiment`: Sentiment and emotional tone detection
- `valsea-rtt`: Real-time transcription via WebSocket

## Supported Languages (Transcription)

singlish, english, chinese, korean, vietnamese, thai, indonesian, malay, filipino, tamil, khmer

## Errors

All endpoints return these standard error codes:
- 401: Missing or invalid API key
- 402: Insufficient credits
- 413: Audio file too large or too long (transcription only — max 10 MB, max 1 hour)

## Optional

- [Dashboard](https://valsea.ai/en/dashboard): Manage API keys, view analytics, test endpoints in the interactive playground
- [API Playground](https://valsea.ai/en/dashboard/playground): Interactive playground to test all endpoints with your API key