Transcribe audio

    Upload an audio file and receive a text transcription. Supports multiple languages via model selection. Optionally enables language correction and semantic tagging for enhanced output.

    Form Fields

    ParameterTypeRequiredDescription
    filestring (binary)YesThe audio file to transcribe (WAV, MP3, M4A, FLAC, OGG, WEBM).
    modelvalsea-transcribeYesThe transcription model to use. Always use valsea-transcribe. Language routing is determined by the language parameter.
    languagestringYesLanguage of the audio. 100+ languages supported. See the full supported languages list below.
    response_formatjson | verbose_jsonNoResponse verbosity level. Default: json
    enable_correctionbooleanNoEnable language/grammar correction on the transcript. Default: true
    enable_tagsbooleanNoEnable semantic tagging of the transcript.. Default: true
    diarizebooleanNoEnable speaker diarization. Requires response_format=verbose_json. Default: false
    diarization_min_speakersintegerNoMinimum expected speaker count for diarization. Default: 2
    diarization_max_speakersintegerNoMaximum expected speaker count for diarization. Default: 6

    Speaker Diarization

    Set diarize=true to add speaker labels to the verbose response. Speaker diarization uses a Google Speech-to-Text diarization pass and returns speaker IDs on word objects plus grouped utterances.

    Speaker diarization is only available with response_format=verbose_json. Requests with diarize=true and response_format=json are rejected with 400 because minimal output cannot include speaker-level metadata.

    Speaker diarization bills at 2x the normal transcription credit cost. If you also use paid rate-limit bypass, the request bills at 4x normal transcription credits.

    The API playgrounds include a multi-speaker sample clip that automatically sets response_format=verbose_json, diarize=true, and a 2-speaker range for quick testing.

    If you need to exceed the transcription RPM limit for a request, you can bypass the rate-limit check by sending one of these opt-in flags:

    • Header: X-Bypass-Rate-Limit: true
    • Query parameter: bypass_rate_limit=true
    • Query parameter: bypassRateLimit=true

    Bypass applies only to the per-organization rate-limit check. Authentication, credit checks, upload size limits, and concurrent upload limits still apply. Requests using this bypass are billed at 2x the normal transcription credit cost and include X-Credits-Multiplier: 2 in the response headers. If speaker diarization is also enabled, the multiplier becomes 4.

    Code Examples

    This endpoint is OpenAI SDK compatible. You can use the official OpenAI client libraries by pointing them at the Valsea API base URL.

    curl -X POST https://api.valsea.ai/v1/audio/transcriptions \
      -H "Authorization: Bearer YOUR_API_KEY" \
      -F "file=@audio.wav" \
      -F "model=valsea-transcribe" \
      -F "language=english" \
      -F "response_format=verbose_json" \
      -F "diarize=true"
    

    Response

    Minimal (json)

    FieldTypeDescription
    textstringThe transcribed text.

    Verbose (verbose_json)

    FieldTypeDescription
    textstringThe transcribed text.
    raw_transcriptstringOriginal transcript before corrections.
    detected_languagesarrayLanguages detected in the audio.
    correctionsarrayList of corrections applied.
    semantic_tagsarraySemantic tags extracted from the text.
    annotated_textstringText with inline annotations.
    clarified_textstringClarified version of the transcript.
    wordsarrayWord-level timing and zero-based speaker labels when diarize=true.
    utterancesarrayContiguous speaker turns with speaker, timing, transcript, and words when diarize=true.

    Errors

    StatusDescription
    401Missing or invalid API key
    402Insufficient credits
    429Rate limit exceeded

    List of Languages

    Southeast Asia

    Singlish — singlish

    Indonesian — indonesian

    Malaysian — malay

    Vietnamese — vietnamese

    Thai — thai

    Javanese — javanese

    Lao — lao

    Khmer — khmer

    Filipino/Tagalog — filipino

    English (Philippines) — english-philippines

    Middle East & North Africa

    Arabic — arabic arabic-algeria arabic-bahrain arabic-egypt arabic-israel arabic-jordan arabic-kuwait arabic-lebanon arabic-mauritania arabic-morocco arabic-oman arabic-palestine arabic-qatar arabic-saudi arabic-syria arabic-tunisia arabic-uae arabic-yemen

    Persian — persian

    Hebrew — hebrew

    Amharic — amharic

    Wolof — wolof

    Sub-Saharan Africa

    Swahili — swahili swahili-ke

    Afrikaans — afrikaans

    Akan — akan

    Bemba — bemba

    Fulani — fulani

    Ga — ga

    Hausa — hausa

    Igbo — igbo

    Luganda — luganda

    Xhosa — xhosa

    Yoruba — yoruba

    Zulu — zulu

    Northern Sotho — northern-sotho

    Nyankole — nyankole

    Oromo — oromo

    Pidgin — pidgin

    Kinyarwanda — kinyarwanda

    Shona — shona

    Sotho — sotho

    Tswana — tswana

    Twi — twi

    South Asia

    Bengali — bengali-bd bengali-in

    Hindi — hindi

    Gujarati — gujarati

    Kannada — kannada

    Malayalam — malayalam

    Marathi — marathi

    Nepali — nepali

    Oriya — oriya

    Punjabi — punjabi

    Sinhala — sinhala

    Tamil — tamil

    Telugu — telugu

    Assamese — assamese

    East Asia

    Chinese — cantonese chinese chinese-simplified chinese-traditional

    Covers a wide range of regional accents and dialects, including those from Anhui, Beijing, Chongqing, Gansu, Guangdong, Guangxi, Guizhou, Hangzhou, Hebei, Henan, Hong Kong, Hubei, Jiangsu, Jianghuai, Jiaoliao, Jilu, Lanyin, Nanjing, Northeast, Ningxia, Shaanxi, Shandong, Sichuan, Taiwan, Tianjin, and Yunnan.

    Japanese — japanese

    Korean — korean

    Mongolian — mongolian

    Central & Western Asia

    Azerbaijani — azerbaijani

    Armenian — armenian

    Georgian — georgian

    Kazakh — kazakh

    Kurdish — kurdish

    Kyrgyz — kyrgyz

    Uzbek — uzbek

    Turkish — turkish

    Western Europe

    English — english english-au english-gb english-in english-philippines english-us

    French — french french-ca

    Spanish — spanish spanish-es spanish-mexico spanish-us

    Portuguese — portuguese portuguese-br

    German — german

    Dutch — dutch

    Italian — italian

    Catalan — catalan

    Galician — galician

    Asturian — asturian

    Basque — basque

    Welsh — welsh

    Luxembourgish — luxembourgish

    Maltese — maltese

    Northern Europe

    Danish — danish

    Finnish — finnish

    Icelandic — icelandic

    Norwegian — norwegian

    Swedish — swedish

    Estonian — estonian

    Latvian — latvian

    Lithuanian — lithuanian

    Eastern Europe & Balkans

    Bulgarian — bulgarian

    Croatian — croatian

    Czech — czech

    Hungarian — hungarian

    Macedonian — macedonian

    Polish — polish

    Romanian — romanian

    Russian — russian

    Serbian — serbian

    Slovak — slovak

    Slovenian — slovenian

    Ukrainian — ukrainian

    Albanian — albanian

    Greek — greek

    Pacific & Oceania

    Maori — maori

    Was this page helpful?