豆包语音(Doubao Speech)API 文档

原始文档

  • 文档首页: https://www.volcengine.com/docs/6561/162929
  • 控制台: https://console.volcengine.com/speech/app

如果本文档信息不完整,请访问上述链接获取最新内容。

产品体系

豆包语音分为两代产品:大模型版(2.0)经典版(1.0)。推荐使用大模型版。


语音合成(TTS)

大模型语音合成 2.0

接口端点Resource ID文档
单向流式 HTTP V3POST /api/v3/tts/unidirectionalseed-tts-2.0stream-http.md
单向流式 WebSocket V3WSS /api/v3/tts/unidirectionalseed-tts-2.0stream-ws.md
双向流式 WebSocket V3WSS /api/v3/tts/bidirectionseed-tts-2.0duplex-ws.md
异步长文本POST /api/v3/tts/async/submitseed-tts-2.0-concurrasync.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/1598757 (单向流式HTTP-V3)
  • https://www.volcengine.com/docs/6561/1719100 (单向流式WebSocket-V3)
  • https://www.volcengine.com/docs/6561/1329505 (双向流式WebSocket-V3)
  • https://www.volcengine.com/docs/6561/1330194 (异步长文本)

经典版语音合成 1.0

接口端点Cluster文档
HTTP 一次性合成POST /api/v1/ttsvolcano_ttshttp.md
WebSocket 流式WSS /api/v1/tts/ws_binaryvolcano_ttswebsocket.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/79820 (HTTP接口)
  • https://www.volcengine.com/docs/6561/79821 (WebSocket接口)
  • https://www.volcengine.com/docs/6561/97465 (参数说明)

精品长文本语音合成

接口端点文档
异步长文本POST /api/v1/long_tts/submitlong-tts.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/1096680

语音识别(ASR)

大模型语音识别 2.0

接口端点Resource ID文档
流式识别 WebSocketWSS /api/v3/sauc/bigmodelvolc.bigasr.sauc.durationstreaming.md
录音文件识别(标准版)POST /api/v3/asr/bigmodel/submitvolc.bigasr.auc.durationfile-standard.md
录音文件识别(极速版)POST /api/v3/asr/bigmodel_async/submitvolc.bigasr.auc.durationfile-fast.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/1354869 (大模型流式语音识别)
  • https://www.volcengine.com/docs/6561/1354868 (大模型录音文件识别标准版)
  • https://www.volcengine.com/docs/6561/1631584 (大模型录音文件极速版)
  • https://www.volcengine.com/docs/6561/1840838 (大模型录音文件闲时版)

经典版语音识别 1.0

接口端点Cluster文档
一句话识别POST /api/v1/asrvolcengine_input_commonone-sentence.md
流式识别WSS /api/v2/asrvolcengine_streaming_commonstreaming.md
录音文件标准版POST /api/v1/asr/submitvolc.megatts.defaultfile-standard.md
录音文件极速版POST /api/v1/asr/async/submitvolc.megatts.defaultfile-fast.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/104897 (一句话识别)
  • https://www.volcengine.com/docs/6561/80816 (流式语音识别)
  • https://www.volcengine.com/docs/6561/80818 (录音文件识别标准版)
  • https://www.volcengine.com/docs/6561/80820 (录音文件识别极速版)

声音复刻

接口端点Cluster文档
训练提交POST /api/v1/mega_tts/audio/uploadvolcano_iclapi.md
状态查询POST /api/v1/mega_tts/statusvolcano_iclapi.md
激活音色POST /api/v1/mega_tts/audio/activatevolcano_iclapi.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/1305191 (声音复刻API)
  • https://www.volcengine.com/docs/6561/1829010 (声音复刻下单及使用指南)

实时语音大模型

接口端点Resource ID文档
实时对话WSS /api/v3/realtime/dialoguevolc.speech.dialogapi.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/1257584 (端到端实时语音大模型API)

播客合成

接口端点Resource ID文档
WebSocket V3WSS /api/v3/sami/podcastttsvolc.megatts.podcastapi.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/1668014 (播客API-websocket-v3协议)

同声传译

接口端点Resource ID文档
WebSocket V3WSS /api/v3/saas/simtvolc.megatts.simtapi.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/xxx (同声传译2.0-API)

语音妙记(会议纪要)

接口端点文档
异步提交POST /api/v1/meeting/submitapi.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/xxx (豆包语音妙记-API)

音视频字幕

接口端点文档
字幕生成POST /api/v1/subtitle/submitsubtitle.md
字幕打轴POST /api/v1/subtitle/alignalign.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/192519 (音视频字幕生成)
  • https://www.volcengine.com/docs/6561/113635 (自动字幕打轴)

控制台管理 API

接口端点认证方式文档
大模型音色列表POST /ListBigModelTTSTimbresAK/SKtimbre.md
大模型音色列表(新)POST /ListSpeakersAK/SKtimbre.md
API Key 管理POST /ListAPIKeysAK/SKapikey.md
服务状态管理POST /ServiceStatusAK/SKservice.md
配额监控POST /QuotaMonitoringAK/SKmonitoring.md
声音复刻状态POST /ListMegaTTSTrainStatusAK/SKvoice-clone-status.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/1770994 (ListBigModelTTSTimbres)
  • https://www.volcengine.com/docs/6561/2160690 (ListSpeakers)

认证方式

Speech API(语音服务)

语音服务使用以下认证方式:

认证方式Header适用场景
Access TokenAuthorization: Bearer; {token}HTTP/WebSocket V1-V2
X-Api 认证X-Api-App-Id, X-Api-Access-KeyWebSocket V3
Request Bodyapp.token部分 HTTP 接口

Console API(控制台服务)

控制台 API 使用 Volcengine OpenAPI AK/SK 签名认证

Authorization: HMAC-SHA256 Credential={AccessKeyId}/...

详见 auth.md


快速选择

需求推荐接口文档
短文本实时合成TTS 2.0 单向流式 HTTP V3stream-http.md
长文本批量合成TTS 2.0 异步接口async.md
实时语音交互实时对话 APIrealtime/api.md
定制音色声音复刻 APIvoice-clone/api.md
实时语音识别ASR 2.0 流式asr2.0/streaming.md
录音文件转写ASR 2.0 文件识别asr2.0/file-standard.md
播客生成播客 APIpodcast/api.md