voice.go: Voice/VoiceSegment interfaces
speech.go: Speech/SpeechSegment interfaces
asr.go: ASR multiplexer and interfaces
tts.go: TTS multiplexer and interfaces
segment.go: default sentence segmentation
util.go: collectors and copy helpers
- Provider implementations:
asr_doubao_sauc.go, tts_doubao_v1.go,
tts_doubao_v2.go, tts_minimax.go
- Voice:
Voice, VoiceSegment, VoiceStream
- Speech:
Speech, SpeechSegment, SpeechStream
- ASR:
StreamTranscriber, Transcriber, ASR mux + helpers
- TTS:
Synthesizer, TTS mux + helpers
- Segmentation:
SentenceSegmenter, SentenceIterator
- Global muxes
ASRMux and TTSMux provide default routing.
- ASR uses Opus frame streams (
opusrt.FrameReader).
DefaultSentenceSegmenter splits by punctuation with a rune cap.
CollectSpeech and CopySpeech help aggregate or export streams.
Transcribe falls back to streaming when the backend does not implement
the Transcriber interface.
Revoice streams existing speech into a TTS backend via a pipe.