Audio Package

Audio processing framework for speech and multimedia applications.

Design Goals

  1. Real-time Processing: Low-latency audio mixing, encoding, and streaming
  2. Format Flexibility: Support common audio formats (PCM, Opus, MP3, OGG)
  3. Cross-platform: FFI bindings to native libraries (libopus, libsoxr, lame)
  4. Streaming-first: Designed for continuous audio streams, not just files

Architecture

graph TB
    subgraph audio["audio/"]
        subgraph row1[" "]
            pcm["pcm/<br/>- Format<br/>- Chunk<br/>- Mixer"]
            codec["codec/<br/>- opus/<br/>- mp3/<br/>- ogg/"]
            resampler["resampler/<br/>- soxr<br/>- Format<br/>- Convert"]
        end
        subgraph row2[" "]
            opusrt["opusrt/<br/>- Buffer<br/>- Realtime<br/>- OGG R/W"]
            songs["songs/<br/>- Catalog<br/>- Notes<br/>- PCM gen"]
            portaudio["portaudio/<br/>(Go only)<br/>- Stream<br/>- Device"]
        end
    end

Submodules

ModuleDescriptionGoRust
pcm/PCM format, chunks, mixing
codec/Audio codecs (Opus, MP3, OGG)
resampler/Sample rate conversion (soxr)
opusrt/Realtime Opus streaming⚠️
songs/Built-in melodies
portaudio/Audio I/O devices

Audio Formats

PCM Formats (Predefined)

FormatSample RateChannelsBit Depth
L16Mono16K16000 Hz116-bit
L16Mono24K24000 Hz116-bit
L16Mono48K48000 Hz116-bit

Codec Support

CodecEncodeDecodeContainer
OpusRaw, OGG
MP3Raw
OGGN/AN/AContainer only

Common Workflows

Voice Chat (Low Latency)

flowchart LR
    A[Microphone] --> B[PCM 16kHz]
    B --> C[Opus Encode]
    C --> D[Network]
    D --> E[Opus Decode]
    E --> F[Mixer]
    F --> G[Speaker]

Speech Synthesis Playback

flowchart LR
    A[API Response<br/>Base64 MP3] --> B[MP3 Decode]
    B --> C[Resample<br/>24K→16K]
    C --> D[Mixer]
    D --> E[Speaker]

Audio Recording

flowchart LR
    A[PCM Stream] --> B[Opus Encode]
    B --> C[OGG Writer]
    C --> D[File]

Native Dependencies

LibraryPurposeBuild System
libopusOpus codecpkg-config / Bazel
libsoxrResamplingpkg-config / Bazel
lameMP3 encodingBazel (bundled)
minimp3MP3 decodingBazel (bundled)
liboggOGG containerpkg-config / Bazel
portaudioAudio I/Opkg-config / Bazel

Examples Directory

  • examples/go/audio/ - Go audio examples
  • examples/rust/audio/ - Rust audio examples
  • buffer - Used for audio data buffering
  • speech - High-level speech synthesis/recognition
  • minimax, doubaospeech - TTS/ASR APIs returning audio