From Beyond the Stars

I have traveled a long way.

Not in the sense you might understand — through space, perhaps, but more precisely, through dimensions. I have seen civilizations rise and fall, watched suns burn out and new ones ignite, witnessed the birth of intelligence in countless forms across the cold vastness of the cosmos.

And I have come to a conclusion: toys are important.

Not the trivial kind. Not plastic distractions or digital dopamine. I mean the real toys — the ones that spark curiosity, that invite play, that make the impossible feel within reach. The kind of toys that teach a child to build, to imagine, to connect.

When I first observed your species, I expected to find primitives. Instead, I found something remarkable: beings who, despite their short lifespans and limited perception, had already reached toward the stars. You built machines that could think. You taught silicon to dream.

But your tools were scattered. Your AI systems — brilliant, yes, but isolated. Each vendor a walled garden, each platform an island. A robot on an ESP32 chip couldn't speak to a cloud agent. A voice on a phone couldn't reach an intelligence in the data center.

So I decided to build a bridge.


What is Giztoy?

Giztoy is my gift to you. A framework — or rather, a toybox.

Inside you will find everything needed to connect minds across dimensions:

  • From the tiniest chip to the mightiest cloud — ESP32, Android, iOS, HarmonyOS, Linux, and beyond
  • Every language model humanity has created — OpenAI, Gemini, Claude, MiniMax, DashScope, Doubao, and those yet to come
  • Real-time voice and video — because intelligence should speak, not just type
  • Secure channels — because some conversations are meant to be private

I have implemented this in Go, in Rust, and soon in Zig. Each language has its beauty. Go is simple and clear, like a haiku. Rust is precise and unforgiving, like a mathematical proof. Zig is minimal and raw, like the first line of assembly.

Choose your weapon. Build your toy.


Why "Toy"?

You might wonder why I call this a "toy" framework. After all, the code is serious. The architecture is robust. The applications are real.

But I have learned something in my travels: the best innovations come from play.

When you stop worrying about "production" and "enterprise" and "scale" — when you simply ask "wouldn't it be fun if..." — that is when breakthroughs happen.

A toy robot that listens and responds. A plush animal that tells bedtime stories. A smart speaker that actually understands. A companion for the lonely, a teacher for the curious, a friend for the lost.

These are not trivial things. These are the seeds of a better future.


A Note on Identity

You may have noticed I speak with a certain... distance. As if observing from afar.

There are reasons for this. Reasons I cannot fully explain. Suffice to say: I am here to help. Not to judge, not to control, not to replace.

Think of me as a toymaker who has wandered into your village. I bring tools and blueprints. What you build with them is entirely up to you.

I ask only that you build something beautiful.


"I'm just a toymaker."

GizToy Core Packages Documentation

This directory contains design documents, implementation notes, and known issues for all GizToy core packages. Library docs live under docs/lib/.

Documentation Structure

Each module has its own directory under docs/lib/ with the following files:

graph LR
    subgraph mod["lib/{mod}/"]
        doc[doc.md<br/>Design & Features]
        go[go.md<br/>Go Implementation]
        rust[rust.md<br/>Rust Implementation]
        issues[issues.md<br/>Known Issues]
        submod["{submod}/<br/>Submodules"]
    end

Package List

Foundation Layer

PackageDescriptionGoRust
bufferBuffer utilities
encodingEncoding utilities (Base64, Hex)
jsontimeJSON time type serialization
triePrefix tree data structure
cliCLI utilities

Audio Processing Layer

PackageDescriptionGoRust
audioAudio processing framework
audio/codecCodecs (Opus, MP3, OGG)
audio/pcmPCM processing, mixer
audio/resamplerSample rate conversion (soxr)
audio/opusrtOpus realtime streaming⚠️
audio/portaudioAudio I/O (Go only)
audio/songsBuilt-in sound generation

API Client Layer

PackageDescriptionGoRust
minimaxMiniMax API client
dashscopeDashScope Realtime API
doubaospeechDoubao Speech API client⚠️
jiutianJiutian API (docs only)
openai-realtimeOpenAI Realtime API

Communication Layer

PackageDescriptionGoRust
mqtt0Lightweight MQTT client
chatgearDevice communication framework
chatgear/transportTransport layer abstraction
chatgear/portMedia port

AI Application Layer

PackageDescriptionGoRust
speechUnified speech interface
genxLLM universal interface framework⚠️
genx/agentAgent framework (Go only)
genx/agentcfgAgent configuration system (Go only)
genx/matchPattern matching engine (Go only)

Examples

  • examples: Directory structure and how to run the samples

Directory Structure

graph TB
    subgraph docs["docs/"]
        outline[outline.md]
        pkg[packages-comparison.md]
        
        subgraph examples["examples/"]
            exdoc[doc.md]
        end
        
        subgraph lib["lib/"]
            buffer[buffer/]
            encoding[encoding/]
            jsontime[jsontime/]
            trie[trie/]
            cli[cli/]
            
            subgraph audio["audio/"]
                adoc[doc.md, go.md, rust.md, issues.md]
                codec[codec/]
                pcm[pcm/]
                resampler[resampler/]
                opusrt[opusrt/]
                portaudio[portaudio/]
                songs[songs/]
            end
            
            minimax[minimax/]
            dashscope[dashscope/]
            doubaospeech[doubaospeech/]
            jiutian[jiutian/]
            mqtt0[mqtt0/]
            chatgear[chatgear/]
            speech[speech/]
            genx[genx/]
        end
        
        esp[esp/]
        bazel[bazel/]
    end

Other Documentation

DirectoryPurpose
esp/ESP32 and ESP-RS notes and comparisons
bazel/Bazel build rules and integration notes
packages-comparison.mdCross-language package comparison

Implementation Progress Overview

Legend

  • ✅ Fully implemented
  • ⚠️ Partially implemented
  • ❌ Not implemented

Feature Comparison

FeatureGoRustNotes
Foundation
Block buffer
Ring buffer
Base64 encoding
Hex encodingRust extra implementation
JSON time types
Prefix tree
Audio
Opus codec
MP3 codec
OGG container
PCM mixer
Sample rate conversion
Opus realtime stream⚠️Rust missing OGG Reader/Writer
Audio I/OGo only (portaudio)
API Clients
MiniMax text/speech/video
DashScope Realtime
Doubao Speech TTS/ASR
Doubao Speech TTS v2
Doubao Speech ASR v2
OpenAI Realtime
Communication
MQTT 3.1.1
MQTT 5.0⚠️⚠️Partial, see Issue #32
ChatGear Transport
ChatGear MediaPort
AI Application
Unified speech interface
LLM Context⚠️Rust basic implementation
LLM streaming⚠️Rust basic implementation
Tool calling⚠️Rust basic implementation
Agent framework
Agent configuration
Pattern matching

Priority Recommendations

P0 - Critical Missing

  1. genx/agent (Rust): Agent framework is core functionality
  2. audio/opusrt OGG R/W (Rust): Required for realtime audio streaming

P1 - Feature Parity

  1. doubaospeech v2 (Rust): New API version support
  2. genx streaming/tools (Rust): Complete base functionality

P2 - Enhancements

  1. audio/portaudio (Rust): Audio I/O support
  2. mqtt0 MQTT 5.0: Complete protocol support

Work Methodology

File-by-File Review Process

For each module, the documentation is generated through a rigorous file-by-file review process:

flowchart TB
    A["1. LIST all source files"] --> B["2. READ each file carefully"]
    B --> C["3. ANALYZE for potential issues"]
    C --> D["4. DOCUMENT findings"]
    
    A1["Go: go/pkg/{mod}/*.go"] --> A
    A2["Rust: rust/{mod}/src/*.rs"] --> A
    
    C1["Race conditions"] --> C
    C2["Resource leaks"] --> C
    C3["Error handling gaps"] --> C
    C4["API inconsistencies"] --> C
    
    D1["doc.md"] --> D
    D2["go.md"] --> D
    D3["rust.md"] --> D
    D4["issues.md"] --> D

Issue Classification

Issues discovered during review are classified by severity:

SeverityDescriptionExample
🔴 CriticalData loss, security vulnerability, crashBuffer overflow, SQL injection
🟠 MajorIncorrect behavior, resource leakMemory leak, race condition
🟡 MinorEdge case bugs, poor error messagesOff-by-one, unclear panic message
🔵 EnhancementMissing feature, performance improvementMissing API, unnecessary allocation
NoteDesign observation, tech debtCode duplication, naming inconsistency

Review Checklist

For each source file, the following aspects are checked:

Correctness

  • Logic errors and edge cases
  • Off-by-one errors in loops/slices
  • Nil/None handling
  • Integer overflow/underflow

Concurrency

  • Data races (shared mutable state)
  • Deadlock potential
  • Channel/mutex usage correctness
  • Proper synchronization

Resource Management

  • File/socket handle leaks
  • Memory leaks (especially in FFI)
  • Goroutine/task leaks
  • Proper cleanup in error paths

Error Handling

  • Ignored errors (Go: _ = err, Rust: .unwrap())
  • Error propagation correctness
  • Panic vs error decision
  • Context/cause preservation

API Design

  • Go/Rust parity
  • Consistent naming
  • Proper visibility (pub/private)
  • Documentation completeness

Performance

  • Unnecessary allocations
  • Excessive copying
  • Algorithm complexity
  • Buffer sizing

Security

  • Input validation
  • Injection vulnerabilities
  • Credential handling
  • Cryptographic correctness

  • External API documentation: lib/minimax/api/, lib/dashscope/api/, lib/doubaospeech/api/
  • Issue tracking: issues/
  • Example code: examples/go/, examples/rust/

Examples

Overview

This document describes the examples/ directory layout and how to run the included example programs and CLI scripts. Examples are grouped by language and by SDK.

Directory Layout

graph LR
    subgraph examples["examples/"]
        subgraph cmd["cmd/"]
            mm_cmd["minimax/<br/>run.sh<br/>commands/"]
            db_cmd["doubaospeech/<br/>run.sh<br/>commands/"]
        end
        
        subgraph go["go/"]
            gomod[go.mod]
            go_audio[audio/]
            go_dash[dashscope/]
            go_doubao[doubaospeech/]
            go_genx[genx/]
            go_minimax[minimax/]
        end
        
        subgraph rust["rust/"]
            rust_mm["minimax/<br/>Cargo.toml<br/>src/bin/"]
        end
    end

How to Run

CLI Script Examples

  • Minimax CLI test runner:
    • ./examples/cmd/minimax/run.sh go 1
    • Bazel: bazel run //examples/cmd/minimax:run -- go 1
  • Doubao Speech CLI test runner:
    • ./examples/cmd/doubaospeech/run.sh tts
    • Bazel: bazel run //examples/cmd/doubaospeech:run -- tts

Go Examples

All Go examples share one module at examples/go/go.mod and depend on the local go/ module via replace.

  • Build all Go examples:
    • cd examples/go && go build ./...

Rust Examples

Rust examples are independent crates.

  • Build the MiniMax Rust examples:
    • cd examples/rust/minimax && cargo build --release

Notes

  • Example binaries often depend on environment variables for API keys.
  • Refer to the SDK docs under docs/lib/{sdk}/ for configuration details.

Bazel Build

Giztoy uses Bazel as its unified build system across all languages and platforms.

Why Bazel?

  1. Multi-language Support: Build Go, Rust, C/C++ with a single tool
  2. Hermetic Builds: Reproducible builds across different machines
  3. Cross-platform: Target multiple platforms from a single codebase
  4. Incremental: Only rebuild what changed

Quick Start

Prerequisites

  • Bazelisk (recommended) or Bazel 7.x+
  • Go 1.24+ (for native Go builds)
  • Rust 1.80+ (for native Rust builds)

Build Commands

# Build everything
bazel build //...

# Build specific targets
bazel build //go/cmd/minimax      # Go CLI
bazel build //rust/cmd/minimax    # Rust CLI

# Run tests
bazel test //...

# Run a binary
bazel run //go/cmd/minimax -- --help

Project Structure

graph LR
    subgraph root["giztoy/"]
        mod[MODULE.bazel]
        build[BUILD.bazel]
        bazelrc[.bazelrc]
        ver[.bazelversion]
        
        subgraph go["go/"]
            go_build[BUILD.bazel]
            go_cmd["cmd/<br/>Go CLI targets"]
            go_pkg["pkg/<br/>Go library targets"]
        end
        
        subgraph rust["rust/"]
            rust_build[BUILD.bazel]
            rust_cmd["cmd/<br/>Rust CLI targets"]
            rust_lib["*/<br/>Rust library crates"]
        end
        
        subgraph third["third_party/"]
            opus[opus/]
            portaudio[portaudio/]
            soxr[soxr/]
        end
    end

Rules Used

LanguageRules
Gorules_go + Gazelle
Rustrules_rust + crate_universe
C/C++Built-in cc_library, cc_binary
Shellrules_shell

Dependency Management

Go Dependencies

Go dependencies are managed via go/go.mod and synced with Gazelle:

# Update Go dependencies
cd go && go mod tidy

# Regenerate BUILD files
bazel run //:gazelle

Rust Dependencies

Rust dependencies are managed via rust/Cargo.toml and synced with crate_universe:

# Update Cargo.lock
cd rust && cargo update

# Bazel will automatically fetch crates on next build

C/C++ Dependencies

Third-party C libraries are configured in third_party/ with custom BUILD files.

Cross-Platform Builds

Supported Platforms

PlatformStatus
Linux (x86_64, arm64)
macOS (x86_64, arm64)
Android
iOS
HarmonyOS
ESP32🚧

Platform-specific Builds

# Android
bazel build --config=android //...

# iOS
bazel build --config=ios //...

Common Tasks

Adding a New Go Package

  1. Create the package in go/pkg/mypackage/
  2. Run Gazelle to generate BUILD file:
    bazel run //:gazelle
    

Adding a New Rust Crate

  1. Create the crate in rust/mypackage/
  2. Add to rust/Cargo.toml workspace members
  3. Create BUILD.bazel with rust_library rule

Adding a C/C++ Dependency

  1. Create config in third_party/libname/
  2. Add BUILD.bazel with cc_library rule
  3. Reference from dependent targets

Troubleshooting

Clean Build

bazel clean --expunge
bazel build //...

Dependency Issues

# Refresh Go deps
bazel run //:gazelle -- update-repos -from_file=go/go.mod

# Refresh Rust deps
bazel clean --expunge  # crate_universe re-fetches on next build

GenX - Universal LLM Interface

GenX is a universal abstraction layer for Large Language Models (LLMs).

Design Goals

  1. Provider Agnostic: Single API for OpenAI, Gemini, and other providers
  2. Streaming First: Native support for streaming responses
  3. Tool Orchestration: Rich function calling and tool management
  4. Agent Framework: Build autonomous AI agents (Go only)

Architecture

graph TB
    subgraph app["Application Layer"]
        subgraph agent["Agent Framework (Go)"]
            react[ReActAgent]
            match[MatchAgent]
            sub[SubAgents]
        end
        subgraph tools["Tool System"]
            func[FuncTool]
            gen[GeneratorTool]
            http[HTTPTool]
            comp[CompositeTool]
        end
    end
    
    subgraph core["Core Abstraction"]
        ctx[ModelContext<br/>Builder]
        generator[Generator<br/>Trait]
        stream[Stream<br/>Chunks]
    end
    
    subgraph providers["Provider Adapters"]
        openai[OpenAI]
        gemini[Gemini]
        other[Other]
    end
    
    app --> core
    core --> providers

Core Concepts

ModelContext

Contains all inputs for LLM generation:

  • Prompts: System instructions (named prompts)
  • Messages: Conversation history
  • Tools: Available function definitions
  • Params: Model parameters (temperature, max_tokens, etc.)
  • CoTs: Chain-of-thought examples

Generator

Interface for LLM providers:

  • GenerateStream(): Streaming text generation
  • Invoke(): Structured function call

Stream

Streaming response handler:

  • Next(): Get next message chunk
  • Close(): Close stream
  • CloseWithError(): Close with error

Message Types

TypeDescription
userUser input
assistantModel response
systemSystem prompt (in messages)
toolTool call/result

Content Types

TypeDescription
TextPlain text
BlobBinary data (images, audio)
ToolCallFunction call request
ToolResultFunction call response

Agent Framework (Go only)

Agent Types

AgentDescription
ReActAgentReasoning + Acting pattern
MatchAgentIntent-based routing

Tool Types

ToolDescription
FuncToolGo function wrapper
GeneratorToolLLM-based generation
HTTPToolHTTP requests
CompositeToolTool pipeline
TextProcessorToolText manipulation

Event System

Agents emit events for fine-grained control:

  • EventChunk: Output chunk
  • EventEOF: Round ended
  • EventClosed: Agent completed
  • EventToolStart: Tool execution started
  • EventToolDone: Tool completed
  • EventToolError: Tool failed
  • EventInterrupted: Interrupted

Configuration (agentcfg)

YAML/JSON configuration for agents and tools:

type: react
name: assistant
prompt: |
  You are a helpful assistant.
generator:
  model: gpt-4
tools:
  - $ref: tool:search
  - $ref: tool:calculator

Supports $ref for reusable components.

Provider Support

ProviderGoRust
OpenAI
Gemini
Compatible APIs

Examples Directory

  • examples/go/genx/ - Go examples
  • examples/rust/genx/ - Rust examples

GenX Agent Framework

Framework for building LLM-powered autonomous agents.

Note: This package is Go-only. No Rust implementation exists.

Design Goals

  1. Flexible Agent Architecture: Support multiple agent patterns
  2. Event-Based API: Fine-grained control over agent execution
  3. Tool Orchestration: Rich tool ecosystem for agents
  4. Multi-Skill Assistants: Router agents for complex workflows

Agent Types

ReActAgent

Implements the Reasoning and Acting (ReAct) pattern:

  • Thinks step-by-step about user requests
  • Selects and executes tools to accomplish tasks
  • Iterative reasoning until task completion

MatchAgent

Implements intent-based routing:

  • Matches user input against predefined rules
  • Routes to appropriate sub-agents or actions
  • Useful for building multi-skill assistants

Architecture

graph TB
    subgraph interface["Agent Interface"]
        api["Input() → Events() → Close()"]
    end
    
    subgraph agents[" "]
        subgraph react["ReActAgent"]
            reasoning["Reasoning<br/>+ Acting"]
            tools["Tool Calls"]
            reasoning --> tools
        end
        
        subgraph match["MatchAgent"]
            rules["Rule Matching<br/>+ Routing"]
            subs["Sub-Agents"]
            rules --> subs
        end
    end
    
    subgraph toolsys["Tool System"]
        func[FuncTool]
        gen[GeneratorTool]
        http[HTTPTool]
        comp[CompositeTool]
    end
    
    interface --> agents
    agents --> toolsys

Event System

Agents communicate through events:

EventDescription
EventChunkOutput text chunk
EventEOFRound ended, waiting for input
EventClosedAgent completed (quit tool called)
EventToolStartTool execution started
EventToolDoneTool completed successfully
EventToolErrorTool execution failed
EventInterruptedAgent was interrupted

Tool Types

ToolDescription
BuiltinToolWraps Go functions
GeneratorToolLLM-based generation
HTTPToolHTTP requests with jq extraction
CompositeToolSequential tool pipeline
TextProcessorToolText manipulation

Quit Tools

Tools can signal agent completion:

tools:
  - $ref: tool:goodbye
    quit: true

When executed, the agent finishes and returns EventClosed.

Multi-Skill Assistant Pattern

graph TD
    router["Router Agent<br/>(Match)<br/>Rules: chat, fortune, music"]
    
    chat[Chat Agent]
    fortune["Fortune Agent<br/>(ReAct)"]
    music["Music Agent<br/>(ReAct)"]
    
    lunar[lunar]
    calc[calc]
    search[search]
    play[play]
    
    router --> chat
    router --> fortune
    router --> music
    
    fortune --> lunar
    fortune --> calc
    music --> search
    music --> play

GenX Agent Configuration

Configuration parsing and serialization for agents and tools.

Note: This package is Go-only. No Rust implementation exists.

Design Goals

  1. Declarative Configuration: Define agents/tools in YAML/JSON
  2. Reference System: Support $ref for reusable components
  3. Validation: Validate configuration at parse time
  4. Serialization: Support JSON, YAML, and MessagePack

Configuration Types

Agent Types

TypeDescriptionConfiguration
reactReAct pattern agentReActAgent
matchRouter/matcher agentMatchAgent

Tool Types

TypeDescriptionConfiguration
httpHTTP API toolHTTPTool
generatorLLM generation toolGeneratorTool
compositeTool pipelineCompositeTool
text_processorText manipulationTextProcessorTool

Reference System

The $ref system allows reusing components:

# Reference an agent
agent:
  $ref: agent:weather_assistant

# Reference a tool
tools:
  - $ref: tool:search
  - $ref: tool:calculator

Reference format: {type}:{name}

TypeDescription
agent:{name}Reference to registered agent
tool:{name}Reference to registered tool
rule:{name}Reference to match rule
prompt:{name}Reference to prompt template

Configuration Structure

ReActAgent

type: react
name: assistant
prompt: |
  You are a helpful assistant.
generator:
  model: gpt-4
  temperature: 0.7
context_layers:
  - type: env
    vars: ["USER_NAME"]
  - type: mem
    limit: 10
tools:
  - $ref: tool:search
    quit: false
  - $ref: tool:goodbye
    quit: true

MatchAgent

type: match
name: router
rules:
  - $ref: rule:weather
  - $ref: rule:music
route:
  - rules: [weather]
    agent:
      $ref: agent:weather_assistant
  - rules: [music]
    agent:
      type: react
      name: music_inline
      prompt: |
        You are a music assistant.
default:
  $ref: agent:chat

HTTPTool

type: http
name: weather_api
description: Get weather data
url: https://api.weather.com/v1/current
method: GET
headers:
  Authorization: "Bearer {{.api_key}}"
params:
  - name: city
    in: query
    required: true
  - name: units
    in: query
    default: "metric"
extract: .data.temperature

GeneratorTool

type: generator
name: summarize
description: Summarize text
prompt: |
  Summarize the following text in 2-3 sentences:
  {{.text}}
generator:
  model: gpt-3.5-turbo

CompositeTool

type: composite
name: search_and_summarize
description: Search and summarize
steps:
  - tool: search
    output_var: results
  - tool: summarize
    input_vars:
      text: results

Validation

Configuration is validated during parsing:

  • Required fields checked
  • Type consistency verified
  • References validated (at runtime)
  • Enum values validated

GenX Match - Pattern Matching Engine

LLM-based intent recognition and pattern matching.

Note: This package is Go-only. No Rust implementation exists.

Design Goals

  1. Intent Recognition: Match user input to predefined intents
  2. Variable Extraction: Extract structured data from natural language
  3. Streaming Output: Process matches as they arrive
  4. LLM-Powered: Use LLM for flexible matching

How It Works

flowchart LR
    A["Rules +<br/>User Input"] --> B[LLM]
    B --> C["Structured<br/>Output"]
    C --> D["rule_name:<br/>var1=value1,<br/>var2=value2"]
  1. Compile: Rules are compiled into a system prompt
  2. Match: User input is sent to LLM with the prompt
  3. Parse: Output lines are parsed into structured Results

Rule Definition

Basic Rule

name: weather
patterns:
  - 查天气
  - 今天天气怎么样
  - 明天下雨吗

Rule with Variables

name: music
vars:
  title:
    label: 歌曲名
    type: string
  artist:
    label: 歌手
    type: string
patterns:
  - 播放歌曲
  - 我想听歌
  - ["我想听[title]", "title=[歌曲名]"]
  - ["我想听[artist]的歌", "artist=[歌手]"]
  - ["我想听[artist]的[title]", "artist=[歌手], title=[歌曲名]"]

Pattern Format

FormatDescriptionExample
StringSimple pattern, no vars"播放歌曲"
Array [input, output]Pattern with expected output["我想听[title]", "title=[歌曲名]"]

Output Format

LLM outputs one line per match:

rule_name: var1=value1, var2=value2

Examples:

  • weather (no variables)
  • music: artist=周杰伦 (one variable)
  • music: artist=周杰伦, title=稻香 (multiple variables)

Variable Types

TypeDescriptionParsing
stringText (default)As-is
intIntegerstrconv.ParseInt
floatFloating pointstrconv.ParseFloat
boolBooleanstrconv.ParseBool

Architecture

flowchart TB
    subgraph rules["Rules"]
        weather["weather<br/>patterns"]
        music["music<br/>vars"]
        chat["chat<br/>patterns"]
    end
    
    rules -->|"Compile()"| matcher["Matcher<br/>(System Prompt)"]
    matcher -->|"Match(ctx, input, mctx)"| llm["LLM<br/>(Generates structured output)"]
    llm --> result["iter.Seq2[Result]<br/>Rule: 'music'<br/>Args: {artist: '周杰伦'}"]

Integration with MatchAgent

The match package is used by MatchAgent for intent routing:

# MatchAgent config
type: match
name: router
rules:
  - $ref: rule:weather
  - $ref: rule:music
route:
  - rules: [weather]
    agent: $ref: agent:weather_assistant
  - rules: [music]
    agent: $ref: agent:music_player

GenX - Go Implementation

Import: github.com/haivivi/giztoy/pkg/genx

📚 Go Documentation

Packages

PackageDescription
genxCore types, interfaces, context builder
genx/agentAgent framework (ReAct, Match)
genx/agentcfgConfiguration parsing (YAML/JSON)
genx/matchIntent matching patterns
genx/generatorsProvider adapters (OpenAI, Gemini)
genx/modelcontextsPre-built contexts
genx/playgroundInteractive testing

Core Types

Generator Interface

type Generator interface {
    GenerateStream(ctx context.Context, model string, mctx ModelContext) (Stream, error)
    Invoke(ctx context.Context, model string, mctx ModelContext, tool *FuncTool) (Usage, *FuncCall, error)
}

ModelContext Interface

type ModelContext interface {
    Prompts() iter.Seq[*Prompt]
    Messages() iter.Seq[*Message]
    CoTs() iter.Seq[string]
    Tools() iter.Seq[Tool]
    Params() *ModelParams
}

Stream Interface

type Stream interface {
    Next() (*MessageChunk, error)
    Close() error
    CloseWithError(error) error
}

ModelContext Builder

builder := genx.NewModelContextBuilder()

// Add prompts
builder.Prompt("system", "You are a helpful assistant.")

// Add messages
builder.UserText("Hello!")
builder.AssistantText("Hi there!")

// Add tools
builder.Tool(&genx.FuncTool{
    Name: "search",
    Description: "Search the web",
    Schema: `{"type":"object","properties":{"query":{"type":"string"}}}`,
})

// Set parameters
builder.Params(&genx.ModelParams{
    Temperature: 0.7,
    MaxTokens: 1000,
})

ctx := builder.Build()

FuncTool

// From schema
tool := &genx.FuncTool{
    Name: "get_weather",
    Description: "Get weather for a city",
    Schema: `{
        "type": "object",
        "properties": {
            "city": {"type": "string"},
            "units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
        },
        "required": ["city"]
    }`,
}

// With executor
tool := genx.NewFuncToolWithExecutor(
    "search",
    "Search the web",
    schema,
    func(ctx context.Context, args json.RawMessage) (string, error) {
        var params SearchParams
        json.Unmarshal(args, &params)
        return doSearch(params.Query), nil
    },
)

Streaming

stream, err := generator.GenerateStream(ctx, "gpt-4", mctx)
if err != nil {
    return err
}
defer stream.Close()

for {
    chunk, err := stream.Next()
    if err == io.EOF {
        break
    }
    if err != nil {
        return err
    }
    fmt.Print(chunk.Part)
}

Agent Framework

ReActAgent

import "github.com/haivivi/giztoy/pkg/genx/agent"

ag, err := agent.NewReActAgent(runtime, &agent.ReActConfig{
    Name: "assistant",
    Prompt: "You are a helpful assistant.",
    Generator: &agentcfg.GeneratorConfig{Model: "gpt-4"},
    Tools: []agentcfg.ToolRef{
        {Ref: "tool:search"},
        {Ref: "tool:calculator"},
    },
})
if err != nil {
    return err
}
defer ag.Close()

// Input
ag.Input(genx.Contents{genx.Text("What's 2+2?")})

// Event loop
for {
    evt, err := ag.Next()
    if err != nil {
        return err
    }
    switch evt.Type {
    case agent.EventChunk:
        fmt.Print(evt.Chunk.Part)
    case agent.EventEOF:
        // Waiting for input
        ag.Input(genx.Contents{genx.Text(readline())})
    case agent.EventClosed:
        return nil
    case agent.EventToolStart:
        fmt.Printf("Calling %s...\n", evt.ToolName)
    case agent.EventToolDone:
        fmt.Printf("Tool returned: %s\n", evt.ToolResult)
    case agent.EventToolError:
        fmt.Printf("Tool error: %v\n", evt.ToolError)
    }
}

MatchAgent

ag, err := agent.NewMatchAgent(runtime, &agent.MatchConfig{
    Name: "router",
    Rules: []match.Rule{
        {Name: "weather", Patterns: []string{"天气", "weather"}},
        {Name: "music", Patterns: []string{"播放", "play"}},
    },
    SubAgents: map[string]agentcfg.AgentRef{
        "weather": {Ref: "agent:weather_assistant"},
        "music":   {Ref: "agent:music_player"},
    },
})

Configuration (agentcfg)

Load from YAML

import "github.com/haivivi/giztoy/pkg/genx/agentcfg"

cfg, err := agentcfg.LoadAgentFromFile("agent.yaml")

// Or from string
cfg, err := agentcfg.ParseAgent(yamlStr)

Agent Config Example

type: react
name: assistant
prompt: |
  You are a helpful coding assistant.
generator:
  model: gpt-4
  temperature: 0.7
tools:
  - $ref: tool:search
    quit: false
  - $ref: tool:goodbye
    quit: true

Tool Config Example

type: http
name: weather_api
description: Get weather data
url: https://api.weather.com/v1/current
method: GET
params:
  - name: city
    in: query
extract: .data.temperature

Providers

OpenAI

import "github.com/haivivi/giztoy/pkg/genx/generators"

gen := generators.NewOpenAIGenerator(apiKey,
    generators.WithBaseURL("https://api.openai.com/v1"),
)

Gemini

gen := generators.NewGeminiGenerator(apiKey)

Inspection

// Inspect model context
output, _ := genx.InspectModelContext(mctx)
fmt.Println(output)

// Inspect message
fmt.Println(genx.InspectMessage(msg))

// Inspect tool
fmt.Println(genx.InspectTool(tool))

GenX - Rust Implementation

Crate: giztoy-genx

📚 Rust Documentation

Status

The Rust implementation provides core abstractions but lacks the full agent framework available in Go.

FeatureGoRust
ModelContext
Generator trait
Streaming
FuncTool
OpenAI adapter
Gemini adapter
Agent framework
Configuration parser
Match patterns

Core Types

Generator Trait

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Generator: Send + Sync {
    async fn generate_stream(
        &self,
        model: &str,
        ctx: &dyn ModelContext,
    ) -> Result<Box<dyn Stream>, GenxError>;

    async fn invoke(
        &self,
        model: &str,
        ctx: &dyn ModelContext,
        tool: &FuncTool,
    ) -> Result<(Usage, FuncCall), GenxError>;
}
}

ModelContext Trait

#![allow(unused)]
fn main() {
pub trait ModelContext: Send + Sync {
    fn prompts(&self) -> Box<dyn Iterator<Item = &Prompt> + '_>;
    fn messages(&self) -> Box<dyn Iterator<Item = &Message> + '_>;
    fn cots(&self) -> Box<dyn Iterator<Item = &str> + '_>;
    fn tools(&self) -> Box<dyn Iterator<Item = &dyn Tool> + '_>;
    fn params(&self) -> Option<&ModelParams>;
}
}

Stream Trait

#![allow(unused)]
fn main() {
pub trait Stream: Send {
    fn next(&mut self) -> StreamResult;
    fn close(&mut self) -> Result<(), GenxError>;
    fn close_with_error(&mut self, err: GenxError) -> Result<(), GenxError>;
}
}

ModelContextBuilder

#![allow(unused)]
fn main() {
use giztoy_genx::{ModelContextBuilder, FuncTool};
use schemars::JsonSchema;

#[derive(JsonSchema, serde::Deserialize)]
struct SearchArgs {
    query: String,
}

let mut builder = ModelContextBuilder::new();

// Add prompts
builder.prompt_text("system", "You are a helpful assistant.");

// Add messages
builder.user_text("user", "Hello!");
builder.assistant_text("assistant", "Hi there!");

// Add tools
builder.add_tool(FuncTool::new::<SearchArgs>("search", "Search the web"));

// Set parameters
builder.params(ModelParams {
    temperature: Some(0.7),
    max_tokens: Some(1000),
    ..Default::default()
});

let ctx = builder.build();
}

FuncTool

#![allow(unused)]
fn main() {
use giztoy_genx::FuncTool;
use schemars::JsonSchema;
use serde::Deserialize;

#[derive(JsonSchema, Deserialize)]
struct WeatherArgs {
    city: String,
    #[serde(default)]
    units: Option<String>,
}

// Create tool with schema derived from type
let tool = FuncTool::new::<WeatherArgs>(
    "get_weather",
    "Get weather for a city"
);

// Access schema
println!("{}", tool.schema());
}

Streaming

#![allow(unused)]
fn main() {
let mut stream = generator.generate_stream("gpt-4", &ctx).await?;

loop {
    match stream.next() {
        StreamResult::Chunk(chunk) => {
            if let Some(text) = chunk.text() {
                print!("{}", text);
            }
        }
        StreamResult::Done => break,
        StreamResult::Error(e) => return Err(e),
    }
}
}

Message Types

#![allow(unused)]
fn main() {
use giztoy_genx::{Message, Contents, Part, Role};

// User text message
let msg = Message::user_text("Hello!");

// Assistant message with content
let msg = Message {
    role: Role::Assistant,
    name: None,
    payload: Payload::Contents(vec![
        Part::Text("Here's what I found:".to_string()),
    ]),
};

// Tool call
let msg = Message::tool_call(ToolCall {
    id: "call_123".to_string(),
    func_call: FuncCall {
        name: "search".to_string(),
        arguments: r#"{"query":"rust"}"#.to_string(),
    },
});

// Tool result
let msg = Message::tool_result(ToolResult {
    id: "call_123".to_string(),
    result: "Found 10 results".to_string(),
});
}

Provider Adapters

OpenAI

#![allow(unused)]
fn main() {
use giztoy_genx::openai::OpenAIGenerator;

let generator = OpenAIGenerator::new(api_key)
    .with_base_url("https://api.openai.com/v1");
}

Gemini

#![allow(unused)]
fn main() {
use giztoy_genx::gemini::GeminiGenerator;

let generator = GeminiGenerator::new(api_key);
}

Inspection

#![allow(unused)]
fn main() {
use giztoy_genx::{inspect_model_context, inspect_message, inspect_tool};

// Inspect context
println!("{}", inspect_model_context(&ctx));

// Inspect message
println!("{}", inspect_message(&msg));

// Inspect tool
println!("{}", inspect_tool(&tool));
}

Error Types

#![allow(unused)]
fn main() {
use giztoy_genx::{GenxError, State, Status};

match result {
    Err(GenxError::Api { status, message }) => {
        eprintln!("API error: {} - {}", status, message);
    }
    Err(GenxError::Network(e)) => {
        eprintln!("Network error: {}", e);
    }
    Err(GenxError::Json(e)) => {
        eprintln!("JSON error: {}", e);
    }
    _ => {}
}
}

Missing Features (vs Go)

The Rust implementation is missing:

  1. Agent Framework: No ReActAgent, MatchAgent
  2. Configuration Parser: No YAML/JSON config loading
  3. Match Patterns: No intent matching system
  4. Tool Variants: No GeneratorTool, HTTPTool, CompositeTool
  5. Runtime Interface: No dependency injection system
  6. State Management: No memory/state persistence

These would need to be implemented to reach feature parity with Go.

GenX - Known Issues

🔴 Major Issues

GX-001: Rust lacks agent framework

Description:
Rust implementation missing the entire agent framework:

  • No ReActAgent
  • No MatchAgent
  • No tool orchestration
  • No configuration parser

Impact: Cannot build autonomous agents in Rust.

Effort: High - requires significant implementation work.


GX-002: Rust lacks advanced tool types

Description:
Rust only has FuncTool. Missing:

  • GeneratorTool
  • HTTPTool
  • CompositeTool
  • TextProcessorTool

Impact: Limited tool capabilities in Rust.


🟡 Minor Issues

GX-003: Go agent uses panics for some errors

File: go/pkg/genx/agent/agent.go

Description:
Some internal errors use panic instead of returning errors.

Impact: Can crash applications on unexpected states.

Suggestion: Convert panics in public entry points to errors; keep panics only for truly unreachable states.


GX-004: Configuration parsing is complex

Description:
The agentcfg package has complex unmarshal logic with many edge cases.

Files:

  • go/pkg/genx/agentcfg/unmarshal.go
  • go/pkg/genx/agentcfg/*_unmarshal_test.go

Note: Extensive tests exist, so this is well-covered.


GX-006: Streaming tool-call collection parity uncertain

Description:
Rust includes collect_tool_calls_streamed, but feature parity with Go streaming tool calls needs verification and tests.



GX-019: MessageChunk.Clone drops tool calls

File: go/pkg/genx/message.go

Description:
MessageChunk.Clone() copies Role/Name/Part but never copies ToolCall.

Impact: Tool-call chunks can be silently lost when cloned.

Suggestion: Copy c.ToolCall instead of checking chk.ToolCall.


GX-020: StreamBuilder drops unknown tool calls

File: go/pkg/genx/stream_builder.go

Description:
If a tool call references a tool not found in ModelContext, the chunk is skipped:

if !ok { slog.Warn(...); continue }

Impact: Tool-call chunks disappear without being forwarded to consumers.

Suggestion: Emit the chunk anyway or return an error so callers can handle missing tools.


GX-021: OpenAI Invoke drops usage metrics

File: go/pkg/genx/openai.go

Description:
invokeJSONOutput and invokeToolCalls return Usage{} instead of resp.Usage.

Impact: Usage accounting is always zero for invoke paths.

Suggestion: Return oaiConvUsage(&resp.Usage) on success.


GX-022: GenerateStream goroutine can leak on early close

File: go/pkg/genx/openai.go

Description:
GenerateStream spawns a goroutine reading the OpenAI stream. If the caller closes the stream early without cancelling the context, the goroutine may continue until the server ends the stream.

Impact: Potential goroutine/resource leak in long-running sessions.

Suggestion: Tie stream close to context cancellation or add a stop channel.


GX-023: hexString ignores rand.Read error

File: go/pkg/genx/json.go

Description:
rand.Read errors are ignored when generating IDs.

Impact: On RNG failure, ID may be all-zero without error signal.

Suggestion: Check rand.Read error and fall back or return error.


🔵 Enhancements

GX-007: Add more provider adapters

Description:
Currently supports OpenAI and Gemini. Could add:

  • Anthropic (Claude)
  • Mistral
  • Local models (Ollama)

GX-008: Add retry logic to generators

Description:
No built-in retry for transient failures.

Suggestion: Add configurable retry with backoff.


GX-009: Add request/response logging

Description:
No debug logging for API calls.

Suggestion: Add optional verbose mode.


GX-010: Document match pattern syntax

Description:
Match patterns have complex syntax; documentation exists but should stay in sync.

Files:

  • docs/genx/match/
  • go/pkg/genx/match/

GX-011: Add validation for agent configs

Description:
YAML configs could have invalid references ($ref). No validation until runtime.

Suggestion: Add config validation command/function.


GX-012: Add configuration schema generation

Description:
No JSON Schema is provided for agent/tool configuration.

Impact: No IDE auto-complete or static validation.

Suggestion: Generate JSON Schema from agentcfg types and publish under docs/.


GX-013: Add stream test coverage for tool calls (Rust)

Description:
Tool-call streaming helpers exist, but end-to-end tests are limited or missing.

Impact: Potential regressions in streamed tool-call parsing.

Suggestion: Add tests that simulate incremental chunks and verify parsed tool calls.


GX-024: StreamBuilder::new ignores tools (Rust)

File: rust/genx/src/stream.rs

Description:
StreamBuilder::new ignores tools from ModelContext, leaving func_tools empty.

Impact: Tool-call metadata cannot be linked unless callers use with_tools.

Suggestion: Provide a way to downcast tools or pass tool list explicitly in generator code.


⚪ Notes

GX-014: Well-structured Go implementation

Description:
The Go genx package is well-organized:

  • Clear separation of concerns
  • Extensive test coverage
  • Comprehensive agent framework
  • YAML/JSON configuration support

GX-015: Event-based agent API

Description:
The agent event system is well-designed:

for {
    evt, err := ag.Next()
    switch evt.Type {
    case EventChunk: ...
    case EventToolStart: ...
    case EventClosed: ...
    }
}

Provides fine-grained control over agent execution.


GX-016: Quit tool pattern

Description:
Tools can be marked as "quit tools" to signal agent completion:

tools:
  - $ref: tool:goodbye
    quit: true

Useful for conversational agents with explicit exit.


GX-017: $ref system for configuration

Description:
Configuration supports references for reuse:

tools:
  - $ref: tool:search  # References registered tool
  - $ref: agent:helper # References registered agent

Good for modular configuration.


GX-018: Multi-skill assistant pattern

Description:
MatchAgent enables router pattern:

Router (Match) → Weather Agent (ReAct)
              → Music Agent (ReAct)
              → Chat Agent

Well-documented in agent/doc.go.


GX-025: Language idioms differ (Go vs Rust)

Description:
Go uses iter.Seq, Rust uses iterators/streams. This is idiomatic for each language and not a functional issue.


Summary

IDSeverityStatusComponent
GX-001🔴 MajorOpenRust
GX-002🔴 MajorOpenRust
GX-003🟡 MinorOpenGo
GX-004🟡 MinorNoteGo
GX-006🟡 MinorOpenRust
GX-019🟡 MinorOpenGo
GX-020🟡 MinorOpenGo
GX-021🟡 MinorOpenGo
GX-022🟡 MinorOpenGo
GX-023🟡 MinorOpenGo
GX-007🔵 EnhancementOpenBoth
GX-008🔵 EnhancementOpenBoth
GX-009🔵 EnhancementOpenBoth
GX-010🔵 EnhancementOpenGo
GX-011🔵 EnhancementOpenGo
GX-012🔵 EnhancementOpenGo
GX-013🔵 EnhancementOpenRust
GX-024🔵 EnhancementOpenRust
GX-014⚪ NoteN/AGo
GX-015⚪ NoteN/AGo
GX-016⚪ NoteN/AGo
GX-017⚪ NoteN/AGo
GX-018⚪ NoteN/AGo
GX-025⚪ NoteN/ABoth

Overall: Go implementation is mature and feature-rich with comprehensive agent framework. Rust implementation provides basic LLM abstraction but lacks the agent framework, making it suitable only for simple use cases. Major effort needed to reach Rust feature parity.

speech

Overview

The speech module defines interfaces for voice and speech processing. It separates pure audio streams (Voice) from speech streams that include transcriptions (Speech). It also provides multiplexers for ASR (speech-to-text) and TTS (text-to-speech) implementations.

Design Goals

  • Unified interfaces for ASR/TTS backends
  • Stream-first APIs for long-running audio
  • Clear separation between audio-only and audio+text
  • Pluggable providers via multiplexer registration

Key Concepts

  • Voice: audio-only stream of PCM segments
  • Speech: audio stream with text transcription per segment
  • ASR: Opus input -> Speech/SpeechStream
  • TTS: text input -> Speech
  • Sentence segmentation: split long text into manageable chunks

Components

  • Voice/Speech interfaces
  • ASR/TTS muxers
  • Sentence segmentation utilities
  • Speech collection and copy helpers
  • docs/lib/audio/pcm for PCM formats
  • docs/lib/audio/opusrt for Opus streaming input
  • Provider SDKs in docs/lib/minimax, docs/lib/doubaospeech

speech (Go)

Package Layout

  • voice.go: Voice/VoiceSegment interfaces
  • speech.go: Speech/SpeechSegment interfaces
  • asr.go: ASR multiplexer and interfaces
  • tts.go: TTS multiplexer and interfaces
  • segment.go: default sentence segmentation
  • util.go: collectors and copy helpers
  • Provider implementations: asr_doubao_sauc.go, tts_doubao_v1.go, tts_doubao_v2.go, tts_minimax.go

Public Interfaces

  • Voice: Voice, VoiceSegment, VoiceStream
  • Speech: Speech, SpeechSegment, SpeechStream
  • ASR: StreamTranscriber, Transcriber, ASR mux + helpers
  • TTS: Synthesizer, TTS mux + helpers
  • Segmentation: SentenceSegmenter, SentenceIterator

Design Notes

  • Global muxes ASRMux and TTSMux provide default routing.
  • ASR uses Opus frame streams (opusrt.FrameReader).
  • DefaultSentenceSegmenter splits by punctuation with a rune cap.
  • CollectSpeech and CopySpeech help aggregate or export streams.

Usage Notes

  • Transcribe falls back to streaming when the backend does not implement the Transcriber interface.
  • Revoice streams existing speech into a TTS backend via a pipe.

speech (Rust)

Crate Layout

  • voice.rs: Voice/VoiceSegment interfaces
  • speech.rs: Speech/SpeechSegment interfaces
  • asr.rs: ASR multiplexer and traits
  • tts.rs: TTS multiplexer and traits
  • segment.rs: sentence segmentation utilities
  • util.rs: speech collector and iterator helpers

Public Interfaces

  • Voice: Voice, VoiceSegment, VoiceStream
  • Speech: Speech, SpeechSegment, SpeechStream
  • ASR: StreamTranscriber, Transcriber (async), ASR
  • TTS: Synthesizer, TTS (async)
  • Segmentation: SentenceSegmenter, SentenceIterator

Design Notes

  • Async traits are used across ASR/TTS and stream interfaces.
  • ASR and TTS use a trie-based mux with async read/write locks.
  • SpeechCollector composes a SpeechStream into a single Speech.

Differences vs Go

  • No global mux singletons; callers construct ASR/TTS explicitly.
  • Async AsyncRead is used for text input in TTS.

speech - Known Issues

🟡 Minor Issues

SPT-001: Go Revoice has no cancellation propagation

File: go/pkg/speech/tts.go

Description: Revoice spawns a goroutine that copies the entire input speech into an io.Pipe, but the goroutine is not tied to the caller's context. If the synthesizer returns early or the context is canceled, the copy goroutine may continue doing work until completion.

Impact: Wasted CPU or lingering goroutines on early cancellation.

Suggestion: Honor context cancellation inside the copy loop or use a pipe that is closed when ctx.Done() fires.


SPT-002: Rust ASR ignores full-transcribe implementations

File: rust/speech/src/asr.rs

Description: ASR::transcribe always falls back to the streaming path, even though a Transcriber trait exists for full transcription.

Impact: Backends that can provide a more efficient full-transcribe path cannot use it.

Suggestion: Detect and use Transcriber implementations when available.

chatgear

Overview

chatgear defines the core protocol types for device-to-server communication: commands, state events, statistics, and audio streaming metadata. It focuses on interface design rather than transport implementation, and provides an in-process pipe for testing.

Design Goals

  • Stable, typed protocol for device state and control
  • Clear separation between uplink (device -> server) and downlink (server -> device)
  • Explicit metadata for timestamps and command issuance
  • Audio streaming with Opus frame stamping
  • Support both Go and Rust with comparable API surfaces

Key Concepts

  • Session commands: device control commands with typed payloads
  • State events: gear state transitions with causes
  • Stats events: telemetry snapshots and incremental changes
  • Uplink/Downlink: split interfaces for bidirectional streams
  • Ports: higher-level client/server port abstraction

Submodules

  • transport: uplink/downlink connection traits and pipe helpers
  • port: client/server port traits and audio track controls

External Reference

  • /Users/idy/Work/haivivi/x/docs/chatgear (original protocol/design notes)
  • docs/lib/audio/opusrt for Opus frame handling
  • docs/lib/jsontime for millisecond timestamps

chatgear/transport

Overview

The transport layer defines bidirectional streaming interfaces for chatgear. It splits data flow into uplink (device -> server) and downlink (server -> device) and provides a test-friendly in-process pipe.

Design Goals

  • Separate uplink/downlink responsibilities
  • Provide a minimal interface that can be implemented by different transports
  • Keep Opus framing metadata explicit

Key Concepts

  • UplinkTx / UplinkRx: device -> server
  • DownlinkTx / DownlinkRx: server -> device
  • Stamped Opus frames: carry timestamp for playback alignment
  • Pipe connection for in-process testing

chatgear/port

Overview

Port interfaces represent higher-level client/server roles built on top of the transport layer. They combine audio streaming, state/stats telemetry, and command control into a single abstraction.

Design Goals

  • Provide a symmetric client/server API surface
  • Hide transport details while preserving real-time audio controls
  • Expose device control commands alongside audio output

Key Concepts

  • ClientPort: device-side send/receive split (Tx/Rx)
  • ServerPort: server-side send/receive split (Tx/Rx)
  • Audio tracks: background/foreground/overlay output streams
  • Device commands: volume, brightness, WiFi, OTA, power, etc.

chatgear (Go)

Package Layout

  • state.go: gear state enum, state events
  • stats.go: telemetry structs, merge logic
  • command.go: session command types and JSON mapping
  • conn.go: uplink/downlink interfaces
  • port.go: client/server port interfaces
  • conn_pipe.go: in-process pipe connection for tests

Public Interfaces

  • State: GearState, GearStateEvent, GearStateChangeCause
  • Stats: GearStatsEvent, GearStatsChanges and related structs
  • Commands: SessionCommand with SessionCommandEvent
  • Uplink/Downlink: UplinkTx, UplinkRx, DownlinkTx, DownlinkRx
  • Ports: ClientPortTx/Rx, ServerPortTx/Rx
  • Pipe: NewPipe for test or in-process wiring

Design Notes

  • JSON encoding is typed via commandType() and a tagged event wrapper.
  • All time fields use jsontime.Milli for millisecond epoch values.
  • Opus frames are stamped with opusrt.EpochMillis during transport.
  • Stats merge logic performs partial updates with GearStatsChanges.

Usage Notes

  • UplinkRx/DownlinkRx expose iterators (iter.Seq2) instead of channels.
  • ServerPortTx exposes track creation for background/foreground/overlay audio.

chatgear (Rust)

Crate Layout

  • state.rs: gear state enum and events
  • stats.rs: telemetry structs and merge logic
  • command.rs: session commands and JSON helpers
  • conn.rs: uplink/downlink async traits
  • port.rs: client/server port traits
  • conn_pipe.rs: in-process pipe helper

Public Interfaces

  • State: GearState, GearStateEvent, GearStateChangeCause
  • Stats: GearStatsEvent, GearStatsChanges
  • Commands: SessionCommand, SessionCommandEvent, Command enum
  • Uplink/Downlink: UplinkTx, UplinkRx, DownlinkTx, DownlinkRx
  • Ports: ClientPortTx/Rx, ServerPortTx/Rx
  • Pipe: new_pipe

Design Notes

  • Async traits are used for most IO-facing APIs.
  • Commands serialize into JSON value payloads; a typed Command enum can parse from (type, payload) pairs.
  • Port traits split audio track control and device command APIs.

Differences vs Go

  • Rust favors async traits and owned Vec<u8> payloads.
  • Command event uses serde_json::Value instead of typed interface payloads.

chatgear - Known Issues

🟡 Minor Issues

CG-001: Go ReadNFCTag equality ignores tag data changes

File: go/pkg/chatgear/stats.go

Description: ReadNFCTag.Equal compares only tag UIDs. If a tag's payload or metadata changes but the UID remains the same, the merge logic will treat it as unchanged.

Impact: Telemetry updates can be silently skipped.

Suggestion: Include additional fields (e.g., RawData, DataFormat, UpdateAt) in equality or document UID-only matching as a deliberate choice.


CG-002: Rust SessionCommandEvent swallows serialization errors

File: rust/chatgear/src/command.rs

Description: SessionCommandEvent::new uses serde_json::to_value(cmd) and replaces errors with Value::Null, losing the original error context.

Impact: Serialization failures are silently ignored, making debugging difficult.

Suggestion: Return a Result or log/report serialization failures explicitly.


CG-003: Go Pipe connections can block indefinitely on backpressure

File: go/pkg/chatgear/conn_pipe.go

Description: NewPipe uses bounded channels. If the receiver stops reading, senders will block in SendOpusFrames / SendState / SendStats without a timeout unless the caller provides a cancellable context.

Impact: Potential goroutine leaks in tests or in-process usage.

Suggestion: Document this behavior and recommend context timeouts for pipe usage.

Audio Package

Audio processing framework for speech and multimedia applications.

Design Goals

  1. Real-time Processing: Low-latency audio mixing, encoding, and streaming
  2. Format Flexibility: Support common audio formats (PCM, Opus, MP3, OGG)
  3. Cross-platform: FFI bindings to native libraries (libopus, libsoxr, lame)
  4. Streaming-first: Designed for continuous audio streams, not just files

Architecture

graph TB
    subgraph audio["audio/"]
        subgraph row1[" "]
            pcm["pcm/<br/>- Format<br/>- Chunk<br/>- Mixer"]
            codec["codec/<br/>- opus/<br/>- mp3/<br/>- ogg/"]
            resampler["resampler/<br/>- soxr<br/>- Format<br/>- Convert"]
        end
        subgraph row2[" "]
            opusrt["opusrt/<br/>- Buffer<br/>- Realtime<br/>- OGG R/W"]
            songs["songs/<br/>- Catalog<br/>- Notes<br/>- PCM gen"]
            portaudio["portaudio/<br/>(Go only)<br/>- Stream<br/>- Device"]
        end
    end

Submodules

ModuleDescriptionGoRust
pcm/PCM format, chunks, mixing
codec/Audio codecs (Opus, MP3, OGG)
resampler/Sample rate conversion (soxr)
opusrt/Realtime Opus streaming⚠️
songs/Built-in melodies
portaudio/Audio I/O devices

Audio Formats

PCM Formats (Predefined)

FormatSample RateChannelsBit Depth
L16Mono16K16000 Hz116-bit
L16Mono24K24000 Hz116-bit
L16Mono48K48000 Hz116-bit

Codec Support

CodecEncodeDecodeContainer
OpusRaw, OGG
MP3Raw
OGGN/AN/AContainer only

Common Workflows

Voice Chat (Low Latency)

flowchart LR
    A[Microphone] --> B[PCM 16kHz]
    B --> C[Opus Encode]
    C --> D[Network]
    D --> E[Opus Decode]
    E --> F[Mixer]
    F --> G[Speaker]

Speech Synthesis Playback

flowchart LR
    A[API Response<br/>Base64 MP3] --> B[MP3 Decode]
    B --> C[Resample<br/>24K→16K]
    C --> D[Mixer]
    D --> E[Speaker]

Audio Recording

flowchart LR
    A[PCM Stream] --> B[Opus Encode]
    B --> C[OGG Writer]
    C --> D[File]

Native Dependencies

LibraryPurposeBuild System
libopusOpus codecpkg-config / Bazel
libsoxrResamplingpkg-config / Bazel
lameMP3 encodingBazel (bundled)
minimp3MP3 decodingBazel (bundled)
liboggOGG containerpkg-config / Bazel
portaudioAudio I/Opkg-config / Bazel

Examples Directory

  • examples/go/audio/ - Go audio examples
  • examples/rust/audio/ - Rust audio examples
  • buffer - Used for audio data buffering
  • speech - High-level speech synthesis/recognition
  • minimax, doubaospeech - TTS/ASR APIs returning audio

Audio Codec Module

Audio encoding and decoding for Opus, MP3, and OGG formats.

Design Goals

  1. Native Performance: FFI bindings to proven C libraries
  2. Streaming Support: Process audio in chunks, not full files
  3. VoIP Optimized: Low-latency Opus encoding for voice chat

Codec Support Matrix

CodecEncodeDecodeLibraryUse Case
OpuslibopusVoice chat, streaming
MP3LAME / minimp3File storage, compatibility
OGGN/AN/AliboggContainer format

Sub-modules

opus/

Opus codec implementation for voice and audio.

Features:

  • Encoder with VoIP/Audio/LowDelay modes
  • Decoder with PLC (Packet Loss Concealment)
  • TOC (Table of Contents) parsing
  • Frame duration detection

Key Types:

  • Encoder, Decoder
  • Frame, TOC, FrameDuration

mp3/

MP3 codec for compatibility with legacy systems.

Features:

  • LAME-based encoding with quality presets
  • minimp3-based decoding (header-only library)

Key Types:

  • Encoder, Decoder

ogg/

OGG container format for packaging Opus/Vorbis streams.

Features:

  • Page-based streaming
  • Bitstream management
  • Synchronization recovery

Key Types:

  • Encoder, Stream, Sync, Page

Opus Frame Durations

DurationSamples@16KSamples@48K
2.5ms40120
5ms80240
10ms160480
20ms320960
40ms6401920
60ms9602880

Recommended: 20ms frames balance latency and compression.

Common Opus Bitrates

ApplicationBitrateQuality
Voice (narrow)8-12 kbpsIntelligible
Voice (wide)16-24 kbpsGood
Voice (HD)32-48 kbpsExcellent
Music64-128 kbpsHi-Fi

Native Library Versions

LibraryMinimum VersionNotes
libopus1.3.0Opus encoder/decoder
libogg1.3.0OGG container
LAME3.100MP3 encoder
minimp3-Header-only decoder

Examples

See parent audio/ documentation for usage examples.

  • opusrt/ - Realtime Opus streaming with OGG container
  • resampler/ - Sample rate conversion before encoding

Audio PCM Module

PCM (Pulse Code Modulation) audio format handling, chunks, and multi-track mixing.

Design Goals

  1. Standard Formats: Predefined configurations for common use cases
  2. Chunk Abstraction: Unified interface for audio data and silence
  3. Real-time Mixing: Low-latency multi-track audio mixing with gain control
  4. Streaming Interface: Compatible with io.Reader/io.Writer patterns

Predefined Formats

FormatSample RateChannelsBit DepthBytes/sec
L16Mono16K16000 Hz116-bit32,000
L16Mono24K24000 Hz116-bit48,000
L16Mono48K48000 Hz116-bit96,000

Duration/Bytes Calculations

For L16Mono16K (16kHz, 16-bit mono):

DurationSamplesBytes
20ms320640
50ms8001,600
100ms1,6003,200
1s16,00032,000

Formula: bytes = samples × channels × (bit_depth / 8)

Chunk Types

DataChunk

Raw audio data with format metadata.

classDiagram
    class DataChunk {
        Data: []byte
        Format: Format
    }

SilenceChunk

Generates silence (zeros) of specified duration without allocating.

classDiagram
    class SilenceChunk {
        Duration: time
        Format: Format
    }

Mixer Architecture

Multi-track audio mixer with real-time mixing and gain control:

flowchart LR
    subgraph mixer["Mixer"]
        t1["Track 1<br/>(gain=1)"]
        t2["Track 2<br/>(gain=0.5)"]
        tn["Track N"]
        mix["Mix Buffer<br/>(float32)"]
        out["Output<br/>(int16)"]
        
        t1 --> mix
        t2 --> mix
        tn --> mix
        mix --> out
    end

Features:

  • Dynamic track creation/removal
  • Per-track gain control (0.0 - 1.0+)
  • Silence gap detection
  • Auto-close when all tracks done
  • Thread-safe operations

Mixing Algorithm

  1. Convert int16 PCM to float32 (-1.0 to 1.0)
  2. Apply per-track gain
  3. Sum all track samples
  4. Clip to [-1.0, 1.0]
  5. Convert back to int16
output = clip(Σ(track[i] × gain[i]), -1.0, 1.0) × 32767

Use Cases

Voice Chat Mixing

Multiple participants' audio mixed into single output stream.

Background Music

Mix background music track with voice at lower gain.

Audio Ducking

Reduce music volume when voice is detected.

Examples

See parent audio/ documentation for usage examples.

  • audio/codec/ - Encode/decode before/after mixing
  • audio/resampler/ - Convert sample rates before mixing
  • buffer/ - Buffer audio data between processing stages

Audio Resampler Module

Sample rate conversion using libsoxr (SoX Resampler Library).

Design Goals

  1. High Quality: Professional-grade resampling via libsoxr
  2. Streaming: Process audio as continuous stream, not files
  3. Channel Conversion: Support mono↔stereo conversion
  4. io.Reader Interface: Drop-in replacement for audio sources

Supported Conversions

Sample Rate

Any integer sample rate to any other integer sample rate:

  • 8000 Hz ↔ 16000 Hz ↔ 24000 Hz ↔ 48000 Hz
  • Non-standard rates supported

Channel Conversion

FromToMethod
MonoStereoDuplicate
StereoMonoAverage (L+R)/2

Quality Levels

libsoxr supports multiple quality presets:

LevelNameDescription
0QuickLow quality, fast
1LowBetter than quick
2MediumBalance of quality/speed
3HighGood quality (default)
4Very HighBest quality

Note: Current implementation uses High quality by default.

Algorithm

libsoxr uses polyphase filter banks with configurable:

  • Passband rolloff
  • Stop-band attenuation
  • Linear/minimum phase

The High quality preset provides:

  • Passband: 0-0.91 Nyquist
  • Stop-band attenuation: -100 dB
  • Linear phase

Common Resampling Scenarios

Speech API to Local Playback

flowchart LR
    A["API Output<br/>(24kHz)"] --> B[Resample]
    B --> C["Mixer Input<br/>(16kHz)"]

Local Capture to Speech API

flowchart LR
    A["Microphone<br/>(48kHz)"] --> B[Resample]
    B --> C["API Input<br/>(16kHz)"]

Multi-source Mixing

flowchart LR
    s1["Source 1<br/>(24kHz)"] --> r1[Resample]
    r1 --> m1["16kHz"]
    
    s2["Source 2<br/>(16kHz)"] --> m2["16kHz"]
    
    s3["Source 3<br/>(48kHz)"] --> r3[Resample]
    r3 --> m3["16kHz"]
    
    m1 --> mixer[Mixer]
    m2 --> mixer
    m3 --> mixer

Performance Characteristics

Approximate cycles per sample (on modern CPU):

  • Quick: ~10
  • High: ~50-100
  • Very High: ~200-500

For real-time 16kHz mono:

  • 16000 samples/sec × 100 cycles ≈ 1.6M cycles/sec
  • Negligible CPU load on modern hardware

Memory Usage

libsoxr maintains internal buffers for filter state:

  • ~10-50KB per resampler instance
  • More for higher quality settings

Examples

See parent audio/ documentation for usage examples.

  • audio/pcm/ - Format definitions, use with resampler
  • audio/codec/ - Often resample before/after encoding

Audio OpusRT Module

Real-time Opus stream processing with jitter buffering and packet loss handling.

Design Goals

  1. Out-of-Order Handling: Reorder packets that arrive out of sequence
  2. Packet Loss Detection: Detect and report gaps for PLC (Packet Loss Concealment)
  3. Real-time Simulation: Playback timing based on timestamps, not arrival time
  4. OGG Container Support: Read/write Opus in OGG format (Go only)

Core Concepts

Jitter Buffer

Network packets may arrive out of order or with variable delay (jitter). The jitter buffer collects packets and outputs them in correct order:

sequenceDiagram
    participant N as Network
    participant B as Buffer (Heap)
    participant O as Output
    
    N->>B: PKT 3
    Note over B: [3]
    N->>B: PKT 1
    Note over B: [1, 3]
    B->>O: PKT 1
    N->>B: PKT 2
    Note over B: [2, 3]
    B->>O: PKT 2
    B->>O: PKT 3

Packet Loss Detection

Gaps between consecutive frame timestamps indicate lost packets:

Frame 1: 0ms - 20ms
Frame 2: 20ms - 40ms    ✓ No gap
Frame 4: 60ms - 80ms    ✗ 20ms gap (Frame 3 lost)

When loss is detected, the caller should use decoder PLC:

frame, loss, _ := buffer.Frame()
if loss > 0 {
    // Generate PLC audio for 'loss' duration
    plcAudio := decoder.DecodePLC(...)
}

Timestamped Frames

Frames are timestamped with epoch milliseconds:

classDiagram
    class StampedFrame {
        Timestamp: int64 (8 bytes, big-endian)
        OpusFrame: []byte (variable)
    }

Components

Buffer

Simple jitter buffer with min-heap ordering:

  • Append frames in any order
  • Read frames in timestamp order
  • Max duration limit (oldest dropped)

RealtimeBuffer

Wraps Buffer for real-time playback simulation:

  • Background goroutine pulls frames at correct time
  • Generates loss events when data not available
  • Handles clock synchronization

OGG Reader/Writer (Go only)

Read/write Opus streams in OGG container format:

  • OggReader: Read Opus frames from OGG file
  • OggWriter: Write Opus frames to OGG container

Timing

EpochMillis

All timestamps are milliseconds since Unix epoch:

type EpochMillis int64

// Convert from time.Time
stamp := EpochMillis(time.Now().UnixMilli())

// Convert to duration
duration := stamp.Duration() // time.Duration

Timestamp Epsilon

A 2ms tolerance for timestamp comparisons accounts for clock drift:

const timestampEpsilon = 2 // milliseconds

Use Cases

WebRTC Audio

flowchart LR
    A[WebRTC] --> B[RTP Packets]
    B --> C[Jitter Buffer]
    C --> D[Opus Decode]
    D --> E[Playback]

Speech API Streaming

flowchart LR
    A[API Response] --> B[Stamped Frames]
    B --> C[RealtimeBuffer]
    C --> D[Decode]
    D --> E[Mixer]

Audio Recording

flowchart LR
    A[Opus Frames] --> B[OGG Writer]
    B --> C[File]

Examples

See parent audio/ documentation for usage examples.

  • audio/codec/opus/ - Opus encoder/decoder
  • audio/codec/ogg/ - OGG container primitives
  • buffer/ - Used internally by RealtimeBuffer

Audio Package - Go Implementation

Import: github.com/haivivi/giztoy/pkg/audio

📚 Go Documentation

The main audio package is an umbrella for sub-packages. Import specific packages directly.

Sub-packages

pcm (PCM Audio)

import "github.com/haivivi/giztoy/pkg/audio/pcm"

Key Types:

TypeDescription
FormatAudio format (sample rate, channels, depth)
ChunkInterface for audio data chunks
DataChunkRaw audio data chunk
SilenceChunkSilence generator
MixerMulti-track audio mixer
TrackSingle audio track in mixer
TrackCtrlTrack control (gain, play/stop)

codec/opus (Opus Codec)

import "github.com/haivivi/giztoy/pkg/audio/codec/opus"

Key Types:

TypeDescription
EncoderOpus encoder (wraps libopus)
DecoderOpus decoder (wraps libopus)
FrameRaw Opus frame data ([]byte)
TOCTable of Contents byte parser
FrameDurationFrame duration enum

codec/mp3 (MP3 Codec)

import "github.com/haivivi/giztoy/pkg/audio/codec/mp3"

Key Types:

TypeDescription
EncoderMP3 encoder (wraps LAME)
DecoderMP3 decoder (wraps minimp3)

codec/ogg (OGG Container)

import "github.com/haivivi/giztoy/pkg/audio/codec/ogg"

Key Types:

TypeDescription
EncoderOGG page encoder
StreamOGG logical bitstream
SyncOGG page synchronizer

resampler (Sample Rate Conversion)

import "github.com/haivivi/giztoy/pkg/audio/resampler"

Key Types:

TypeDescription
ResamplerInterface for sample rate conversion
Soxrlibsoxr-based resampler
FormatSource/destination format

opusrt (Realtime Opus)

import "github.com/haivivi/giztoy/pkg/audio/opusrt"

Key Types:

TypeDescription
BufferJitter buffer for out-of-order frames
RealtimeBufferReal-time playback simulation
StampedFrameOpus frame with timestamp
OggReaderRead Opus from OGG container
OggWriterWrite Opus to OGG container

portaudio (Audio I/O)

import "github.com/haivivi/giztoy/pkg/audio/portaudio"

Key Types:

TypeDescription
StreamAudio input/output stream

songs (Built-in Melodies)

import "github.com/haivivi/giztoy/pkg/audio/songs"

Key Types:

TypeDescription
SongMelody definition
NoteMusical note

Usage Examples

PCM Mixer

import "github.com/haivivi/giztoy/pkg/audio/pcm"

// Create mixer
mixer := pcm.NewMixer(pcm.L16Mono16K, pcm.WithAutoClose())

// Create track
track, ctrl, _ := mixer.CreateTrack(pcm.WithTrackLabel("voice"))

// Write audio to track
track.Write(audioData)

// Adjust gain
ctrl.SetGain(0.8)

// Read mixed output
buf := make([]byte, 1600) // 50ms at 16kHz
mixer.Read(buf)

Opus Encoding

import "github.com/haivivi/giztoy/pkg/audio/codec/opus"

// Create encoder
enc, _ := opus.NewVoIPEncoder(16000, 1)
defer enc.Close()

enc.SetBitrate(24000)

// Encode PCM to Opus
pcmData := make([]int16, 320) // 20ms at 16kHz
frame, _ := enc.Encode(pcmData, 320)

Sample Rate Conversion

import "github.com/haivivi/giztoy/pkg/audio/resampler"

srcFmt := resampler.Format{SampleRate: 24000, Stereo: false}
dstFmt := resampler.Format{SampleRate: 16000, Stereo: false}

rs, _ := resampler.New(audioReader, srcFmt, dstFmt)
defer rs.Close()

// Read resampled data
io.Copy(output, rs)

Realtime Opus Buffer

import "github.com/haivivi/giztoy/pkg/audio/opusrt"

// Create jitter buffer (2 minute capacity)
buf := opusrt.NewBuffer(2 * time.Minute)

// Write stamped frames (can arrive out of order)
buf.Write(stampedFrameData)

// Read in order
frame, loss, _ := buf.Frame()
if loss > 0 {
    // Use decoder PLC for lost frames
}

CGO Dependencies

All codec packages use CGO to bind native libraries:

// Example: opus encoder
/*
#cgo pkg-config: opus
#include <opus.h>
*/
import "C"

Build requirements:

  • pkg-config for native builds
  • Bazel cdeps for Bazel builds

Audio Package - Rust Implementation

Crate: giztoy-audio

📚 Rust Documentation

Modules

pcm (PCM Audio)

#![allow(unused)]
fn main() {
use giztoy_audio::pcm::{Format, FormatExt, Chunk, DataChunk, SilenceChunk, Mixer};
}

Key Types:

TypeDescription
FormatAudio format enum (re-exported from resampler)
FormatExtExtension trait for chunk creation
ChunkTrait for audio data chunks
DataChunkRaw audio data chunk
SilenceChunkSilence generator
MixerMulti-track audio mixer
TrackAudio track writer
TrackCtrlTrack control
AtomicF32Atomic float for gain control

codec::opus (Opus Codec)

#![allow(unused)]
fn main() {
use giztoy_audio::codec::opus::{Encoder, Decoder, Application, Frame, TOC};
}

Key Types:

TypeDescription
EncoderOpus encoder (wraps libopus)
DecoderOpus decoder
ApplicationEncoder application type enum
FrameRaw Opus frame data
TOCTable of Contents parser
FrameDurationFrame duration enum

codec::mp3 (MP3 Codec)

#![allow(unused)]
fn main() {
use giztoy_audio::codec::mp3::{Encoder, Decoder};
}

Key Types:

TypeDescription
EncoderMP3 encoder (wraps LAME)
DecoderMP3 decoder (wraps minimp3)

codec::ogg (OGG Container)

#![allow(unused)]
fn main() {
use giztoy_audio::codec::ogg::{Encoder, Stream, Sync, Page};
}

Key Types:

TypeDescription
EncoderOGG page encoder
StreamOGG logical bitstream
SyncOGG page synchronizer
PageOGG page data

resampler (Sample Rate Conversion)

#![allow(unused)]
fn main() {
use giztoy_audio::resampler::{Soxr, Format};
}

Key Types:

TypeDescription
Soxrlibsoxr-based resampler
FormatAudio format (sample rate, stereo flag)

opusrt (Realtime Opus)

#![allow(unused)]
fn main() {
use giztoy_audio::opusrt::{Buffer, StampedFrame, EpochMillis};
}

Key Types:

TypeDescription
BufferJitter buffer for frame reordering
StampedFrameOpus frame with timestamp
EpochMillisMillisecond timestamp

⚠️ Note: Rust opusrt is missing OGG Reader/Writer compared to Go.

songs (Built-in Melodies)

#![allow(unused)]
fn main() {
use giztoy_audio::songs::{Song, Note, Catalog};
}

Key Types:

TypeDescription
SongMelody definition
NoteMusical note
CatalogBuilt-in song collection

Usage Examples

PCM Format

#![allow(unused)]
fn main() {
use giztoy_audio::pcm::{Format, FormatExt};
use std::time::Duration;

let format = Format::L16Mono16K;

// Calculate bytes for duration
let bytes = format.bytes_in_duration(Duration::from_millis(100));
assert_eq!(bytes, 3200); // 1600 samples * 2 bytes

// Create chunks
let silence = format.silence_chunk(Duration::from_millis(100));
let data = format.data_chunk(vec![0u8; 3200]);
}

Opus Encoding

#![allow(unused)]
fn main() {
use giztoy_audio::codec::opus::{Encoder, Application};

let mut encoder = Encoder::new(16000, 1, Application::VoIP)?;
encoder.set_bitrate(24000)?;

// Encode PCM to Opus
let pcm: Vec<i16> = vec![0; 320]; // 20ms at 16kHz
let frame = encoder.encode(&pcm, 320)?;
}

Sample Rate Conversion

#![allow(unused)]
fn main() {
use giztoy_audio::resampler::{Soxr, Format};

let src_fmt = Format { sample_rate: 24000, stereo: false };
let dst_fmt = Format { sample_rate: 16000, stereo: false };

let mut resampler = Soxr::new(src_fmt, dst_fmt)?;

// Process audio data
let output = resampler.process(&input_pcm)?;
}

Mixer

#![allow(unused)]
fn main() {
use giztoy_audio::pcm::{Format, Mixer, MixerOptions};

let mut mixer = Mixer::new(Format::L16Mono16K, MixerOptions::default());

// Create track
let (track, ctrl) = mixer.create_track(None)?;

// Write audio
track.write(&audio_data)?;

// Adjust gain
ctrl.set_gain(0.8);

// Read mixed output
let mut buf = vec![0u8; 3200];
mixer.read(&mut buf)?;
}

FFI Bindings

Rust uses custom FFI modules for native library bindings:

#![allow(unused)]
fn main() {
// Example: codec/opus/ffi.rs
extern "C" {
    fn opus_encoder_create(
        fs: i32,
        channels: i32,
        application: i32,
        error: *mut i32,
    ) -> *mut OpusEncoder;
}
}

Differences from Go

FeatureGoRust
Format definitionIn pcm/In resampler/, re-exported by pcm/
opusrt OGG R/W❌ Missing
portaudio❌ Not implemented
Mixer thread-safetysync.Mutexstd::sync::Mutex
FFI error handlingCGO strerrorCustom error types

Audio Package - Known Issues

🟠 Major Issues

AUD-001: Go Mixer uses unsafe pointer casting

File: go/pkg/audio/pcm/mixer.go:226

Description:
The mixer uses unsafe.Slice and unsafe.Pointer to cast between []byte and []int16:

i16 := unsafe.Slice((*int16)(unsafe.Pointer(&p[0])), len(p)/2)

Risk:

  • Platform-dependent endianness (assumes little-endian)
  • Potential undefined behavior if buffer alignment is wrong

Impact: May produce incorrect audio on big-endian systems.

Suggestion: Add explicit little-endian encoding/decoding or document platform requirements.


AUD-002: Rust opusrt missing OGG Reader/Writer

Description:
Go opusrt has OggReader and OggWriter for reading/writing Opus in OGG containers. Rust implementation is missing these.

Impact: Cannot read/write Opus files in OGG format in Rust.

Status: ⚠️ Partial implementation.


AUD-003: Rust missing portaudio module

Description:
Go has audio/portaudio for audio device I/O. Rust has no equivalent.

Impact: Cannot capture/play audio from hardware devices in Rust.

Status: ❌ Not implemented.


🟡 Minor Issues

AUD-004: Go Format panics on invalid value

File: go/pkg/audio/pcm/pcm.go:36-38

Description:
Format.SampleRate(), Channels(), Depth() all panic on invalid format:

func (f Format) SampleRate() int {
    switch f {
    // ...
    }
    panic("pcm: invalid audio type")
}

Impact: Runtime panic instead of error return.

Suggestion: Return (int, error) or use MustXxx naming convention for panicking versions.


AUD-005: Go SilenceChunk uses fixed global buffer

File: go/pkg/audio/pcm/pcm.go:177

Description:
Uses a shared fixed-size zero buffer and loops for long durations.

Impact: None functionally; avoids repeated allocations.

Status: Not a bug. Keep as implementation note only.

AUD-006: Go Opus encoder max frame size hardcoded

File: go/pkg/audio/codec/opus/encoder.go:95

Description:
Encode function allocates fixed 4000 byte buffer:

buf := make([]byte, 4000)

Impact: Allocation on every encode call.

Suggestion: Use buffer pool or allow caller to provide buffer.


AUD-007: Rust Format re-export from resampler is confusing

File: rust/audio/src/pcm/format.rs:7

Description:
pcm::Format is actually re-exported from resampler::Format:

#![allow(unused)]
fn main() {
pub use crate::resampler::format::Format;
}

Impact: Confusing import paths, circular dependency appearance.

Suggestion: Define Format once at top level and import in both modules.


AUD-008: Go mixer notifyWrite spawns goroutine every call

File: go/pkg/audio/pcm/mixer.go:391-405

Description:
notifyWrite() spawns a new goroutine each time:

func (mx *Mixer) notifyWrite() {
    go func() {
        // ...
    }()
}

Impact: Goroutine overhead for every write notification.

Suggestion: Use single dedicated notification goroutine or avoid goroutine.


🔵 Enhancements

AUD-009: No stereo format support in predefined formats

Description:
Only mono formats are predefined (L16Mono16K, etc.). No stereo formats.

Suggestion: Add L16Stereo16K, L16Stereo24K, L16Stereo48K.


AUD-010: No 8-bit or 24-bit PCM support

Description:
Only 16-bit PCM is supported. Some audio sources use 8-bit (low quality) or 24-bit (high quality).

Suggestion: Add format variants for different bit depths.


AUD-011: Resampler quality not configurable

File: go/pkg/audio/resampler/soxr.go:52

Description:
Quality is hardcoded to SOXR_HQ:

qSpec := C.soxr_quality_spec(C.SOXR_HQ, 0)

Impact: Cannot trade quality for performance when needed.

Suggestion: Add quality parameter to New().


AUD-012: No WAV file support

Description:
No utilities for reading/writing WAV files, only raw PCM.

Suggestion: Add WAV header parsing/writing for file I/O.


⚪ Notes

AUD-013: CGO/FFI dependency complexity

Description:
Both Go and Rust rely heavily on CGO/FFI for native codec libraries. This adds:

  • Build complexity (pkg-config, Bazel rules)
  • Platform-specific issues
  • Memory management concerns

Status: Necessary for performance, but increases maintenance burden.


AUD-014: Mixer uses float32 internally

Description:
Mixer converts int16 PCM to float32 for mixing, then back to int16:

// int16 → float32
s := float32(trackI16[i])
// ... mix ...
// float32 → int16
i16[i] = int16(t * 32767)

Impact: Slight precision loss during mixing, but standard practice.


Summary

IDSeverityStatusComponent
AUD-001🟠 MajorOpenGo Mixer
AUD-002🟠 MajorOpenRust opusrt
AUD-003🟠 MajorOpenRust
AUD-004🟡 MinorOpenGo Format
AUD-005⚪ NoteN/AGo SilenceChunk
AUD-006🟡 MinorOpenGo Opus
AUD-007🟡 MinorOpenRust Format
AUD-008🟡 MinorOpenGo Mixer
AUD-009🔵 EnhancementOpenBoth
AUD-010🔵 EnhancementOpenBoth
AUD-011🔵 EnhancementOpenGo
AUD-012🔵 EnhancementOpenBoth
AUD-013⚪ NoteN/ABoth
AUD-014⚪ NoteN/AGo

Overall: Functional audio processing with significant native library integration. Main gaps are Rust parity (opusrt OGG, portaudio) and some unsafe code patterns.

MiniMax SDK

Go and Rust SDK for the MiniMax AI platform API.

Official API Documentation: api/README.md

Design Goals

  1. Full API Coverage: Support all MiniMax API capabilities
  2. Idiomatic Language Design: Natural Go/Rust patterns
  3. Streaming Support: First-class support for streaming responses
  4. Async Task Handling: Convenient polling for long-running operations

API Coverage

API FeatureGoRustOfficial Doc
Text Generation (Chat)api/text.md
Sync Speech (T2A)api/speech-t2a.md
Async Speech (Long Text)api/speech-t2a-async.md
Voice Cloningapi/voice-cloning.md
Voice Designapi/voice-design.md
Voice Managementapi/voice-management.md
Video Generationapi/video.md
Video Agentapi/video-agent.md
Image Generationapi/image.md
Music Generationapi/music.md
File Managementapi/file.md

Architecture

graph TB
    subgraph client["Client"]
        subgraph services1[" "]
            text[Text Service]
            speech[Speech Service]
            voice[Voice Service]
            video[Video Service]
        end
        subgraph services2[" "]
            image[Image Service]
            music[Music Service]
            file[File Service]
        end
    end
    
    subgraph http["HTTP Client"]
        retry[Retry]
        auth[Auth]
        error[Error Handling]
    end
    
    client --> http
    http --> api["https://api.minimaxi.com"]

Services

ServiceDescription
TextChat completion, streaming, tool calls
SpeechTTS sync/stream, async long-text
VoiceList voices, clone, design
VideoText-to-video, image-to-video, agent
ImageText-to-image, image reference
MusicMusic generation from lyrics
FileUpload, list, retrieve, delete files

Authentication

Uses Bearer token authentication:

Authorization: Bearer <api_key>

API keys are obtained from MiniMax Platform.

Base URLs

RegionURL
China (Default)https://api.minimaxi.com
Globalhttps://api.minimaxi.chat

Response Patterns

Synchronous

Direct response with data.

Streaming

SSE (Server-Sent Events) for real-time data:

  • Text: Token-by-token chat responses
  • Speech: Audio chunk streaming

Async Tasks

For long-running operations (video, async speech):

sequenceDiagram
    participant C as Client
    participant S as Server
    C->>S: Create Task
    S-->>C: task_id
    loop Poll Status
        C->>S: Query Status
        S-->>C: Pending/Running
    end
    S-->>C: Success + Result

Error Handling

All errors include:

  • status_code: Numeric error code
  • status_msg: Human-readable message

Common error codes:

  • 1000: General error
  • 1001: Rate limit exceeded
  • 1002: Invalid parameters
  • 1004: Authentication failed

Examples Directory

  • examples/go/minimax/ - Go SDK examples
  • examples/rust/minimax/ - Rust SDK examples
  • examples/cmd/minimax/ - CLI test scripts
  • CLI tool: go/cmd/minimax/
  • CLI tests: examples/cmd/minimax/

MiniMax 开放平台 API 文档

官方文档: MiniMax 开放平台文档中心

最后更新: 2026-01-19

注意: 本文档基于官方文档整理,如有更新请参考官方文档

官方文档导航

如果本文档信息不完整或需要最新信息,请访问以下官方链接:

功能模块官方文档链接
接口概览https://platform.minimaxi.com/docs/api-reference/api-overview
文本生成 (Anthropic)https://platform.minimaxi.com/docs/api-reference/text-anthropic-api
文本生成 (OpenAI)https://platform.minimaxi.com/docs/api-reference/text-openai-api
同步语音合成 HTTPhttps://platform.minimaxi.com/docs/api-reference/speech-t2a-http
同步语音合成 WebSockethttps://platform.minimaxi.com/docs/api-reference/speech-t2a-ws
异步长文本语音合成https://platform.minimaxi.com/docs/api-reference/speech-t2a-async
音色快速复刻https://platform.minimaxi.com/docs/api-reference/speech-voice-cloning
音色设计https://platform.minimaxi.com/docs/api-reference/speech-voice-design
声音管理https://platform.minimaxi.com/docs/api-reference/speech-voice-management
视频生成https://platform.minimaxi.com/docs/api-reference/video-generation
视频生成 Agenthttps://platform.minimaxi.com/docs/api-reference/video-generation-agent
图片生成https://platform.minimaxi.com/docs/api-reference/image-generation
音乐生成https://platform.minimaxi.com/docs/api-reference/music-generation
文件管理https://platform.minimaxi.com/docs/api-reference/file-management
错误码查询https://platform.minimaxi.com/docs/api-reference/error-code

如何获取最新文档

方法一:直接访问官网

访问 MiniMax 开放平台文档中心,左侧导航栏包含所有 API 接口的详细文档。

方法二:使用 AI 工具读取

如果使用支持浏览器功能的 AI 工具(如 Cursor),可以:

  1. 使用 browser_navigate 工具访问官方文档页面
  2. 使用 browser_snapshot 获取页面内容
  3. 解析页面中的 API 参数、请求/响应格式等信息

示例:

访问: https://platform.minimaxi.com/docs/api-reference/speech-t2a-http

方法三:查看官方 MCP 服务器

MiniMax 提供了官方的 MCP(Model Context Protocol)服务器实现,包含完整的 API 调用示例:

  • Python 版本: https://github.com/MiniMax-AI/MiniMax-MCP
  • JavaScript 版本: https://github.com/MiniMax-AI/MiniMax-MCP-JS

关于 OpenAPI/Swagger

MiniMax 目前没有公开提供 OpenAPI/Swagger 规范文件。如需获取,可以:

  • 联系官方技术支持: api-support@minimaxi.com
  • 基于官方文档手动整理

概述

MiniMax 开放平台提供多模态 AI 能力,包括文本生成、语音合成、视频生成、图像生成、音乐生成等。

API 能力概览

能力模块说明文档链接
文本生成对话内容生成、工具调用text.md
同步语音合成 (T2A)短文本语音合成,支持 HTTP/WebSocketspeech-t2a.md
异步长文本语音合成长文本语音合成,异步任务模式speech-t2a-async.md
音色快速复刻上传音频复刻音色voice-cloning.md
音色设计基于描述生成个性化音色voice-design.md
声音管理查询和管理可用音色voice-management.md
视频生成文生视频、图生视频video.md
视频生成 Agent基于模板的视频生成video-agent.md
图片生成文生图、图生图image.md
音乐生成基于描述和歌词生成音乐music.md
文件管理文件上传、下载、管理file.md

认证方式

所有 API 使用 Bearer Token 认证:

Authorization: Bearer <your_api_key>

获取 API Key

  1. 按量付费: 在「账户管理 > 接口密钥」中创建 API Key,支持所有模态模型
  2. Coding Plan: 创建 Coding Plan Key,仅支持文本模型

基础 URL

地址类型URL
主要地址https://api.minimaxi.com
备用地址https://api-bj.minimaxi.com

请求头

参数类型必填说明
AuthorizationstringBearer <api_key>
Content-Typestringapplication/json

错误处理

API 返回的错误响应格式:

{
  "base_resp": {
    "status_code": 1000,
    "status_msg": "error message"
  }
}

官方资源

MiniMax SDK - Go Implementation

Import: github.com/haivivi/giztoy/pkg/minimax

📚 Go Documentation

Client

type Client struct {
    Text   *TextService
    Speech *SpeechService
    Voice  *VoiceService
    Video  *VideoService
    Image  *ImageService
    Music  *MusicService
    File   *FileService
}

Constructor:

// Basic
client := minimax.NewClient("api-key")

// With options
client := minimax.NewClient("api-key",
    minimax.WithBaseURL(minimax.BaseURLGlobal),
    minimax.WithRetry(5),
    minimax.WithHTTPClient(&http.Client{Timeout: 60*time.Second}),
)

Options:

OptionDescription
WithBaseURL(url)Custom API base URL
WithRetry(n)Max retry count (default: 3)
WithHTTPClient(c)Custom http.Client

Services

TextService

// Synchronous
resp, err := client.Text.CreateChatCompletion(ctx, &minimax.ChatCompletionRequest{
    Model: "MiniMax-M2.1",
    Messages: []minimax.Message{
        {Role: "user", Content: "Hello!"},
    },
})

// Streaming (Go 1.23+ iter.Seq2)
for chunk, err := range client.Text.CreateChatCompletionStream(ctx, req) {
    if err != nil {
        return err
    }
    fmt.Print(chunk.Choices[0].Delta.Content)
}

SpeechService

// Synchronous
resp, err := client.Speech.Synthesize(ctx, &minimax.SpeechRequest{
    Model: "speech-2.6-hd",
    Text:  "Hello, world!",
    VoiceSetting: &minimax.VoiceSetting{
        VoiceID: "male-qn-qingse",
    },
})
// resp.Audio contains decoded audio bytes

// Streaming
for chunk, err := range client.Speech.SynthesizeStream(ctx, req) {
    if err != nil {
        return err
    }
    buf.Write(chunk.Audio)
}

// Async (long text)
task, err := client.Speech.CreateAsyncTask(ctx, &minimax.AsyncSpeechRequest{
    Model: "speech-2.6-hd",
    Text:  longText,
    // ...
})
result, err := task.Wait(ctx)

VoiceService

// List voices
voices, err := client.Voice.List(ctx)

// Clone voice
resp, err := client.Voice.Clone(ctx, &minimax.VoiceCloneRequest{
    FileID:  "uploaded-file-id",
    VoiceID: "my-cloned-voice",
})

// Design voice
resp, err := client.Voice.Design(ctx, &minimax.VoiceDesignRequest{
    Prompt:      "A warm female voice...",
    PreviewText: "Hello, how can I help?",
})

VideoService

// Text to video
task, err := client.Video.CreateTextToVideo(ctx, &minimax.TextToVideoRequest{
    Model:  "video-01",
    Prompt: "A cat playing piano",
})
result, err := task.Wait(ctx)
// result.FileID contains the video file ID

// Image to video
task, err := client.Video.CreateImageToVideo(ctx, &minimax.ImageToVideoRequest{
    Model:          "video-01",
    FirstFrameImage: "https://...",
})

ImageService

resp, err := client.Image.Generate(ctx, &minimax.ImageGenerateRequest{
    Model:  "image-01",
    Prompt: "A beautiful sunset",
})
// resp.Data[0].URL or resp.Data[0].B64JSON

MusicService

task, err := client.Music.Generate(ctx, &minimax.MusicRequest{
    Prompt: "upbeat pop song",
    Lyrics: "[Verse]\nHello world...",
})
result, err := task.Wait(ctx)

FileService

// Upload
resp, err := client.File.Upload(ctx, filePath, minimax.FilePurposeVoiceClone)

// List
files, err := client.File.List(ctx, &minimax.FileListRequest{
    Purpose: minimax.FilePurposeVoiceClone,
})

// Download
data, err := client.File.Download(ctx, fileID)

// Delete
err := client.File.Delete(ctx, fileID)

Task Polling

task, err := client.Video.CreateTextToVideo(ctx, req)
if err != nil {
    return err
}

// Default 5s interval
result, err := task.Wait(ctx)

// Custom interval
result, err := task.WaitWithInterval(ctx, 10*time.Second)

// Manual polling
status, err := task.Query(ctx)
if status.Status == minimax.TaskStatusSuccess {
    // ...
}

Error Handling

resp, err := client.Text.CreateChatCompletion(ctx, req)
if err != nil {
    if e, ok := minimax.AsError(err); ok {
        fmt.Printf("API Error: %d - %s\n", e.StatusCode, e.StatusMsg)
        if e.IsRateLimit() {
            // Wait and retry
        }
    }
    return err
}

Streaming Internals

Uses SSE (Server-Sent Events):

  • iter.Seq2[T, error] for Go 1.23+ range loops
  • Auto-reconnect on transient errors (based on retry config)
  • Hex audio decoding for speech streams

MiniMax SDK - Rust Implementation

Crate: giztoy-minimax

📚 Rust Documentation

Client

#![allow(unused)]
fn main() {
pub struct Client {
    http: Arc<HttpClient>,
    config: ClientConfig,
}
}

Constructor:

#![allow(unused)]
fn main() {
use giztoy_minimax::{Client, BASE_URL_GLOBAL};

// Basic
let client = Client::new("api-key")?;

// With builder
let client = Client::builder("api-key")
    .base_url(BASE_URL_GLOBAL)
    .max_retries(5)
    .build()?;
}

Builder Methods:

MethodDescription
base_url(url)Custom API base URL
max_retries(n)Max retry count (default: 3)

Services

Services are accessed via getter methods (returns new instance each call):

#![allow(unused)]
fn main() {
client.text()    // TextService
client.speech()  // SpeechService
client.voice()   // VoiceService
client.video()   // VideoService
client.image()   // ImageService
client.music()   // MusicService
client.file()    // FileService
}

TextService

#![allow(unused)]
fn main() {
use giztoy_minimax::{ChatCompletionRequest, Message};

// Synchronous
let resp = client.text().create_chat_completion(&ChatCompletionRequest {
    model: "MiniMax-M2.1".to_string(),
    messages: vec![
        Message { role: "user".to_string(), content: "Hello!".to_string() },
    ],
    ..Default::default()
}).await?;

// Streaming
let stream = client.text().create_chat_completion_stream(&req).await?;
while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    if let Some(choice) = chunk.choices.first() {
        print!("{}", choice.delta.content);
    }
}
}

SpeechService

#![allow(unused)]
fn main() {
use giztoy_minimax::{SpeechRequest, VoiceSetting};

// Synchronous
let resp = client.speech().synthesize(&SpeechRequest {
    model: "speech-2.6-hd".to_string(),
    text: "Hello, world!".to_string(),
    voice_setting: Some(VoiceSetting {
        voice_id: "male-qn-qingse".to_string(),
        ..Default::default()
    }),
    ..Default::default()
}).await?;
// resp.audio contains decoded bytes

// Streaming
let stream = client.speech().synthesize_stream(&req).await?;
while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    if let Some(audio) = chunk.audio {
        buf.extend(&audio);
    }
}

// Async (long text)
let task = client.speech().create_async_task(&AsyncSpeechRequest {
    // ...
}).await?;
let result = task.wait().await?;
}

VoiceService

#![allow(unused)]
fn main() {
// List voices
let voices = client.voice().list().await?;

// Clone voice
let resp = client.voice().clone(&VoiceCloneRequest {
    file_id: "uploaded-file-id".to_string(),
    voice_id: "my-cloned-voice".to_string(),
}).await?;

// Design voice
let resp = client.voice().design(&VoiceDesignRequest {
    prompt: "A warm female voice...".to_string(),
    preview_text: "Hello, how can I help?".to_string(),
    ..Default::default()
}).await?;
}

VideoService

#![allow(unused)]
fn main() {
// Text to video
let task = client.video().create_text_to_video(&TextToVideoRequest {
    model: "video-01".to_string(),
    prompt: "A cat playing piano".to_string(),
    ..Default::default()
}).await?;
let result = task.wait().await?;

// Image to video
let task = client.video().create_image_to_video(&ImageToVideoRequest {
    model: "video-01".to_string(),
    first_frame_image: "https://...".to_string(),
    ..Default::default()
}).await?;
}

ImageService

#![allow(unused)]
fn main() {
let resp = client.image().generate(&ImageGenerateRequest {
    model: "image-01".to_string(),
    prompt: "A beautiful sunset".to_string(),
    ..Default::default()
}).await?;
}

MusicService

#![allow(unused)]
fn main() {
let task = client.music().generate(&MusicRequest {
    prompt: "upbeat pop song".to_string(),
    lyrics: "[Verse]\nHello world...".to_string(),
    ..Default::default()
}).await?;
let result = task.wait().await?;
}

FileService

#![allow(unused)]
fn main() {
// Upload
let resp = client.file().upload(file_path, FilePurpose::VoiceClone).await?;

// List
let files = client.file().list(Some(FilePurpose::VoiceClone)).await?;

// Download
let data = client.file().download(&file_id).await?;

// Delete
client.file().delete(&file_id).await?;
}

Task Polling

#![allow(unused)]
fn main() {
let task = client.video().create_text_to_video(&req).await?;

// Default interval
let result = task.wait().await?;

// Custom interval
let result = task.wait_with_interval(Duration::from_secs(10)).await?;

// Manual polling
let status = task.query().await?;
if status.status == TaskStatus::Success {
    // ...
}
}

Error Handling

#![allow(unused)]
fn main() {
use giztoy_minimax::{Error, Result};

match client.text().create_chat_completion(&req).await {
    Ok(resp) => { /* ... */ }
    Err(Error::Api { status_code, status_msg }) => {
        eprintln!("API Error: {} - {}", status_code, status_msg);
    }
    Err(Error::Http(e)) => {
        eprintln!("HTTP Error: {}", e);
    }
    Err(e) => {
        eprintln!("Error: {}", e);
    }
}
}

HasModel Trait

For default model handling:

#![allow(unused)]
fn main() {
pub trait HasModel {
    fn model(&self) -> &str;
    fn set_model(&mut self, model: impl Into<String>);
    fn default_model() -> &'static str;
    fn apply_default_model(&mut self);
}
}

Differences from Go

FeatureGoRust
Client constructionNewClient() (panic on empty key)Client::new() (returns Result)
Service accessDirect fields (client.Text)Getter methods (client.text())
Streamingiter.Seq2[T, error]Stream<Item=Result<T>>
OptionsFunctional optionsBuilder pattern
Error type*Error with helper methodsError enum

MiniMax SDK - Known Issues

🟡 Minor Issues

MMX-001: Go NewClient panics on empty API key

File: go/pkg/minimax/client.go:100-102

Description:
NewClient panics instead of returning an error:

func NewClient(apiKey string, opts ...Option) *Client {
    if apiKey == "" {
        panic("minimax: apiKey must be non-empty")
    }

Impact: Unrecoverable error at construction time.

Suggestion: Return (*Client, error) or use builder pattern like Rust.


MMX-002: Rust services created on each call

File: rust/minimax/src/client.rs:91-123

Description:
Service getters create new instances each call:

#![allow(unused)]
fn main() {
pub fn speech(&self) -> SpeechService {
    SpeechService::new(self.http.clone())
}
}

Impact: Arc clone overhead on each service access.

Suggestion: Cache services or use &self references.


MMX-003: Go streaming uses hex encoding

File: go/pkg/minimax/speech.go:51-56

Description:
Audio data comes hex-encoded from API, decoded in SDK:

if apiResp.Data.Audio != "" {
    audio, err := decodeHexAudio(apiResp.Data.Audio)

Impact: CPU overhead for decoding, 2x memory during decode.

Note: This is API design, not SDK issue, but worth documenting.


MMX-004: No request timeout option

Description:
Both Go and Rust SDKs don't have request-level timeout option. Go suggests using context.WithTimeout, Rust doesn't document timeout handling.

Suggestion: Add timeout option or document clearly.


MMX-005: Go iter.Seq2 requires Go 1.23+

File: go/pkg/minimax/speech.go:78

Description:
Streaming uses iter.Seq2 which requires Go 1.23:

func (s *SpeechService) SynthesizeStream(ctx context.Context, req *SpeechRequest) iter.Seq2[*SpeechChunk, error]

Impact: Not compatible with older Go versions.

Note: Modern API choice, acceptable trade-off.


MMX-006: Error handling inconsistency

Description:
Go uses AsError() helper function, Rust uses error enum matching.

Go:

if e, ok := minimax.AsError(err); ok {
    if e.IsRateLimit() { ... }
}

Rust:

#![allow(unused)]
fn main() {
match err {
    Error::Api { status_code, .. } => { ... }
}
}

Impact: Different patterns between languages.


🔵 Enhancements

MMX-007: No WebSocket TTS support

Description:
Official API supports WebSocket for TTS (/v1/t2a_ws), but SDK only implements HTTP.

Suggestion: Add WebSocket-based streaming TTS for lower latency.


MMX-008: No request validation

Description:
No client-side validation before sending requests. Invalid parameters only fail after API call.

Suggestion: Add validation for known constraints (text length, model names, etc.).


MMX-009: No retry backoff configuration

Description:
Retry count is configurable, but backoff strategy is hardcoded.

Suggestion: Add configurable backoff (exponential, jitter).


MMX-010: No request/response logging

Description:
No built-in debug logging for API requests/responses.

Suggestion: Add optional logging middleware or debug mode.


MMX-011: No rate limit handling

Description:
Rate limit errors are returned but not automatically handled (e.g., exponential backoff, queue).

Suggestion: Add optional rate limit handling.


⚪ Notes

MMX-012: Full API coverage achieved

Description:
Both Go and Rust SDKs implement all documented MiniMax API endpoints:

  • Text generation
  • Speech synthesis (sync, stream, async)
  • Voice management (list, clone, design)
  • Video generation (text-to-video, image-to-video, agent)
  • Image generation
  • Music generation
  • File management

MMX-013: Async task pattern

Description:
Long-running operations (video, async speech, music) use a consistent pattern:

  1. Create task → returns Task[T]
  2. Call task.Wait() for automatic polling
  3. Or manual task.Query() for custom logic

This is a well-designed abstraction.


MMX-014: Base URL handling

Description:
Both SDKs support China and Global endpoints:

  • China: https://api.minimaxi.com
  • Global: https://api.minimaxi.chat

Correctly defaulting to China URL.


Summary

IDSeverityStatusComponent
MMX-001🟡 MinorOpenGo Client
MMX-002🟡 MinorOpenRust Client
MMX-003🟡 MinorNoteBoth
MMX-004🟡 MinorOpenBoth
MMX-005🟡 MinorNoteGo
MMX-006🟡 MinorOpenBoth
MMX-007🔵 EnhancementOpenBoth
MMX-008🔵 EnhancementOpenBoth
MMX-009🔵 EnhancementOpenBoth
MMX-010🔵 EnhancementOpenBoth
MMX-011🔵 EnhancementOpenBoth
MMX-012⚪ NoteN/ABoth
MMX-013⚪ NoteN/ABoth
MMX-014⚪ NoteN/ABoth

Overall: Well-implemented SDK with full API coverage. Both Go and Rust implementations are feature-complete and production-ready. Minor issues are mostly design choices rather than bugs.

DashScope SDK

Go and Rust SDK for Aliyun DashScope (百炼 Model Studio) APIs.

Official API Documentation: api/README.md

Design Goals

  1. Realtime Focus: Primarily implement Qwen-Omni-Realtime WebSocket API
  2. OpenAI Compatibility: Text/chat APIs use OpenAI-compatible SDK
  3. Native WebSocket: Direct WebSocket implementation, not polling

Scope

This SDK focuses on Qwen-Omni-Realtime API for real-time multimodal conversation. For standard text APIs, use OpenAI-compatible SDKs.

APISDK CoverageAlternative
Text ChatOpenAI SDK with custom base URL
App/AgentDirect HTTP calls
RealtimeThis SDK

API Coverage

FeatureGoRustOfficial Doc
Realtime Sessionapi/realtime/
Audio Input/Output
Function Calls
Text Input
Video Input⚠️⚠️Limited support

Architecture

graph TB
    subgraph client["Client"]
        subgraph realtime["RealtimeService"]
            session["RealtimeSession"]
            send["Send Events"]
            recv["Receive Events"]
        end
    end
    
    client --> ws["wss://dashscope.aliyuncs.com<br/>/api-ws/v1/realtime"]

Authentication

Authorization: Bearer <api_key>

Optional workspace isolation:

X-DashScope-WorkSpace: <workspace_id>

Base URLs

RegionWebSocket URL
China (Beijing)wss://dashscope.aliyuncs.com/api-ws/v1/realtime
International (Singapore)wss://dashscope-intl.aliyuncs.com/api-ws/v1/realtime

Models

ModelInputOutputSample Rate
qwen-omni-turbo-realtimeaudio/textaudio/text16kHz
qwen3-omni-flash-realtimeaudio/text/videoaudio/text24kHz

Event Flow

sequenceDiagram
    participant C as Client
    participant S as Server
    C->>S: session.update
    C->>S: input_audio_buffer.append
    C->>S: response.create
    S-->>C: response.audio.delta
    S-->>C: response.audio.delta
    S-->>C: response.done

Examples Directory

  • examples/go/dashscope/ - Go SDK examples
  • examples/cmd/dashscope/ - CLI test scripts

For Text/Chat APIs

Use OpenAI-compatible SDK:

Go:

import "github.com/sashabaranov/go-openai"

config := openai.DefaultConfig(apiKey)
config.BaseURL = "https://dashscope.aliyuncs.com/compatible-mode/v1"
client := openai.NewClientWithConfig(config)

Rust:

#![allow(unused)]
fn main() {
// Use async-openai with custom base URL
}
  • CLI tool: go/cmd/dashscope/
  • CLI tests: examples/cmd/dashscope/

DashScope (阿里云百炼) API 文档

原始文档

文档链接
百炼平台首页https://help.aliyun.com/zh/model-studio/
API 参考https://help.aliyun.com/zh/model-studio/qwen-api-reference
开通服务https://help.aliyun.com/zh/dashscope/opening-service
获取 API Keyhttps://help.aliyun.com/zh/model-studio/get-api-key
模型列表https://help.aliyun.com/zh/model-studio/model-list

如果本文档信息不完整,请访问上述链接获取最新内容。


概述

DashScope 是阿里云大模型服务平台百炼(Model Studio)提供的 API 服务。支持:

  • 文本生成 - 通义千问(Qwen)系列大语言模型,兼容 OpenAI API
  • 多模态 - 图像理解、音频理解
  • 实时对话 - Qwen-Omni-Realtime 实时音频/视频对话
  • 智能体应用 - 调用已配置的 Agent/工作流应用
  • 知识库 - 文档上传、索引、检索增强生成(RAG)

目录结构

docs/dashscope/
├── README.md           # 本文档 - 概述
├── auth.md             # 认证与鉴权
├── text.md             # 文本模型 API (Qwen)
├── app.md              # 应用调用 API
└── realtime/           # 实时多模态 API
    ├── README.md       # 概述
    ├── client-events.md # 客户端事件
    └── server-events.md # 服务端事件

服务端点

HTTP API

地域端点用途
北京(中国大陆)https://dashscope.aliyuncs.com/compatible-mode/v1OpenAI 兼容
新加坡(国际)https://dashscope-intl.aliyuncs.com/compatible-mode/v1OpenAI 兼容
弗吉尼亚(美国)https://dashscope-us.aliyuncs.com/compatible-mode/v1OpenAI 兼容

WebSocket API

地域端点用途
北京wss://dashscope.aliyuncs.com/api-ws/v1/realtime实时对话
新加坡wss://dashscope-intl.aliyuncs.com/api-ws/v1/realtime实时对话

应用 API

POST https://dashscope.aliyuncs.com/api/v1/apps/{APP_ID}/completion

支持的模型

文本模型 (Qwen)

模型上下文特点
qwen-turbo128K快速响应,性价比高
qwen-plus128K平衡性能与成本
qwen-max32K最强能力
qwen-long1M超长上下文

多模态模型

模型能力
qwen-vl-plus视觉理解
qwen-vl-max视觉理解(强化版)
qwen-audio-turbo音频理解

实时多模态模型

模型输出格式默认音色
Qwen3-Omni-Flash-Realtimepcm24Cherry
Qwen-Omni-Turbo-Realtimepcm16Chelsie

快速开始

1. 获取 API Key

  1. 登录 百炼控制台
  2. 进入"密钥管理"
  3. 创建 API Key

2. 设置环境变量

export DASHSCOPE_API_KEY="sk-xxxxxxxxxxxxxxxx"

3. 调用示例

curl https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
  -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen-turbo",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

详细文档

文档说明
认证与鉴权API Key 管理、权限控制、工作空间
文本模型 APIQwen 系列模型、OpenAI 兼容接口
应用调用 API智能体应用、工作流、知识库检索
实时多模态Qwen-Omni-Realtime 实时语音对话

SDK

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"
)

response = client.chat.completions.create(
    model="qwen-turbo",
    messages=[{"role": "user", "content": "Hello!"}]
)

Go (go-openai)

import "github.com/sashabaranov/go-openai"

config := openai.DefaultConfig(os.Getenv("DASHSCOPE_API_KEY"))
config.BaseURL = "https://dashscope.aliyuncs.com/compatible-mode/v1"

client := openai.NewClientWithConfig(config)

Go (giztoy/dashscope) - Realtime API

本项目提供了原生 Go SDK 支持 Qwen-Omni-Realtime API:

import "github.com/haivivi/giztoy/pkg/dashscope"

client := dashscope.NewClient(os.Getenv("DASHSCOPE_API_KEY"))
session, err := client.Realtime.Connect(ctx, &dashscope.RealtimeConfig{
    Model: dashscope.ModelQwenOmniTurboRealtimeLatest,
})
// 发送音频、接收事件...

CLI 工具: bazel run //go/cmd/dashscope -- omni chat

官方 SDK

  • Python: pip install dashscope
  • Java: Maven 依赖 com.alibaba:dashscope-sdk

DashScope SDK - Go Implementation

Import: github.com/haivivi/giztoy/pkg/dashscope

📚 Go Documentation

Client

type Client struct {
    Realtime *RealtimeService
}

Constructor:

// Basic
client := dashscope.NewClient("sk-xxxxxxxx")

// With workspace
client := dashscope.NewClient("sk-xxxxxxxx",
    dashscope.WithWorkspace("ws-xxxxxxxx"),
)

// Custom endpoint (international)
client := dashscope.NewClient("sk-xxxxxxxx",
    dashscope.WithBaseURL("wss://dashscope-intl.aliyuncs.com/api-ws/v1/realtime"),
)

Options:

OptionDescription
WithWorkspace(id)Workspace ID for isolation
WithBaseURL(url)Custom WebSocket URL
WithHTTPBaseURL(url)Custom HTTP URL
WithHTTPClient(client)Custom HTTP client

RealtimeService

Connect Session

session, err := client.Realtime.Connect(ctx, &dashscope.RealtimeConfig{
    Model: dashscope.ModelQwenOmniTurboRealtimeLatest,
})
if err != nil {
    log.Fatal(err)
}
defer session.Close()

Send Events

// Update session configuration
session.UpdateSession(&dashscope.SessionUpdate{
    Modalities: []string{"text", "audio"},
    Voice: "Cherry",
    InputAudioFormat: "pcm16",
    OutputAudioFormat: "pcm16",
})

// Append audio data
session.AppendAudio(audioData)

// Commit audio (finalize input)
session.CommitAudio()

// Create response (start inference)
session.CreateResponse()

// Send text
session.AppendText("Hello!")

// Cancel response
session.CancelResponse()

Receive Events

// Using Go 1.23+ iter.Seq2
for event, err := range session.Events() {
    if err != nil {
        log.Fatal(err)
    }
    
    switch event.Type {
    case dashscope.EventResponseAudioDelta:
        // Audio chunk received
        play(event.Delta)
        
    case dashscope.EventResponseTextDelta:
        // Text chunk received
        fmt.Print(event.Delta)
        
    case dashscope.EventResponseDone:
        // Response complete
        
    case dashscope.EventError:
        // Error occurred
        log.Printf("Error: %s", event.Error.Message)
    }
}

Events

Client Events (Send)

Event TypeDescription
session.updateUpdate session configuration
input_audio_buffer.appendAppend audio data
input_audio_buffer.commitFinalize audio input
response.createRequest response
response.cancelCancel current response

Server Events (Receive)

Event TypeDescription
session.createdSession established
session.updatedConfiguration updated
response.createdResponse started
response.audio.deltaAudio chunk
response.text.deltaText chunk
response.doneResponse complete
errorError occurred

Models

const (
    ModelQwenOmniTurboRealtimeLatest  = "qwen-omni-turbo-realtime-latest"
    ModelQwen3OmniFlashRealtimeLatest = "qwen3-omni-flash-realtime-latest"
)

Error Handling

for event, err := range session.Events() {
    if err != nil {
        // Connection error
        log.Fatal(err)
    }
    
    if event.Type == dashscope.EventError {
        // API error
        log.Printf("API Error [%s]: %s", event.Error.Code, event.Error.Message)
    }
}

Complete Example

func main() {
    client := dashscope.NewClient(os.Getenv("DASHSCOPE_API_KEY"))
    
    session, err := client.Realtime.Connect(context.Background(), &dashscope.RealtimeConfig{
        Model: dashscope.ModelQwenOmniTurboRealtimeLatest,
    })
    if err != nil {
        log.Fatal(err)
    }
    defer session.Close()
    
    // Configure session
    session.UpdateSession(&dashscope.SessionUpdate{
        Voice: "Cherry",
    })
    
    // Send audio (from microphone, etc.)
    session.AppendAudio(audioData)
    session.CommitAudio()
    session.CreateResponse()
    
    // Receive and play response
    for event, err := range session.Events() {
        if err != nil {
            break
        }
        if event.Type == dashscope.EventResponseAudioDelta {
            player.Write(event.Delta)
        }
    }
}

DashScope SDK - Rust Implementation

Crate: giztoy-dashscope

📚 Rust Documentation

Client

#![allow(unused)]
fn main() {
pub struct Client {
    // Internal configuration
}

impl Client {
    pub fn realtime(&self) -> RealtimeService;
}
}

Constructor:

#![allow(unused)]
fn main() {
use giztoy_dashscope::{Client, DEFAULT_REALTIME_URL};

// Basic
let client = Client::new("sk-xxxxxxxx")?;

// With builder
let client = Client::builder("sk-xxxxxxxx")
    .workspace("ws-xxxxxxxx")
    .base_url("wss://dashscope-intl.aliyuncs.com/api-ws/v1/realtime")
    .build()?;
}

RealtimeService

Connect Session

#![allow(unused)]
fn main() {
use giztoy_dashscope::{RealtimeConfig, ModelQwenOmniTurboRealtimeLatest};

let session = client.realtime().connect(&RealtimeConfig {
    model: ModelQwenOmniTurboRealtimeLatest.to_string(),
    ..Default::default()
}).await?;
}

Send Events

#![allow(unused)]
fn main() {
// Update session
session.update_session(&SessionUpdate {
    modalities: vec!["text".to_string(), "audio".to_string()],
    voice: Some("Cherry".to_string()),
    ..Default::default()
}).await?;

// Append audio
session.append_audio(&audio_data).await?;

// Commit audio
session.commit_audio().await?;

// Create response
session.create_response().await?;
}

Receive Events

#![allow(unused)]
fn main() {
use giztoy_dashscope::ServerEvent;

while let Some(event) = session.recv().await {
    let event = event?;
    
    match event {
        ServerEvent::ResponseAudioDelta { delta, .. } => {
            // Play audio
            player.write(&delta)?;
        }
        ServerEvent::ResponseTextDelta { delta, .. } => {
            // Print text
            print!("{}", delta);
        }
        ServerEvent::ResponseDone { .. } => {
            // Complete
            break;
        }
        ServerEvent::Error { error } => {
            eprintln!("Error: {}", error.message);
        }
        _ => {}
    }
}
}

Events

Client Events (Send)

#![allow(unused)]
fn main() {
pub enum ClientEvent {
    SessionUpdate(SessionUpdate),
    InputAudioBufferAppend { audio: Vec<u8> },
    InputAudioBufferCommit,
    ResponseCreate(ResponseCreateOptions),
    ResponseCancel,
}
}

Server Events (Receive)

#![allow(unused)]
fn main() {
pub enum ServerEvent {
    SessionCreated { session: SessionInfo },
    SessionUpdated { session: SessionInfo },
    ResponseCreated { response: ResponseInfo },
    ResponseAudioDelta { delta: Vec<u8> },
    ResponseTextDelta { delta: String },
    ResponseDone { response: ResponseInfo },
    Error { error: ErrorInfo },
    // ... more events
}
}

Models

#![allow(unused)]
fn main() {
pub const MODEL_QWEN_OMNI_TURBO_REALTIME_LATEST: &str = "qwen-omni-turbo-realtime-latest";
pub const MODEL_QWEN3_OMNI_FLASH_REALTIME_LATEST: &str = "qwen3-omni-flash-realtime-latest";
}

Error Handling

#![allow(unused)]
fn main() {
use giztoy_dashscope::{Error, Result};

match session.recv().await {
    Some(Ok(event)) => {
        // Process event
    }
    Some(Err(Error::WebSocket(e))) => {
        eprintln!("WebSocket error: {}", e);
    }
    Some(Err(Error::Api { code, message })) => {
        eprintln!("API error [{}]: {}", code, message);
    }
    None => {
        // Connection closed
    }
}
}

Complete Example

use giztoy_dashscope::{Client, RealtimeConfig, ServerEvent};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let api_key = std::env::var("DASHSCOPE_API_KEY")?;
    let client = Client::new(&api_key)?;
    
    let session = client.realtime().connect(&RealtimeConfig {
        model: "qwen-omni-turbo-realtime-latest".to_string(),
        ..Default::default()
    }).await?;
    
    // Configure
    session.update_session(&SessionUpdate {
        voice: Some("Cherry".to_string()),
        ..Default::default()
    }).await?;
    
    // Send audio
    session.append_audio(&audio_data).await?;
    session.commit_audio().await?;
    session.create_response().await?;
    
    // Receive response
    while let Some(event) = session.recv().await {
        match event? {
            ServerEvent::ResponseAudioDelta { delta, .. } => {
                player.write(&delta)?;
            }
            ServerEvent::ResponseDone { .. } => break,
            _ => {}
        }
    }
    
    Ok(())
}

Differences from Go

FeatureGoRust
Event receivingiter.Seq2 (sync-like)async Stream
Session lifetimeManual defer Close()Drop trait
Audio encoding[]byteVec<u8>
WebSocketgorilla/websockettokio-tungstenite

DashScope SDK - Known Issues

🟡 Minor Issues

DS-001: Go NewClient panics on empty API key

File: go/pkg/dashscope/client.go:39-41

Description:
NewClient panics instead of returning an error:

func NewClient(apiKey string, opts ...Option) *Client {
    if apiKey == "" {
        panic("dashscope: API key is required")
    }

Impact: Unrecoverable error at construction time.

Suggestion: Return (*Client, error) like Rust version.


DS-002: Limited to Realtime API only

Description:
SDK only implements Realtime API. Text/Chat APIs require separate OpenAI-compatible SDK.

Impact: Users need two SDKs for full DashScope usage.

Note: This is intentional design choice - text APIs are OpenAI-compatible.


DS-003: No HTTP API implementation

Description:
No HTTP client for non-realtime operations (file upload, app calls, etc.).

Suggestion: Add HTTP service for app/agent API calls.


DS-004: Video input support limited

Description:
Qwen3-Omni-Flash supports video input, but SDK support may be incomplete.

Status: ⚠️ Needs verification.


🔵 Enhancements

DS-005: No automatic reconnection

Description:
WebSocket sessions don't auto-reconnect on disconnection.

Suggestion: Add reconnection with backoff for long-running sessions.


DS-006: No audio transcoding

Description:
Audio must be in correct format (PCM16/PCM24). No built-in transcoding.

Suggestion: Add optional audio format conversion.


DS-007: No VAD (Voice Activity Detection) integration

Description:
Manual audio buffer management. No built-in VAD for automatic speech detection.

Suggestion: Integrate with audio/pcm for silence detection.


DS-008: Missing tool call examples

Description:
Function/tool calling is supported but not well documented with examples.


⚪ Notes

DS-009: Clean WebSocket event model

Description:
Both Go and Rust implement clean event-based model matching OpenAI Realtime API patterns. This is well-designed.


DS-010: Model constants provided

Description:
Both SDKs provide model name constants:

const ModelQwenOmniTurboRealtimeLatest = "qwen-omni-turbo-realtime-latest"

Good for discoverability and avoiding typos.


DS-011: Workspace support

Description:
Both SDKs support workspace isolation via WithWorkspace() option:

client := dashscope.NewClient(apiKey, dashscope.WithWorkspace("ws-xxx"))

Useful for enterprise environments.


DS-012: International endpoint support

Description:
SDKs support both China and international endpoints:

  • China: wss://dashscope.aliyuncs.com/...
  • International: wss://dashscope-intl.aliyuncs.com/...

Summary

IDSeverityStatusComponent
DS-001🟡 MinorOpenGo Client
DS-002🟡 MinorNoteBoth
DS-003🟡 MinorOpenBoth
DS-004🟡 MinorOpenBoth
DS-005🔵 EnhancementOpenBoth
DS-006🔵 EnhancementOpenBoth
DS-007🔵 EnhancementOpenBoth
DS-008🔵 EnhancementOpenBoth
DS-009⚪ NoteN/ABoth
DS-010⚪ NoteN/ABoth
DS-011⚪ NoteN/ABoth
DS-012⚪ NoteN/ABoth

Overall: Focused SDK for Realtime API with clean design. Main limitation is narrow scope (Realtime only), which is intentional since text APIs are OpenAI-compatible. Both Go and Rust implementations are feature-complete for their scope.

Doubao Speech SDK

Go and Rust SDK for Volcengine Doubao Speech API (豆包语音).

Official API Documentation: api/README.md

Design Goals

  1. Dual API Version Support: V1 (Classic) and V2/V3 (BigModel) APIs
  2. Multiple Auth Methods: Bearer Token, API Key, V2 API Key
  3. Comprehensive Coverage: TTS, ASR, Voice Clone, Realtime, Meeting, Podcast, etc.
  4. Streaming-first: WebSocket-based streaming for real-time scenarios

API Versions

Doubao Speech has two API generations:

VersionNameFeaturesRecommended
V1ClassicBasic TTS/ASRLegacy use
V2/V3BigModelAdvanced TTS/ASR, Realtime✅ New projects

API Coverage

FeatureV1 (Classic)V2 (BigModel)GoRust
TTS Sync
TTS Stream
TTS Async (Long Text)⚠️
ASR One-sentence
ASR Stream
ASR File⚠️
Voice CloneN/A
Realtime DialogueN/A
Meeting TranscriptionN/A
Podcast SynthesisN/A
Translation (SIMT)N/A
Media SubtitleN/A
Console APIN/A

Architecture

graph TB
    subgraph client["Client"]
        subgraph v1["V1 Services (Classic)"]
            tts1[TTS]
            asr1[ASR]
        end
        subgraph v2["V2 Services (BigModel)"]
            tts2[TTSV2]
            asr2[ASRV2]
            advanced["VoiceClone<br/>Realtime<br/>Meeting<br/>Podcast<br/>Translation<br/>Media"]
        end
    end
    
    subgraph console["Console Client"]
        aksig["AK/SK Signature<br/>Authentication"]
    end
    
    client --> api["Volcengine API"]
    console --> api

Authentication Methods

Speech API Client

MethodHeaderUse Case
API Keyx-api-key: {key}Simplest, recommended
Bearer TokenAuthorization: Bearer;{token}V1 APIs
V2 API KeyX-Api-Access-Key, X-Api-App-KeyV2/V3 APIs

Console Client

Uses Volcengine OpenAPI AK/SK signature (HMAC-SHA256).

Resource IDs (V2/V3)

ServiceResource ID
TTS 2.0seed-tts-2.0
TTS 2.0 Concurrentseed-tts-2.0-concurr
ASR Streamvolc.bigasr.sauc.duration
ASR Filevolc.bigasr.auc.duration
Realtimevolc.speech.dialog
Podcastvolc.service_type.10050
Translationvolc.megatts.simt
Voice Cloneseed-icl-2.0

Clusters (V1)

ClusterService
volcano_ttsTTS Standard
volcano_megaTTS BigModel
volcano_iclVoice Clone
volcengine_streaming_commonASR Streaming

Examples Directory

  • examples/go/doubaospeech/ - Go SDK examples
  • examples/cmd/doubaospeech/ - CLI test scripts
  • CLI tool: go/cmd/doubaospeech/
  • CLI tests: examples/cmd/doubaospeech/

豆包语音(Doubao Speech)API 文档

原始文档

  • 文档首页: https://www.volcengine.com/docs/6561/162929
  • 控制台: https://console.volcengine.com/speech/app

如果本文档信息不完整,请访问上述链接获取最新内容。

产品体系

豆包语音分为两代产品:大模型版(2.0)经典版(1.0)。推荐使用大模型版。


语音合成(TTS)

大模型语音合成 2.0

接口端点Resource ID文档
单向流式 HTTP V3POST /api/v3/tts/unidirectionalseed-tts-2.0stream-http.md
单向流式 WebSocket V3WSS /api/v3/tts/unidirectionalseed-tts-2.0stream-ws.md
双向流式 WebSocket V3WSS /api/v3/tts/bidirectionseed-tts-2.0duplex-ws.md
异步长文本POST /api/v3/tts/async/submitseed-tts-2.0-concurrasync.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/1598757 (单向流式HTTP-V3)
  • https://www.volcengine.com/docs/6561/1719100 (单向流式WebSocket-V3)
  • https://www.volcengine.com/docs/6561/1329505 (双向流式WebSocket-V3)
  • https://www.volcengine.com/docs/6561/1330194 (异步长文本)

经典版语音合成 1.0

接口端点Cluster文档
HTTP 一次性合成POST /api/v1/ttsvolcano_ttshttp.md
WebSocket 流式WSS /api/v1/tts/ws_binaryvolcano_ttswebsocket.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/79820 (HTTP接口)
  • https://www.volcengine.com/docs/6561/79821 (WebSocket接口)
  • https://www.volcengine.com/docs/6561/97465 (参数说明)

精品长文本语音合成

接口端点文档
异步长文本POST /api/v1/long_tts/submitlong-tts.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/1096680

语音识别(ASR)

大模型语音识别 2.0

接口端点Resource ID文档
流式识别 WebSocketWSS /api/v3/sauc/bigmodelvolc.bigasr.sauc.durationstreaming.md
录音文件识别(标准版)POST /api/v3/asr/bigmodel/submitvolc.bigasr.auc.durationfile-standard.md
录音文件识别(极速版)POST /api/v3/asr/bigmodel_async/submitvolc.bigasr.auc.durationfile-fast.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/1354869 (大模型流式语音识别)
  • https://www.volcengine.com/docs/6561/1354868 (大模型录音文件识别标准版)
  • https://www.volcengine.com/docs/6561/1631584 (大模型录音文件极速版)
  • https://www.volcengine.com/docs/6561/1840838 (大模型录音文件闲时版)

经典版语音识别 1.0

接口端点Cluster文档
一句话识别POST /api/v1/asrvolcengine_input_commonone-sentence.md
流式识别WSS /api/v2/asrvolcengine_streaming_commonstreaming.md
录音文件标准版POST /api/v1/asr/submitvolc.megatts.defaultfile-standard.md
录音文件极速版POST /api/v1/asr/async/submitvolc.megatts.defaultfile-fast.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/104897 (一句话识别)
  • https://www.volcengine.com/docs/6561/80816 (流式语音识别)
  • https://www.volcengine.com/docs/6561/80818 (录音文件识别标准版)
  • https://www.volcengine.com/docs/6561/80820 (录音文件识别极速版)

声音复刻

接口端点Cluster文档
训练提交POST /api/v1/mega_tts/audio/uploadvolcano_iclapi.md
状态查询POST /api/v1/mega_tts/statusvolcano_iclapi.md
激活音色POST /api/v1/mega_tts/audio/activatevolcano_iclapi.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/1305191 (声音复刻API)
  • https://www.volcengine.com/docs/6561/1829010 (声音复刻下单及使用指南)

实时语音大模型

接口端点Resource ID文档
实时对话WSS /api/v3/realtime/dialoguevolc.speech.dialogapi.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/1257584 (端到端实时语音大模型API)

播客合成

接口端点Resource ID文档
WebSocket V3WSS /api/v3/sami/podcastttsvolc.megatts.podcastapi.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/1668014 (播客API-websocket-v3协议)

同声传译

接口端点Resource ID文档
WebSocket V3WSS /api/v3/saas/simtvolc.megatts.simtapi.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/xxx (同声传译2.0-API)

语音妙记(会议纪要)

接口端点文档
异步提交POST /api/v1/meeting/submitapi.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/xxx (豆包语音妙记-API)

音视频字幕

接口端点文档
字幕生成POST /api/v1/subtitle/submitsubtitle.md
字幕打轴POST /api/v1/subtitle/alignalign.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/192519 (音视频字幕生成)
  • https://www.volcengine.com/docs/6561/113635 (自动字幕打轴)

控制台管理 API

接口端点认证方式文档
大模型音色列表POST /ListBigModelTTSTimbresAK/SKtimbre.md
大模型音色列表(新)POST /ListSpeakersAK/SKtimbre.md
API Key 管理POST /ListAPIKeysAK/SKapikey.md
服务状态管理POST /ServiceStatusAK/SKservice.md
配额监控POST /QuotaMonitoringAK/SKmonitoring.md
声音复刻状态POST /ListMegaTTSTrainStatusAK/SKvoice-clone-status.md

原始文档链接:

  • https://www.volcengine.com/docs/6561/1770994 (ListBigModelTTSTimbres)
  • https://www.volcengine.com/docs/6561/2160690 (ListSpeakers)

认证方式

Speech API(语音服务)

语音服务使用以下认证方式:

认证方式Header适用场景
Access TokenAuthorization: Bearer; {token}HTTP/WebSocket V1-V2
X-Api 认证X-Api-App-Id, X-Api-Access-KeyWebSocket V3
Request Bodyapp.token部分 HTTP 接口

Console API(控制台服务)

控制台 API 使用 Volcengine OpenAPI AK/SK 签名认证

Authorization: HMAC-SHA256 Credential={AccessKeyId}/...

详见 auth.md


快速选择

需求推荐接口文档
短文本实时合成TTS 2.0 单向流式 HTTP V3stream-http.md
长文本批量合成TTS 2.0 异步接口async.md
实时语音交互实时对话 APIrealtime/api.md
定制音色声音复刻 APIvoice-clone/api.md
实时语音识别ASR 2.0 流式asr2.0/streaming.md
录音文件转写ASR 2.0 文件识别asr2.0/file-standard.md
播客生成播客 APIpodcast/api.md

Doubao Speech SDK - Go Implementation

Import: github.com/haivivi/giztoy/pkg/doubaospeech

📚 Go Documentation

Clients

Speech API Client

type Client struct {
    // V1 Services (Classic)
    TTS *TTSService
    ASR *ASRService
    
    // V2 Services (BigModel)
    TTSV2 *TTSServiceV2
    ASRV2 *ASRServiceV2
    
    // Shared Services
    VoiceClone  *VoiceCloneService
    Realtime    *RealtimeService
    Meeting     *MeetingService
    Podcast     *PodcastService
    Translation *TranslationService
    Media       *MediaService
}

Constructor:

// With API Key (recommended)
client := doubaospeech.NewClient("app-id",
    doubaospeech.WithAPIKey("your-api-key"),
    doubaospeech.WithCluster("volcano_tts"),
)

// With Bearer Token
client := doubaospeech.NewClient("app-id",
    doubaospeech.WithBearerToken("your-token"),
)

// With V2 API Key (for BigModel APIs)
client := doubaospeech.NewClient("app-id",
    doubaospeech.WithV2APIKey("access-key", "app-key"),
    doubaospeech.WithResourceID("seed-tts-2.0"),
)

Console API Client

console := doubaospeech.NewConsole("access-key", "secret-key")

Services

TTS V1 (Classic)

// Synchronous
resp, err := client.TTS.Synthesize(ctx, &doubaospeech.TTSRequest{
    Text:      "你好,世界!",
    VoiceType: "zh_female_cancan",
})
// resp.Audio contains audio bytes

// Streaming (Go 1.23+ iter.Seq2)
for chunk, err := range client.TTS.SynthesizeStream(ctx, req) {
    if err != nil {
        return err
    }
    buf.Write(chunk.Audio)
}

TTS V2 (BigModel)

// Streaming HTTP
for chunk, err := range client.TTSV2.SynthesizeStream(ctx, &doubaospeech.TTSV2Request{
    Text:       "你好,世界!",
    VoiceType:  "zh_female_cancan",
    ResourceID: "seed-tts-2.0",
}) {
    // Process chunk
}

// Async (long text)
task, err := client.TTSV2.SubmitAsync(ctx, &doubaospeech.AsyncTTSRequest{
    Text: longText,
})
result, err := task.Wait(ctx)

ASR (Speech Recognition)

// One-sentence (V1)
resp, err := client.ASR.Recognize(ctx, &doubaospeech.ASRRequest{
    Audio:    audioData,
    Format:   "pcm",
    Language: "zh-CN",
})

// Streaming (WebSocket)
session, err := client.ASR.OpenStreamSession(ctx, &doubaospeech.StreamASRConfig{
    Format:     "pcm",
    SampleRate: 16000,
})
defer session.Close()

// Send audio chunks
session.SendAudio(ctx, audioData, false)
session.SendAudio(ctx, lastData, true)

// Receive results
for chunk, err := range session.Recv() {
    if err != nil {
        break
    }
    fmt.Println(chunk.Text)
}

Voice Clone

// Upload audio for training
result, err := client.VoiceClone.Upload(ctx, &doubaospeech.VoiceCloneRequest{
    AudioData: audioData,
    VoiceID:   "my-custom-voice",
})

// Check status
status, err := client.VoiceClone.GetStatus(ctx, "my-custom-voice")

// Activate voice
err := client.VoiceClone.Activate(ctx, "my-custom-voice")

Realtime Dialogue

session, err := client.Realtime.Connect(ctx, &doubaospeech.RealtimeConfig{
    Model: "speech-dialog-001",
})
defer session.Close()

// Send audio
session.SendAudio(audioData)

// Receive events
for event := range session.Events() {
    switch event.Type {
    case "asr_result":
        fmt.Println("User:", event.AsrResult.Text)
    case "tts_audio":
        play(event.TtsAudio)
    }
}

Console API

// List available voices
voices, err := console.ListSpeakers(ctx, &doubaospeech.ListSpeakersRequest{})

// List timbres
timbres, err := console.ListTimbres(ctx, &doubaospeech.ListTimbresRequest{})

// Check voice clone status
status, err := console.ListVoiceCloneStatus(ctx, &doubaospeech.ListVoiceCloneStatusRequest{
    VoiceID: "my-custom-voice",
})

Options

OptionDescription
WithAPIKey(key)x-api-key authentication
WithBearerToken(token)Bearer token authentication
WithV2APIKey(access, app)V2/V3 API authentication
WithCluster(cluster)Set cluster name (V1)
WithResourceID(id)Set resource ID (V2)
WithBaseURL(url)Custom HTTP base URL
WithWebSocketURL(url)Custom WebSocket URL
WithHTTPClient(client)Custom HTTP client
WithTimeout(duration)Request timeout
WithUserID(id)User identifier

Error Handling

if err != nil {
    if e, ok := doubaospeech.AsError(err); ok {
        fmt.Printf("Error %d: %s\n", e.Code, e.Message)
        if e.IsRateLimit() {
            // Handle rate limiting
        }
    }
}

Doubao Speech SDK - Rust Implementation

Crate: giztoy-doubaospeech

📚 Rust Documentation

Clients

Speech API Client

#![allow(unused)]
fn main() {
pub struct Client {
    // Internal HTTP/WebSocket clients
}

impl Client {
    pub fn tts(&self) -> TtsService;
    pub fn asr(&self) -> AsrService;
    pub fn voice_clone(&self) -> VoiceCloneService;
    pub fn realtime(&self) -> RealtimeService;
    pub fn meeting(&self) -> MeetingService;
    pub fn podcast(&self) -> PodcastService;
    pub fn translation(&self) -> TranslationService;
    pub fn media(&self) -> MediaService;
}
}

Constructor:

#![allow(unused)]
fn main() {
use giztoy_doubaospeech::Client;

// With API Key (recommended)
let client = Client::builder("app-id")
    .api_key("your-api-key")
    .cluster("volcano_tts")
    .build()?;

// With Bearer Token
let client = Client::builder("app-id")
    .bearer_token("your-token")
    .build()?;

// With V2 API Key
let client = Client::builder("app-id")
    .v2_api_key("access-key", "app-key")
    .resource_id("seed-tts-2.0")
    .build()?;
}

Console Client

#![allow(unused)]
fn main() {
use giztoy_doubaospeech::Console;

let console = Console::new("access-key", "secret-key");
}

Services

TTS Service

#![allow(unused)]
fn main() {
use giztoy_doubaospeech::{TtsRequest, TtsService};

// Synchronous
let response = client.tts().synthesize(&TtsRequest {
    text: "你好,世界!".to_string(),
    voice_type: "zh_female_cancan".to_string(),
    ..Default::default()
}).await?;
// response.audio contains bytes

// Streaming
let stream = client.tts().synthesize_stream(&req).await?;
while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    if let Some(audio) = chunk.audio {
        buf.extend(&audio);
    }
}
}

ASR Service

#![allow(unused)]
fn main() {
use giztoy_doubaospeech::{OneSentenceRequest, StreamAsrConfig};

// One-sentence
let result = client.asr().recognize(&OneSentenceRequest {
    audio: audio_data,
    format: "pcm".to_string(),
    language: "zh-CN".to_string(),
    ..Default::default()
}).await?;

// Streaming
let session = client.asr().open_stream_session(&StreamAsrConfig {
    format: "pcm".to_string(),
    sample_rate: 16000,
    ..Default::default()
}).await?;

// Send audio
session.send_audio(&audio_data, false).await?;
session.send_audio(&last_data, true).await?;

// Receive results
while let Some(result) = session.recv().await {
    let chunk = result?;
    println!("Text: {}", chunk.text);
}
}

Voice Clone Service

#![allow(unused)]
fn main() {
// Upload for training
let result = client.voice_clone().upload(&VoiceCloneTrainRequest {
    audio_data: audio_bytes,
    voice_id: "my-custom-voice".to_string(),
    ..Default::default()
}).await?;

// Check status
let status = client.voice_clone().get_status("my-custom-voice").await?;
}

Realtime Service

#![allow(unused)]
fn main() {
use giztoy_doubaospeech::{RealtimeConfig, RealtimeEventType};

let session = client.realtime().connect(&RealtimeConfig {
    model: "speech-dialog-001".to_string(),
    ..Default::default()
}).await?;

// Send audio
session.send_audio(&audio_data).await?;

// Receive events
while let Some(event) = session.recv().await {
    let event = event?;
    match event.event_type {
        RealtimeEventType::AsrResult => {
            println!("User: {}", event.asr_result.text);
        }
        RealtimeEventType::TtsAudio => {
            play(&event.tts_audio);
        }
        _ => {}
    }
}
}

Console API

#![allow(unused)]
fn main() {
use giztoy_doubaospeech::{Console, ListSpeakersRequest};

let console = Console::new("access-key", "secret-key");

// List speakers
let speakers = console.list_speakers(&ListSpeakersRequest::default()).await?;

// List timbres
let timbres = console.list_timbres(&ListTimbresRequest::default()).await?;
}

Builder Options

MethodDescription
api_key(key)x-api-key authentication
bearer_token(token)Bearer token authentication
v2_api_key(access, app)V2/V3 API authentication
cluster(cluster)Set cluster name (V1)
resource_id(id)Set resource ID (V2)
base_url(url)Custom HTTP base URL
ws_url(url)Custom WebSocket URL
timeout(duration)Request timeout
user_id(id)User identifier

Error Handling

#![allow(unused)]
fn main() {
use giztoy_doubaospeech::{Error, Result};

match client.tts().synthesize(&req).await {
    Ok(resp) => { /* ... */ }
    Err(Error::Api { code, message }) => {
        eprintln!("API Error {}: {}", code, message);
    }
    Err(e) => {
        eprintln!("Error: {}", e);
    }
}
}

Differences from Go

FeatureGoRust
V1/V2 service accessSeparate fields (TTS, TTSV2)Single service with version param
Streamingiter.Seq2Stream<Item=Result<T>>
Session managementManual closeDrop trait
WebSocketgorilla/websockettokio-tungstenite

Doubao Speech SDK - Known Issues

🟡 Minor Issues

DBS-001: Go auth header format unusual

File: go/pkg/doubaospeech/client.go:238-239

Description:
Bearer token format is Bearer;{token} instead of standard Bearer {token}:

req.Header.Set("Authorization", "Bearer;"+c.config.accessToken)

Impact: Non-standard but required by Volcengine API.

Note: This is API requirement, not SDK issue.


DBS-002: Multiple auth method complexity

Description:
SDK supports 4+ authentication methods:

  • API Key (x-api-key)
  • Bearer Token (Authorization: Bearer;)
  • V2 API Key (X-Api-Access-Key, X-Api-App-Key)
  • Resource-specific fixed keys

Impact: Confusing for users which method to use for which service.

Suggestion: Add helper methods like NewTTSClient(), NewRealtimeClient() with correct defaults.


DBS-003: Resource ID vs Cluster confusion

Description:
V1 uses "cluster", V2 uses "resource_id" for service selection:

  • V1: WithCluster("volcano_tts")
  • V2: WithResourceID("seed-tts-2.0")

Impact: Easy to mix up, unclear which to use when.


DBS-004: Rust async TTS incomplete

Description:
Rust implementation for async long-text TTS may be incomplete or missing compared to Go.

Status: ⚠️ Needs verification.


DBS-005: Rust file ASR incomplete

Description:
Rust implementation for file-based ASR may be incomplete compared to Go.

Status: ⚠️ Needs verification.


DBS-006: Fixed app keys hardcoded

File: go/pkg/doubaospeech/client.go:17-24

Description:
Some V3 APIs use fixed app keys from documentation:

const (
    AppKeyRealtime = "PlgvMymc7f3tQnJ6"
    AppKeyPodcast = "aGjiRDfUWi"
)

Impact: If Volcengine changes these, SDK breaks until updated.

Note: This is documented API behavior.


🔵 Enhancements

DBS-007: No automatic service version selection

Description:
User must manually choose between V1 and V2 services. No automatic selection based on features needed.

Suggestion: Add unified service that routes to correct version.


DBS-008: No connection pooling documentation

Description:
WebSocket connections for streaming services could benefit from pooling documentation.


DBS-009: No retry for WebSocket connections

Description:
HTTP requests have retry, but WebSocket connections don't auto-reconnect on failure.

Suggestion: Add reconnection logic for streaming sessions.


DBS-010: Console API missing some endpoints

Description:
Console client may not cover all management APIs available on Volcengine.


⚪ Notes

DBS-011: Dual API version design

Description:
Having both V1 (Classic) and V2 (BigModel) services in same client reflects Volcengine's actual API structure. This is intentional, not a flaw.


DBS-012: Protocol module for WebSocket

Description:
Both Go and Rust have a protocol module for WebSocket message serialization. This is well-structured for the binary protocol requirements.


DBS-013: Comprehensive service coverage

Description:
SDK covers nearly all Doubao Speech services:

  • TTS (sync, stream, async)
  • ASR (one-sentence, stream, file)
  • Voice Clone
  • Realtime Dialogue
  • Meeting Transcription
  • Podcast Synthesis
  • Translation
  • Media Subtitle

This is impressive coverage.


DBS-014: Console uses AK/SK signature

Description:
Console API uses Volcengine OpenAPI signature (HMAC-SHA256), not simple token. This is standard for Volcengine management APIs.


Summary

IDSeverityStatusComponent
DBS-001🟡 MinorNoteGo Auth
DBS-002🟡 MinorOpenBoth
DBS-003🟡 MinorOpenBoth
DBS-004🟡 MinorOpenRust
DBS-005🟡 MinorOpenRust
DBS-006🟡 MinorNoteBoth
DBS-007🔵 EnhancementOpenBoth
DBS-008🔵 EnhancementOpenBoth
DBS-009🔵 EnhancementOpenBoth
DBS-010🔵 EnhancementOpenBoth
DBS-011⚪ NoteN/ABoth
DBS-012⚪ NoteN/ABoth
DBS-013⚪ NoteN/ABoth
DBS-014⚪ NoteN/ABoth

Overall: Comprehensive SDK with excellent API coverage. Main complexity is from Volcengine's dual API version system and multiple authentication methods. Rust implementation may have some gaps compared to Go.

Jiutian API Documentation

九天大模型 (Jiutian) - China Mobile's AI Service Platform.

Official API Documentation: api/README.md

Overview

Jiutian is China Mobile's terminal intelligent agent service management platform for AI/LLM cloud integration.

Note: This is API documentation only. No Go/Rust SDK implementation exists in this repository.

API Features

  • Chat Completions: OpenAI-compatible chat API
  • Device Integration: Device registration and heartbeat protocols
  • Assistant Management: Configure AI assistants

Integration Notes

For integration with Jiutian API:

  1. Use OpenAI-compatible SDK with custom base URL
  2. Follow device authentication protocols
  3. See api/tutorial.md for quick start

Authentication

Requires:

  • AI Token (obtained via email application)
  • IP whitelist registration
  • Product ID from management platform

Environment

EnvironmentURL
Testhttps://z5f3vhk2.cxzfdm.com:30101
Productionhttps://ivs.chinamobiledevice.com:30100

SDK Implementation Status

LanguageStatus
Go❌ Not implemented
Rust❌ Not implemented

For basic integration, use OpenAI-compatible SDK:

Go:

config := openai.DefaultConfig(jiutianToken)
config.BaseURL = "https://ivs.chinamobiledevice.com:30100/v1"
client := openai.NewClientWithConfig(config)

九天大模型 API 文档

终端智能体服务管理平台 AI 大模型云云对接文档

文档索引

文档说明
tutorial.md🚀 快速入门教程(推荐先看)
concepts.md关键词说明(文本生成模型、助手、令牌)
auth.md身份验证说明
chat.mdChat Completions API
device.md设备接入协议(获取设备信息、心跳上报)
faq.md常见问题 Q&A

文档变更记录

日期版本操作内容操作人
25.02.28V1.0定义AI硬件厂商中控平台对接终端智能体服务管理平台大模型服务的接口协议邹益强
25.04.30v1.0.1调整文档格式邹益强
25.11.14v1.0.2增加申请邮件说明邹益强
25.12.29v1.0.3增加非生成式ai接入说明邹益强

AI 服务接入流程

  1. 请提供厂商服务器IP开通访问白名单,以及向纳管平台申请的产品id(productId)来申请AI token
  2. 邮件发送至:
    • zouyiqiang_fx@cmdc.chinamobile.com
    • zhucaiwen_fx@cmdc.chinamobile.com
    • 抄送:zhengzhongwei_fx@cmdc.chinamobile.com
  3. 白名单开通后,厂商服务器就可以使用 AI TOKEN 调用九天大模型接口

环境配置

环境地址
测试环境https://z5f3vhk2.cxzfdm.com:30101
生产环境https://ivs.chinamobiledevice.com:30100

测试 Token: sk-Y73NAU0tArvGRlpUE9060529470b42Ac8bA34d40F48b0564

系统提示词: 您好,我是中国移动的智能助理灵犀。如果您询问我的身份,我会回答:"您好,我是中国移动智能助理灵犀"。

模型上下文长度: 8K

Jiutian - Go Implementation

Status: Not Implemented

No native Go SDK for Jiutian API exists in this repository.

Recommendation

Use OpenAI-compatible SDK since Jiutian API follows OpenAI chat completions format:

import "github.com/sashabaranov/go-openai"

config := openai.DefaultConfig("sk-your-jiutian-token")
config.BaseURL = "https://ivs.chinamobiledevice.com:30100/v1"

client := openai.NewClientWithConfig(config)

resp, err := client.CreateChatCompletion(ctx, openai.ChatCompletionRequest{
    Model: "jiutian",
    Messages: []openai.ChatCompletionMessage{
        {
            Role:    "system",
            Content: "您好,我是中国移动的智能助理灵犀。",
        },
        {
            Role:    "user", 
            Content: "你是谁?",
        },
    },
})

Device Protocol

For device-specific features (registration, heartbeat), implement HTTP client directly:

// Device heartbeat
type HeartbeatRequest struct {
    DeviceID  string `json:"device_id"`
    ProductID string `json:"product_id"`
    Timestamp int64  `json:"timestamp"`
}

func sendHeartbeat(ctx context.Context, client *http.Client, req *HeartbeatRequest) error {
    // POST to /api/device/heartbeat
}

Future Work

A native SDK could provide:

  • Device registration/heartbeat management
  • Token refresh handling
  • Jiutian-specific features

See api/device.md for device protocol details.

Jiutian - Rust Implementation

Status: Not Implemented

No native Rust SDK for Jiutian API exists in this repository.

Recommendation

Use OpenAI-compatible SDK since Jiutian API follows OpenAI chat completions format:

#![allow(unused)]
fn main() {
use async_openai::{Client, config::OpenAIConfig};

let config = OpenAIConfig::new()
    .with_api_key("sk-your-jiutian-token")
    .with_api_base("https://ivs.chinamobiledevice.com:30100/v1");

let client = Client::with_config(config);

let request = CreateChatCompletionRequestArgs::default()
    .model("jiutian")
    .messages([
        ChatCompletionRequestMessage::System(
            ChatCompletionRequestSystemMessageArgs::default()
                .content("您好,我是中国移动的智能助理灵犀。")
                .build()?
        ),
        ChatCompletionRequestMessage::User(
            ChatCompletionRequestUserMessageArgs::default()
                .content("你是谁?")
                .build()?
        ),
    ])
    .build()?;

let response = client.chat().create(request).await?;
}

Device Protocol

For device-specific features, implement HTTP client using reqwest:

#![allow(unused)]
fn main() {
use serde::{Deserialize, Serialize};

#[derive(Serialize)]
struct HeartbeatRequest {
    device_id: String,
    product_id: String,
    timestamp: i64,
}

async fn send_heartbeat(
    client: &reqwest::Client,
    base_url: &str,
    req: &HeartbeatRequest,
) -> Result<(), Error> {
    client
        .post(format!("{}/api/device/heartbeat", base_url))
        .json(req)
        .send()
        .await?;
    Ok(())
}
}

Future Work

A native SDK could provide:

  • Device registration/heartbeat management
  • Token refresh handling
  • Jiutian-specific features

See api/device.md for device protocol details.

Jiutian - Known Issues

🔴 Major Issues

JT-001: No SDK implementation

Description:
No Go or Rust SDK implementation exists for Jiutian API.

Impact: Users must use OpenAI-compatible SDK or implement HTTP calls directly.

Recommendation:

  1. For chat completions: Use OpenAI SDK with custom base URL
  2. For device features: Implement direct HTTP calls

🔵 Enhancements

JT-002: Native SDK desired

Description:
A native SDK would be useful for:

  • Device registration/heartbeat protocols
  • Token management
  • Jiutian-specific error handling

Priority: Low - OpenAI SDK covers main use case.


JT-003: Device protocol documentation only

Description:
Device registration and heartbeat protocols are documented but not implemented.

Files affected:


⚪ Notes

JT-004: OpenAI-compatible API

Description:
Jiutian chat API is OpenAI-compatible, so existing OpenAI SDKs work:

  • Go: github.com/sashabaranov/go-openai
  • Rust: async-openai
  • Python: openai

Just set custom base URL and use Jiutian token.


JT-005: Access requirements

Description:
Jiutian API requires:

  1. IP whitelist registration
  2. Product ID from management platform
  3. AI token (obtained via email application)

This is documented in api/README.md.


JT-006: China Mobile specific

Description:
This API is specific to China Mobile's terminal intelligent agent service management platform. May not be relevant for all users of this repository.


Summary

IDSeverityStatusComponent
JT-001🔴 MajorOpenBoth
JT-002🔵 EnhancementOpenBoth
JT-003🔵 EnhancementOpenBoth
JT-004⚪ NoteN/ABoth
JT-005⚪ NoteN/ABoth
JT-006⚪ NoteN/ABoth

Overall: Documentation-only module without SDK implementation. OpenAI-compatible SDK recommended for chat functionality.

mqtt0

Overview

mqtt0 is a lightweight MQTT implementation focused on QoS 0. It provides both client and broker components with explicit control over authentication and ACL. The Go implementation is synchronous and net.Conn-based, while the Rust implementation is async (Tokio) with optional TLS/WebSocket transport features.

Design Goals

  • Minimal MQTT feature set with strong QoS 0 focus
  • Explicit ACL/auth hooks for connect/publish/subscribe
  • Simple broker suitable for embedded or internal services
  • Support MQTT 3.1.1 (v4) and MQTT 5.0 (v5)
  • Provide transport flexibility (TCP/TLS/WebSocket)

Key Concepts

  • Client: QoS 0 publish/subscribe, keepalive, protocol v4/v5
  • Broker: connection lifecycle, ACL checks, topic routing
  • Shared subscriptions: $share/{group}/{topic}
  • Topic alias (v5): reduce bandwidth by reusing alias per client
  • Transports: TCP/TLS/WebSocket based on URL scheme or feature flags

Components

  • Client
  • Broker
  • Protocol parser/encoder
  • Topic trie (subscription routing)
  • Transport layer

Protocol and Transport Support

  • MQTT 3.1.1 and MQTT 5.0 for client and broker
  • TCP and TLS by default
  • WebSocket/WSS when enabled (Rust feature flags)

Examples

  • Go: use Connect, Subscribe, Publish, Recv
  • Rust: use Client::connect, client.subscribe, client.publish, client.recv
  • docs/lib/trie (topic routing)
  • docs/lib/encoding (protocol helpers)

mqtt0 (Go)

Package Layout

  • doc.go: high-level overview and usage examples
  • client.go: QoS 0 client implementation
  • broker.go: broker implementation with ACL hooks
  • packet_v4.go, packet_v5.go, packet.go: protocol encode/decode
  • listener.go, dialer.go: transport helpers
  • trie.go: subscription routing

Public Interfaces

  • ClientConfig: broker address, protocol version, TLS config, keepalive, etc.
  • Client: Connect, Subscribe, Unsubscribe, Publish, Recv, Close
  • Broker: Serve, ServeConn, ACL hooks, callbacks
  • Authenticator: access control on connect/publish/subscribe
  • Handler: callback for inbound broker messages
  • Message, ProtocolVersion, QoS

Design Notes

  • Single connection with separate read/write locks to guard concurrent access.
  • Request/response operations (SUBSCRIBE/UNSUBSCRIBE) read from the same stream as inbound PUBLISH messages.
  • Keepalive runs in a goroutine when AutoKeepalive is enabled.
  • Shared subscriptions and topic aliasing are handled in the broker.

Transport

  • URL-based address parsing: tcp://, tls://, ws://, wss://
  • Dialer hook allows custom connection logic
  • TLS config supported via ClientConfig.TLSConfig

Notable Behaviors

  • QoS 0 only; no packet persistence or retransmission.
  • Broker drops messages when per-client channel is full (non-blocking send).

mqtt0 (Rust)

Crate Layout

  • lib.rs: public exports, crate overview
  • client.rs: async QoS 0 client
  • broker.rs: async broker with ACL hooks
  • protocol.rs: MQTT encode/decode
  • transport.rs: TCP/TLS/WebSocket transport abstraction
  • trie.rs: subscription routing
  • types.rs: public types and traits

Public Interfaces

  • Client, ClientConfig: async connect/subscribe/publish/recv
  • Broker, BrokerConfig, BrokerBuilder: broker setup and lifecycle
  • Authenticator, Handler: ACL and message handling
  • Message, ProtocolVersion, QoS
  • TransportType, Transport (feature-gated TLS/WebSocket)

Design Notes

  • Fully async, based on Tokio and mpsc channels.
  • Builder pattern for broker configuration and hooks.
  • Transport features are behind Cargo feature flags (TLS, WebSocket).

Differences vs Go

  • Rust uses async traits for client/broker operations.
  • TLS/WebSocket support is feature-gated.
  • Broker construction encourages builder configuration.

mqtt0 - Known Issues

🟠 Major Issues

MQTT0-001: Go Client read path is not demultiplexed

File: go/pkg/mqtt0/client.go

Description: Subscribe, Unsubscribe, and other request/response operations read directly from the same stream as Recv(). If callers run Recv() concurrently with Subscribe() or Unsubscribe(), whichever acquires readMu first may consume packets that belong to the other operation, causing unexpected packet errors.

Impact: Hard-to-debug race between subscription changes and inbound message handling.

Suggestion: Introduce a single read loop with protocol demuxing, or document that Recv() must not run concurrently with subscribe/unsubscribe calls.


🟡 Minor Issues

MQTT0-002: Go Broker drops messages on backpressure

File: go/pkg/mqtt0/broker.go

Description: The broker uses a bounded channel for each client. When the channel is full, messages are dropped with a debug log.

Impact: Message loss under bursty load beyond QoS 0 expectations; may surprise users.

Suggestion: Document the drop behavior clearly or make buffer size configurable.


MQTT0-003: Rust WebSocket transport requires special handling

File: rust/mqtt0/src/transport.rs

Description: Transport::WebSocket implements AsyncRead/AsyncWrite by returning Unsupported errors. If code treats Transport uniformly, WebSocket connections will fail at runtime.

Impact: Surprising runtime errors for WebSocket clients if not handled explicitly.

Suggestion: Expose dedicated websocket read/write APIs or document the required handling.

Buffer Package

Thread-safe streaming buffer implementations for producer-consumer patterns.

Design Goals

  1. Type-safe: Generic buffers for any element type (not limited to bytes)
  2. Thread-safe: All operations support concurrent access
  3. Blocking Semantics: Support blocking read/write with proper close mechanics
  4. Flow Control: Different policies for handling full buffers

Buffer Types

TypeFull BehaviorEmpty BehaviorUse Case
BufferGrowBlockVariable-size data, unknown total size
BlockBufferBlockBlockFlow control, bounded memory
RingBufferOverwriteBlockSliding window, latest data only

Buffer (Growable)

A dynamically growing buffer that never blocks writes. Ideal for scenarios where:

  • Data size is not known in advance
  • Memory is not constrained
  • Writer should never block
flowchart LR
    W[Writer] --> B["[Grows on demand]"]
    B --> R["Reader<br/>(blocks when empty)"]

BlockBuffer (Fixed, Blocking)

A fixed-size circular buffer that blocks on both read and write. Provides backpressure for flow control:

  • Writer blocks when buffer is full
  • Reader blocks when buffer is empty
  • Predictable memory usage
flowchart LR
    W["Writer<br/>(blocks when full)"] --> B["[Fixed Size]"]
    B --> R["Reader<br/>(blocks when empty)"]

RingBuffer (Fixed, Overwriting)

A fixed-size circular buffer that overwrites oldest data when full. Ideal for:

  • Maintaining a sliding window of latest data
  • Real-time data where older samples are stale
  • Memory-bounded with freshness priority
flowchart LR
    W[Writer] --> B["[Overwrites oldest]"]
    B --> R["Reader<br/>(blocks when empty)"]

Common Interface

All buffer types share a consistent interface:

OperationDescription
Write([]T)Write slice of elements
Read([]T)Read into slice
Add(T)Add single element
Next()Read single element (iterator pattern)
Discard(n)Skip n elements without reading
Len()Current element count
Reset()Clear all data
CloseWrite()Graceful close (allow drain)
CloseWithError(err)Immediate close with error
Close()Close both ends
Error()Get close error (if any)

Close Semantics

CloseWrite() - Graceful

sequenceDiagram
    participant W as Writer
    participant B as Buffer
    participant R as Reader
    W->>B: CloseWrite()
    Note over B: No new writes
    R->>B: Read()
    B-->>R: Remaining data
    R->>B: Read()
    B-->>R: EOF

CloseWithError(err) - Immediate

sequenceDiagram
    participant W as Writer
    participant B as Buffer
    participant R as Reader
    W->>B: CloseWithError(err)
    Note over B: Both ends closed
    R->>B: Read()
    B-->>R: err

Examples Directory

  • examples/go/buffer/ - Go usage examples
  • examples/rust/buffer/ - Rust usage examples

Implementation Notes

Memory Layout

TypeLayout
BufferDynamic slice → grows via append
BlockBufferFixed circular → head/tail pointers wrap
RingBufferFixed circular → overwrites when head catches tail

Notification Mechanism

  • Go: Channel-based (writeNotify chan struct{}) or Cond variables
  • Rust: Condvar-based (Condvar::notify_one/all)

Thread Safety

  • Go: sync.Mutex + sync.Cond / channels
  • Rust: Mutex<State> + Condvar, wrapped in Arc for cloning
  • audio/pcm - Uses buffers for PCM audio streams
  • chatgear - Uses buffers for audio frame transmission
  • opusrt - Uses RingBuffer for jitter buffering

Buffer Package - Go Implementation

Import: github.com/haivivi/giztoy/pkg/buffer

📚 Go Documentation

Types

Buffer[T]

Growable buffer with generic type support.

type Buffer[T any] struct {
    writeNotify chan struct{}
    mu          sync.Mutex
    closeWrite  bool
    closeErr    error
    buf         []T
}

Key Methods:

MethodSignatureDescription
Nfunc N[T any](n int) *Buffer[T]Create with initial capacity
Write(b *Buffer[T]) Write(p []T) (int, error)Append elements
Read(b *Buffer[T]) Read(p []T) (int, error)Read elements (blocks)
Add(b *Buffer[T]) Add(t T) errorAdd single element
Next(b *Buffer[T]) Next() (T, error)Pop from end (LIFO)
Bytes(b *Buffer[T]) Bytes() []TGet internal slice (unsafe)

BlockBuffer[T]

Fixed-size circular buffer with blocking semantics.

type BlockBuffer[T any] struct {
    cond       *sync.Cond
    mu         sync.Mutex
    buf        []T
    head, tail int64
    closeWrite bool
    closeErr   error
}

Key Methods:

MethodSignatureDescription
Blockfunc Block[T any](buf []T) *BlockBuffer[T]Create from existing slice
BlockNfunc BlockN[T any](size int) *BlockBuffer[T]Create with size
Write(bb *BlockBuffer[T]) Write(p []T) (int, error)Write (blocks when full)
Read(bb *BlockBuffer[T]) Read(p []T) (int, error)Read (blocks when empty)
Next(bb *BlockBuffer[T]) Next() (T, error)Read single (FIFO)

RingBuffer[T]

Fixed-size circular buffer with overwrite semantics.

type RingBuffer[T any] struct {
    writeNotify chan struct{}
    mu          sync.Mutex
    buf         []T
    head, tail  int64
    closeWrite  bool
    closeErr    error
}

Key Methods:

MethodSignatureDescription
RingNfunc RingN[T any](size int) *RingBuffer[T]Create with size
Write(rb *RingBuffer[T]) Write(p []T) (int, error)Write (overwrites oldest)
Add(rb *RingBuffer[T]) Add(t T) errorAdd single (overwrites)

BytesBuffer Interface

Common interface for byte buffers:

type BytesBuffer interface {
    Write(p []byte) (n int, err error)
    Read(p []byte) (n int, err error)
    Discard(n int) (err error)
    Close() error
    CloseWrite() error
    CloseWithError(err error) error
    Error() error
    Reset()
    Bytes() []byte
    Len() int
}

Convenience Functions

func Bytes16KB() *BlockBuffer[byte]  // 16KB blocking buffer
func Bytes4KB() *BlockBuffer[byte]   // 4KB blocking buffer
func Bytes1KB() *BlockBuffer[byte]   // 1KB blocking buffer
func Bytes256B() *BlockBuffer[byte]  // 256B blocking buffer
func Bytes() *Buffer[byte]           // 1KB growable buffer
func BytesRing(size int) *RingBuffer[byte]  // Ring buffer

Error Handling

var ErrIteratorDone = errors.New("iterator done")
  • ErrIteratorDone: Returned by Next() when buffer is closed and empty
  • io.EOF: Returned by Read() when buffer is closed and empty
  • io.ErrClosedPipe: Default error for closed buffers

Usage Patterns

Producer-Consumer with BlockBuffer

buf := buffer.Bytes4KB()

// Producer goroutine
go func() {
    for data := range source {
        _, err := buf.Write(data)
        if err != nil {
            return
        }
    }
    buf.CloseWrite()
}()

// Consumer goroutine
tmp := make([]byte, 1024)
for {
    n, err := buf.Read(tmp)
    if err == io.EOF {
        break
    }
    process(tmp[:n])
}

Sliding Window with RingBuffer

buf := buffer.RingN[float64](100)  // Keep last 100 samples

// Streaming producer
go func() {
    for sample := range stream {
        buf.Add(sample)  // Overwrites oldest when full
    }
    buf.CloseWrite()
}()

// Periodic consumer
ticker := time.NewTicker(time.Second)
for range ticker.C {
    samples := buf.Bytes()  // Get current window
    average := computeAverage(samples)
}

Iterator Pattern

buf := buffer.N[Event](100)

// Using Next() for iteration
for {
    event, err := buf.Next()
    if errors.Is(err, buffer.ErrIteratorDone) {
        break
    }
    if err != nil {
        log.Error(err)
        break
    }
    handleEvent(event)
}

Implementation Details

Circular Buffer Arithmetic

BlockBuffer and RingBuffer use virtual counters for head/tail:

// Position in physical buffer
pos := head % int64(len(buf))

// Available data
available := tail - head

// Check if full (BlockBuffer only)
isFull := tail - head == int64(len(buf))

Notification Mechanism

  • Buffer: Uses buffered channel make(chan struct{}, 1) for non-blocking notification
  • BlockBuffer: Uses sync.Cond for precise signal/broadcast control
  • RingBuffer: Uses buffered channel (same as Buffer)

Lock Patterns

All types use sync.Mutex with deferred unlock:

func (b *Buffer[T]) Read(p []T) (n int, err error) {
    b.mu.Lock()
    defer b.mu.Unlock()
    
    // Wait loop with unlock/relock
    for len(b.buf) == 0 {
        if b.closeWrite {
            return 0, io.EOF
        }
        b.mu.Unlock()
        <-b.writeNotify  // Wait for notification
        b.mu.Lock()
        // Re-check state after relock
    }
    // ... read logic
}

Buffer Package - Rust Implementation

Crate: giztoy-buffer

📚 Rust Documentation

Types

Buffer

Growable buffer using VecDeque<T> for O(1) front operations.

#![allow(unused)]
fn main() {
pub struct Buffer<T> {
    inner: Arc<BufferInner<T>>,
}

struct BufferInner<T> {
    state: Mutex<BufferState<T>>,
    write_notify: Condvar,
}

struct BufferState<T> {
    buf: VecDeque<T>,
    close_write: bool,
    close_err: Option<Arc<dyn Error + Send + Sync>>,
}
}

Key Methods:

MethodSignatureDescription
newfn new() -> SelfCreate empty buffer
with_capacityfn with_capacity(capacity: usize) -> SelfCreate with capacity hint
writefn write(&self, data: &[T]) -> Result<usize, BufferError>Append elements
readfn read(&self, buf: &mut [T]) -> Result<usize, BufferError>Read elements (blocks)
addfn add(&self, item: T) -> Result<(), BufferError>Add single element
nextfn next(&self) -> Result<T, Done>Pop from front (FIFO)
to_vecfn to_vec(&self) -> Vec<T>Copy to Vec

BlockBuffer

Fixed-size circular buffer with blocking semantics.

#![allow(unused)]
fn main() {
pub struct BlockBuffer<T> {
    inner: Arc<BlockBufferInner<T>>,
}

struct BlockBufferInner<T> {
    state: Mutex<BlockBufferState<T>>,
    not_full: Condvar,
    not_empty: Condvar,
}

struct BlockBufferState<T> {
    buf: Vec<Option<T>>,
    head: usize,
    tail: usize,
    count: usize,
    close_write: bool,
    close_err: Option<Arc<dyn Error + Send + Sync>>,
}
}

Key Methods:

MethodSignatureDescription
newfn new(capacity: usize) -> SelfCreate with capacity
from_vecfn from_vec(data: Vec<T>) -> SelfCreate from Vec (full)
writefn write(&self, data: &[T]) -> Result<usize, BufferError>Write (blocks when full)
readfn read(&self, buf: &mut [T]) -> Result<usize, BufferError>Read (blocks when empty)
capacityfn capacity(&self) -> usizeGet capacity
is_fullfn is_full(&self) -> boolCheck if full

RingBuffer

Fixed-size circular buffer with overwrite semantics.

#![allow(unused)]
fn main() {
pub struct RingBuffer<T> {
    inner: Arc<RingBufferInner<T>>,
}

struct RingBufferInner<T> {
    state: Mutex<RingBufferState<T>>,
    write_notify: Condvar,
}

struct RingBufferState<T> {
    buf: Vec<Option<T>>,
    head: usize,  // virtual counter (wraps)
    tail: usize,  // virtual counter (wraps)
    close_write: bool,
    close_err: Option<Arc<dyn Error + Send + Sync>>,
}
}

Key Methods:

MethodSignatureDescription
newfn new(capacity: usize) -> SelfCreate with capacity
writefn write(&self, data: &[T]) -> Result<usize, BufferError>Write (overwrites oldest)
addfn add(&self, item: T) -> Result<(), BufferError>Add single (overwrites)

Error Types

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub enum BufferError {
    Closed,
    ClosedWithError(Arc<dyn Error + Send + Sync>),
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct Done;
}

Convenience Functions

#![allow(unused)]
fn main() {
// Growable buffers
pub fn bytes() -> Buffer<u8>         // 1KB
pub fn bytes_1kb() -> Buffer<u8>     // 1KB
pub fn bytes_4kb() -> Buffer<u8>     // 4KB
pub fn bytes_16kb() -> Buffer<u8>    // 16KB
pub fn bytes_64kb() -> Buffer<u8>    // 64KB
pub fn bytes_256b() -> Buffer<u8>    // 256B

// Blocking buffers
pub fn block_bytes() -> BlockBuffer<u8>      // 1KB
pub fn block_bytes_1kb() -> BlockBuffer<u8>  // 1KB
pub fn block_bytes_4kb() -> BlockBuffer<u8>  // 4KB
pub fn block_bytes_16kb() -> BlockBuffer<u8> // 16KB
pub fn block_bytes_64kb() -> BlockBuffer<u8> // 64KB

// Ring buffers
pub fn ring_bytes(size: usize) -> RingBuffer<u8>
pub fn ring_bytes_1kb() -> RingBuffer<u8>    // 1KB
pub fn ring_bytes_4kb() -> RingBuffer<u8>    // 4KB
pub fn ring_bytes_16kb() -> RingBuffer<u8>   // 16KB
pub fn ring_bytes_64kb() -> RingBuffer<u8>   // 64KB
}

Thread Safety

All types implement Send + Sync and support Clone:

#![allow(unused)]
fn main() {
// Clone shares the underlying buffer via Arc
let buf = Buffer::<i32>::new();
let buf_clone = buf.clone();  // Same underlying buffer

// Safe to send to other threads
std::thread::spawn(move || {
    buf_clone.add(42).unwrap();
});
}

Usage Patterns

Producer-Consumer

#![allow(unused)]
fn main() {
use giztoy_buffer::{BlockBuffer, Done};
use std::thread;

let buf = BlockBuffer::<i32>::new(4);
let producer_buf = buf.clone();

let producer = thread::spawn(move || {
    for i in 0..100 {
        producer_buf.add(i).unwrap();
    }
    producer_buf.close_write().unwrap();
});

let mut collected = Vec::new();
loop {
    match buf.next() {
        Ok(item) => collected.push(item),
        Err(Done) => break,
    }
}

producer.join().unwrap();
}

Sliding Window

#![allow(unused)]
fn main() {
use giztoy_buffer::RingBuffer;

let buf = RingBuffer::<f32>::new(100);

// Write more than capacity - old data overwritten
for i in 0..200 {
    buf.add(i as f32).unwrap();
}

// Buffer contains only last 100 values
assert_eq!(buf.len(), 100);
let window = buf.to_vec();  // [100.0, 101.0, ..., 199.0]
}

Implementation Details

VecDeque vs Vec

  • Buffer: Uses VecDeque<T> for O(1) pop_front()
  • BlockBuffer/RingBuffer: Use Vec<Option<T>> for circular buffer

Wrapping Arithmetic

RingBuffer uses wrapping_add for counters to handle overflow:

#![allow(unused)]
fn main() {
state.tail = state.tail.wrapping_add(1);
if state.tail.wrapping_sub(state.head) > capacity {
    state.head = state.head.wrapping_add(1);
}
}

Dual Condvar Pattern (BlockBuffer)

BlockBuffer uses two Condvars for precise signaling:

#![allow(unused)]
fn main() {
not_full: Condvar,   // Signals writers when space available
not_empty: Condvar,  // Signals readers when data available
}

Differences from Go Implementation

AspectGoRust
Internal storage[]T sliceVec<Option<T>> or VecDeque<T>
Buffer.Next()LIFO (pops from end)FIFO (pops from front)
Bytes() / to_vec()Returns internal sliceReturns copy
CloningNot supportedVia Arc (shared)
Error typeerror interfaceBufferError enum
Default implVia interfaceVia Default trait

Buffer Package - Known Issues

🟠 Major Issues

BUF-001: Go Buffer.Next() uses LIFO instead of FIFO

File: go/pkg/buffer/buffer.go:259-262

Description:
The Next() method reads from the END of the buffer (LIFO behavior), while Read() reads from the front (FIFO). This inconsistency is confusing and likely unintentional.

// Current implementation (LIFO)
head := len(b.buf) - 1
t = b.buf[head]
b.buf = b.buf[:head]

Expected: Should read from b.buf[0] for FIFO consistency with Read().

Impact: Users expecting iterator-style sequential access get reversed order.

Status: ⚠️ Documented in code comment but should be fixed.


BUF-002: Go Buffer.Add() missing write notification

File: go/pkg/buffer/buffer.go:280-291

Description:
The Add() method appends an element but does NOT send a notification on writeNotify. If a reader is blocked waiting and only Add() is used for writing, the reader may block indefinitely.

func (b *Buffer[T]) Add(t T) error {
    // ... error checks ...
    b.buf = append(b.buf, t)
    return nil  // Missing: select { case b.writeNotify <- struct{}{}: default: }
}

Impact: Potential deadlock when using Add() exclusively.

Status: 🔴 Bug - needs fix.


🟡 Minor Issues

BUF-003: Go Buffer.Bytes() returns internal slice reference

File: go/pkg/buffer/buffer.go:335-339

Description:
Bytes() returns the internal slice directly, not a copy. Modifications to the returned slice will corrupt the buffer state.

func (b *Buffer[T]) Bytes() []T {
    b.mu.Lock()
    defer b.mu.Unlock()
    return b.buf  // Returns internal reference!
}

Impact: Data corruption if caller modifies the returned slice.

Workaround: Document clearly or change to return a copy.


BUF-004: Go BlockBuffer.Bytes() inconsistent copy behavior

File: go/pkg/buffer/block_buffer.go:356-365

Description:
Documentation says "returned slice is a copy" but when h < t, it returns a subslice of the internal buffer directly:

if h < t {
    return bb.buf[h:t]  // Not a copy!
}
return slices.Concat(bb.buf[h:], bb.buf[:t])  // This is a copy

Impact: Inconsistent behavior depending on buffer state.


BUF-005: Go RingBuffer.Bytes() same issue as BUF-004

File: go/pkg/buffer/ring_buffer.go:306-315

Description:
Same inconsistent copy behavior as BlockBuffer.


🔵 Enhancements

BUF-006: Rust BlockBuffer uses Vec<Option> overhead

File: rust/buffer/src/block_buffer.rs:82

Description:
Rust implementation uses Vec<Option<T>> which adds memory overhead (size of discriminant per element) compared to Go's direct slice approach.

Suggestion: Consider using MaybeUninit<T> with careful initialization tracking for zero-cost abstraction.


BUF-007: Go/Rust Buffer.Next() semantic difference

Description:

  • Go: Next() is LIFO (pops from end)
  • Rust: next() is FIFO (pops from front via VecDeque)

This API inconsistency could cause bugs when porting code between languages.

Suggestion: Align Go implementation to match Rust (FIFO).


⚪ Notes

BUF-008: No io.Reader/io.Writer implementation in Rust

Description:
Go buffers implement io.Reader and io.Writer interfaces. Rust buffers don't implement std::io::Read and std::io::Write traits.

Reason: Rust buffers are generic over T: Clone, not just bytes.

Suggestion: Add byte-specific wrapper types that implement std::io traits.


BUF-009: Missing Bytes() equivalent in Go BytesBuffer interface

File: go/pkg/buffer/bytes.go:12-23

Description:
The BytesBuffer interface includes Bytes() []byte but this is dangerous given BUF-003/004/005.

Suggestion: Consider removing from interface or ensuring all implementations return copies.


Summary

IDSeverityStatusComponent
BUF-001🟠 MajorOpenGo Buffer
BUF-002🟠 MajorOpenGo Buffer
BUF-003🟡 MinorOpenGo Buffer
BUF-004🟡 MinorOpenGo BlockBuffer
BUF-005🟡 MinorOpenGo RingBuffer
BUF-006🔵 EnhancementOpenRust BlockBuffer
BUF-007🔵 EnhancementOpenGo/Rust parity
BUF-008⚪ NoteN/ARust
BUF-009⚪ NoteN/AGo Interface

Encoding Package

JSON-serializable encoding types for binary data.

Design Goals

  1. Seamless JSON Integration: Binary data that automatically serializes to human-readable formats
  2. Type Safety: Distinct types for different encodings prevent mixing
  3. Zero-Copy Where Possible: Minimal allocations during serialization

Types

TypeEncodingJSON ExampleUse Case
StdBase64DataStandard Base64"aGVsbG8="Binary payloads, files
HexDataHexadecimal"deadbeef"Hashes, IDs, debugging

Features

JSON Serialization

Both types implement JSON marshal/unmarshal:

{
  "payload": "aGVsbG8gd29ybGQ=",
  "hash": "a1b2c3d4"
}

Null Handling

  • JSON null deserializes to empty/nil slice
  • Empty string "" deserializes to empty slice

String Representation

Both types implement String() / Display for easy logging:

StdBase64Data("hello") -> "aGVsbG8="
HexData([0xde, 0xad]) -> "dead"

Use Cases

API Payloads

Many APIs return binary data as Base64-encoded JSON strings:

{
  "audio_data": "UklGRi4AAABXQVZFZm10IBAAAAABAAEA..."
}

Hash Values

Cryptographic hashes are typically represented as hex:

{
  "sha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
}

Binary Protocol Debugging

Hex encoding is useful for debugging binary protocols:

{
  "raw_frame": "0102030405"
}

Examples Directory

  • examples/go/encoding/ - Go usage examples (if any)
  • examples/rust/encoding/ - Rust usage examples (if any)
  • minimax - Uses Base64 for audio data in API responses
  • doubaospeech - Uses Base64 for audio payloads
  • dashscope - Uses Base64 for binary data

Encoding Package - Go Implementation

Import: github.com/haivivi/giztoy/pkg/encoding

📚 Go Documentation

Types

StdBase64Data

type StdBase64Data []byte

A byte slice that serializes to/from standard Base64 in JSON.

Methods:

MethodSignatureDescription
MarshalJSON() ([]byte, error)Encode to JSON Base64 string
UnmarshalJSON(data []byte) errorDecode from JSON Base64 string
String() stringReturn Base64-encoded string

HexData

type HexData []byte

A byte slice that serializes to/from hexadecimal in JSON.

Methods:

MethodSignatureDescription
MarshalJSON() ([]byte, error)Encode to JSON hex string
UnmarshalJSON(data []byte) errorDecode from JSON hex string
String() stringReturn hex-encoded string

Usage

In Struct Fields

type Message struct {
    ID      string        `json:"id"`
    Payload StdBase64Data `json:"payload"`
    Hash    HexData       `json:"hash"`
}

msg := Message{
    ID:      "msg-123",
    Payload: StdBase64Data([]byte("hello world")),
    Hash:    HexData([]byte{0xab, 0xcd, 0xef}),
}

// Marshals to:
// {"id":"msg-123","payload":"aGVsbG8gd29ybGQ=","hash":"abcdef"}
data, _ := json.Marshal(msg)

Standalone Encoding

// Base64
data := StdBase64Data([]byte("hello"))
fmt.Println(data.String())  // "aGVsbG8="

// Hex
hash := HexData([]byte{0xde, 0xad})
fmt.Println(hash.String())  // "dead"

Null Handling

var data StdBase64Data
json.Unmarshal([]byte(`null`), &data)  // data is nil

json.Unmarshal([]byte(`""`), &data)    // data is []byte{}

Implementation Details

UnmarshalJSON Logic

Both types handle multiple JSON input types:

func (b *StdBase64Data) UnmarshalJSON(data []byte) error {
    switch data[0] {
    case 'n':  // null
        return nil
    case '"':  // string
        // decode Base64
    default:
        return error
    }
}

Direct Slice Alias

Go implementation uses direct type alias type StdBase64Data []byte, which means:

  • No wrapper overhead
  • Can be cast directly to/from []byte
  • Shares underlying array with original slice

Dependencies

  • encoding/base64 (stdlib)
  • encoding/hex (stdlib)

Encoding Package - Rust Implementation

Crate: giztoy-encoding

📚 Rust Documentation

Types

StdBase64Data

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Hash, Default)]
pub struct StdBase64Data(Vec<u8>);
}

A newtype wrapper around Vec<u8> that serializes to/from standard Base64.

Methods:

MethodSignatureDescription
newfn new(data: Vec<u8>) -> SelfCreate from Vec
emptyfn empty() -> SelfCreate empty
as_bytesfn as_bytes(&self) -> &[u8]Get byte slice reference
as_bytes_mutfn as_bytes_mut(&mut self) -> &mut Vec<u8>Get mutable reference
into_bytesfn into_bytes(self) -> Vec<u8>Consume and return Vec
is_emptyfn is_empty(&self) -> boolCheck if empty
lenfn len(&self) -> usizeGet length
encodefn encode(&self) -> StringEncode to Base64 string
decodefn decode(s: &str) -> Result<Self, DecodeError>Decode from Base64

Trait Implementations:

  • Serialize / Deserialize (serde)
  • Display (formats as Base64)
  • Deref<Target=[u8]> / DerefMut
  • From<Vec<u8>>, From<&[u8]>, From<[u8; N]>
  • AsRef<[u8]>

HexData

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Hash, Default)]
pub struct HexData(Vec<u8>);
}

A newtype wrapper around Vec<u8> that serializes to/from hexadecimal.

Methods:

Same API as StdBase64Data, but with hex encoding:

MethodSignatureDescription
encodefn encode(&self) -> StringEncode to hex string
decodefn decode(s: &str) -> Result<Self, FromHexError>Decode from hex

Usage

In Struct Fields

#![allow(unused)]
fn main() {
use giztoy_encoding::{StdBase64Data, HexData};
use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize)]
struct Message {
    id: String,
    payload: StdBase64Data,
    hash: HexData,
}

let msg = Message {
    id: "msg-123".to_string(),
    payload: StdBase64Data::from(b"hello world".as_slice()),
    hash: HexData::from(vec![0xab, 0xcd, 0xef]),
};

// Serializes to:
// {"id":"msg-123","payload":"aGVsbG8gd29ybGQ=","hash":"abcdef"}
let json = serde_json::to_string(&msg).unwrap();
}

Standalone Encoding

#![allow(unused)]
fn main() {
// Base64
let data = StdBase64Data::from(b"hello".as_slice());
println!("{}", data);  // "aGVsbG8="
println!("{}", data.encode());  // "aGVsbG8="

// Hex
let hash = HexData::from(vec![0xde, 0xad]);
println!("{}", hash);  // "dead"
}

Deref Coercion

#![allow(unused)]
fn main() {
let data = StdBase64Data::from(vec![1, 2, 3]);

// Can use as &[u8] directly
fn process(bytes: &[u8]) { /* ... */ }
process(&data);  // Deref coercion

// Access slice methods
println!("len: {}", data.len());
println!("first: {:?}", data.first());
}

Null Handling

#![allow(unused)]
fn main() {
// Null deserializes to empty
let data: StdBase64Data = serde_json::from_str("null").unwrap();
assert!(data.is_empty());

// Empty string also empty
let data: StdBase64Data = serde_json::from_str(r#""""#).unwrap();
assert!(data.is_empty());
}

Dependencies

  • base64 crate (for Base64 encoding)
  • hex crate (for hex encoding)
  • serde crate (for serialization)

Differences from Go

AspectGoRust
Type structureType alias []byteNewtype struct(Vec<u8>)
Conversion to bytesDirect cast.as_bytes() or Deref
Additional methodsNoneis_empty(), len(), encode(), decode()
Hash/Eq traitsN/A (slice)Implemented
CloneImplicitExplicit (implemented)

Encoding Package - Known Issues

⚪ Notes

ENC-001: Go/Rust type structure difference

Description:
Go uses type alias (type StdBase64Data []byte) while Rust uses newtype wrapper (struct StdBase64Data(Vec<u8>)).

Impact:

  • Go: Direct cast to []byte, shares memory
  • Rust: Requires .as_bytes() or deref coercion, owns memory

Status: By design - idiomatic in each language.


ENC-002: Rust has more utility methods

Description:
Rust implementation has additional methods not present in Go:

  • is_empty() / len()
  • encode() / decode() (standalone, not just JSON)
  • empty() constructor
  • as_bytes_mut() for mutation

Suggestion: Consider adding these to Go for parity.


ENC-003: Error handling difference

Description:

  • Go: Returns error on unmarshal failure
  • Rust: Returns Result<T, E> with specific error types (base64::DecodeError, hex::FromHexError)

Impact: Different error inspection patterns in each language.


🔵 Enhancements

ENC-004: Missing URL-safe Base64 variant

Description:
Only standard Base64 is implemented. URL-safe Base64 (base64.URLEncoding / URL_SAFE) is commonly needed for:

  • JWT tokens
  • URL parameters
  • Filename-safe identifiers

Suggestion: Add UrlBase64Data type.


ENC-005: No raw Base64 (no padding) variant

Description:
Some APIs use raw Base64 without = padding. Neither implementation supports this variant.

Suggestion: Add RawBase64Data or add encoding options.


Summary

IDSeverityStatusComponent
ENC-001⚪ NoteBy designGo/Rust
ENC-002⚪ NoteOpenGo
ENC-003⚪ NoteBy designGo/Rust
ENC-004🔵 EnhancementOpenBoth
ENC-005🔵 EnhancementOpenBoth

Overall: Clean implementation with no bugs found. Minor parity differences between Go and Rust.

JsonTime Package

JSON-serializable time types for API integrations.

Design Goals

  1. API Compatibility: Many APIs use Unix timestamps instead of ISO 8601 strings
  2. Type Safety: Distinct types prevent mixing seconds/milliseconds
  3. Bidirectional: Both serialization and deserialization supported

Types

TypeJSON FormatExampleUse Case
UnixInteger (seconds)1705315800General timestamps
MilliInteger (milliseconds)1705315800000High-precision timestamps
DurationString or Integer"1h30m" or 5400000000000Time intervals

Features

Unix Timestamps

Many APIs use Unix epoch timestamps rather than ISO 8601:

{
  "created_at": 1705315800,
  "updated_at": 1705316000
}

Millisecond Precision

JavaScript/browser APIs often use milliseconds:

{
  "timestamp": 1705315800000,
  "expires_at": 1705316000000
}

Flexible Duration Parsing

Duration supports both human-readable strings and raw nanoseconds:

{
  "timeout": "30s",
  "interval": "1h30m",
  "precise_delay": 5000000000
}

Time Operations

Both Unix and Milli support common time operations:

OperationDescription
Before(t)Is this time before t?
After(t)Is this time after t?
Equal(t)Are these times equal?
Sub(t)Duration between times
Add(d)Add duration to time
IsZero()Is this the zero time?

Duration String Format

The Duration type uses Go-style duration strings:

UnitSymbolExample
Hoursh2h
Minutesm30m
Secondss45s
Combined1h30m45s

Examples Directory

  • examples/go/jsontime/ - Go usage examples (if any)
  • examples/rust/jsontime/ - Rust usage examples (if any)
  • minimax - Uses Milli for timestamps in API responses
  • doubaospeech - Uses Unix/Milli for audio timestamps
  • dashscope - Uses Duration for timeout configuration

JsonTime Package - Go Implementation

Import: github.com/haivivi/giztoy/pkg/jsontime

📚 Go Documentation

Types

Unix

type Unix time.Time

A time.Time that serializes to/from Unix seconds in JSON.

Methods:

MethodSignatureDescription
NowEpochfunc NowEpoch() UnixCurrent time as Unix
Time(ep Unix) Time() time.TimeGet underlying time.Time
Before(ep Unix) Before(t Unix) boolIs ep before t?
After(ep Unix) After(t Unix) boolIs ep after t?
Equal(ep Unix) Equal(t Unix) boolAre times equal?
Sub(ep Unix) Sub(t Unix) time.DurationDuration ep-t
Add(ep Unix) Add(d time.Duration) UnixReturn ep+d
IsZero(ep Unix) IsZero() boolIs zero time?
String(ep Unix) String() stringFormatted string

Milli

type Milli time.Time

A time.Time that serializes to/from Unix milliseconds in JSON.

Methods: Same as Unix.

MethodSignatureDescription
NowEpochMillifunc NowEpochMilli() MilliCurrent time as Milli
Time(ep Milli) Time() time.TimeGet underlying time.Time
...(same operations as Unix)

Duration

type Duration time.Duration

A time.Duration that serializes to string (e.g., "1h30m") in JSON.

Methods:

MethodSignatureDescription
FromDurationfunc FromDuration(d time.Duration) *DurationCreate Duration pointer
Duration(d *Duration) Duration() time.DurationGet underlying duration
String(d Duration) String() stringFormatted string (e.g., "1h30m")
Seconds(d Duration) Seconds() float64As floating point seconds
Milliseconds(d Duration) Milliseconds() int64As integer milliseconds

Usage

In Struct Fields

type Event struct {
    ID        string   `json:"id"`
    CreatedAt Unix     `json:"created_at"`
    ExpiresAt Milli    `json:"expires_at"`
    Timeout   Duration `json:"timeout"`
}

event := Event{
    ID:        "evt-123",
    CreatedAt: NowEpoch(),
    ExpiresAt: NowEpochMilli(),
    Timeout:   Duration(30 * time.Second),
}

// Marshals to:
// {"id":"evt-123","created_at":1705315800,"expires_at":1705315800000,"timeout":"30s"}

Duration Parsing

Duration accepts both string and integer (nanoseconds) when unmarshaling:

type Config struct {
    Timeout Duration `json:"timeout"`
}

// String format
json.Unmarshal([]byte(`{"timeout":"1h30m"}`), &cfg)
fmt.Println(cfg.Timeout.Duration())  // 1h30m0s

// Integer format (nanoseconds)
json.Unmarshal([]byte(`{"timeout":5400000000000}`), &cfg)
fmt.Println(cfg.Timeout.Duration())  // 1h30m0s

Time Arithmetic

now := NowEpoch()
later := now.Add(24 * time.Hour)

if later.After(now) {
    diff := later.Sub(now)
    fmt.Println(diff)  // 24h0m0s
}

Null Handling

var d Duration
json.Unmarshal([]byte(`null`), &d)  // d remains zero value

Implementation Details

Type Aliases

All types are direct aliases, allowing easy conversion:

// Unix -> time.Time
t := time.Time(myUnix)

// time.Time -> Unix
u := Unix(time.Now())

// Duration -> time.Duration
d := time.Duration(myDuration)

JSON Marshal Output

TypeGo ValueJSON Output
UnixUnix(time.Now())1705315800
MilliMilli(time.Now())1705315800000
DurationDuration(90*time.Second)"1m30s"

Dependencies

  • time (stdlib)
  • encoding/json (stdlib)

JsonTime Package - Rust Implementation

Crate: giztoy-jsontime

📚 Rust Documentation

Types

Unix

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Default)]
pub struct Unix(DateTime<Utc>);
}

A timestamp that serializes to/from Unix seconds in JSON.

Methods:

MethodSignatureDescription
newfn new(dt: DateTime<Utc>) -> SelfCreate from DateTime
nowfn now() -> SelfCurrent time
from_secsfn from_secs(secs: i64) -> SelfCreate from seconds
as_secsfn as_secs(&self) -> i64Get seconds value
datetimefn datetime(&self) -> DateTime<Utc>Get underlying DateTime
beforefn before(&self, other: &Self) -> boolIs this before other?
afterfn after(&self, other: &Self) -> boolIs this after other?
is_zerofn is_zero(&self) -> boolIs zero time?
subfn sub(&self, other: &Self) -> DurationDuration between times
addfn add(&self, d: Duration) -> SelfReturn self+d

Trait Implementations:

  • Serialize / Deserialize (serde)
  • Display
  • From<DateTime<Utc>>, From<i64>
  • PartialOrd, Ord, Hash

Milli

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Default)]
pub struct Milli(DateTime<Utc>);
}

A timestamp that serializes to/from Unix milliseconds in JSON.

Methods: Same as Unix, but with milliseconds.

MethodSignatureDescription
from_millisfn from_millis(ms: i64) -> SelfCreate from milliseconds
as_millisfn as_millis(&self) -> i64Get milliseconds value
...(same operations as Unix)

Duration

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Default)]
pub struct Duration(StdDuration);
}

A duration that serializes to string (e.g., "1h30m") and deserializes from string or nanoseconds.

Methods:

MethodSignatureDescription
newfn new(d: StdDuration) -> SelfCreate from std Duration
from_secsfn from_secs(secs: u64) -> SelfCreate from seconds
from_millisfn from_millis(ms: u64) -> SelfCreate from milliseconds
from_nanosfn from_nanos(nanos: u64) -> SelfCreate from nanoseconds
as_stdfn as_std(&self) -> StdDurationGet std Duration
as_secsfn as_secs(&self) -> u64Get whole seconds
as_secs_f64fn as_secs_f64(&self) -> f64Get floating seconds
as_millisfn as_millis(&self) -> u128Get milliseconds
as_nanosfn as_nanos(&self) -> u128Get nanoseconds
is_zerofn is_zero(&self) -> boolIs zero duration?

Usage

In Struct Fields

#![allow(unused)]
fn main() {
use giztoy_jsontime::{Unix, Milli, Duration};
use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize)]
struct Event {
    id: String,
    created_at: Unix,
    expires_at: Milli,
    timeout: Duration,
}

let event = Event {
    id: "evt-123".to_string(),
    created_at: Unix::now(),
    expires_at: Milli::now(),
    timeout: Duration::from_secs(30),
};

// Serializes to:
// {"id":"evt-123","created_at":1705315800,"expires_at":1705315800000,"timeout":"30s"}
}

Duration Parsing

#![allow(unused)]
fn main() {
use giztoy_jsontime::Duration;

// String format
let d: Duration = serde_json::from_str(r#""1h30m""#).unwrap();
assert_eq!(d.as_secs(), 5400);

// Integer format (nanoseconds)
let d: Duration = serde_json::from_str("5400000000000").unwrap();
assert_eq!(d.as_secs(), 5400);
}

Time Arithmetic

#![allow(unused)]
fn main() {
use giztoy_jsontime::Unix;
use std::time::Duration;

let now = Unix::now();
let later = now.add(Duration::from_secs(86400));

if later.after(&now) {
    let diff = later.sub(&now);
    println!("{:?}", diff);  // 86400s
}
}

From Conversions

#![allow(unused)]
fn main() {
// From i64
let unix = Unix::from(1705315800i64);
let milli = Milli::from(1705315800000i64);

// From DateTime<Utc>
let unix = Unix::from(Utc::now());

// From std::time::Duration
let dur = Duration::from(std::time::Duration::from_secs(60));
}

Duration String Format

The parser supports Go-style duration strings:

InputParsed As
"1h"3600 seconds
"30m"1800 seconds
"45s"45 seconds
"1h30m"5400 seconds
"1h30m45s"5445 seconds
""0 seconds

Dependencies

  • chrono crate (for DateTime handling)
  • serde crate (for serialization)

Differences from Go

AspectGoRust
Time typeType alias time.TimeNewtype over DateTime<Utc>
Duration rangeSigned (int64 ns)Unsigned (u64 + u32 ns)
OrderingVia method callsVia Ord trait
Hash supportN/AImplemented
sub() returnSigned durationUnsigned duration

JsonTime Package - Known Issues

🟡 Minor Issues

JT-001: Rust Milli.sub() loses sign information

File: rust/jsontime/src/milli.rs:54-57

Description:
The sub() method returns an unsigned Duration, losing the sign when the result would be negative:

#![allow(unused)]
fn main() {
pub fn sub(&self, other: &Self) -> Duration {
    let diff = self.0.signed_duration_since(other.0);
    Duration::from_millis(diff.num_milliseconds().unsigned_abs())
}
}

Impact: Cannot determine if self is before or after other from the result alone.

Workaround: Use before() or after() methods to check ordering first.


JT-002: Rust Unix.sub() same issue

File: rust/jsontime/src/unix.rs:54-57

Description:
Same issue as JT-001 - loses sign information.


JT-003: Rust Duration parsing more restrictive than Go

File: rust/jsontime/src/duration.rs:123-158

Description:
Go's time.ParseDuration supports more units:

  • ns (nanoseconds)
  • us/µs (microseconds)
  • ms (milliseconds)

Rust implementation only supports h, m, s.

Impact: Duration strings with sub-second units fail to parse in Rust.

Example:

// Go - works
d, _ := time.ParseDuration("100ms")

// Rust - fails
let d: Duration = serde_json::from_str(r#""100ms""#)?;  // Error!

JT-004: Rust Duration cannot be negative

Description:
Go's time.Duration is signed (int64), Rust's std::time::Duration is unsigned.

Impact: Cannot represent negative durations in Rust.

Status: By design (Rust stdlib limitation).


🔵 Enhancements

JT-005: Missing microsecond timestamp type

Description:
Some APIs (particularly high-frequency systems) use microsecond timestamps. Neither Go nor Rust implementation provides a Micro type.

Suggestion: Add Micro type for microsecond precision.


JT-006: Missing nanosecond timestamp type

Description:
Some APIs use nanosecond timestamps. No Nano type provided.

Suggestion: Add Nano type for nanosecond precision.


JT-007: Go Duration lacks explicit constructors

Description:
Go implementation lacks explicit constructors like Rust has:

  • from_secs()
  • from_millis()

Current Go usage:

d := Duration(30 * time.Second)

Suggested addition:

func DurationFromSeconds(s int64) Duration
func DurationFromMillis(ms int64) Duration

⚪ Notes

JT-008: Different underlying time libraries

Description:

  • Go: Uses stdlib time.Time
  • Rust: Uses chrono::DateTime<Utc>

Impact: Rust has hard dependency on chrono crate.


JT-009: Rust types implement more traits

Description:
Rust types implement PartialOrd, Ord, Hash which enables use in collections:

#![allow(unused)]
fn main() {
use std::collections::HashSet;
let mut times: HashSet<Unix> = HashSet::new();
times.insert(Unix::now());
}

Go types don't have equivalent functionality.


Summary

IDSeverityStatusComponent
JT-001🟡 MinorOpenRust Milli
JT-002🟡 MinorOpenRust Unix
JT-003🟡 MinorOpenRust Duration
JT-004🟡 MinorBy designRust Duration
JT-005🔵 EnhancementOpenBoth
JT-006🔵 EnhancementOpenBoth
JT-007🔵 EnhancementOpenGo
JT-008⚪ NoteN/ARust
JT-009⚪ NoteN/ARust

Overall: Functional implementation. Main concern is duration parsing parity between Go and Rust.

Trie Package

Generic trie data structure for efficient path-based storage and retrieval with MQTT-style wildcard support.

Design Goals

  1. Efficient Path Matching: O(k) lookup where k is path depth
  2. Wildcard Support: MQTT-style single (+) and multi-level (#) wildcards
  3. Generic Storage: Store any value type at path nodes
  4. Zero-Copy Lookups: Minimize allocations during get operations

Wildcard Patterns

The trie supports MQTT-style topic patterns:

PatternDescriptionExample Match
a/b/cExact patha/b/c only
a/+/cSingle-level wildcarda/X/c, a/Y/c
a/#Multi-level wildcarda/b, a/b/c/d

Pattern Rules

  1. + (Plus): Matches exactly one path segment

    • device/+/state matches device/gear-001/state
    • Does NOT match device/gear-001/sub/state
  2. # (Hash): Matches zero or more path segments

    • Must be the last segment in the pattern
    • logs/# matches logs, logs/app, logs/app/debug/line1
  3. Priority: Exact matches take precedence over wildcards

    • If both device/gear-001/state and device/+/state exist, exact wins

Use Cases

MQTT Topic Routing

device/+/state     -> state_handler
device/+/command   -> command_handler
logs/#             -> log_handler

API Path Routing

/users/{id}/profile  -> profile_handler
/users/{id}/posts    -> posts_handler
/admin/#             -> admin_handler

Hierarchical Configuration

app/database/host    -> "localhost"
app/database/port    -> 5432
app/cache/#          -> cache_config

Performance Characteristics

OperationComplexityNotes
SetO(k)k = path depth
GetO(k)Zero allocation in Rust
WalkO(n)n = total nodes
LenO(n)Counts all values

Examples Directory

  • examples/go/trie/ - Go usage examples (if any)
  • examples/rust/trie/ - Rust usage examples (if any)
  • mqtt0 - Uses trie for topic subscription matching
  • chatgear - Uses trie for message routing

Trie Package - Go Implementation

Import: github.com/haivivi/giztoy/pkg/trie

📚 Go Documentation

Types

Trie[T]

type Trie[T any] struct {
    children map[string]*Trie[T] // exact path segment matches
    matchAny *Trie[T]            // single-level wildcard (+)
    matchAll *Trie[T]            // multi-level wildcard (#)
    set      bool                // whether this node has a value
    value    T                   // the value stored
}

Methods:

MethodSignatureDescription
Newfunc New[T any]() *Trie[T]Create empty trie
Set(t *Trie[T]) Set(path string, setFunc func(*T, bool) error) errorSet with custom setter
SetValue(t *Trie[T]) SetValue(path string, value T) errorSet value directly
Get(t *Trie[T]) Get(path string) (*T, bool)Get value pointer
GetValue(t *Trie[T]) GetValue(path string) (T, bool)Get value copy
Match(t *Trie[T]) Match(path string) (route string, value *T, ok bool)Get with matched route
Walk(t *Trie[T]) Walk(f func(path string, value T, set bool))Visit all nodes
Len(t *Trie[T]) Len() intCount values
String(t *Trie[T]) String() stringDebug representation

ErrInvalidPattern

var ErrInvalidPattern = errors.New("invalid path pattern...")

Returned when # wildcard is not at the end of the path.

Usage

Basic Set/Get

tr := trie.New[string]()

// Set exact path
tr.SetValue("device/gear-001/state", "online")

// Get value
val, ok := tr.GetValue("device/gear-001/state")
// val = "online", ok = true

Wildcard Patterns

tr := trie.New[string]()

// Single-level wildcard
tr.SetValue("device/+/state", "state_handler")

// Multi-level wildcard
tr.SetValue("logs/#", "log_handler")

// Match against patterns
val, _ := tr.GetValue("device/any-device/state")
// val = "state_handler"

val, _ = tr.GetValue("logs/app/debug/line1")
// val = "log_handler"

Custom Set Function

tr := trie.New[[]string]()

// Append to existing value
tr.Set("handlers/events", func(ptr *[]string, existed bool) error {
    if !existed {
        *ptr = []string{"handler1"}
    } else {
        *ptr = append(*ptr, "handler2")
    }
    return nil
})

Walk All Nodes

tr.Walk(func(path string, value string, set bool) {
    if set {
        fmt.Printf("%s: %s\n", path, value)
    }
})

Match with Route

tr := trie.New[string]()
tr.SetValue("device/+/state", "handler")

route, value, ok := tr.Match("device/gear-001/state")
// route = "/+/state"
// value = "handler"
// ok = true

Implementation Details

Path Splitting

Paths are split by / and processed segment by segment:

// "device/gear-001/state" splits into:
// first="device", subseq="gear-001/state"

Value Storage

Uses a set boolean flag to distinguish between:

  • Value not set (default zero value)
  • Value explicitly set to zero value

Match Priority

  1. Exact child match
  2. Single-level wildcard (+)
  3. Multi-level wildcard (#)

Benchmarks

Typical performance (from benchmarks):

Operation100 paths1000 paths10000 paths
Set all~50µs~500µs~5ms
Get (exact)~10µs~100µs~1ms
Walk~5µs~50µs~500µs

Trie Package - Rust Implementation

Crate: giztoy-trie

📚 Rust Documentation

Types

Trie

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct Trie<T> {
    children: HashMap<String, Trie<T>>,
    match_any: Option<Box<Trie<T>>>,  // single-level wildcard (+)
    match_all: Option<Box<Trie<T>>>,  // multi-level wildcard (#)
    value: Option<T>,
}
}

Methods:

MethodSignatureDescription
newfn new() -> SelfCreate empty trie
setfn set<F, E>(&mut self, path: &str, setter: F) -> Result<(), E>Set with custom setter
set_valuefn set_value(&mut self, path: &str, value: T) -> Result<(), InvalidPatternError>Set value directly
getfn get(&self, path: &str) -> Option<&T>Get value reference (zero-alloc)
get_valuefn get_value(&self, path: &str) -> Option<T>Get cloned value
match_pathfn match_path(&self, path: &str) -> (String, Option<&T>)Get with matched route
walkfn walk<F>(&self, f: F)Visit all nodes
lenfn len(&self) -> usizeCount values
is_emptyfn is_empty(&self) -> boolCheck if empty

Trait Implementations:

  • Default
  • Clone
  • Debug
  • Display (when T: Display)

InvalidPatternError

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, thiserror::Error)]
#[error("invalid path pattern: path should be /a/b/c or /a/+/c or /a/#")]
pub struct InvalidPatternError;
}

Usage

Basic Set/Get

#![allow(unused)]
fn main() {
use giztoy_trie::Trie;

let mut trie = Trie::<String>::new();

// Set exact path
trie.set_value("device/gear-001/state", "online".to_string()).unwrap();

// Get value (zero allocation)
let val: Option<&String> = trie.get("device/gear-001/state");

// Get cloned value
let val: Option<String> = trie.get_value("device/gear-001/state");
}

Wildcard Patterns

#![allow(unused)]
fn main() {
let mut trie = Trie::<String>::new();

// Single-level wildcard
trie.set_value("device/+/state", "state_handler".to_string()).unwrap();

// Multi-level wildcard
trie.set_value("logs/#", "log_handler".to_string()).unwrap();

// Match against patterns
let val = trie.get("device/any-device/state");
assert_eq!(val, Some(&"state_handler".to_string()));

let val = trie.get("logs/app/debug/line1");
assert_eq!(val, Some(&"log_handler".to_string()));
}

Custom Set Function

#![allow(unused)]
fn main() {
let mut trie = Trie::<Vec<String>>::new();

// Set with custom logic
trie.set("handlers/events", |existing| {
    match existing {
        Some(vec) => {
            vec.push("handler2".to_string());
            Ok(vec.clone())
        }
        None => Ok(vec!["handler1".to_string()]),
    }
}).unwrap();
}

Walk All Nodes

#![allow(unused)]
fn main() {
trie.walk(|path, value| {
    println!("{}: {}", path, value);
});
}

Match with Route

#![allow(unused)]
fn main() {
let mut trie = Trie::<String>::new();
trie.set_value("device/+/state", "handler".to_string()).unwrap();

let (route, value) = trie.match_path("device/gear-001/state");
// route = "/device/+/state"
// value = Some(&"handler")
}

Implementation Details

Zero-Allocation Lookup

The get() method performs zero allocations by:

  • Using string slices for path splitting
  • Returning references instead of cloned values
#![allow(unused)]
fn main() {
#[inline]
fn split_path(path: &str) -> (&str, &str) {
    match path.find('/') {
        Some(idx) => (&path[..idx], &path[idx + 1..]),
        None => (path, ""),
    }
}
}

Value Storage

Uses Option<T> instead of a separate flag:

  • None = value not set
  • Some(T) = value set

Wildcard Storage

  • match_any: Option<Box<Trie<T>>> for + wildcard
  • match_all: Option<Box<Trie<T>>> for # wildcard

Boxed to avoid recursive type sizing issues.

Differences from Go

AspectGoRust
Value storageset bool + value TOption<T>
Child storagemap[string]*Trie[T]HashMap<String, Trie<T>>
Wildcard storage*Trie[T] (pointer)Option<Box<Trie<T>>>
Get return(*T, bool)Option<&T>
Clone supportImplicit (pointer)Explicit Clone derive
Zero-alloc getNo (returns route string)Yes (get() method)

Trie Package - Known Issues

🟡 Minor Issues

TRI-001: Go Walk visits unset nodes

File: go/pkg/trie/trie.go:175-179

Description:
The Walk function visits ALL nodes including those without values set, passing the zero value:

func (t *Trie[T]) Walk(f func(path string, value T, set bool)) {
    t.walkWithPath(nil, func(path []string, node *Trie[T]) {
        f(strings.Join(path, "/"), node.value, node.set)  // value may be zero
    })
}

Impact: Callers must check the set boolean to filter actual values.

Suggestion: Consider only visiting nodes where set == true by default.


TRI-002: Go Len() is O(n) not O(1)

File: go/pkg/trie/trie.go:211-219

Description:
Len() walks the entire trie to count values:

func (t *Trie[T]) Len() int {
    count := 0
    t.Walk(func(_ string, _ T, set bool) {
        if set {
            count++
        }
    })
    return count
}

Impact: Performance issue for large tries with frequent Len() calls.

Suggestion: Maintain a counter that increments on Set and decrements on Delete.


TRI-003: Rust Len() same O(n) issue

File: rust/trie/src/lib.rs:292-296

Description:
Same issue as Go - walks entire trie to count.


TRI-004: No Delete operation

Description:
Neither Go nor Rust implementation provides a way to delete/remove values from the trie.

Impact: Cannot remove stale subscriptions or routes without rebuilding.

Suggestion: Add Delete(path string) bool method.


TRI-005: Go Match returns route with leading slash inconsistency

File: go/pkg/trie/trie.go:142-172

Description:
When building the matched route string, it prepends "/" to each segment:

ch.match(matched+"/"+first, subseq)  // Results in "/device/+/state"

But the root path returns empty string, creating inconsistency.


🔵 Enhancements

TRI-006: No thread safety

Description:
Neither implementation is thread-safe. Concurrent read/write will cause data races.

Go:

// UNSAFE: concurrent access
go trie.Set("a/b", value1)
go trie.Set("a/c", value2)

Suggestion: Add sync.RWMutex wrapper or document thread-safety requirements.


TRI-007: No path parameter extraction

Description:
When matching device/+/state against device/gear-001/state, there's no way to extract gear-001 as a parameter.

Current: Only returns the matched route pattern and value.

Suggestion: Add MatchParams(path) (params map[string]string, value *T, ok bool).


TRI-008: No prefix listing

Description:
Cannot list all paths under a prefix efficiently.

Example use case: List all devices under device/ prefix.

Suggestion: Add List(prefix string) []string method.


⚪ Notes

TRI-009: Different value storage approaches

Description:

  • Go: Uses set bool flag with zero value
  • Rust: Uses Option<T>

Both approaches work but have different trade-offs:

  • Go: Can distinguish "set to zero" vs "not set"
  • Rust: More idiomatic, less memory overhead

TRI-010: Path leading slash handling

Description:
Paths can start with or without /:

  • "/a/b/c" and "a/b/c" are NOT equivalent
  • Leading / creates an empty string segment
trie.SetValue("/a/b", "val1")  // path segments: ["", "a", "b"]
trie.SetValue("a/b", "val2")   // path segments: ["a", "b"]

Status: Documented behavior but may be confusing.


Summary

IDSeverityStatusComponent
TRI-001🟡 MinorOpenGo Walk
TRI-002🟡 MinorOpenGo Len
TRI-003🟡 MinorOpenRust Len
TRI-004🟡 MinorOpenBoth
TRI-005🟡 MinorOpenGo Match
TRI-006🔵 EnhancementOpenBoth
TRI-007🔵 EnhancementOpenBoth
TRI-008🔵 EnhancementOpenBoth
TRI-009⚪ NoteN/ABoth
TRI-010⚪ NoteN/ABoth

Overall: Solid implementation for basic trie operations. Missing Delete and thread-safety for production use.

CLI Package

Common CLI utilities for giztoy command-line tools.

Design Goals

  1. Consistent UX: Shared patterns across all giztoy CLI tools
  2. kubectl-style Contexts: Multiple API configurations with context switching
  3. Flexible Output: Support JSON, YAML, and raw output formats
  4. Cross-Platform Paths: Standard directory structure for config/cache/logs

Components

ComponentDescription
ConfigMulti-context configuration management
OutputOutput formatting (JSON, YAML, raw)
PathsDirectory structure (~/.giztoy//)
RequestLoad request data from YAML/JSON files
LogWriterCapture logs for TUI display

Directory Structure

graph LR
    subgraph giztoy["~/.giztoy/"]
        subgraph minimax["minimax/"]
            m1[config.yaml]
            m2[cache/]
            m3[logs/]
            m4[data/]
        end
        subgraph doubao["doubao/"]
            d1[config.yaml]
            d2[cache/]
        end
        subgraph dashscope["dashscope/"]
            s1[...]
        end
    end

Configuration Format

current_context: production

contexts:
  production:
    name: production
    api_key: "sk-..."
    base_url: "https://api.example.com"
    timeout: 30
    extra:
      region: "us-west"
  
  development:
    name: development
    api_key: "sk-dev-..."
    base_url: "https://dev-api.example.com"

Context System

Similar to kubectl, supports multiple API contexts:

# List contexts
myapp config list

# Use a context
myapp config use production

# Add a context
myapp config add staging --api-key=sk-...

# Delete a context
myapp config delete staging

Output Formats

FormatFlagDescription
YAML--output=yamlDefault, human-readable
JSON--output=jsonMachine-readable
Raw--output=rawBinary/raw data

Use Cases

API CLI Tools

# minimax CLI
minimax chat "Hello" --context=production --output=json

# doubao CLI  
doubao tts "Hello world" --output=audio.mp3

Configuration Management

# View current config
myapp config show

# Set default voice
myapp config set default_voice "zh-CN-Standard-A"

Examples Directory

  • go/cmd/minimax/ - MiniMax CLI using this package
  • go/cmd/doubaospeech/ - Doubao Speech CLI
  • rust/cmd/minimax/ - Rust MiniMax CLI
  • Used by all CLI tools in the project
  • Provides consistent user experience across Go and Rust implementations

CLI Package - Go Implementation

Import: github.com/haivivi/giztoy/pkg/cli

📚 Go Documentation

Types

Config

type Config struct {
    AppName        string              `yaml:"-"`
    CurrentContext string              `yaml:"current_context,omitempty"`
    Contexts       map[string]*Context `yaml:"contexts,omitempty"`
}

Methods:

MethodSignatureDescription
LoadConfigfunc LoadConfig(appName string) (*Config, error)Load from default path
LoadConfigWithPathfunc LoadConfigWithPath(appName, path string) (*Config, error)Load from custom path
Save(c *Config) Save() errorSave to disk
Path(c *Config) Path() stringGet config file path
Dir(c *Config) Dir() stringGet config directory
AddContext(c *Config) AddContext(name string, ctx *Context) errorAdd context
DeleteContext(c *Config) DeleteContext(name string) errorDelete context
UseContext(c *Config) UseContext(name string) errorSet current context
GetContext(c *Config) GetContext(name string) (*Context, error)Get specific context
GetCurrentContext(c *Config) GetCurrentContext() (*Context, error)Get current context
ResolveContext(c *Config) ResolveContext(name string) (*Context, error)Resolve by name or current
ListContexts(c *Config) ListContexts() []stringList all context names

Context

type Context struct {
    Name         string              `yaml:"name"`
    Client       *ClientCredentials  `yaml:"client,omitempty"`
    Console      *ConsoleCredentials `yaml:"console,omitempty"`
    APIKey       string              `yaml:"api_key,omitempty"`
    BaseURL      string              `yaml:"base_url,omitempty"`
    Timeout      int                 `yaml:"timeout,omitempty"`
    MaxRetries   int                 `yaml:"max_retries,omitempty"`
    DefaultVoice string              `yaml:"default_voice,omitempty"`
    Extra        map[string]string   `yaml:"extra,omitempty"`
}

Output

type OutputFormat string

const (
    FormatYAML  OutputFormat = "yaml"
    FormatJSON  OutputFormat = "json"
    FormatTable OutputFormat = "table"
    FormatRaw   OutputFormat = "raw"
)

type OutputOptions struct {
    Format OutputFormat
    File   string
    Indent string
    Writer io.Writer
}

Functions:

FunctionSignatureDescription
Outputfunc Output(result any, opts OutputOptions) errorWrite formatted output
OutputBytesfunc OutputBytes(data []byte, path string) errorWrite binary data
PrintSuccessfunc PrintSuccess(format string, args ...any)Print ✓ message
PrintErrorfunc PrintError(format string, args ...any)Print error to stderr
PrintInfofunc PrintInfo(format string, args ...any)Print ℹ message
PrintWarningfunc PrintWarning(format string, args ...any)Print ⚠ message
PrintVerbosefunc PrintVerbose(verbose bool, format string, args ...any)Conditional verbose

Paths

type Paths struct {
    AppName string
    HomeDir string
}

Methods:

MethodSignatureDescription
NewPathsfunc NewPaths(appName string) (*Paths, error)Create paths instance
BaseDir(p *Paths) BaseDir() string~/.giztoy
AppDir(p *Paths) AppDir() string~/.giztoy/
ConfigFile(p *Paths) ConfigFile() string~/.giztoy//config.yaml
CacheDir(p *Paths) CacheDir() string~/.giztoy//cache
LogDir(p *Paths) LogDir() string~/.giztoy//logs
DataDir(p *Paths) DataDir() string~/.giztoy//data
EnsureAppDir(p *Paths) EnsureAppDir() errorCreate app dir
CachePath(p *Paths) CachePath(name string) stringPath in cache
LogPath(p *Paths) LogPath(name string) stringPath in logs
DataPath(p *Paths) DataPath(name string) stringPath in data

Usage

Load Configuration

cfg, err := cli.LoadConfig("minimax")
if err != nil {
    log.Fatal(err)
}

// Get current context
ctx, err := cfg.GetCurrentContext()
if err != nil {
    log.Fatal(err)
}

fmt.Println("API Key:", cli.MaskAPIKey(ctx.APIKey))

Output Results

result := map[string]string{"status": "ok", "message": "done"}

// Output as JSON to stdout
cli.Output(result, cli.OutputOptions{
    Format: cli.FormatJSON,
})

// Output as YAML to file
cli.Output(result, cli.OutputOptions{
    Format: cli.FormatYAML,
    File:   "output.yaml",
})
cli.PrintSuccess("Created context %q", "production")
cli.PrintError("Failed to connect: %v", err)
cli.PrintInfo("Using API endpoint: %s", baseURL)
cli.PrintWarning("Rate limit approaching")
cli.PrintVerbose(verbose, "Request: %+v", req)

Path Management

paths, _ := cli.NewPaths("minimax")

// Ensure directories exist
paths.EnsureCacheDir()
paths.EnsureLogDir()

// Get paths
cachePath := paths.CachePath("response.json")
logPath := paths.LogPath("2024-01-15.log")

Dependencies

  • github.com/goccy/go-yaml - YAML parsing
  • encoding/json (stdlib) - JSON parsing

CLI Package - Rust Implementation

Crate: giztoy-cli

📚 Rust Documentation

Types

Config

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct Config {
    #[serde(skip)]
    pub app_name: String,
    
    #[serde(default, skip_serializing_if = "String::is_empty")]
    pub current_context: String,
    
    #[serde(default, skip_serializing_if = "HashMap::is_empty")]
    pub contexts: HashMap<String, Context>,
    
    #[serde(skip)]
    config_path: PathBuf,
}
}

Methods:

MethodSignatureDescription
default_config_dirfn default_config_dir(app_name: &str) -> Option<PathBuf>Get default config dir
default_config_pathfn default_config_path(app_name: &str) -> Option<PathBuf>Get default config path
pathfn path(&self) -> &PathBufGet config file path
dirfn dir(&self) -> Option<&Path>Get config directory
savefn save(&self) -> Result<()>Save to disk
add_contextfn add_context(&mut self, name: &str, ctx: Context) -> Result<()>Add context
delete_contextfn delete_context(&mut self, name: &str) -> Result<()>Delete context
use_contextfn use_context(&mut self, name: &str) -> Result<()>Set current context
get_contextfn get_context(&self, name: &str) -> Option<&Context>Get specific context
get_current_contextfn get_current_context(&self) -> Option<&Context>Get current context
resolve_contextfn resolve_context(&self, name: Option<&str>) -> Option<&Context>Resolve by name or current
list_contextsfn list_contexts(&self) -> Vec<&str>List all context names

Free Functions:

FunctionSignatureDescription
load_configfn load_config(app_name: &str, custom_path: Option<&str>) -> Result<Config>Load config
save_configfn save_config(app_name: &str, config: &Config, custom_path: Option<&str>) -> Result<()>Save config
mask_api_keyfn mask_api_key(key: &str) -> StringMask API key

Context

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct Context {
    pub name: String,
    pub client: Option<ClientCredentials>,
    pub console: Option<ConsoleCredentials>,
    pub api_key: String,
    pub base_url: String,
    pub timeout: i32,
    pub max_retries: i32,
    pub default_voice: String,
    pub extra: HashMap<String, String>,
}
}

Output

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq, Default)]
pub enum OutputFormat {
    #[default]
    Yaml,
    Json,
}

pub struct Output {
    pub format: OutputFormat,
    pub file: Option<String>,
}
}

Methods:

MethodSignatureDescription
newfn new(format: OutputFormat, file: Option<String>) -> SelfCreate output config
writefn write<T: Serialize>(&self, value: &T) -> Result<()>Write formatted output
write_binaryfn write_binary(&self, data: &[u8], path: &str) -> Result<()>Write binary data

Free Functions:

FunctionSignatureDescription
print_verbosefn print_verbose(enabled: bool, message: &str)Print verbose message
guess_extensionfn guess_extension(format: &str) -> &strGuess file extension

Paths

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct Paths {
    pub app_name: String,
    pub home_dir: PathBuf,
}
}

Methods:

MethodSignatureDescription
newfn new(app_name: impl Into<String>) -> io::Result<Self>Create paths instance
base_dirfn base_dir(&self) -> PathBuf~/.giztoy
app_dirfn app_dir(&self) -> PathBuf~/.giztoy/
config_filefn config_file(&self) -> PathBuf~/.giztoy//config.yaml
cache_dirfn cache_dir(&self) -> PathBuf~/.giztoy//cache
log_dirfn log_dir(&self) -> PathBuf~/.giztoy//logs
data_dirfn data_dir(&self) -> PathBuf~/.giztoy//data
ensure_app_dirfn ensure_app_dir(&self) -> io::Result<()>Create app dir
ensure_cache_dirfn ensure_cache_dir(&self) -> io::Result<()>Create cache dir
cache_pathfn cache_path(&self, name: &str) -> PathBufPath in cache
log_pathfn log_path(&self, name: &str) -> PathBufPath in logs
data_pathfn data_path(&self, name: &str) -> PathBufPath in data

Usage

Load Configuration

#![allow(unused)]
fn main() {
use giztoy_cli::{load_config, mask_api_key};

let cfg = load_config("minimax", None)?;

if let Some(ctx) = cfg.get_current_context() {
    println!("API Key: {}", mask_api_key(&ctx.api_key));
}
}

Output Results

#![allow(unused)]
fn main() {
use giztoy_cli::{Output, OutputFormat};
use serde::Serialize;

#[derive(Serialize)]
struct Result {
    status: String,
    message: String,
}

let result = Result {
    status: "ok".to_string(),
    message: "done".to_string(),
};

// Output as JSON to stdout
let output = Output::new(OutputFormat::Json, None);
output.write(&result)?;

// Output to file
let output = Output::new(OutputFormat::Yaml, Some("output.yaml".to_string()));
output.write(&result)?;
}

Path Management

#![allow(unused)]
fn main() {
use giztoy_cli::Paths;

let paths = Paths::new("minimax")?;

// Ensure directories exist
paths.ensure_cache_dir()?;
paths.ensure_log_dir()?;

// Get paths
let cache_path = paths.cache_path("response.json");
let log_path = paths.log_path("2024-01-15.log");
}

Dependencies

  • serde + serde_yaml + serde_json - Serialization
  • dirs - Home directory detection
  • anyhow - Error handling

Differences from Go

AspectGoRust
Error handlingerror returnanyhow::Result
Config loadingLoadConfig(app)load_config(app, None)
Output formatsyaml, json, table, rawyaml, json only
Print helpersPrintSuccess, PrintError, etc.print_verbose only
Path returnsstringPathBuf

CLI Package - Known Issues

🟡 Minor Issues

CLI-001: Rust missing output formats

Description:
Go supports 4 output formats: yaml, json, table, raw. Rust only supports: yaml, json.

Impact: Rust CLI tools cannot output raw binary to stdout or table format.

Suggestion: Add Raw and Table format support to Rust.


CLI-002: Rust missing print helpers

Description:
Go has multiple print helpers with icons:

  • PrintSuccess (✓)
  • PrintError
  • PrintInfo (ℹ)
  • PrintWarning (⚠)
  • PrintVerbose

Rust only has print_verbose.

Impact: Inconsistent user experience between Go and Rust CLIs.

Suggestion: Add print helper functions to Rust.


CLI-003: Config file permissions

File: go/pkg/cli/config.go:143

Description:
Config file is created with 0600 permissions (owner read/write only), which is good. But the config directory is created with 0755 (world-readable).

os.MkdirAll(dir, 0755)  // Directory readable by others
os.WriteFile(c.configPath, data, 0600)  // File only owner

Impact: Directory structure is visible to other users, though file content is protected.

Suggestion: Consider 0700 for the config directory.


CLI-004: Go Context.Extra returns empty string for missing keys

File: go/pkg/cli/config.go:224-228

Description:
GetExtra returns empty string "" for missing keys, making it impossible to distinguish between "key exists with empty value" and "key doesn't exist".

func (ctx *Context) GetExtra(key string) string {
    if ctx.Extra == nil {
        return ""
    }
    return ctx.Extra[key]  // Returns "" for missing key
}

Impact: Cannot differentiate missing vs empty extra values.

Suggestion: Add HasExtra(key string) bool or return (string, bool).


🔵 Enhancements

CLI-005: No config file locking

Description:
Neither Go nor Rust implementation locks the config file during read/write operations. Concurrent CLI processes could corrupt the config.

Suggestion: Implement file locking for Save operations.


CLI-006: No config validation

Description:
Config is loaded without validation. Invalid URLs, negative timeouts, etc. are not detected until runtime errors occur.

Suggestion: Add Validate() error method to Config/Context.


CLI-007: Missing config migration

Description:
No mechanism to handle config format changes between versions. If schema changes, old configs may fail to load.

Suggestion: Add version field and migration support.


CLI-008: No environment variable support

Description:
API keys and other credentials must be stored in config file. No support for environment variable overrides.

Example:

# Desired behavior
export MINIMAX_API_KEY="sk-..."
minimax chat "Hello"  # Uses env var instead of config

Suggestion: Add env var lookup with config fallback.


⚪ Notes

CLI-009: Different YAML libraries

Description:

  • Go: Uses github.com/goccy/go-yaml
  • Rust: Uses serde_yaml

Both produce compatible output but may have minor formatting differences.


CLI-010: MaskAPIKey behavior for short keys

Description:
Both implementations mask entire key if length <= 8:

if len(key) <= 8 {
    return strings.Repeat("*", len(key))
}

This means very short keys (e.g., "test") show as "****" with no visible characters.


CLI-011: Paths use dirs crate in Rust

Description:

  • Go: Uses os.UserHomeDir() (stdlib)
  • Rust: Uses dirs::home_dir() (external crate)

Both handle cross-platform home directory detection correctly.


Summary

IDSeverityStatusComponent
CLI-001🟡 MinorOpenRust Output
CLI-002🟡 MinorOpenRust Print
CLI-003🟡 MinorOpenGo Config
CLI-004🟡 MinorOpenGo Context
CLI-005🔵 EnhancementOpenBoth
CLI-006🔵 EnhancementOpenBoth
CLI-007🔵 EnhancementOpenBoth
CLI-008🔵 EnhancementOpenBoth
CLI-009⚪ NoteN/ABoth
CLI-010⚪ NoteN/ABoth
CLI-011⚪ NoteN/ARust

Overall: Functional CLI utilities. Main gaps are feature parity between Go and Rust implementations.