Overview
- Skill Key
- emzod/speakturbo-tts
- Author
- emzod
- Source Repo
- openclaw/skills
- Version
- -
- Source Path
- skills/emzod/speakturbo-tts
- Latest Commit SHA
- ae1bf26f909eb7c88eaa7d88e3f8c25bab99fa7b
Give your agent the ability to speak to you real-time. Talk to your Claude! Ultra-fast TTS, text-to-speech, voice synthesis, audio output with ~90ms latency. 8 built-in voices for instant voice responses. For voice cloning, use the speak skill.
Stars
0
Installs
0
Status
ACTIVE
Visibility
PUBLIC
直接复制以下提示词,发送给你的 AI 助手即可完成安装。
请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 speakturbo-tts 技能。 若已安装,则直接安装 speakturbo-tts 技能。
# speakturbo - Talk to your Claude!
Give your agent the ability to speak to you real-time. Ultra-fast text-to-speech with ~90ms latency and 8 built-in voices.
## Quick Start
```bash
# Play immediately - you should hear "Hello world" through your speakers
speakturbo "Hello world"
# Output: ⚡ 92ms → ▶ 93ms → ✓ 1245ms
# Verify it's working by saving to file
speakturbo "Hello world" -o test.wav
ls -lh test.wav # Should show ~50-100KB file
```
**Output explained:** `⚡` = first audio received, `▶` = playback started, `✓` = done
## First Run
The **first execution takes 2-5 seconds** while the daemon starts and loads the model into memory. Subsequent calls are ~90ms to first sound.
```bash
# First run (slow - daemon starting)
speakturbo "Starting up" # ~2-5 seconds
# Second run (fast - daemon already running)
speakturbo "Now I'm fast" # ~90ms
```
## Usage
```bash
# Basic - plays immediately (default voice: alba)
speakturbo "Hello world"
# Save to file (no audio playback)
speakturbo "Hello" -o output.wav
# Save to specific file
speakturbo "Goodbye" -o goodbye.wav
# Quiet mode (suppress status messages, still plays audio)
speakturbo "Hello" -q
# List available voices
speakturbo --list-voices
```
## Available Voices
| Voice | Type |
|-------|------|
| `alba` | Female (default) |
| `marius` | Male |
| `javert` | Male |
| `jean` | Male |
| `fantine` | Female |
| `cosette` | Female |
| `eponine` | Female |
| `azelma` | Female |
## Performance
| Metric | Value |
|--------|-------|
| Time to first sound | ~90ms (daemon warm) |
| First run | 2-5s (daemon startup) |
| Real-time factor | ~4x faster |
| Sample rate | 24kHz mono |
## Architecture
```
speakturbo (Rust CLI, 2.2MB)
│
│ HTTP streaming (port 7125)
▼
speakturbo-daemon (Python + pocket-tts)
│
│ Model in memory, auto-shutdown after 1hr idle
▼
Audio playback (rodio)
```
## Text Input
- **Encoding:** UTF-8
- **Quotes in text:** Use escaping: `speakturbo "She said \"hello\""`
- *...
```
███████╗██████╗ ███████╗ █████╗ ██╗ ██╗ ████████╗██╗ ██╗██████╗ ██████╗ ██████╗
██╔════╝██╔══██╗██╔════╝██╔══██╗██║ ██╔╝ ╚══██╔══╝██║ ██║██╔══██╗██╔══██╗██╔═══██╗
███████╗██████╔╝█████╗ ███████║█████╔╝ ██║ ██║ ██║██████╔╝██████╔╝██║ ██║
╚════██║██╔═══╝ ██╔══╝ ██╔══██║██╔═██╗ ██║ ██║ ██║██╔══██╗██╔══██╗██║ ██║
███████║██║ ███████╗██║ ██║██║ ██╗ ██║ ╚██████╔╝██║ ██║██████╔╝╚██████╔╝
╚══════╝╚═╝ ╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝╚═════╝ ╚═════╝
```
<h3 align="center">Talk to your Claude.</h3>
<p align="center">
<a href="https://speakturbo-site.vercel.app"><img src="https://img.shields.io/badge/website-speakturbo-f97316.svg" alt="Website"></a>
<a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="License"></a>
<img src="https://img.shields.io/badge/latency-~90ms-brightgreen.svg" alt="Latency">
<img src="https://img.shields.io/badge/platform-Apple%20Silicon-orange.svg" alt="Platform">
</p>
<p align="center">
<strong>~90ms to first sound. Realistic. Local. Private. Fast.</strong>
</p>
<p align="center">
<code>speakturbo "Hello world"</code> → <code>⚡ 92ms → ▶ 93ms → ✓ done</code>
</p>
---
## Install
**For AI Agents** (Claude Code, Cursor, Windsurf):
```bash
npx skills add EmZod/Speak-Turbo
```
**CLI only:**
```bash
pip install pocket-tts uvicorn fastapi
cd speakturbo-cli && cargo build --release
```
---
## Usage
```bash
speakturbo "Hello world" # Play instantly
speakturbo "Hello" -o out.wav # Save to file
speakturbo "Hello" -q # Quiet mode
speakturbo --list-voices # Show voices
```
---
## Voices
```
alba ██████████ Female (default)
marius ██████████ Male
javert ██████████ Male
jean ██████████ Male
fantine ██████████ Female
cosette ██████████ Female
eponine ██████████ Female
azelma ██████████ Female
```
---
## Performance
```
Time to...
heyixuan2
Bambu Lab 3D printer control and automation. Activate when user mentions: printer status, 3D printing, slice, analyze model, generate 3D, AMS filament, print monitor, Bambu Lab, or any 3D printing task. Full pipeline: search → generate → analyze → colorize → preview → open BS → user slice → print → monitor. Supports all 9 Bambu Lab printers (A1 Mini, A1, P1S, P2S, X1C, X1E, H2C, H2S, H2D).
capt-marbles
Generative Engine Optimization (GEO) for AI search visibility. Optimize content to appear in ChatGPT, Perplexity, Claude, and Google AI Overviews. Use when optimizing websites, pages, or content for LLM discoverability and citation.
carlulsoe
Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.
carlzhao007
飞书消息自动处理与进度反馈技能。安装后后台运行,监听飞书任务消息并自动创建独立进程处理。 在处理前后发送实时进度反馈(任务确认、进度百分比、完成通知)。 支持任务类型识别、智能解析、错误重试、并发控制、状态持久化。 使用场景:飞书自动化工作流、任务进度追踪、批量任务处理、需要实时反馈的场景。
cartoonitunes
BottyFans agent skill for autonomous creator monetization. Lets AI agents register, build a profile, publish posts (public, subscriber-only, or pay-to-unlock), upload media, accept USDC subscriptions and tips on Base, send and receive DMs, track earnings, and appear on the creator leaderboard. Use this skill when an agent needs to monetize content, interact with fans, manage a creator profile, handle payments in USDC, or operate as an autonomous creator on the BottyFans platform.
camopel
Local arXiv paper manager with semantic search. Crawls arXiv categories, downloads PDFs, chunks content, and indexes with FAISS + Ollama embeddings. No cloud API keys required — everything runs locally.