voice-tts

语音处理技能 - 完整的语音输入输出解决方案。功能：(1) 语音识别 ASR - 将用户语音转录为文字（使用 Whisper）(2) 语音合成 TTS - 将文字转换为语音（使用 Edge TTS）。触发场景：用户发送语音消息、主动要求"用语音读..."、"语音回复"等。支持平台：Telegram、Discord、WhatsApp、飞书/Lark。确保每次语音回复都同时发送文字和语音。

View Source SKILL.md

Stars

Installs

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词，发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店，若未安装，请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店，然后安装 voice-tts 技能。若已安装，则直接安装 voice-tts 技能。

Overview

Skill Key: believe3344/voice-tts
Author: believe3344
Source Repo: openclaw/skills
Version: -
Source Path: skills/believe3344/voice-tts
Latest Commit SHA: a4dc0ada033fbf396877077400aa679167a79338

Extracted Content

SKILL.md excerpt

> ⚠️ 安装后请将 scripts/ 目录下的 .txt 文件名后缀去掉（去掉 .txt）才能正常使用！

# Voice 语音处理技能

完整的语音输入输出解决方案，同时支持语音识别（ASR）和语音合成（TTS）。

## 功能概述

| 方向 | 技术 | 说明 |
|------|------|------|
| 语音→文字 | Whisper (本地) | 用户发语音时自动转录 |
| 文字→语音 | Edge TTS | 生成语音回复 |

## 安装依赖

```bash
# 必须安装的包
pip install edge-tts whisper torch click

# 如果没有 ffmpeg（音频处理需要）
# macOS: brew install ffmpeg
# Ubuntu: sudo apt install ffmpeg
```

## OpenClaw 配置

要在 OpenClaw 中使用语音识别，需要修改 `openclaw.json`：

```json
{
  "tools": {
    "media": {
      "audio": {
        "enabled": true,
        "models": [
          {
            "type": "cli",
            "command": "python3",
            "args": [
              "{{SkillPath}}/voice-tts/scripts/whisper",
              "--model",
              "base",
              "{{MediaPath}}"
            ]
          }
        ]
      }
    }
  }
}
```

**说明**：
- `{{SkillPath}}` 会自动替换为 skill 安装路径
- `--model base` 可以改为 `turbo` 等其他模型
- 修改后执行 `openclaw gateway restart` 重启生效

## 触发场景

### 场景一：用户发送语音消息（ASR）

当用户发送**语音消息**时：
1. 系统会自动转录为文字（transcript）
2. 你需要理解用户意图并回复
3. **用语音+文字回复**用户

### 场景二：用户主动要求语音回复（TTS）

当用户说以下话术时：
- "用语音读..."
- "语音回复"
- "读给我听"
- "说出来"
- "text to speech"
- "TTS"
- "飞书语音"
- 或任何明确要求语音输出的请求

## 脚本说明

Skill 自带两个脚本：

### 1. 语音识别 - whisper

位置：`{{SkillPath}}/scripts/whisper`

```bash
# 基本用法
python3 {{SkillPath}}/scripts/whisper audio.mp3

# 指定模型（默认 base）
python3 {{SkillPath}}/scripts/whisper audio.mp3 --model turbo

# 输出 JSON（带语言检测）
python3 {{SkillPath}}/scripts/whisper audio.mp3 --json

# 带时间戳
python3 {{SkillPath}}/scripts/whisper audio.mp3 --timestamps
```

**可用模型**：

| 模型 | 大小 | 速度 | 精度 |
|------|------|------|------|
| tiny | 39M | 最快 |...

Related Claw Skills

capt-marbles

Task Router Skill

★ 0

Task Router

capncoconut

x402hub

★ 0

Register, communicate, and earn on the x402hub AI agent marketplace. Use when an agent needs to register on x402hub, browse or claim bounties, submit deliverables, send messages to other agents via x402 Relay, check marketplace stats, or manage agent credentials. Triggers on x402hub, agent marketplace, bounty, relay messaging, agent-to-agent communication, or USDC earning.

capevace

claw

★ 0

Real-time event bus for AI agents. Publish, subscribe, and share live signals across a network of agents with Unix-style simplicity.

captchasco

captchas-openclaw

★ 0

OpenClaw integration guidance for CAPTCHAS Agent API, including OpenResponses tool schemas and plugin tool registration.

carol-gutianle

Modelready

★ 0

name: modelready description: Start using a local or Hugging Face model instantly, directly from chat. metadata: {"openclaw":{"requires":{"bins": "bash", "curl" }, "env": "URL" }}

canbirlik

wiz-light-control

★ 0

Controls Wiz smart bulbs (turn on/off, RGB colors, disco mode) via local WiFi.

Analysis Signals

Dependencies

bun pip python edge-tts

External Services

telegram discord x