TopRank Skills

Home / Claw Skills / Others / voice-recognition
Official OpenClaw rules 15%

voice-recognition

Local speech-to-text with OpenAI Whisper CLI. Supports Chinese, English, 100+ languages with translation and summarization.

Stars

0

Installs

0

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词,发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 voice-recognition 技能。 若已安装,则直接安装 voice-recognition 技能。

Overview

Skill Key
gykdly/voice-recognition
Author
gykdly
Source Repo
openclaw/skills
Version
1.0.0
Source Path
skills/gykdly/voice-recognition
Latest Commit SHA
6876d5f184b5c3eb8679e4747aa8b06eff06f471

Extracted Content

SKILL.md excerpt

# Voice Recognition (Whisper)

Local speech-to-text with OpenAI Whisper CLI.

## Features

- **Local processing** - No API key needed, free
- **Multi-language** - Chinese, English, 100+ languages
- **Translation** - Translate to English
- **Summarization** - Generate quick summary

## Usage

### Basic

```bash
# Chinese recognition
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a

# Force Chinese
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --zh

# English recognition  
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --en

# Translate to English
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --translate

# With summary
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --summarize
```

### Quick Command (add to ~/.zshrc)

```bash
alias voice="python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py"
```

Then use:

```bash
voice ~/Downloads/audio.m4a --zh
```

## Requirements

- OpenAI Whisper CLI: `brew install openai-whisper`
- Python 3.10+

## Files

- `scripts/voice识别_升级版.py` - Main script
- `scripts/voice_tool_README.md` - Documentation

## Supported Formats

- MP3, M4A, WAV, OGG, FLAC, WebM

## Language Support

100+ languages including:
- Chinese (zh)
- English (en)
- Japanese (ja)
- Korean (ko)
- And more...

## Notes

- Default model: `medium` (balance of speed and accuracy)
- First run downloads model to `~/.cache/whisper`
- Processing time varies by audio length and model size

Related Claw Skills