TopRank Skills

Home / Claw Skills / 其他 / qwen3-audio
Official OpenClaw rules 15%

qwen3-audio

High-performance audio library for Apple Silicon with text-to-speech (TTS) and speech-to-text (STT).

Stars

0

Installs

0

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词,发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 qwen3-audio 技能。 若已安装,则直接安装 qwen3-audio 技能。

Overview

Skill Key
darknoah/qwen3-audio
Author
darknoah
Source Repo
openclaw/skills
Version
0.0.3
Source Path
skills/darknoah/qwen3-audio
Latest Commit SHA
0463e0cda3088a20a3786735b1afc571dfe12974

Extracted Content

SKILL.md excerpt

# Qwen3-Audio

## Overview

Qwen3-Audio is a high-performance audio processing library optimized for Apple Silicon (M1/M2/M3/M4). It delivers fast, efficient TTS and STT with support for multiple models, languages, and audio formats.

## Prerequisites

- Python 3.10+
- Apple Silicon Mac (M1/M2/M3/M4)

### Environment checks

Before using any capability, verify that all items in `./references/env-check-list.md` are complete.

## Capabilities

### Text to Speech
```bash
uv run --python ".venv/bin/python" "./scripts/mlx-audio.py" tts --text "hello world" --output "/path_to_save.wav"
```

**Returns (JSON):**
```json
{
  "audio_path": "/path_to_save.wav",
  "duration": 1.234,
  "sample_rate": 24000
}
```

### Voice Cloning
Clone any voice using a reference audio sample. Provide the wav file and its transcript:
```bash
uv run --python ".venv/bin/python" "./scripts/mlx-audio.py" tts --text "hello world" --output "/path_to_save.wav" --ref_audio "sample_audio.wav" --ref_text "This is what my voice sounds like."
```
ref_audio: reference audio to clone
ref_text: transcript of the reference audio

### Use Created Voice (Shortcut)
Use a voice created with `voice create` by its ID:
```bash
uv run --python ".venv/bin/python" "./scripts/mlx-audio.py" tts --text "hello world" --output "/path_to_save.wav" --ref_voice "my-voice-id"
```
This automatically loads `ref_audio` and `ref_text` from the voice profile.

### CustomVoice (Emotion Control)
Use predefined voices with emotion/style instructions:
```bash
uv run --python ".venv/bin/python" "./scripts/mlx-audio.py" tts --text "hello world" --output "/path_to_save.wav" --speaker "Ryan" --language "English" --instruct "Very happy and excited."
```

### VoiceDesign (Create Any Voice)
Create any voice from a text description:
```bash
uv run --python ".venv/bin/python" "./scripts/mlx-audio.py" tts --text "hello world" --output "/path_to_save.wav" --language "English" --instruct "A cheerful young female voice with high pitch and energetic t...

Related Claw Skills