Overview
- Skill Key
- franklu0819-lang/zhipu-asr
- Author
- franklu0819-lang
- Source Repo
- openclaw/skills
- Version
- -
- Source Path
- skills/franklu0819-lang/zhipu-asr
- Latest Commit SHA
- 42bc707031f51f68def488a534c976f74d838e7d
Automatic Speech Recognition (ASR) using Zhipu AI (BigModel) GLM-ASR model. Use when you need to transcribe audio files to text. Supports Chinese audio transcription with context prompts, custom hotwords, and multiple audio formats.
Stars
0
Installs
0
Status
ACTIVE
Visibility
PUBLIC
直接复制以下提示词,发送给你的 AI 助手即可完成安装。
请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 zhipu-asr 技能。 若已安装,则直接安装 zhipu-asr 技能。
# Zhipu AI Automatic Speech Recognition (ASR) Transcribe Chinese audio files to text using Zhipu AI's GLM-ASR model. ## Setup **1. Get your API Key:** Get a key from [Zhipu AI Console](https://bigmodel.cn/usercenter/proj-mgmt/apikeys) **2. Set it in your environment:** ```bash export ZHIPU_API_KEY="your-key-here" ``` ## Supported Audio Formats - **WAV** - Recommended, best quality - **MP3** - Widely supported - **OGG** - Auto-converted to MP3 - **M4A** - Auto-converted to MP3 - **AAC** - Auto-converted to MP3 - **FLAC** - Auto-converted to MP3 - **WMA** - Auto-converted to MP3 > **Note:** The script automatically converts unsupported formats to MP3 using ffmpeg. Only WAV and MP3 are accepted by the API, but you can use any format that ffmpeg supports. ## File Constraints - **Maximum file size:** 25 MB - **Maximum duration:** 30 seconds - **Recommended sample rate:** 16000 Hz or higher - **Audio channels:** Mono or stereo ## Usage ### Basic Transcription Transcribe an audio file with default settings: ```bash bash scripts/speech_to_text.sh recording.wav ``` ### Transcription with Context Provide previous transcription or context for better accuracy: ```bash bash scripts/speech_to_text.sh recording.wav "这是之前的转录内容,有助于提高准确性" ``` ### Transcription with Hotwords Use custom vocabulary to improve recognition of specific terms: ```bash bash scripts/speech_to_text.sh recording.mp3 "" "人名,地名,专业术语,公司名称" ``` ### Full Options Combine context and hotwords: ```bash bash scripts/speech_to_text.sh recording.wav "会议记录片段" "张三,李四,项目名称" ``` **Parameters:** - `audio_file` (required): Path to audio file (.wav or .mp3) - `prompt` (optional): Previous transcription or context text (max 8000 chars) - `hotwords` (optional): Comma-separated list of specific terms (max 100 words) ## Features ### Context Prompts **Why use context prompts:** - Improves accuracy in long conversations - Helps with domain-specific terminology - Mai...
# Zhipu AI ASR Skill Automatic Speech Recognition (ASR) using Zhipu AI (BigModel) GLM-ASR model. Transcribe Chinese audio files to text with high accuracy. ## Features - 🎤 **Multiple Audio Formats**: WAV, MP3, OGG, M4A, AAC, FLAC, WMA - 🇨🇳 **Chinese Language Support**: Optimized for Mandarin Chinese - 📝 **Context Prompts**: Improve accuracy with previous transcription context - 🔥 **Hotwords**: Custom vocabulary for specific terms (names, jargon, etc.) - ⚡ **Fast Processing**: Real-time or faster transcription speed - 🔄 **Auto Format Conversion**: Automatically converts unsupported formats to MP3 ## Requirements - `jq` - JSON processor - `ffmpeg` - Audio format conversion - `ZHIPU_API_KEY` environment variable ## Quick Start ```bash # Install dependencies (if needed) sudo apt-get install jq ffmpeg # Set your API key export ZHIPU_API_KEY="your-key-here" # Transcribe an audio file bash scripts/speech_to_text.sh recording.wav # With context and hotwords bash scripts/speech_to_text.sh recording.wav "previous context" "term1,term2,term3" ``` ## File Constraints - **Max file size**: 25 MB - **Max duration**: 30 seconds - **Supported formats**: WAV (recommended), MP3 - **Other formats**: Auto-converted to MP3 ## Use Cases - 🎙️ Meeting transcription - 📚 Lecture recording - 💼 Voice memos - 🎞️ Video subtitle generation - 📞 Call recording transcription ## Author franklu0819-lang ## License MIT
capt-marbles
Task Router
capncoconut
Register, communicate, and earn on the x402hub AI agent marketplace. Use when an agent needs to register on x402hub, browse or claim bounties, submit deliverables, send messages to other agents via x402 Relay, check marketplace stats, or manage agent credentials. Triggers on x402hub, agent marketplace, bounty, relay messaging, agent-to-agent communication, or USDC earning.
capevace
Real-time event bus for AI agents. Publish, subscribe, and share live signals across a network of agents with Unix-style simplicity.
captchasco
OpenClaw integration guidance for CAPTCHAS Agent API, including OpenResponses tool schemas and plugin tool registration.
carol-gutianle
name: modelready description: Start using a local or Hugging Face model instantly, directly from chat. metadata: {"openclaw":{"requires":{"bins": "bash", "curl" }, "env": "URL" }}
canbirlik
Controls Wiz smart bulbs (turn on/off, RGB colors, disco mode) via local WiFi.