TopRank Skills

Home / Claw Skills / 其他 / alicloud-ai-audio-asr-realtime
Official OpenClaw rules 15%

alicloud-ai-audio-asr-realtime

Use when low-latency realtime speech recognition is needed with Alibaba Cloud Model Studio Qwen ASR Realtime models, including streaming microphone input, live captions, or duplex voice agents.

Stars

0

Installs

0

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词,发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 alicloud-ai-audio-asr-realtime 技能。 若已安装,则直接安装 alicloud-ai-audio-asr-realtime 技能。

Overview

Skill Key
cinience/alicloud-ai-audio-asr-realtime
Author
cinience
Source Repo
openclaw/skills
Version
1.0.0
Source Path
skills/cinience/alicloud-ai-audio-asr-realtime
Latest Commit SHA
7f8e36aa59e9017a2575fa730fb811154c6d1765

Extracted Content

SKILL.md excerpt

Category: provider

# Model Studio Qwen ASR Realtime

## Validation

```bash
mkdir -p output/alicloud-ai-audio-asr-realtime
python -m py_compile skills/ai/audio/alicloud-ai-audio-asr-realtime/scripts/prepare_realtime_asr_request.py && echo "py_compile_ok" > output/alicloud-ai-audio-asr-realtime/validate.txt
```

Pass criteria: command exits 0 and `output/alicloud-ai-audio-asr-realtime/validate.txt` is generated.

## Output And Evidence

- Save session payloads and response samples under `output/alicloud-ai-audio-asr-realtime/`.

## Critical model names

Use one of these exact model strings:
- `qwen3-asr-flash-realtime`
- `qwen3-asr-flash-realtime-2026-02-10`

## Use cases

- Realtime subtitles and captions
- Voice-agent duplex input
- Streaming speech-to-text in browser or terminal clients

## Prerequisites

- Set `DASHSCOPE_API_KEY` in your environment, or add `dashscope_api_key` to `~/.alibabacloud/credentials`.
- Realtime sessions generally require WebSocket or streaming session handling in the client.

## Normalized interface (asr.realtime)

### Request
- `model` (string, optional): default `qwen3-asr-flash-realtime`
- `language_hints` (array<string>, optional)
- `format` (string, optional): e.g. `pcm`, `wav`
- `sample_rate` (int, optional): e.g. `16000`
- `chunk_ms` (int, optional): frame size in milliseconds

### Response
- `text` (string): recognized transcript fragment
- `is_final` (bool): finalization marker
- `usage` (object, optional)

## Quick start

Generate a request template:

```bash
python skills/ai/audio/alicloud-ai-audio-asr-realtime/scripts/prepare_realtime_asr_request.py \
  --output output/alicloud-ai-audio-asr-realtime/request.json
```

## Operational guidance

- Prefer 16kHz mono PCM unless your client stack requires another format.
- Keep chunks small enough for responsive partial results.
- If you only have recorded files, use `skills/ai/audio/alicloud-ai-audio-asr/` instead.

## References

- `references/sources.md`

Related Claw Skills