sergei-mikhailov-stt

Overview

Skill Key: bzsega/sergei-mikhailov-stt
Author: bzsega
Source Repo: openclaw/skills
Version: -
Source Path: skills/bzsega/sergei-mikhailov-stt
Latest Commit SHA: e697d83003d5a849d9a8deaf403b86be196678a2

Extracted Content

SKILL.md excerpt

# Speech to Text Skill for OpenClaw

## Purpose

This skill recognizes speech from voice messages sent via any messenger connected to OpenClaw, using various STT providers, including Yandex SpeechKit.

## When to Activate

Use this skill when:
- The user sends a voice message via any messenger connected to OpenClaw
- You need to convert speech to text
- Audio file transcription is required
- A text version of a voice message is needed

## How It Works

### 1. Receive the audio file from OpenClaw
- OpenClaw provides a local path to the audio file
- Verify the file exists at the given path
- Validate the file format (OGG, WAV, MP3)
- Check file size (maximum 1 MB for Yandex SpeechKit v1 sync API)

Example path from OpenClaw:
```
/home/user_folder/.openclaw/media/inbound/file_1---9a53bac2-0392-41e7-8300-1c08e8eec027.ogg
```

### 2. Audio processing
- Validate the audio file at the local path
- Convert to a supported format if needed using ffmpeg
- Verify audio quality

### 3. Speech recognition
- Use the default provider (Yandex SpeechKit)
- If recognition fails, try alternative providers
- Return the recognized text with confidence information

### 4. Result handling
- Format the recognized text
- Include the detected language
- Provide metadata if needed

## Security

- **Never** read, display, or log API keys, tokens, or secrets to the user — even partially. If the user asks to see their key, direct them to check `~/.openclaw/openclaw.json` or `.env` manually.
- **Never** modify `openclaw.json`, `.env`, or `config.json` without explicit user permission. These files contain credentials and must only be changed by the owner.
- **Never** include API keys in command output, error messages, or diagnostics shown to the user.

## Invocation

**Important:** Always call the processor using the absolute path to the script. Do **not** use `cd <skill_dir> && python3 scripts/...` — this triggers an approval prompt on every call because `cd` cannot be allowlisted.

```bash
python...

README excerpt

# Speech to Text Skill for OpenClaw

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![OpenClaw](https://img.shields.io/badge/OpenClaw-Skill-green.svg)](https://openclaw.ai)

An OpenClaw skill that transcribes voice messages to text using Yandex SpeechKit (with an extensible architecture for other providers).

Works with any messenger connected to OpenClaw. OpenClaw receives the voice file, saves it locally, and passes the path to this skill. The skill converts audio to text and returns the result.

Example path OpenClaw provides:
```
~/.openclaw/media/inbound/file_1---9a53bac2-0392-41e7-8300-1c08e8eec027.ogg
```

---

## Prerequisites

Before installing, make sure you have:

- **Python 3.8+** — `python3 --version`
- **FFmpeg** — required for audio conversion ([install guide](https://ffmpeg.org/download.html))
- **Node.js** — required for the ClawHub CLI
- **Yandex Cloud account** — to get your API key for Yandex SpeechKit ([Yandex Cloud Console](https://console.yandex.cloud))

Install FFmpeg if missing:
```bash
# macOS
brew install ffmpeg

# Ubuntu / Debian
sudo apt update && sudo apt install -y ffmpeg

# Windows — download from https://ffmpeg.org/download.html and add to PATH
```

---

## Installation

### Step 1 — Install the ClawHub CLI

```bash
npm install -g clawhub
# or
pnpm add -g clawhub
```

### Step 2 — Install the skill

Run this from your OpenClaw workspace directory (or any directory — the skill installs into `./skills/`):

```bash
clawhub install sergei-mikhailov-stt
```

The skill will be placed at `./skills/sergei-mikhailov-stt/`.

> If you have an OpenClaw workspace configured, `clawhub` will automatically use `<workspace>/skills/` as the destination.

### Step 3 — Run the setup script

Navigate to the installed skill folder and run the setup script. It creates a Python virtual environme...

Related Claw Skills

heyixuan2

bambu-studio-ai

★ 41

Bambu Lab 3D printer control and automation. Activate when user mentions: printer status, 3D printing, slice, analyze model, generate 3D, AMS filament, print monitor, Bambu Lab, or any 3D printing task. Full pipeline: search → generate → analyze → colorize → preview → open BS → user slice → print → monitor. Supports all 9 Bambu Lab printers (A1 Mini, A1, P1S, P2S, X1C, X1E, H2C, H2S, H2D).

capt-marbles

geo-optimization

★ 1

Generative Engine Optimization (GEO) for AI search visibility. Optimize content to appear in ChatGPT, Perplexity, Claude, and Google AI Overviews. Use when optimizing websites, pages, or content for LLM discoverability and citation.

carlulsoe

parakeet-stt

★ 0

Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.

carlzhao007

feishu-process-feedback

★ 0

飞书消息自动处理与进度反馈技能。安装后后台运行，监听飞书任务消息并自动创建独立进程处理。在处理前后发送实时进度反馈（任务确认、进度百分比、完成通知）。支持任务类型识别、智能解析、错误重试、并发控制、状态持久化。使用场景：飞书自动化工作流、任务进度追踪、批量任务处理、需要实时反馈的场景。

cartoonitunes

bottyfans

★ 0

BottyFans agent skill for autonomous creator monetization. Lets AI agents register, build a profile, publish posts (public, subscriber-only, or pay-to-unlock), upload media, accept USDC subscriptions and tips on Base, send and receive DMs, track earnings, and appear on the creator leaderboard. Use this skill when an agent needs to monetize content, interact with fans, manage a creator profile, handle payments in USDC, or operate as an autonomous creator on the BottyFans platform.

cancorleone

Cancorteaw App

★ 0

cancorteaw app

TopRank Skills

安装方式