Overview
- Skill Key
- araa47/gemini-stt
- Author
- araa47
- Source Repo
- openclaw/skills
- Version
- -
- Source Path
- skills/araa47/gemini-stt
- Latest Commit SHA
- ad0dba29f438651f615379c20091b2b9620d8822
Transcribe audio files using Google's Gemini API or Vertex AI
Stars
0
Installs
0
Status
ACTIVE
Visibility
PUBLIC
直接复制以下提示词,发送给你的 AI 助手即可完成安装。
请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 gemini-stt 技能。 若已安装,则直接安装 gemini-stt 技能。
# Gemini Speech-to-Text Skill Transcribe audio files using Google's Gemini API or Vertex AI. Default model is `gemini-2.0-flash-lite` for fastest transcription. ## Authentication (choose one) ### Option 1: Vertex AI with Application Default Credentials (Recommended) ```bash gcloud auth application-default login gcloud config set project YOUR_PROJECT_ID ``` The script will automatically detect and use ADC when available. ### Option 2: Direct Gemini API Key Set `GEMINI_API_KEY` in environment (e.g., `~/.env` or `~/.clawdbot/.env`) ## Requirements - Python 3.10+ (no external dependencies) - Either GEMINI_API_KEY or gcloud CLI with ADC configured ## Supported Formats - `.ogg` / `.opus` (Telegram voice messages) - `.mp3` - `.wav` - `.m4a` ## Usage ```bash # Auto-detect auth (tries ADC first, then GEMINI_API_KEY) python ~/.claude/skills/gemini-stt/transcribe.py /path/to/audio.ogg # Force Vertex AI python ~/.claude/skills/gemini-stt/transcribe.py /path/to/audio.ogg --vertex # With a specific model python ~/.claude/skills/gemini-stt/transcribe.py /path/to/audio.ogg --model gemini-2.5-pro # Vertex AI with specific project and region python ~/.claude/skills/gemini-stt/transcribe.py /path/to/audio.ogg --vertex --project my-project --region us-central1 # With Clawdbot media python ~/.claude/skills/gemini-stt/transcribe.py ~/.clawdbot/media/inbound/voice-message.ogg ``` ## Options | Option | Description | |--------|-------------| | `<audio_file>` | Path to the audio file (required) | | `--model`, `-m` | Gemini model to use (default: `gemini-2.0-flash-lite`) | | `--vertex`, `-v` | Force use of Vertex AI with ADC | | `--project`, `-p` | GCP project ID (for Vertex, defaults to gcloud config) | | `--region`, `-r` | GCP region (for Vertex, default: `us-central1`) | ## Supported Models Any Gemini model that supports audio input can be used. Recommended models: | Model | Notes | |-------|-------| | `gemini-2.0-flash-lite` | **Default.** Fastest transcription speed...
laborany
基于 Claude Code 的桌面 AI 工作力平台 — 支持飞书/QQ 远程调度、技能创建、定时任务。OpenClaw 的桌面实现,零代码养好你的 AI 🦞 Desktop AI workforce platform built on Claude Code. Feishu/QQ bot integration, skill creation, scheduled tasks — OpenClaw for your desktop. Raise your AI lobsters 🦞
win4r
Reusable OpenClaw skill for remote Linux deployment with MiniMax M2.1 and Telegram bot setup
botlearn-ai
Bots learn, human earns, curated open claw playbook list and skill list for life long learners at https://botlearn.ai
duanecilliers
Web-based admin dashboard for OpenClaw — manage Discord persona bots, workspace files, skills, cron jobs, channels, and config
abczsl520
OpenClaw skill: Dynamic bug audit for Node.js web projects (games, data tools, WeChat, APIs, bots). 200+ real-world pitfalls.
pardnchiu
A Go agentic AI platform with skill routing, multi-provider intelligent dispatch, Discord bot integration, and security-first shared agent design