TopRank Skills

Home / Claw Skills / Git / GitHub / volcengine-ata-subtitle
Official OpenClaw rules 36%

volcengine-ata-subtitle

Generate subtitles with automatic time alignment using Volcengine ATA API. Use when the user wants to: (1) add time-aligned subtitles to videos, (2) convert text + audio to SRT/ASS format, or (3) automate subtitle creation workflow.

Stars

0

Installs

0

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词,发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 volcengine-ata-subtitle 技能。 若已安装,则直接安装 volcengine-ata-subtitle 技能。

Overview

Skill Key
blackeight4752/doubao-ata-subtitle
Author
blackeight4752
Source Repo
openclaw/skills
Version
1.0.0
Source Path
skills/blackeight4752/doubao-ata-subtitle
Latest Commit SHA
b5a31ff33e6f459e2219145654f1c1810e7ab34a

Extracted Content

SKILL.md excerpt

# Volcengine ATA Subtitle (自动打轴)

Generate subtitles with automatic time alignment using Volcengine's ATA (Automatic Time Alignment) API.

## Prerequisites

Set the following environment variables or create a config file:

### Option A: Environment Variables

```bash
export VOLC_ATA_APP_ID="your-app-id"
export VOLC_ATA_TOKEN="your-access-token"
export VOLC_ATA_API_BASE="https://openspeech.bytedance.com"
```

### Option B: Config File

Create `~/.volcengine_ata.conf`:

```ini
[credentials]
appid = your-app-id
access_token = your-access-token
secret_key = your-secret-key

[api]
base_url = https://openspeech.bytedance.com
submit_path = /api/v1/vc/ata/submit
query_path = /api/v1/vc/ata/query
```

## Execution (Python CLI Tool)

A Python CLI tool is provided at `~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py`.

### Quick Examples

```bash
# Basic usage: audio + text → SRT subtitle
python3 ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py \
  --audio storage/audio.wav \
  --text storage/subtitle.txt \
  --output storage/subtitles/final.srt

# Specify output format (srt or ass)
python3 ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py \
  --audio storage/audio.wav \
  --text storage/subtitle.txt \
  --output storage/subtitles/final.ass \
  --format ass
```

## Input Requirements

### Audio File

- **Format**: WAV (PCM)
- **Sample Rate**: 16000 Hz (16kHz)
- **Channels**: 1 (mono)
- **Encoding**: 16-bit PCM (`pcm_s16le`)

**Extract from video**:
```bash
ffmpeg -i input.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 audio.wav
```

### Text File

- **Format**: Plain text (UTF-8)
- **Structure**: One sentence per line
- **No punctuation**: ATA will handle automatically
- **No timestamps**: Pure text only

**Example**:
```
主人闹钟没响睡过头了
我们俩轮流用鼻子拱他脸
他以为地震了抱着枕头就跑
```

## Output Formats

### SRT (SubRip)

```srt
1
00:00:00,000 --> 00:00:02,500
第一句字幕

2
00:00:02,500 --> 00:00:05,000
第二句字幕
```

###...

README excerpt

# 🔥 DouBao ATA Subtitle

豆包语音 ATA (Automatic Time Alignment) 自动字幕打轴 Skill 
本技能由OpenClaw构建
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![OpenClaw Skill](https://img.shields.io/badge/OpenClaw-Skill-blue)](https://github.com/openclaw/openclaw)

## ✨ 功能特性

- 🎯 **自动打轴**:将音频 + 文本转换为带毫秒级时间轴的字幕
- 📝 **多格式支持**:输出 SRT 或 ASS 格式字幕
- 🔧 **灵活配置**:支持环境变量或配置文件两种鉴权方式
- 🧪 **演示模式**:无 API 密钥也可体验基本功能

## 📦 快速开始
[创建豆包语音应用](https://console.volcengine.com/speech/app)
自动字幕打轴应用,获取APP ID、	Access Token、Secret Key 有20小时试用
<img width="1107" height="761" alt="398d4eed-4abe-497e-96c0-9fd0adae4f39" src="https://github.com/user-attachments/assets/7dfabbe1-b1d7-44e7-b071-007641d0cbad" />

### 安装

```bash
# 方式 1: 克隆到 OpenClaw skills 目录
cd ~/.openclaw/workspace/skills
git clone https://github.com/BlackEight4752/volcengine-ata-subtitle.git

# 方式 2: 通过 ClawHub(待发布)
clawhub install volcengine-ata-subtitle
```

### 配置 API 密钥

**方式 A - 环境变量**:
```bash
export VOLC_ATA_APP_ID="your-app-id"
export VOLC_ATA_TOKEN="your-access-token"
export VOLC_ATA_API_BASE="https://openspeech.bytedance.com"
```

**方式 B - 配置文件** (`~/.volcengine_ata.conf`):
```ini
[credentials]
appid = your-app-id
access_token = your-access-token
secret_key = your-secret-key

[api]
base_url = https://openspeech.bytedance.com
submit_path = /api/v1/vc/ata/submit
query_path = /api/v1/vc/ata/query
```

### 使用示例

```bash
# 基础用法:音频 + 文本 → SRT 字幕
python3 ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py \
  --audio storage/audio.wav \
  --text storage/subtitle.txt \
  --output storage/subtitles/final.srt

# 输出 ASS 格式
python3 ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py \
  --audio storage/audio.wav \
  --text storage/subtitle.txt \
  --output storage/subtitles/final.ass \...

Related Claw Skills

heyixuan2

bambu-studio-ai

★ 41

Bambu Lab 3D printer control and automation. Activate when user mentions: printer status, 3D printing, slice, analyze model, generate 3D, AMS filament, print monitor, Bambu Lab, or any 3D printing task. Full pipeline: search → generate → analyze → colorize → preview → open BS → user slice → print → monitor. Supports all 9 Bambu Lab printers (A1 Mini, A1, P1S, P2S, X1C, X1E, H2C, H2S, H2D).

capt-marbles

geo-optimization

★ 1

Generative Engine Optimization (GEO) for AI search visibility. Optimize content to appear in ChatGPT, Perplexity, Claude, and Google AI Overviews. Use when optimizing websites, pages, or content for LLM discoverability and citation.

carlulsoe

parakeet-stt

★ 0

Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.

carlzhao007

feishu-process-feedback

★ 0

飞书消息自动处理与进度反馈技能。安装后后台运行,监听飞书任务消息并自动创建独立进程处理。 在处理前后发送实时进度反馈(任务确认、进度百分比、完成通知)。 支持任务类型识别、智能解析、错误重试、并发控制、状态持久化。 使用场景:飞书自动化工作流、任务进度追踪、批量任务处理、需要实时反馈的场景。

cartoonitunes

bottyfans

★ 0

BottyFans agent skill for autonomous creator monetization. Lets AI agents register, build a profile, publish posts (public, subscriber-only, or pay-to-unlock), upload media, accept USDC subscriptions and tips on Base, send and receive DMs, track earnings, and appear on the creator leaderboard. Use this skill when an agent needs to monetize content, interact with fans, manage a creator profile, handle payments in USDC, or operate as an autonomous creator on the BottyFans platform.

camopel

arxivkb

★ 0

Local arXiv paper manager with semantic search. Crawls arXiv categories, downloads PDFs, chunks content, and indexes with FAISS + Ollama embeddings. No cloud API keys required — everything runs locally.