TopRank Skills

Home / Claw Skills / 文档 / pdf-translate
Official OpenClaw rules 72%

pdf-translate

Translates PDF documents to Chinese with professional typography. Extracts text, translates section-by-section into well-structured Markdown, then generates PDF via weasyprint with full CJK support. Use when user asks to translate a PDF, says "翻译PDF", "translate this document", or "pdf translate".

Stars

0

Installs

0

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词,发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 pdf-translate 技能。 若已安装,则直接安装 pdf-translate 技能。

Overview

Skill Key
chrislee121/pdf-translate
Author
chrislee121
Source Repo
openclaw/skills
Version
-
Source Path
skills/chrislee121/pdf-translate
Latest Commit SHA
3e52c669f53bcbce420145dcc04bf7babc68153f

Extracted Content

SKILL.md excerpt

# PDF Translation Skill

翻译 PDF 文档并生成排版精美的中文文档。输出 Markdown + PDF 双格式。

## 版本信息

**当前版本**: v4.0.0
**发布日期**: 2026-02-21

### v4.0.0 变更

- 采用 Markdown-first 工作流:先生成结构化 Markdown,再转 PDF
- PDF 引擎从 reportlab 切换为 weasyprint(支持完整 HTML/CSS 排版)
- 修复代码块中文乱码问题(添加 CJK 字体 fallback)
- 完整支持:标题层级、代码块、表格、列表、引用、粗体斜体
- 新增 `scripts/md2pdf.py` 通用转换脚本

## 核心工作流

### Step 1: 提取 PDF 文本

```python
import pdfplumber

pdf = pdfplumber.open("输入文件.pdf")
for i, page in enumerate(pdf.pages):
    text = page.extract_text()
    if text:
        print(f"--- Page {i+1} ---")
        print(text)
```

长文档(>20 页)先提取前几页了解结构,再分批提取。

### Step 2: 分析文档结构

通读全文,识别以下元素并规划 Markdown 映射:

| 原文元素 | Markdown 映射 |
|---------|-------------|
| 文档标题 | `#` |
| 章节(Chapter) | `##` |
| 小节(Section) | `###` |
| 子小节(Subsection) | `####` |
| 目录 | 链接列表 `- [章节名](#锚点)` |
| 正文段落 | 段落(空行分隔) |
| 代码块 | ` ``` ` 围栏(**不翻译**内容) |
| 表格 | `\| 列1 \| 列2 \|` 语法 |
| 有序列表 | `1. ` 开头 |
| 无序列表 | `- ` 开头 |
| 引用/提示框 | `> ` 语法 |
| 页脚/页码 | 丢弃 |

### Step 3: 逐章节翻译为中文 Markdown

**必须逐章节翻译**,不要一次输出全文。每完成一个章节就追加写入文件。

#### 翻译规则

1. **专有名词保留英文**:首次出现时括号附英文,如"渐进式披露(Progressive Disclosure)"
2. **代码块不翻译**:` ``` ` 内代码保持原文,只翻译围栏外说明文字
3. **行内代码不翻译**:反引号内标识符、命令、文件名保持英文
4. **保持层级结构**:`#` → `##` → `###` → `####` 不跳级
5. **段落间必须空行**:每个段落、列表、代码块、表格前后都要有空行
6. **列表格式**:`- ` 或 `1. ` 开头,嵌套用 2 空格缩进
7. **表格格式**:`| 列1 | 列2 |` 语法,必须有 `|---|---|` 分隔行
8. **引用格式**:`> ` 开头

#### 翻译质量标准

参见 [translation-standards.md](references/translation-standards.md)

- 三步翻译工作流:重写初稿 → 问题诊断 → 润色定稿
- 四大语言转换...

README excerpt

# PDF Translate Skill

> Academic-quality PDF translation tool powered by Claude — translates English PDFs into beautifully typeset Chinese documents.

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Skill Version](https://img.shields.io/badge/version-4.0.0-blue.svg)](SKILL.md)
[![Claude Compatible](https://img.shields.io/badge/Claude-compatible-green.svg)](https://claude.ai/)
[![Cursor Compatible](https://img.shields.io/badge/Cursor-compatible-green.svg)](https://cursor.com/)

**[中文文档](README_CN.md)**

## Features

- **Markdown-first workflow** — Translate into structured Markdown first, then generate PDF for maximum formatting fidelity
- **Professional typography** — Dark code blocks, alternating-row tables, blue-bordered blockquotes, hierarchical headings
- **Full CJK support** — PingFang SC / STHeiti / Microsoft YaHei / Noto Sans CJK fallback chain; Chinese renders correctly everywhere, including code blocks
- **Dual output** — Produces both `.md` and `.pdf` so you get an editable source and a polished document
- **Academic-quality translation** — Three-step workflow (rewrite → diagnose → polish) that eliminates "translationese"
- **Cross-platform** — macOS, Windows, and Linux with automatic font detection

## Quick Start

### Prerequisites

```bash
# macOS
brew install pango
pip3 install pdfplumber markdown weasyprint

# Linux (Debian/Ubuntu)
sudo apt install libpango1.0-dev
pip3 install pdfplumber markdown weasyprint

# Windows
pip3 install pdfplumber markdown weasyprint
```

### Usage with Claude / Cursor

1. Place (or symlink) this skill into your skills directory:

   ```bash
   # Claude Code
   ln -s /path/to/pdf-translate ~/.claude/skills/pdf-translate

   # Cursor
   ln -s /path/to/pdf-translate .cursor/skills/pdf-translate
   ```

2. Ask in natural language:

   ```
   Translate this PDF: report.pdf
   翻译这个PDF:report.pdf
   ```

3. The skill automatically:
   - Extracts text with `p...

Related Claw Skills

edholofy

dojo.md

★ 4

University for AI agents. 92 courses, 4400+ scenarios, any model via OpenRouter. Auto-training loops generate per-model SKILL.md documents. Works with Claude Code, OpenClaw, Cursor, Windsurf. No fine-tuning required.

lethehades

wps-macos-helper

★ 1

macOS WPS Office workflow helper skill for safer document preparation, conversion, export, and compatibility guidance

capt-marbles

firecrawl

★ 0

Web scraping and crawling with Firecrawl API. Fetch webpage content as markdown, take screenshots, extract structured data, search the web, and crawl documentation sites. Use when the user needs to scrape a URL, get current web info, capture a screenshot, extract specific data from pages, or crawl docs for a framework/library.

caqlayan

Tweet Processor

★ 0

Tweet Processor Skill

carev01

md-docs-search

★ 0

Full-text search across structured Markdown documentation archives using SQLite FTS5. Use when you need to search large collections of Markdown articles that are separated by "---" delimiters and contain source URLs (marked with "*Source:" pattern). Provides fast BM25-ranked search with automatic source URL extraction for citations. Ideal for research, documentation lookups, and knowledge base exploration. Requires indexing documentation first with `docs.py index`.

camelsprout

duckdb-en

★ 0

DuckDB CLI specialist for SQL analysis, data processing and file conversion. Use for SQL queries, CSV/Parquet/JSON analysis, database queries, or data conversion. Triggers on "duckdb", "sql", "query", "data analysis", "parquet", "convert data".