pdf-translate

Overview

Skill Key: chrislee121/pdf-translate
Author: chrislee121
Source Repo: openclaw/skills
Version: -
Source Path: skills/chrislee121/pdf-translate
Latest Commit SHA: 3e52c669f53bcbce420145dcc04bf7babc68153f

Extracted Content

SKILL.md excerpt

# PDF Translation Skill

翻译 PDF 文档并生成排版精美的中文文档。输出 Markdown + PDF 双格式。

## 版本信息

**当前版本**: v4.0.0
**发布日期**: 2026-02-21

### v4.0.0 变更

- 采用 Markdown-first 工作流：先生成结构化 Markdown，再转 PDF
- PDF 引擎从 reportlab 切换为 weasyprint（支持完整 HTML/CSS 排版）
- 修复代码块中文乱码问题（添加 CJK 字体 fallback）
- 完整支持：标题层级、代码块、表格、列表、引用、粗体斜体
- 新增 `scripts/md2pdf.py` 通用转换脚本

## 核心工作流

### Step 1: 提取 PDF 文本

```python
import pdfplumber

pdf = pdfplumber.open("输入文件.pdf")
for i, page in enumerate(pdf.pages):
    text = page.extract_text()
    if text:
        print(f"--- Page {i+1} ---")
        print(text)
```

长文档（>20 页）先提取前几页了解结构，再分批提取。

### Step 2: 分析文档结构

通读全文，识别以下元素并规划 Markdown 映射：

| 原文元素 | Markdown 映射 |
|---------|-------------|
| 文档标题 | `#` |
| 章节（Chapter） | `##` |
| 小节（Section） | `###` |
| 子小节（Subsection） | `####` |
| 目录 | 链接列表 `- [章节名](#锚点)` |
| 正文段落 | 段落（空行分隔） |
| 代码块 | ` ``` ` 围栏（**不翻译**内容） |
| 表格 | `\| 列1 \| 列2 \|` 语法 |
| 有序列表 | `1. ` 开头 |
| 无序列表 | `- ` 开头 |
| 引用/提示框 | `> ` 语法 |
| 页脚/页码 | 丢弃 |

### Step 3: 逐章节翻译为中文 Markdown

**必须逐章节翻译**，不要一次输出全文。每完成一个章节就追加写入文件。

#### 翻译规则

1. **专有名词保留英文**：首次出现时括号附英文，如"渐进式披露（Progressive Disclosure）"
2. **代码块不翻译**：` ``` ` 内代码保持原文，只翻译围栏外说明文字
3. **行内代码不翻译**：反引号内标识符、命令、文件名保持英文
4. **保持层级结构**：`#` → `##` → `###` → `####` 不跳级
5. **段落间必须空行**：每个段落、列表、代码块、表格前后都要有空行
6. **列表格式**：`- ` 或 `1. ` 开头，嵌套用 2 空格缩进
7. **表格格式**：`| 列1 | 列2 |` 语法，必须有 `|---|---|` 分隔行
8. **引用格式**：`> ` 开头

#### 翻译质量标准

参见 [translation-standards.md](references/translation-standards.md)

- 三步翻译工作流：重写初稿 → 问题诊断 → 润色定稿
- 四大语言转换...

README excerpt

# PDF Translate Skill

> Academic-quality PDF translation tool powered by Claude — translates English PDFs into beautifully typeset Chinese documents.

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Skill Version](https://img.shields.io/badge/version-4.0.0-blue.svg)](SKILL.md)
[![Claude Compatible](https://img.shields.io/badge/Claude-compatible-green.svg)](https://claude.ai/)
[![Cursor Compatible](https://img.shields.io/badge/Cursor-compatible-green.svg)](https://cursor.com/)

**[中文文档](README_CN.md)**

## Features

- **Markdown-first workflow** — Translate into structured Markdown first, then generate PDF for maximum formatting fidelity
- **Professional typography** — Dark code blocks, alternating-row tables, blue-bordered blockquotes, hierarchical headings
- **Full CJK support** — PingFang SC / STHeiti / Microsoft YaHei / Noto Sans CJK fallback chain; Chinese renders correctly everywhere, including code blocks
- **Dual output** — Produces both `.md` and `.pdf` so you get an editable source and a polished document
- **Academic-quality translation** — Three-step workflow (rewrite → diagnose → polish) that eliminates "translationese"
- **Cross-platform** — macOS, Windows, and Linux with automatic font detection

## Quick Start

### Prerequisites

```bash
# macOS
brew install pango
pip3 install pdfplumber markdown weasyprint

# Linux (Debian/Ubuntu)
sudo apt install libpango1.0-dev
pip3 install pdfplumber markdown weasyprint

# Windows
pip3 install pdfplumber markdown weasyprint
```

### Usage with Claude / Cursor

1. Place (or symlink) this skill into your skills directory:

   ```bash
   # Claude Code
   ln -s /path/to/pdf-translate ~/.claude/skills/pdf-translate

   # Cursor
   ln -s /path/to/pdf-translate .cursor/skills/pdf-translate
   ```

2. Ask in natural language:

   ```
   Translate this PDF: report.pdf
   翻译这个PDF：report.pdf
   ```

3. The skill automatically:
   - Extracts text with `p...

TopRank Skills

安装方式

Overview

Extracted Content

SKILL.md excerpt

README excerpt

Related Claw Skills

dojo.md

wps-macos-helper

firecrawl

Tweet Processor

md-docs-search

duckdb-en