Overview
- Skill Key
- etoile04/mineru-pdf
- Author
- etoile04
- Source Repo
- openclaw/skills
- Version
- -
- Source Path
- skills/etoile04/mineru-pdf
- Latest Commit SHA
- 7454f1996e3ed8b5f17184173c52280bea34fd3a
Parse PDF documents with MinerU MCP to extract text, tables, and formulas. Supports multiple backends including MLX-accelerated inference on Apple Silicon.
Stars
0
Installs
0
Status
ACTIVE
Visibility
PUBLIC
直接复制以下提示词,发送给你的 AI 助手即可完成安装。
请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 mineru-pdf 技能。 若已安装,则直接安装 mineru-pdf 技能。
# MinerU PDF Parser
Parse PDF documents using MinerU MCP to extract structured content including text, tables, and formulas with MLX acceleration on Apple Silicon.
## Installation
### Option 1: Install MinerU MCP (for Claude Code)
```bash
claude mcp add --transport stdio --scope user mineru -- \
uvx --from mcp-mineru python -m mcp_mineru.server
```
This installs and configures MinerU for all Claude projects. Models are downloaded on first use.
### Option 2: Use Direct Tool (preserves files)
The skill includes a direct parsing tool that saves output to a persistent directory:
```bash
python /Users/lwj04/clawd/skills/mineru-pdf/parse.py <pdf_path> <output_dir> [options]
```
**Advantages:**
- ✅ Files are saved permanently (not auto-deleted)
- ✅ Full control over output location
- ✅ No MCP overhead
- ✅ Works with any Python environment that has MinerU
## Quick Start
### Method 1: Using the Direct Tool (Recommended)
```bash
# Parse entire PDF
python /Users/lwj04/clawd/skills/mineru-pdf/parse.py \
"/path/to/document.pdf" \
"/path/to/output"
# Parse specific pages
python /Users/lwj04/clawd/skills/mineru-pdf/parse.py \
"/path/to/document.pdf" \
"/path/to/output" \
--start-page 0 --end-page 2
# Use Apple Silicon optimization
python /Users/lwj04/clawd/skills/mineru-pdf/parse.py \
"/path/to/document.pdf" \
"/path/to/output" \
--backend vlm-mlx-engine
# Text only (faster)
python /Users/lwj04/clawd/skills/mineru-pdf/parse.py \
"/path/to/document.pdf" \
"/path/to/output" \
--no-table --no-formula
```
### Method 2: Using MinerU MCP (Temporary Files)
### Parse a PDF document
```bash
uvx --from mcp-mineru python -c "
import asyncio
from mcp_mineru.server import call_tool
async def parse_pdf():
result = await call_tool(
name='parse_pdf',
arguments={
'file_path': '/path/to/document.pdf',
'backend': 'pipeline',
'formula_enable': True,
'table_enable': True,
'sta...
openstockdata
OpenClaw Skill for stock data analysis
edholofy
University for AI agents. 92 courses, 4400+ scenarios, any model via OpenRouter. Auto-training loops generate per-model SKILL.md documents. Works with Claude Code, OpenClaw, Cursor, Windsurf. No fine-tuning required.
lethehades
macOS WPS Office workflow helper skill for safer document preparation, conversion, export, and compatibility guidance
capt-marbles
Generative Engine Optimization (GEO) for AI search visibility. Optimize content to appear in ChatGPT, Perplexity, Claude, and Google AI Overviews. Use when optimizing websites, pages, or content for LLM discoverability and citation.
camopel
Continuous financial news crawler for finviz.com with SQLite storage, article extraction, and query tool. Use when monitoring financial markets, building news digests, or needing a local financial news database. Runs as a background daemon or systemd service.
camopel
Free multi-engine web search via ddgs CLI (DuckDuckGo, Google, Bing, Brave, Yandex, Yahoo, Wikipedia) + arXiv API search. No API keys required. Use when user needs web search, research paper discovery, or when other skills need a search backend. Drop-in replacement for web-search-plus.