Overview
- Skill Key
- a-i-r/mineru-pdf-extractor
- Author
- Community
- Source Repo
- openclaw/skills
- Version
- 1.0.0
- Source Path
- skills/a-i-r/mineru-pdf-extractor
- Latest Commit SHA
- d8ecc69685dac4cc94a51ee53ced0f9afeab91d5
Extract PDF content to Markdown using MinerU API. Supports formulas, tables, OCR. Provides both local file and online URL parsing methods.
Stars
0
Installs
0
Status
ACTIVE
Visibility
PUBLIC
直接复制以下提示词,发送给你的 AI 助手即可完成安装。
请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 mineru-pdf-extractor 技能。 若已安装,则直接安装 mineru-pdf-extractor 技能。
# MinerU PDF Extractor
Extract PDF documents to structured Markdown using the MinerU API. Supports formula recognition, table extraction, and OCR.
> **Note**: This is a community skill, not an official MinerU product. You need to obtain your own API key from [MinerU](https://mineru.net/).
---
## 📁 Skill Structure
```
mineru-pdf-extractor/
├── SKILL.md # English documentation
├── SKILL_zh.md # Chinese documentation
├── docs/ # Documentation
│ ├── Local_File_Parsing_Guide.md # Local PDF parsing detailed guide (English)
│ ├── Online_URL_Parsing_Guide.md # Online PDF parsing detailed guide (English)
│ ├── MinerU_本地文档解析完整流程.md # Local parsing complete guide (Chinese)
│ └── MinerU_在线文档解析完整流程.md # Online parsing complete guide (Chinese)
└── scripts/ # Executable scripts
├── local_file_step1_apply_upload_url.sh # Local parsing Step 1
├── local_file_step2_upload_file.sh # Local parsing Step 2
├── local_file_step3_poll_result.sh # Local parsing Step 3
├── local_file_step4_download.sh # Local parsing Step 4
├── online_file_step1_submit_task.sh # Online parsing Step 1
└── online_file_step2_poll_result.sh # Online parsing Step 2
```
---
## 🔧 Requirements
### Required Environment Variables
Scripts automatically read MinerU Token from environment variables (choose one):
```bash
# Option 1: Set MINERU_TOKEN
export MINERU_TOKEN="your_api_token_here"
# Option 2: Set MINERU_API_KEY
export MINERU_API_KEY="your_api_token_here"
```
### Required Command-Line Tools
- `curl` - For HTTP requests (usually pre-installed)
- `unzip` - For extracting results (usually pre-installed)
### Optional Tools
- `jq` - For enhanced JSON parsing and security (recommended but not required)
- If not installed, scripts will use fallback methods
- Install: `apt-get install jq` (Debian/Ubuntu)...
edholofy
University for AI agents. 92 courses, 4400+ scenarios, any model via OpenRouter. Auto-training loops generate per-model SKILL.md documents. Works with Claude Code, OpenClaw, Cursor, Windsurf. No fine-tuning required.
lethehades
macOS WPS Office workflow helper skill for safer document preparation, conversion, export, and compatibility guidance
capt-marbles
Web scraping and crawling with Firecrawl API. Fetch webpage content as markdown, take screenshots, extract structured data, search the web, and crawl documentation sites. Use when the user needs to scrape a URL, get current web info, capture a screenshot, extract specific data from pages, or crawl docs for a framework/library.
caqlayan
Tweet Processor Skill
carev01
Full-text search across structured Markdown documentation archives using SQLite FTS5. Use when you need to search large collections of Markdown articles that are separated by "---" delimiters and contain source URLs (marked with "*Source:" pattern). Provides fast BM25-ranked search with automatic source URL extraction for citations. Ideal for research, documentation lookups, and knowledge base exploration. Requires indexing documentation first with `docs.py index`.
camelsprout
DuckDB CLI specialist for SQL analysis, data processing and file conversion. Use for SQL queries, CSV/Parquet/JSON analysis, database queries, or data conversion. Triggers on "duckdb", "sql", "query", "data analysis", "parquet", "convert data".