Overview
- Skill Key
- decrystal/agent-mineru
- Author
- decrystal
- Source Repo
- openclaw/skills
- Version
- -
- Source Path
- skills/decrystal/agent-mineru
- Latest Commit SHA
- c089770109e08a8d44d67b57124087776b80852a
MinerU document parsing CLI with layout.json post-processing and S3 integration. Parse PDF/Word/PPT/images to structured Markdown with formula, table, and code extraction.
Stars
0
Installs
0
Status
ACTIVE
Visibility
PUBLIC
直接复制以下提示词,发送给你的 AI 助手即可完成安装。
请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 agent-mineru 技能。 若已安装,则直接安装 agent-mineru 技能。
# Document Parsing with agent-mineru ## Installation ```bash npm install -g agent-mineru ``` ## Authentication ```bash export MINERU_TOKEN="your_api_token" ``` Get your token at: https://mineru.net/apiManage/docs ## Quick start ```bash agent-mineru parse https://arxiv.org/pdf/2410.17247 # Parse PDF agent-mineru extract ./task_id/layout.json # Extract formulas/tables agent-mineru convert ./task_id/layout.json -o custom.md # Custom Markdown ``` ## Important: HTML vs PDF output difference - **PDF/Doc/PPT/Image** → ZIP with `layout.json` + `full.md` + `images/` → supports fine-grained post-processing - **HTML** → only `full.md` → no layout.json, no post-processing available ## Commands ### Parse (single file) ```bash agent-mineru parse <url|file> # Auto-detect type, parse & download agent-mineru parse ./paper.pdf # Local file agent-mineru parse https://example.com/doc.pdf --model pipeline agent-mineru parse https://example.com/page.html # Auto-selects MinerU-HTML agent-mineru parse ./paper.pdf --no-wait # Submit only, don't wait agent-mineru parse ./paper.pdf --json # JSON output for piping agent-mineru parse ./paper.pdf --s3 # Auto-upload to S3 after download ``` ### Parse batch (multiple URLs) ```bash agent-mineru parse-batch url1.pdf url2.pdf agent-mineru parse-batch --file urls.txt # URLs from file agent-mineru parse-batch --file urls.txt --model vlm ``` ### Upload (local files) ```bash agent-mineru upload ./paper1.pdf ./paper2.pdf agent-mineru upload ./docs/*.pdf --model pipeline ``` ### Check status ```bash agent-mineru status <task_id> # Single task status agent-mineru status <task_id> --json # JSON output agent-mineru status-batch <batch_id> # Batch task status ``` ### Extract elements (PDF only, needs layout.json) ```bash agent-mineru extract <json_file> # All elements as JSON agent-mineru extract layout.json --types formula...
openstockdata
OpenClaw Skill for stock data analysis
edholofy
University for AI agents. 92 courses, 4400+ scenarios, any model via OpenRouter. Auto-training loops generate per-model SKILL.md documents. Works with Claude Code, OpenClaw, Cursor, Windsurf. No fine-tuning required.
lethehades
macOS WPS Office workflow helper skill for safer document preparation, conversion, export, and compatibility guidance
capt-marbles
Generative Engine Optimization (GEO) for AI search visibility. Optimize content to appear in ChatGPT, Perplexity, Claude, and Google AI Overviews. Use when optimizing websites, pages, or content for LLM discoverability and citation.
camopel
Continuous financial news crawler for finviz.com with SQLite storage, article extraction, and query tool. Use when monitoring financial markets, building news digests, or needing a local financial news database. Runs as a background daemon or systemd service.
camopel
Free multi-engine web search via ddgs CLI (DuckDuckGo, Google, Bing, Brave, Yandex, Yahoo, Wikipedia) + arXiv API search. No API keys required. Use when user needs web search, research paper discovery, or when other skills need a search backend. Drop-in replacement for web-search-plus.