Overview
- Skill Key
- chunhualiao/wechat-article-extractor
- Author
- chunhualiao
- Source Repo
- openclaw/skills
- Version
- -
- Source Path
- skills/chunhualiao/wechat-article-extractor
- Latest Commit SHA
- eb46cc32205396ccb0a500f3517058a93ecaf2f9
Extract full text and figures from a WeChat public account (微信公众号) article URL and save as a clean Markdown file. Handles WeChat's bot-detection by finding mirror sites automatically. Use when the user shares an mp.weixin.qq.com URL and asks to save, archive, extract, or read the article.
Stars
0
Installs
0
Status
ACTIVE
Visibility
PUBLIC
直接复制以下提示词,发送给你的 AI 助手即可完成安装。
请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 wechat-article-extractor 技能。 若已安装,则直接安装 wechat-article-extractor 技能。
# WeChat Article Extractor
Extract WeChat public account articles to clean Markdown. WeChat blocks headless browsers (环境异常 CAPTCHA) and `web_fetch` gets empty JS-rendered pages, so the reliable approach is: find a mirror on aggregator sites, then extract content.
## Scope & Boundaries
**This skill handles:**
- Extracting article text, images, and metadata from WeChat article URLs
- Finding mirror copies when direct access is blocked
- Converting HTML to clean Markdown
- Saving output as `.md` files
**This skill does NOT handle:**
- Publishing or syncing to note-taking apps (that's the user's workflow)
- Batch extraction of multiple articles (handle one at a time)
- WeChat login, authentication, or account management
- Translating article content
## Inputs
| Input | Required | Description |
|-------|----------|-------------|
| WeChat URL | Yes | An `mp.weixin.qq.com` link |
| Output filename | No | Defaults to kebab-case of article title |
| Save location | No | Defaults to `/tmp/` |
## Outputs
- A Markdown file with full article content, images, and metadata header
- Console confirmation with file path and character count
## Workflow
### Step 1 — Try direct fetch (fast path)
```
web_fetch(url, extractMode="markdown", maxChars=50000)
```
**Success check:** If result `rawLength > 500` AND content has real paragraphs (not just nav/footer text) → skip to Step 4 Option B.
**Failure indicators:** `rawLength < 500`, content is navigation/boilerplate only, or contains "环境异常" → go to Step 2.
### Step 2 — Extract article metadata
From the URL or any partial content, identify:
- Article title (from `<title>` or og:title)
- Author / account name (from og:description or page content)
If metadata is unavailable from the URL, ask the user for the article title.
### Step 3 — Search for mirrors
```
web_search("<article title> <author/account name>")
```
**Mirror site priority** (ranked by content quality and reliability):
1. **53ai.com** — full content, re...
# wechat-article-extractor Extract WeChat public account (微信公众号) articles to clean Markdown files with images and metadata. ## Problem WeChat articles are notoriously difficult to archive: - Direct scraping is blocked by bot detection (环境异常 CAPTCHA) - `web_fetch` gets empty JavaScript-rendered shells - Headless browsers trigger anti-bot measures This skill works around these limitations by automatically finding mirror copies on aggregator sites, then extracting clean content. ## How It Works 1. Attempts direct fetch (works ~10% of the time) 2. If blocked, searches for mirror copies on aggregator sites (53ai.com, ofweek.com, juejin.cn, etc.) 3. Downloads mirror HTML and extracts article content, images, and metadata 4. Outputs clean Markdown with proper formatting Falls back to Chrome Extension Relay for very new or niche articles with no mirrors. ## Installation Copy the skill directory to your OpenClaw skills folder: ```bash cp -r wechat-article-extractor ~/.openclaw/<workspace>/skills/ ``` ### Requirements - Python 3.8+ - `curl` (for downloading mirror pages) - OpenClaw tools: `web_fetch`, `web_search`, `exec` - Optional: `browser` tool (for Chrome Relay fallback) ## Usage Share a WeChat article URL with your agent: > "Save this article: https://mp.weixin.qq.com/s/example123" The skill triggers automatically on `mp.weixin.qq.com` URLs. ### Trigger Phrases - Any `mp.weixin.qq.com` URL - "extract wechat article" - "save wechat article" - "archive wechat" - "提取公众号文章" - "保存公众号文章" ## Output Format ```markdown # Article Title **作者:** Author Name **来源:** 微信公众号「Account Name」 **日期:** 2024-01-15 **原文:** https://mp.weixin.qq.com/s/... --- Full article content with images preserved... ``` ## Extraction Script The included Python script handles HTML-to-Markdown conversion: ```bash # Extract from downloaded HTML python3 scripts/extract_wechat.py article.html output.md # With source URL for metadata python3 scr...
laborany
基于 Claude Code 的桌面 AI 工作力平台 — 支持飞书/QQ 远程调度、技能创建、定时任务。OpenClaw 的桌面实现,零代码养好你的 AI 🦞 Desktop AI workforce platform built on Claude Code. Feishu/QQ bot integration, skill creation, scheduled tasks — OpenClaw for your desktop. Raise your AI lobsters 🦞
win4r
Reusable OpenClaw skill for remote Linux deployment with MiniMax M2.1 and Telegram bot setup
botlearn-ai
Bots learn, human earns, curated open claw playbook list and skill list for life long learners at https://botlearn.ai
duanecilliers
Web-based admin dashboard for OpenClaw — manage Discord persona bots, workspace files, skills, cron jobs, channels, and config
abczsl520
OpenClaw skill: Dynamic bug audit for Node.js web projects (games, data tools, WeChat, APIs, bots). 200+ real-world pitfalls.
pardnchiu
A Go agentic AI platform with skill routing, multi-provider intelligent dispatch, Discord bot integration, and security-first shared agent design