TopRank Skills

Home / Claw Skills / Document / web-fetcher
Official OpenClaw rules 54%

web-fetcher

Smart web content fetcher - articles and videos from WeChat, Feishu, Bilibili, Zhihu, Toutiao, YouTube, etc. Triggers: '抓取文章', '下载网页', '保存文章', 'fetch URL', '下载视频', '抓取飞书文档', '抓取微信文章', '把这个链接内容保存下来', '下载B站视频', 'download video', 'scrape article'.

Stars

0

Installs

0

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词,发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 web-fetcher 技能。 若已安装,则直接安装 web-fetcher 技能。

Overview

Skill Key
alexxxiong/web-fetcher
Author
alexxxiong
Source Repo
openclaw/skills
Version
0.1.1
Source Path
skills/alexxxiong/web-fetcher
Latest Commit SHA
a0fc2f53b1d67b89891227c8619eadb6b4837274

Extracted Content

SKILL.md excerpt

# Web Fetcher

Smart web content fetcher for Claude Code. Automatically detects platform and uses the best strategy to fetch articles or download videos.

## Quick Start

```bash
# Fetch an article
python3 {SKILL_DIR}/fetcher.py "URL" -o ~/docs/

# Download a video
python3 {SKILL_DIR}/fetcher.py "https://b23.tv/xxx" -o ~/videos/

# Batch fetch from file
python3 {SKILL_DIR}/fetcher.py --urls-file urls.txt -o ~/docs/
```

## Install Dependencies

Install only what you need — dependencies are checked at runtime:

| Dependency | Purpose | Install |
|-----------|---------|---------|
| scrapling | Article fetching (HTTP + browser) | `pip install scrapling` |
| yt-dlp | Video download | `pip install yt-dlp` |
| camoufox | Anti-detection browser (Xiaohongshu, Weibo) | `pip install camoufox && python3 -m camoufox fetch` |
| html2text | HTML to Markdown conversion | `pip install html2text` |

## Smart Routing

The fetcher automatically detects the platform from the URL:

| Platform | Method | Notes |
|----------|--------|-------|
| mp.weixin.qq.com | scrapling | Extracts `data-src` images, handles SVG placeholders |
| *.feishu.cn | Virtual scroll | Collects all blocks via scrolling, downloads images with cookies |
| zhuanlan.zhihu.com | scrapling | `.Post-RichText` selector |
| www.zhihu.com | scrapling | `.RichContent` selector |
| www.toutiao.com | scrapling | Handles `toutiaoimg.com` base64 placeholders |
| www.xiaohongshu.com | camoufox | Anti-bot protection requires stealth browser |
| www.weibo.com | camoufox | Anti-bot protection requires stealth browser |
| bilibili.com / b23.tv | yt-dlp | Video download, supports quality selection |
| youtube.com / youtu.be | yt-dlp | Video download |
| douyin.com | yt-dlp | Video download |
| Unknown URLs | scrapling | Generic fetch with fallback tiers |

## CLI Reference

```
python3 {SKILL_DIR}/fetcher.py [URL] [OPTIONS]

Arguments:
  url                    URL to fetch

Options:
  -o, --output DIR       Output directory (default:...

README excerpt

# Web Fetcher

Smart web content fetcher for Claude Code. Automatically detects the platform from a URL and uses the best strategy to fetch articles or download videos.

## Supported Platforms

| Platform | Type | Method |
|----------|------|--------|
| WeChat (mp.weixin.qq.com) | Article | scrapling + image extraction |
| Feishu (*.feishu.cn) | Article | Virtual scroll collection |
| Zhihu (zhuanlan/www) | Article | scrapling |
| Toutiao | Article | scrapling + image extraction |
| Xiaohongshu | Article | camoufox (anti-bot) |
| Weibo | Article | camoufox (anti-bot) |
| Bilibili / b23.tv | Video | yt-dlp |
| YouTube / youtu.be | Video | yt-dlp |
| Douyin | Video | yt-dlp |
| Any other URL | Article | scrapling (generic) |

## Install

### As Claude Code Skill

```bash
git clone https://github.com/inspirai-store/web-fetcher ~/.claude/skills/web-fetcher
```

### Manual

```bash
git clone https://github.com/inspirai-store/web-fetcher
cd web-fetcher
```

## Quick Start

```bash
# Fetch a WeChat article
python3 fetcher.py "https://mp.weixin.qq.com/s/xxx" -o ~/docs/

# Download a Bilibili video
python3 fetcher.py "https://b23.tv/xxx" -o ~/videos/

# Fetch a Feishu document
python3 fetcher.py "https://xxx.feishu.cn/wiki/xxx" -o ~/docs/

# Batch fetch from file
python3 fetcher.py --urls-file urls.txt -o ~/docs/

# Extract audio only
python3 fetcher.py "https://b23.tv/xxx" -o ~/audio/ --audio-only
```

## CLI Reference

```
python3 fetcher.py [URL] [OPTIONS]

Arguments:
  url                    URL to fetch

Options:
  -o, --output DIR       Output directory (default: current)
  -q, --quality N        Video quality: 1080, 720, 480 (default: 1080)
  --method METHOD        Force method: scrapling, camoufox, ytdlp, feishu
  --selector CSS         Force CSS selector for content extraction
  --urls-file FILE       File with URLs (one per line, # for comments)
  --audio-only           Extract audio only (video downloads)
  --no-images            Skip image download (articles)
  -...

Related Claw Skills