Overview
- Skill Key
- aurthes/aurthes-web-fetcher
- Author
- aurthes
- Source Repo
- openclaw/skills
- Version
- 1.2.0
- Source Path
- skills/aurthes/aurthes-web-fetcher
- Latest Commit SHA
- dd9ed0dbcba39fa67f056668c8d4daaafd2e9316
Fetch web pages and extract readable content for AI use. Use when reading, summarizing, or crawling a specific URL or small set of URLs. Prefer low-friction URL-to-Markdown services first, then fall back to browser-based retrieval, search snippets, or cached/indexed copies when sites are protected by Cloudflare or similar bot checks.
Stars
0
Installs
0
Status
ACTIVE
Visibility
PUBLIC
直接复制以下提示词,发送给你的 AI 助手即可完成安装。
请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 web-fetcher 技能。 若已安装,则直接安装 web-fetcher 技能。
# Web Fetcher
Fetch readable web content with a reliability-first fallback chain.
## Core rule
Do **not** promise direct access to every site. Some sites use Cloudflare, login walls, bot detection, or legal restrictions. In those cases, switch to the next fallback instead of insisting the first method should work.
## Preferred fetch order
### 1) Direct readable fetch
Try lightweight conversion services first:
1. **r.jina.ai**
```
https://r.jina.ai/http://example.com
```
2. **markdown.new**
```
https://markdown.new/https://example.com
```
3. **defuddle**
```
https://defuddle.md/https://example.com
```
For deterministic retries, use the bundled script:
```bash
python {baseDir}/scripts/fetch_url.py "https://example.com/article"
```
The script returns JSON with:
- chosen method
- attempt history
- blocked/thin-content detection
- final content when successful
Use these when the user wants article text, page summaries, or structured extraction from normal public pages.
### 2) Detect failure modes early
Treat the fetch as failed or unreliable if you see signs like:
- `Just a moment...`
- `Performing security verification`
- `Enable JavaScript and cookies`
- CAPTCHA / challenge pages
- login wall instead of target content
- obvious truncation / missing article body
When this happens, **stop treating the result as the page content**.
### 3) Browser fallback for protected sites
For sites blocked behind Cloudflare or requiring real browser execution:
- Prefer a real browser session via OpenClaw browser tools when available.
- If the user is using the Chrome relay/extension, ask them to attach the tab and then inspect the live rendered page.
- Snapshot the page and extract only the needed fields.
Use browser fallback for:
- JS-heavy pages
- Cloudflare-protected pages
- sites that render key content after load
- pages where the direct markdown services return verification screens
### 4) Search / indexed fallback
If direct fetch...
youmind-openlab
AI skill for OpenClaw & Claude Code — recommend from 10000+ Nano Banana Pro (Gemini) image prompts. Smart search by use case, content remix, sample images.
23blocks-os
AI Agent Orchestrator with Skills System - Give AI Agents superpowers: memory search, code graph queries, agent-to-agent messaging. Manage Claude, Codex or any AI Agent from one dashboard. Move Agents between computers and locations
hashgraph-online
AI agent skills for the Universal Registry - search, chat, and register 72,000+ agents across 14+ protocols. Works with Claude, Codex, Cursor, OpenClaw, and any AI assistant.
rito-w
A cross-platform skills manager for AI IDEs. Search marketplace, download locally, and install to Claude, Cursor, Windsurf, and more with one click.
besoeasy
Battle-tested skill library for AI agents. Save 98% of API costs with ready-to-use code for crypto, PDFs, search, web scraping & more. No trial-and-error, no expensive APIs.
zeropointrepo
YouTube Transcript API skills for AI agents. Get transcripts, search videos, browse channels. Works with OpenClaw, ClawdBot, Claude Code, Cursor, Windsurf.