TopRank Skills

Home / Claw Skills / Araignée / scrapling-mcp
Official OpenClaw rules 54%

scrapling-mcp

Advanced web scraping with Scrapling — MCP-native guidance for extraction, crawling, and anti-bot handling. Use via mcporter (MCP) to call the `scrapling` MCP server for execution; this skill provides strategy, recipes, and best practices.

Stars

0

Installs

0

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词,发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 scrapling-mcp 技能。 若已安装,则直接安装 scrapling-mcp 技能。

Overview

Skill Key
devbd1/scrapling-web-scraping
Author
devbd1
Source Repo
openclaw/skills
Version
-
Source Path
skills/devbd1/scrapling-web-scraping
Latest Commit SHA
20ebe1179834baa3ecd7dcef1993ced66afb8941

Extracted Content

SKILL.md excerpt

# Scrapling MCP — Web Scraping Guidance

> **Guidance Layer + MCP Integration**  
> Use this skill for **strategy and patterns**. For execution, call Scrapling's MCP server via `mcporter`.

## Quick Start (MCP)

### 1. Install Scrapling with MCP support
```bash
pip install scrapling[mcp]
# Or for full features:
pip install scrapling[mcp,playwright]
python -m playwright install chromium
```

### 2. Add to OpenClaw MCP config
```json
{
  "mcpServers": {
    "scrapling": {
      "command": "python",
      "args": ["-m", "scrapling.mcp"]
    }
  }
}
```

### 3. Call via mcporter
```
mcporter call scrapling fetch_page --url "https://example.com"
```

## Execution vs Guidance

| Task | Tool | Example |
|------|------|---------|
| Fetch a page | **mcporter** | `mcporter call scrapling fetch_page --url URL` |
| Extract with CSS | **mcporter** | `mcporter call scrapling css_select --selector ".title::text"` |
| Which fetcher to use? | **This skill** | See "Fetcher Selection Guide" below |
| Anti-bot strategy? | **This skill** | See "Anti-Bot Escalation Ladder" |
| Complex crawl patterns? | **This skill** | See "Spider Recipes" |

## Fetcher Selection Guide

```
┌─────────────────┐     ┌──────────────────┐     ┌──────────────────┐
│   Fetcher       │────▶│ DynamicFetcher   │────▶│ StealthyFetcher  │
│   (HTTP)        │     │ (Browser/JS)     │     │ (Anti-bot)       │
└─────────────────┘     └──────────────────┘     └──────────────────┘
     Fastest              JS-rendered               Cloudflare, 
     Static pages         SPAs, React/Vue          Turnstile, etc.
```

### Decision Tree
1. **Static HTML?** → `Fetcher` (10-100x faster)
2. **Need JS execution?** → `DynamicFetcher`
3. **Getting blocked?** → `StealthyFetcher`
4. **Complex session?** → Use Session variants

### MCP Fetch Modes
- `fetch_page` — HTTP fetcher
- `fetch_dynamic` — Browser-based with Playwright
- `fetch_stealthy` — Anti-bot bypass mode

## Anti-Bot Escalation Ladder

### Level 1: Polite HTTP
```python...

Related Claw Skills