Overview
- Skill Key
- cryptos3c/openclaw-scrapling
- Author
- cryptos3c
- Source Repo
- openclaw/skills
- Version
- 1.0.0
- Source Path
- skills/cryptos3c/openclaw-scrapling
- Latest Commit SHA
- 7a9a8a23a41e360bb7dc56415757d1f51bfa48f9
Advanced web scraping with anti-bot bypass, JavaScript support, and adaptive selectors. Use when scraping websites with Cloudflare protection, dynamic content, or frequent UI changes.
Stars
0
Installs
0
Status
ACTIVE
Visibility
PUBLIC
直接复制以下提示词,发送给你的 AI 助手即可完成安装。
请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 scrapling 技能。 若已安装,则直接安装 scrapling 技能。
# Scrapling Web Scraping Skill Use Scrapling to scrape modern websites, including those with anti-bot protection, JavaScript-rendered content, and adaptive element tracking. ## When to Use This Skill - User asks to scrape a website or extract data from a URL - Need to bypass Cloudflare, bot detection, or anti-scraping measures - Need to handle JavaScript-rendered/dynamic content (React, Vue, etc.) - Website requires login or session management - Website structure changes frequently (adaptive selectors) - Need to scrape multiple pages with rate limiting ## Commands All commands use the `scrape.py` script in this skill's directory. ### Basic HTTP Scraping (Fast) ```bash python scrape.py \ --url "https://example.com" \ --selector ".product" \ --output products.json ``` **Use when:** Static HTML, no JavaScript, no bot protection ### Stealth Mode (Bypass Anti-Bot) ```bash python scrape.py \ --url "https://nopecha.com/demo/cloudflare" \ --stealth \ --selector "#content" \ --output data.json ``` **Use when:** Cloudflare protection, bot detection, fingerprinting **Features:** - Bypasses Cloudflare Turnstile automatically - Browser fingerprint spoofing - Headless browser mode ### Dynamic/JavaScript Content ```bash python scrape.py \ --url "https://spa-website.com" \ --dynamic \ --selector ".loaded-content" \ --wait-for ".loaded-content" \ --output data.json ``` **Use when:** React/Vue/Angular apps, lazy-loaded content, AJAX **Features:** - Full Playwright browser automation - Wait for elements to load - Network idle detection ### Adaptive Selectors (Survives Website Changes) ```bash # First time - save the selector pattern python scrape.py \ --url "https://example.com" \ --selector ".product-card" \ --adaptive-save \ --output products.json # Later, if website structure changes python scrape.py \ --url "https://example.com" \ --adaptive \ --output products.json ``` **Use when:** Website frequently redesigns, need robus...
# Scrapling Web Scraping Skill Advanced web scraping for OpenClaw with anti-bot bypass and adaptive selectors. ## Features ✅ **Anti-Bot Bypass** - Automatically handles Cloudflare Turnstile, bot detection ✅ **JavaScript Support** - Scrape React, Vue, Angular apps with full browser automation ✅ **Adaptive Selectors** - Elements auto-relocate when websites redesign ✅ **Session Management** - Persistent cookies, login state across requests ✅ **Multiple Modes** - HTTP (fast), Stealth (anti-bot), Dynamic (full browser) ✅ **Flexible Output** - JSON, JSONL, CSV, Markdown, plain text ## Quick Start ### Install Skill Via OpenClaw Gateway UI: 1. Navigate to Skills section 2. Click "Install Skill" 3. Select or upload `scrapling` skill 4. Wait for dependencies to install (~2-5 minutes for browsers) Via CLI: ```bash # Install dependencies cd ~/.openclaw/skills/scrapling pip install -r requirements.txt scrapling install # Downloads browsers (~500MB) ``` ### Basic Usage ```bash # Scrape a static site python scrape.py --url "https://example.com" --selector ".product" --output products.json # Bypass anti-bot protection python scrape.py --url "https://protected-site.com" --stealth --selector ".content" # Scrape JavaScript-rendered content python scrape.py --url "https://spa-app.com" --dynamic --selector ".item" # Adaptive mode (survives website changes) python scrape.py --url "https://site.com" --selector ".product" --adaptive-save # Later, even if site redesigns: python scrape.py --url "https://site.com" --adaptive ``` ## Examples Check the `examples/` directory for: - `basic.py` - Simple HTTP scraping - `stealth.py` - Cloudflare bypass example - `dynamic.py` - JavaScript-heavy sites - `adaptive.py` - Adaptive selector demo ## Documentation Full documentation in `SKILL.md` including: - All command-line options - Selector types (CSS, XPath) - Output formats - Session management - Troubleshooting guide ## Requirements - Python 3.10+ - ~500MB disk s...
openbotx
An open-source platform for orchestrating AI agents — secure, simple, and built for everyone. Multi-agent, real-time task board, web control panel, skills system, browser automation, multi-provider, scheduler, and more. One command to start. Everything from the browser. No coding required.
heyixuan2
Bambu Lab 3D printer control and automation. Activate when user mentions: printer status, 3D printing, slice, analyze model, generate 3D, AMS filament, print monitor, Bambu Lab, or any 3D printing task. Full pipeline: search → generate → analyze → colorize → preview → open BS → user slice → print → monitor. Supports all 9 Bambu Lab printers (A1 Mini, A1, P1S, P2S, X1C, X1E, H2C, H2S, H2D).
sealiu1997
OpenClaw skill for 知识星球 / ZSXQ digests with token-first auth and browser recovery.
openstockdata
OpenClaw Skill for stock data analysis
abczsl520
🌐 OpenClaw skill for Browser-Use — AI-powered browser automation for complex multi-step workflows (login, form filling, scraping, posting)
ashemag
OpenClaw skill to crosspost X/Twitter posts to Reddit via browser automation