Tophot Chinese

技能：今日热榜爬虫与内容抓取 TopHub Spider

View Source SKILL.md

Stars

Installs

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词，发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店，若未安装，请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店，然后安装 Tophot Chinese 技能。若已安装，则直接安装 Tophot Chinese 技能。

Overview

Skill Key: hanqiudeng/tophot-chinese
Author: hanqiudeng
Source Repo: openclaw/skills
Version: -
Source Path: skills/hanqiudeng/tophot-chinese
Latest Commit SHA: ae5fd01a3aa2df2a6e198051b7ac91a92105c10c

Extracted Content

SKILL.md excerpt

# 技能：今日热榜爬虫与内容抓取 (TopHub Spider)

## 1. 环境依赖与安装 (Prerequisites)
如果系统尚未安装相关依赖，Agent 应参考以下步骤进行环境部署：

1. **核心库安装**: `pip install crawl4ai requests tqdm pypinyin`（若 `pip` 不可用则用 `pip3`）
2. **浏览器内核初始化**: `python -m playwright install chromium`（若 `python` 不可用则用 `python3`）

> **注意**: 以下命令中的 `python` 在部分平台可能需要替换为 `python3`，Agent 应自行判断当前环境使用哪个。

---

## 2. 技能描述

本技能由两个脚本组成，职责分离：

| 脚本 | 职责 | 说明 |
| --- | --- | --- |
| `{baseDir}/scripts/{baseDir}/scripts/tophub_spider.py` | **获取热榜列表** | 从 tophub.today 拉取热榜，生成 `{标题}.json` 文件（只含 title、description、url，不抓正文） |
| `{baseDir}/scripts/{baseDir}/scripts/fetch_site_content.py` | **抓取正文内容** | 读取 JSON 文件中的 url，用 crawl4ai 抓取 Markdown 正文并保存 |

**典型工作流**：
1. 用 `{baseDir}/scripts/tophub_spider.py` 拉取某网站热榜 → 生成 `{保存路径}/{网站拼音}/{标题}.json`
2. 用 `{baseDir}/scripts/fetch_site_content.py` 对单个文件或整个目录抓取正文内容

---

## 3. 核心指令集

### 3.1 {baseDir}/scripts/tophub_spider.py — 获取热榜列表

获取指定网站热榜（全部）：
```bash
python {baseDir}/scripts/tophub_spider.py <网站名称>
```

获取前 N 条：
```bash
python {baseDir}/scripts/tophub_spider.py <网站名称> --top <数量>
```

指定保存路径：
```bash
python {baseDir}/scripts/tophub_spider.py <网站名称> --output <路径> --top <数量>
```

查看所有可用站点：
```bash
python {baseDir}/scripts/tophub_spider.py
```

### 3.2 {baseDir}/scripts/fetch_site_content.py — 抓取正文内容

抓取单个文件（回写到原文件的 content 字段）：
```bash
python {baseDir}/scripts/fetch_site_content.py <文件.json>
```

抓取单个文件（保存到指定文件）：
```bash
python {baseDir}/scripts/fetch_site_content.py <文件.json> --output <输出文件.json>
```

批量抓取目录下所有文件（回写到各原文件）：
```bash
python {baseDir}/scripts/fetch_site_content.py <目录路径>
```

批量抓取并保存到另一个目录：
```bash
python {baseDir}/s...

Related Claw Skills

openbotx

★ 83

An open-source platform for orchestrating AI agents — secure, simple, and built for everyone. Multi-agent, real-time task board, web control panel, skills system, browser automation, multi-provider, scheduler, and more. One command to start. Everything from the browser. No coding required.

heyixuan2

bambu-studio-ai

★ 41

Bambu Lab 3D printer control and automation. Activate when user mentions: printer status, 3D printing, slice, analyze model, generate 3D, AMS filament, print monitor, Bambu Lab, or any 3D printing task. Full pipeline: search → generate → analyze → colorize → preview → open BS → user slice → print → monitor. Supports all 9 Bambu Lab printers (A1 Mini, A1, P1S, P2S, X1C, X1E, H2C, H2S, H2D).

sealiu1997

zsxq-digest

★ 8

OpenClaw skill for 知识星球 / ZSXQ digests with token-first auth and browser recovery.

abczsl520

browser-use-skill

★ 3

🌐 OpenClaw skill for Browser-Use — AI-powered browser automation for complex multi-step workflows (login, form filling, scraping, posting)

ashemag

reddit-crosspost

★ 1

OpenClaw skill to crosspost X/Twitter posts to Reddit via browser automation

chizhongwang

veriglow-agent-map-skill

★ 1

Teach AI agents to discover hidden APIs, data functions & browser automation recipes for any website. Works with Claude Code, ClawHub, Cursor & any AgentSkills-compatible agent.