Overview
- Skill Key
- hanqiudeng/tophot-chinese
- Author
- hanqiudeng
- Source Repo
- openclaw/skills
- Version
- -
- Source Path
- skills/hanqiudeng/tophot-chinese
- Latest Commit SHA
- ae5fd01a3aa2df2a6e198051b7ac91a92105c10c
Stars
0
Installs
0
Status
ACTIVE
Visibility
PUBLIC
直接复制以下提示词,发送给你的 AI 助手即可完成安装。
请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 Tophot Chinese 技能。 若已安装,则直接安装 Tophot Chinese 技能。
# 技能:今日热榜爬虫与内容抓取 (TopHub Spider)
## 1. 环境依赖与安装 (Prerequisites)
如果系统尚未安装相关依赖,Agent 应参考以下步骤进行环境部署:
1. **核心库安装**: `pip install crawl4ai requests tqdm pypinyin`(若 `pip` 不可用则用 `pip3`)
2. **浏览器内核初始化**: `python -m playwright install chromium`(若 `python` 不可用则用 `python3`)
> **注意**: 以下命令中的 `python` 在部分平台可能需要替换为 `python3`,Agent 应自行判断当前环境使用哪个。
---
## 2. 技能描述
本技能由两个脚本组成,职责分离:
| 脚本 | 职责 | 说明 |
| --- | --- | --- |
| `{baseDir}/scripts/{baseDir}/scripts/tophub_spider.py` | **获取热榜列表** | 从 tophub.today 拉取热榜,生成 `{标题}.json` 文件(只含 title、description、url,不抓正文) |
| `{baseDir}/scripts/{baseDir}/scripts/fetch_site_content.py` | **抓取正文内容** | 读取 JSON 文件中的 url,用 crawl4ai 抓取 Markdown 正文并保存 |
**典型工作流**:
1. 用 `{baseDir}/scripts/tophub_spider.py` 拉取某网站热榜 → 生成 `{保存路径}/{网站拼音}/{标题}.json`
2. 用 `{baseDir}/scripts/fetch_site_content.py` 对单个文件或整个目录抓取正文内容
---
## 3. 核心指令集
### 3.1 {baseDir}/scripts/tophub_spider.py — 获取热榜列表
获取指定网站热榜(全部):
```bash
python {baseDir}/scripts/tophub_spider.py <网站名称>
```
获取前 N 条:
```bash
python {baseDir}/scripts/tophub_spider.py <网站名称> --top <数量>
```
指定保存路径:
```bash
python {baseDir}/scripts/tophub_spider.py <网站名称> --output <路径> --top <数量>
```
查看所有可用站点:
```bash
python {baseDir}/scripts/tophub_spider.py
```
### 3.2 {baseDir}/scripts/fetch_site_content.py — 抓取正文内容
抓取单个文件(回写到原文件的 content 字段):
```bash
python {baseDir}/scripts/fetch_site_content.py <文件.json>
```
抓取单个文件(保存到指定文件):
```bash
python {baseDir}/scripts/fetch_site_content.py <文件.json> --output <输出文件.json>
```
批量抓取目录下所有文件(回写到各原文件):
```bash
python {baseDir}/scripts/fetch_site_content.py <目录路径>
```
批量抓取并保存到另一个目录:
```bash
python {baseDir}/s...
openbotx
An open-source platform for orchestrating AI agents — secure, simple, and built for everyone. Multi-agent, real-time task board, web control panel, skills system, browser automation, multi-provider, scheduler, and more. One command to start. Everything from the browser. No coding required.
heyixuan2
Bambu Lab 3D printer control and automation. Activate when user mentions: printer status, 3D printing, slice, analyze model, generate 3D, AMS filament, print monitor, Bambu Lab, or any 3D printing task. Full pipeline: search → generate → analyze → colorize → preview → open BS → user slice → print → monitor. Supports all 9 Bambu Lab printers (A1 Mini, A1, P1S, P2S, X1C, X1E, H2C, H2S, H2D).
sealiu1997
OpenClaw skill for 知识星球 / ZSXQ digests with token-first auth and browser recovery.
abczsl520
🌐 OpenClaw skill for Browser-Use — AI-powered browser automation for complex multi-step workflows (login, form filling, scraping, posting)
ashemag
OpenClaw skill to crosspost X/Twitter posts to Reddit via browser automation
chizhongwang
Teach AI agents to discover hidden APIs, data functions & browser automation recipes for any website. Works with Claude Code, ClawHub, Cursor & any AgentSkills-compatible agent.