Overview
- Skill Key
- alexfeng75/rag-system-builder
- Author
- alexfeng75
- Source Repo
- openclaw/skills
- Version
- -
- Source Path
- skills/alexfeng75/rag-system-builder
- Latest Commit SHA
- e7a0d5eab8134a6584af6eb3cbfbbd16714617c0
Build and deploy local RAG (Retrieval-Augmented Generation) systems with offline document processing, embedding models, and vector storage.
Stars
0
Installs
0
Status
ACTIVE
Visibility
PUBLIC
直接复制以下提示词,发送给你的 AI 助手即可完成安装。
请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 rag-system-builder 技能。 若已安装,则直接安装 rag-system-builder 技能。
# RAG System Builder Skill
Build complete local RAG systems that work offline with document ingestion, semantic search, and AI-powered Q&A.
## 🎯 What This Skill Does
This skill guides you through building a complete RAG system that:
- **Ingests documents** from multiple formats (TXT, PDF, DOCX, MD, HTML, JSON, XML)
- **Generates embeddings** using sentence-transformers (offline, no API needed)
- **Stores vectors** locally using FAISS for fast similarity search
- **Provides Q&A interface** through CLI and web interface
- **Works completely offline** - no external API calls required
## 📦 Prerequisites
```bash
# Python 3.8+ required
python --version
# Install dependencies
pip install sentence-transformers faiss-cpu click flask
```
## 🚀 Quick Start
### 1. Create Project Structure
```bash
# Create project directory
mkdir rag-system
cd rag-system
# Create main files
touch rag.py embeddings.py vector_store.py retriever.py config.py
```
### 2. Download Embedding Model
```bash
# Download sentence-transformers model locally
python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='sentence-transformers/all-MiniLM-L6-v2', local_dir='./models/all-MiniLM-L6-v2')"
```
### 3. Configure System
Create `config.py`:
```python
import os
from dataclasses import dataclass
@dataclass
class Config:
embedding_model: str = "sentence-transformers/all-MiniLM-L6-v2"
local_model_path: str = "./models/all-MiniLM-L6-v2"
chunk_size: int = 512
chunk_overlap: int = 128
vector_store_path: str = "vector_store"
default_top_k: int = 5
supported_formats: tuple = (".txt", ".pdf", ".docx", ".md", ".html", ".json", ".xml")
```
### 4. Build Core Components
#### Embeddings Module (`embeddings.py`)
```python
import os
import numpy as np
from typing import List
from sentence_transformers import SentenceTransformer
from config import config
class EmbeddingModel:
def __init__(self, model_name: str = None):
self.model_name = m...
# RAG System Builder Skill Build complete local RAG (Retrieval-Augmented Generation) systems that work offline with document processing, semantic search, and AI-powered Q&A. ## 🎯 What This Skill Does This skill provides step-by-step guidance for building a complete RAG system from scratch: - **Document Ingestion**: Support for TXT, PDF, DOCX, MD, HTML, JSON, XML - **Embedding Generation**: Using sentence-transformers (offline, no API needed) - **Vector Storage**: Local FAISS index for fast similarity search - **Q&A Interface**: CLI and optional web interface - **Complete Offline**: No external API calls required ## 🚀 Quick Start ### 1. Install Dependencies ```bash pip install sentence-transformers faiss-cpu click flask ``` ### 2. Download Embedding Model ```bash python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='sentence-transformers/all-MiniLM-L6-v2', local_dir='./models/all-MiniLM-L6-v2')" ``` ### 3. Build and Use ```bash # Ingest documents python rag.py ingest --docs-path ./my-documents # Query documents python rag.py query --query "What is machine learning?" ``` ## 📦 What's Included This skill provides: 1. **Complete Code Templates** - `rag.py` - CLI interface - `embeddings.py` - Embedding generation - `vector_store.py` - FAISS storage - `retriever.py` - Search functionality - `config.py` - Configuration 2. **Step-by-Step Instructions** - Project setup - Model downloading - Component implementation - Testing and deployment 3. **Usage Examples** - Basic workflow - Advanced usage - Troubleshooting guide ## 🎯 Use Cases - **Document Q&A**: Ask questions about your documents - **Knowledge Base**: Search through document libraries - **Research Assistant**: Find relevant information quickly - **Offline AI**: Work without internet connection ## 📚 Requirements - Python 3.8+ - 2GB+ disk space for embedding model - RAM depends on document size ## 🤝 Contributing This skill is...
youmind-openlab
AI skill for OpenClaw & Claude Code — recommend from 10000+ Nano Banana Pro (Gemini) image prompts. Smart search by use case, content remix, sample images.
23blocks-os
AI Agent Orchestrator with Skills System - Give AI Agents superpowers: memory search, code graph queries, agent-to-agent messaging. Manage Claude, Codex or any AI Agent from one dashboard. Move Agents between computers and locations
hashgraph-online
AI agent skills for the Universal Registry - search, chat, and register 72,000+ agents across 14+ protocols. Works with Claude, Codex, Cursor, OpenClaw, and any AI assistant.
rito-w
A cross-platform skills manager for AI IDEs. Search marketplace, download locally, and install to Claude, Cursor, Windsurf, and more with one click.
besoeasy
Battle-tested skill library for AI agents. Save 98% of API costs with ready-to-use code for crypto, PDFs, search, web scraping & more. No trial-and-error, no expensive APIs.
zeropointrepo
YouTube Transcript API skills for AI agents. Get transcripts, search videos, browse channels. Works with OpenClaw, ClawdBot, Claude Code, Cursor, Windsurf.