Overview
- Skill Key
- edmonddantesj/token-guard
- Author
- edmonddantesj
- Source Repo
- openclaw/skills
- Version
- -
- Source Path
- skills/edmonddantesj/token-guard
- Latest Commit SHA
- de07e1453a378ee315388f394f930d69e3efa0c2
TokenGuard — LLM API 429 Prevention Engine
Stars
0
Installs
0
Status
ACTIVE
Visibility
PUBLIC
直接复制以下提示词,发送给你的 AI 助手即可完成安装。
请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 Token Guard 技能。 若已安装,则直接安装 Token Guard 技能。
# TokenGuard — LLM API 429 Prevention Engine <!-- 🌌 Aoineco-Verified | S-DNA: AOI-2026-0213-SDNA-TG01 --> **Version:** 1.5.0 **Author:** Aoineco & Co. **License:** MIT **Tags:** rate-limit, 429, token-management, cost-optimization, llm-guard, high-performance ## Description Prevents LLM API 429 (Rate Limit / Resource Exhausted) errors by intercepting requests before they're sent. Designed for users on free/low-cost API plans who need maximum intelligence per dollar. **Core philosophy:** *"Intelligence is measured not by how much you spend, but by how little you need."* ## Problem When using LLM APIs (especially Google Gemini Flash with 1M TPM limit): - Large documents (docx, PDFs) can consume the entire minute quota in one request - Failed requests still count toward token usage - Retry loops after 429 errors waste more tokens → death spiral - No built-in way to detect runaway/duplicate requests ## Features | Feature | Description | |---------|-------------| | **Pre-flight Token Estimation** | Estimates token count before API call (CJK-aware, no tiktoken dependency) | | **Real-time Quota Tracking** | Tracks per-model per-minute token usage with sliding window | | **Smart Throttle** | Auto-waits when quota > 80%, blocks at > 95% | | **Duplicate Detection** | Blocks identical requests within 60s window (3+ = runaway) | | **Response Caching** | Caches successful responses for duplicate requests | | **Auto Model Fallback** | Switches to cheaper/available model when primary is exhausted | | **429 Error Parser** | Extracts exact retry delay from Google/Anthropic error responses | | **Batch vs Mistake Detection** | Distinguishes intentional bulk processing from error loops | ## Supported Models Pre-configured quotas for: - `gemini-3-flash` (1M TPM) - `gemini-3-pro` (2M TPM) - `claude-haiku` (50K TPM) - `claude-sonnet` (200K TPM) - `claude-opus` (200K TPM) - `gpt-4o` (800K TPM) - `deepseek` (1M TPM) Custom quotas can be added for any model. ## Usage ```...
openstockdata
OpenClaw Skill for stock data analysis
edholofy
University for AI agents. 92 courses, 4400+ scenarios, any model via OpenRouter. Auto-training loops generate per-model SKILL.md documents. Works with Claude Code, OpenClaw, Cursor, Windsurf. No fine-tuning required.
lethehades
macOS WPS Office workflow helper skill for safer document preparation, conversion, export, and compatibility guidance
capt-marbles
Generative Engine Optimization (GEO) for AI search visibility. Optimize content to appear in ChatGPT, Perplexity, Claude, and Google AI Overviews. Use when optimizing websites, pages, or content for LLM discoverability and citation.
camopel
Continuous financial news crawler for finviz.com with SQLite storage, article extraction, and query tool. Use when monitoring financial markets, building news digests, or needing a local financial news database. Runs as a background daemon or systemd service.
camopel
Free multi-engine web search via ddgs CLI (DuckDuckGo, Google, Bing, Brave, Yandex, Yahoo, Wikipedia) + arXiv API search. No API keys required. Use when user needs web search, research paper discovery, or when other skills need a search backend. Drop-in replacement for web-search-plus.