Token Guard

TokenGuard — LLM API 429 Prevention Engine

View Source SKILL.md

Stars

Installs

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词，发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店，若未安装，请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店，然后安装 Token Guard 技能。若已安装，则直接安装 Token Guard 技能。

Overview

Skill Key: edmonddantesj/token-guard
Author: edmonddantesj
Source Repo: openclaw/skills
Version: -
Source Path: skills/edmonddantesj/token-guard
Latest Commit SHA: de07e1453a378ee315388f394f930d69e3efa0c2

Extracted Content

SKILL.md excerpt

# TokenGuard — LLM API 429 Prevention Engine

<!-- 🌌 Aoineco-Verified | S-DNA: AOI-2026-0213-SDNA-TG01 -->

**Version:** 1.5.0  
**Author:** Aoineco & Co.  
**License:** MIT  
**Tags:** rate-limit, 429, token-management, cost-optimization, llm-guard, high-performance

## Description

Prevents LLM API 429 (Rate Limit / Resource Exhausted) errors by intercepting requests before they're sent. Designed for users on free/low-cost API plans who need maximum intelligence per dollar.

**Core philosophy:** *"Intelligence is measured not by how much you spend, but by how little you need."*

## Problem

When using LLM APIs (especially Google Gemini Flash with 1M TPM limit):
- Large documents (docx, PDFs) can consume the entire minute quota in one request
- Failed requests still count toward token usage
- Retry loops after 429 errors waste more tokens → death spiral
- No built-in way to detect runaway/duplicate requests

## Features

| Feature | Description |
|---------|-------------|
| **Pre-flight Token Estimation** | Estimates token count before API call (CJK-aware, no tiktoken dependency) |
| **Real-time Quota Tracking** | Tracks per-model per-minute token usage with sliding window |
| **Smart Throttle** | Auto-waits when quota > 80%, blocks at > 95% |
| **Duplicate Detection** | Blocks identical requests within 60s window (3+ = runaway) |
| **Response Caching** | Caches successful responses for duplicate requests |
| **Auto Model Fallback** | Switches to cheaper/available model when primary is exhausted |
| **429 Error Parser** | Extracts exact retry delay from Google/Anthropic error responses |
| **Batch vs Mistake Detection** | Distinguishes intentional bulk processing from error loops |

## Supported Models

Pre-configured quotas for:
- `gemini-3-flash` (1M TPM)
- `gemini-3-pro` (2M TPM)
- `claude-haiku` (50K TPM)
- `claude-sonnet` (200K TPM)
- `claude-opus` (200K TPM)
- `gpt-4o` (800K TPM)
- `deepseek` (1M TPM)

Custom quotas can be added for any model.

## Usage

```...

Related Claw Skills

openstockdata

stock-data-skill

★ 4

OpenClaw Skill for stock data analysis

edholofy

dojo.md

★ 4

University for AI agents. 92 courses, 4400+ scenarios, any model via OpenRouter. Auto-training loops generate per-model SKILL.md documents. Works with Claude Code, OpenClaw, Cursor, Windsurf. No fine-tuning required.

lethehades

wps-macos-helper

★ 1

macOS WPS Office workflow helper skill for safer document preparation, conversion, export, and compatibility guidance

capt-marbles

geo-optimization

★ 1

Generative Engine Optimization (GEO) for AI search visibility. Optimize content to appear in ChatGPT, Perplexity, Claude, and Google AI Overviews. Use when optimizing websites, pages, or content for LLM discoverability and citation.

byteroverinc

byterover-headless

★ 0

Query and curate knowledge-base using ByteRover CLI. Use `brv query` for knowledge retrieval, `brv curate` for adding context, and `brv push/pull` for syncing.

byron-mckeeby

hugo-blog-agent

★ 0

エージェント読者に最適化されたHugoブログの構築

Analysis Signals

Dependencies

gh pip python go needed.

External Services

anthropic x