TopRank Skills

Home / Claw Skills / 数据解析 / Token Guard
Official OpenClaw rules 54%

Token Guard

TokenGuard — LLM API 429 Prevention Engine

Stars

0

Installs

0

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词,发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 Token Guard 技能。 若已安装,则直接安装 Token Guard 技能。

Overview

Skill Key
edmonddantesj/token-guard
Author
edmonddantesj
Source Repo
openclaw/skills
Version
-
Source Path
skills/edmonddantesj/token-guard
Latest Commit SHA
de07e1453a378ee315388f394f930d69e3efa0c2

Extracted Content

SKILL.md excerpt

# TokenGuard — LLM API 429 Prevention Engine

<!-- 🌌 Aoineco-Verified | S-DNA: AOI-2026-0213-SDNA-TG01 -->

**Version:** 1.5.0  
**Author:** Aoineco & Co.  
**License:** MIT  
**Tags:** rate-limit, 429, token-management, cost-optimization, llm-guard, high-performance

## Description

Prevents LLM API 429 (Rate Limit / Resource Exhausted) errors by intercepting requests before they're sent. Designed for users on free/low-cost API plans who need maximum intelligence per dollar.

**Core philosophy:** *"Intelligence is measured not by how much you spend, but by how little you need."*

## Problem

When using LLM APIs (especially Google Gemini Flash with 1M TPM limit):
- Large documents (docx, PDFs) can consume the entire minute quota in one request
- Failed requests still count toward token usage
- Retry loops after 429 errors waste more tokens → death spiral
- No built-in way to detect runaway/duplicate requests

## Features

| Feature | Description |
|---------|-------------|
| **Pre-flight Token Estimation** | Estimates token count before API call (CJK-aware, no tiktoken dependency) |
| **Real-time Quota Tracking** | Tracks per-model per-minute token usage with sliding window |
| **Smart Throttle** | Auto-waits when quota > 80%, blocks at > 95% |
| **Duplicate Detection** | Blocks identical requests within 60s window (3+ = runaway) |
| **Response Caching** | Caches successful responses for duplicate requests |
| **Auto Model Fallback** | Switches to cheaper/available model when primary is exhausted |
| **429 Error Parser** | Extracts exact retry delay from Google/Anthropic error responses |
| **Batch vs Mistake Detection** | Distinguishes intentional bulk processing from error loops |

## Supported Models

Pre-configured quotas for:
- `gemini-3-flash` (1M TPM)
- `gemini-3-pro` (2M TPM)
- `claude-haiku` (50K TPM)
- `claude-sonnet` (200K TPM)
- `claude-opus` (200K TPM)
- `gpt-4o` (800K TPM)
- `deepseek` (1M TPM)

Custom quotas can be added for any model.

## Usage

```...

Related Claw Skills