TopRank Skills

Home / Claw Skills / 开发运维 / tandemn-tuna
Official OpenClaw rules 36%

tandemn-tuna

Deploy and serve LLM models on GPU. Compare GPU pricing. Launch vLLM on Modal, RunPod, Cerebrium, Cloud Run, Baseten, or Azure with spot instance fallback. OpenAI-compatible inference endpoint.

Stars

0

Installs

0

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词,发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 tandemn-tuna 技能。 若已安装,则直接安装 tandemn-tuna 技能。

Overview

Skill Key
choprahetarth/tandemn-tuna
Author
choprahetarth
Source Repo
openclaw/skills
Version
0.0.1
Source Path
skills/choprahetarth/tandemn-tuna
Latest Commit SHA
1dbfdb80ef72bc7b3d587dcbe992adf80222fbf1

Extracted Content

SKILL.md excerpt

# Tuna — Deploy and Serve LLM Models on GPU Infrastructure

Tuna is a hybrid GPU inference orchestrator. It lets you deploy, serve, and manage LLM models (Llama, Qwen, Mistral, DeepSeek, Gemma, and any HuggingFace model) on serverless GPUs from **Modal, RunPod, Cerebrium, Google Cloud Run, Baseten, or Azure Container Apps**, with optional **spot instance fallback on AWS** via SkyPilot. Every deployment gets an **OpenAI-compatible `/v1/chat/completions` endpoint**.

The key idea: serverless GPUs handle requests immediately (fast cold start, pay-per-second) while a cheaper spot GPU boots in the background. Once spot is ready, traffic shifts there. If spot gets preempted, traffic falls back to serverless automatically. This gives you **3–5x cost savings** over pure serverless with zero downtime.

## Quick Start — Deploy a Model in 3 Commands

```bash
# 1. Install tuna
uv pip install tandemn-tuna

# 2. Deploy a model (auto-picks cheapest serverless provider for the GPU)
tuna deploy --model Qwen/Qwen3-0.6B --gpu L4 --service-name my-llm

# 3. Query your endpoint (shown in deploy output)
curl http://<router-ip>:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "Qwen/Qwen3-0.6B", "messages": [{"role": "user", "content": "Hello!"}]}'
```

For serverless-only (no spot, no AWS needed):

```bash
tuna deploy --model Qwen/Qwen3-0.6B --gpu L4 --serverless-only
```

## All Commands

### `tuna deploy` — Launch a model on GPU

Deploy a model across serverless + spot infrastructure. This is the main command.

```bash
tuna deploy --model <HuggingFace-model-ID> --gpu <GPU> [options]
```

**Required arguments:**
- `--model` — HuggingFace model ID (e.g., `Qwen/Qwen3-0.6B`, `meta-llama/Llama-3-70b`)
- `--gpu` — GPU type (e.g., `T4`, `L4`, `L40S`, `A100`, `H100`, `B200`)

**Common options:**
- `--service-name` — Name for the deployment (auto-generated if omitted)
- `--serverless-provider` — Force a specific provider: `modal`, `runpod`, `cloudrun`, `baseten`...

Related Claw Skills