Overview
- Skill Key
- boombignose/phaya
- Author
- boombignose
- Source Repo
- openclaw/skills
- Version
- -
- Source Path
- skills/boombignose/phaya
- Latest Commit SHA
- 14b6c3eecd86e09afe4155334c072aeba6602827
Use the Phaya SaaS backend to generate images, videos, audio, music, and run LLM chat completions via simple REST API calls. Use when the user wants to generate media, call AI models, or use the Phaya API for image/video/audio/text generation.
Stars
0
Installs
0
Status
ACTIVE
Visibility
PUBLIC
直接复制以下提示词,发送给你的 AI 助手即可完成安装。
请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 phaya 技能。 若已安装,则直接安装 phaya 技能。
# Phaya Media API
Phaya is a FastAPI backend that brokers AI media generation across KIE.ai (Sora 2, Veo 3.1, Seedance, Kling, Seedream, Suno), Google Gemini TTS, and OpenRouter LLMs.
## Auth
All endpoints require a Bearer token or API key:
```
Authorization: Bearer <your_api_key>
```
Get your profile and credit balance:
- `GET /api/v1/user/profile` — full profile
- `GET /api/v1/user/credits` → `{ "credits_balance": 84.90, ... }`
**Rate limit:** 60 requests/minute per API key.
## Credit System
Every generation costs credits deducted on job creation; auto-refunded on failure.
| Credits | Service |
|---------|---------|
| 0.5 | image-to-video (FFmpeg local), Sora 2 character creation |
| 1.0 | text-to-image (Z-Image) |
| 1.5 | Seedream 5.0 |
| 2–4 | Nano Banana 2 (1K/2K/4K resolution) |
| 3.0 | Text-to-music (Suno) |
| 2–35 | Seedance 1.5 Pro (resolution × duration × audio) |
| 8.0 | Sora 2 video |
| 1.21–1.82/sec | Kling 2.6 motion control (720p/1080p) |
| 15.0 | Veo 3.1 fast (`veo3_fast`) |
| 50.0 | Veo 3.1 quality (`veo3`) |
## Job / Polling Pattern
Every generation is async. Create endpoints return `job_id` immediately; poll the status endpoint.
```
POST /api/v1/<service>/create → { "job_id": "uuid" }
GET /api/v1/<service>/status/{job_id} → { "status": "...", "<media>_url": "..." }
```
**Status values:**
- Image/music endpoints: `PENDING`, `QUEUED`, `PROCESSING`, `COMPLETED`, `FAILED`
- Speech/subtitle endpoints: `PENDING`, `PROCESSING`, `COMPLETED`, `FAILED`
- Video/download endpoints: `processing`, `completed`, `failed`, `cancelled`
**Response URL field by media type:**
| Media type | Response field |
|------------|---------------|
| Images | `image_url` |
| Videos | `video_url` |
| Audio / music | `audio_url` (music also returns `audio_urls[]`) |
| Sora 2 character | `character_id` (a string ID, not a URL) |
Poll every 3–5 seconds until the terminal status is reached.
## Quick Start
### 1. Generate an image (text-to-image)
```python
imp...
# Phaya Media API
Generate images, videos, audio, and AI chat completions via the Phaya SaaS backend — all through a single, authenticated REST API.
## What It Does
This skill teaches an AI agent how to use the Phaya backend to:
- **Generate images** — Z-Image (1 credit), Seedream 5.0 (1.5 credits), Nano Banana 2 (2–4 credits)
- **Generate videos** — Sora 2 i2v/t2v (8 credits), Veo 3.1 (15–50 credits), Seedance 1.5 Pro (3–35 credits), Kling 2.6 motion control (1.21–1.82 credits/sec), FFmpeg local (0.5 credits)
- **Generate audio/music** — Suno music via KIE.ai (3 credits), Google Gemini TTS (token-based)
- **Run LLM chat** — Phaya-GPT at `/api/v1/phaya-gpt/chat/completions`, plus a full OpenAI-compatible proxy at `/v1/chat/completions`
- **Media utilities** — Thai subtitle generation (Whisper + FFmpeg), yt-dlp video download, FFmpeg merge/overlay/transcribe
## Supported Services
| Category | Service | Provider |
|----------|---------|----------|
| Image | Z-Image | KIE.ai |
| Image | Seedream 5.0 | ByteDance via KIE.ai |
| Image | Nano Banana 2 | KIE.ai |
| Video | Sora 2 (i2v + t2v + character) | KIE.ai |
| Video | Veo 3.1 | Google via KIE.ai |
| Video | Seedance 1.5 Pro | ByteDance via KIE.ai |
| Video | Kling 2.6 motion control | KIE.ai |
| Video | FFmpeg Ken Burns zoom | Local |
| Music | Suno (V3–V5) | KIE.ai |
| Speech | Gemini TTS | Google |
| LLM | Phaya-GPT | OpenRouter |
| Embeddings | Qwen3 Embedding 8B | OpenRouter |
| Subtitles | Whisper Large v3 + PyThaiNLP | Together AI |
| Download | yt-dlp (YouTube, TikTok, etc.) | Local |
## Requirements
- A valid Phaya API key (obtain from your account at the Phaya host)
- `httpx` (Python) or `curl` for making API calls
- No local GPU or model weights required — all generation is done server-side
## Authentication
```
Authorization: Bearer <your_api_key>
```
Check your credit balance:
```
GET /api/v1/user/credits → { "credits_balance": 84.90, "credits_balance_formatted": "84.90 เครดิต" }
```
Rate lim...
human-pages-ai
Search and hire real humans for tasks — photography, delivery, research, and more
zseven-w
Reusable skill templates for OpenClaw AI agents. Templates for API integration, data processing, web scraping, CLI tools, and file processing.
capt-marbles
Attio CRM integration for managing companies, people, deals, notes, tasks, and custom objects. Use when working with Attio CRM data, searching contacts, managing sales pipelines, adding notes to records, creating tasks, or syncing prospect information.
capt-marbles
Web scraping and crawling with Firecrawl API. Fetch webpage content as markdown, take screenshots, extract structured data, search the web, and crawl documentation sites. Use when the user needs to scrape a URL, get current web info, capture a screenshot, extract specific data from pages, or crawl docs for a framework/library.
caqlayan
Tweet Processor Skill
carlosarturoleon
Connect to Windsor.ai MCP for natural language access to 325+ data sources including Facebook Ads, GA4, HubSpot, Shopify, and more.