Home / Claw Skills / Browser automation / gemini-computer-use

Official OpenClaw rules 36%

gemini-computer-use

Build and run Gemini 2.5 Computer Use browser-control agents with Playwright. Use when a user wants to automate web browser tasks via the Gemini Computer Use model, needs an agent loop (screenshot → function_call → action → function_response), or asks to integrate safety confirmation for risky UI actions.

View Source SKILL.md

Stars

Installs

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词，发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店，若未安装，请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店，然后安装 gemini-computer-use 技能。若已安装，则直接安装 gemini-computer-use 技能。

Overview

Skill Key: am-will/gemini-computer-use
Author: am-will
Source Repo: openclaw/skills
Version: -
Source Path: skills/am-will/gemini-computer-use
Latest Commit SHA: 0b71494672d1670f439f834868755bcc5da6421a

Extracted Content

SKILL.md excerpt

# Gemini Computer Use

## Quick start

1. Source the env file and set your API key:

```bash
cp env.example env.sh
$EDITOR env.sh
source env.sh
```

2. Create a virtual environment and install dependencies:

```bash
python -m venv .venv
source .venv/bin/activate
pip install google-genai playwright
playwright install chromium
```

3. Run the agent script with a prompt:

```bash
python scripts/computer_use_agent.py \
--prompt "Find the latest blog post title on example.com" \
--start-url "https://example.com" \
--turn-limit 6
```

## Browser selection

- Default: Playwright's bundled Chromium (no env vars required).
- Choose a channel (Chrome/Edge) with `COMPUTER_USE_BROWSER_CHANNEL`.
- Use a custom Chromium-based executable (e.g., Brave) with `COMPUTER_USE_BROWSER_EXECUTABLE`.

If both are set, `COMPUTER_USE_BROWSER_EXECUTABLE` takes precedence.

## Core workflow (agent loop)

1. Capture a screenshot and send the user goal + screenshot to the model.
2. Parse `function_call` actions in the response.
3. Execute each action in Playwright.
4. If a `safety_decision` is `require_confirmation`, prompt the user before executing.
5. Send `function_response` objects containing the latest URL + screenshot.
6. Repeat until the model returns only text (no actions) or you hit the turn limit.

## Operational guidance

- Run in a sandboxed browser profile or container.
- Use `--exclude` to block risky actions you do not want the model to take.
- Keep the viewport at 1440x900 unless you have a reason to change it.

## Resources

- Script: `scripts/computer_use_agent.py`
- Reference notes: `references/google-computer-use.md`
- Env template: `env.example`

Related Claw Skills

openbotx

★ 83

An open-source platform for orchestrating AI agents — secure, simple, and built for everyone. Multi-agent, real-time task board, web control panel, skills system, browser automation, multi-provider, scheduler, and more. One command to start. Everything from the browser. No coding required.

sealiu1997

zsxq-digest

★ 8

OpenClaw skill for 知识星球 / ZSXQ digests with token-first auth and browser recovery.

abczsl520

browser-use-skill

★ 3

🌐 OpenClaw skill for Browser-Use — AI-powered browser automation for complex multi-step workflows (login, form filling, scraping, posting)

chizhongwang

veriglow-agent-map-skill

★ 1

Teach AI agents to discover hidden APIs, data functions & browser automation recipes for any website. Works with Claude Code, ClawHub, Cursor & any AgentSkills-compatible agent.

ashemag

reddit-crosspost

★ 1

OpenClaw skill to crosspost X/Twitter posts to Reddit via browser automation

canbirlik

claw-browser

★ 0

A visual, human-like web browser for OpenClaw agents.Supports reading,screenshots, and visible mode.

Analysis Signals

Dependencies

playwright gh bun pip python go google-genai