desktop-computer-automation
maintained by web-infra-dev
Vision-driven desktop automation using Midscene. Control your desktop (macOS, Windows, Linux) with natural language commands. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all visible elements on screen regardless of technology stack. ⚠️ WARNING: This skill takes over the user's real mouse and keyboard. The user cannot use their computer while automation is running. → For web apps, prefer the "Browser Automation" skill instead — it runs in a headless browser and does NOT interfere with the user's mouse/keyboard. → Only use this skill for desktop-native applications (Electron, Qt, native macOS/Windows/Linux apps) that cannot be tested in a browser. Triggers: open app, press key, desktop, computer, click on screen, type text, screenshot desktop, launch application, switch window, desktop automation, control computer, mouse click, keyboard shortcut, screen capture, find on screen, read screen, verify window, close app, minimize window, maximize window, test des
Key Features
- Comprehensive skill evaluation and performance tracking
- Community-driven ratings and reviews
- Easy integration with Claude Code
- Regular updates and maintenance
Quick Start
TopRank Skills install web-infra-dev/computer-automation
chat Comments (0)
Sign in to join the discussion and leave a comment.
Skill Details
Related Skills
Build your own?
Join 12,000+ developers contributing to the Claude ecosystem.
No comments yet. Be the first to share your thoughts!