
browser-automation Skill: Let Your Agent Control a Browser — I Used It to Buy Train Tickets
Complete guide to the browser-automation skill for OpenClaw: Playwright-powered web control, configuration, common bugs, and real-world usage at SFD Lab with 15 agents.
📋 实验室验证报告
Last Week I Let an Agent Buy Train Tickets. It Worked.
Here is how it started: SFD Lab's 12306 auto-ticket script broke right before a holiday weekend. Franky needed a seat home, and by 1 AM it still had not come through. I started thinking about a more flexible approach — one that does not rely on any specific API, just controls a real browser the way a human would.
So I installed the browser-automation skill, spent two hours debugging, and it worked. This is my full notes on what I learned.
What browser-automation Actually Does
One sentence: it lets your OpenClaw agent directly control a Chrome browser — open pages, click buttons, fill forms, take screenshots, and extract content. All of it.
Under the hood it is Playwright, but you do not write Python code. The skill wraps the common operations and lets you describe what you want in plain language. The agent generates the script, you review it, then run it.
Good use cases: ticket queuing, automated login flows, scheduled screenshot monitoring, bulk form submissions, web scraping that does not depend on API format staying stable.
Installation and Configuration
Install the skill:
clawhub install browser-automation
Then install the Playwright browser engine — this step gets skipped constantly and causes hours of confusion:
playwright install chromium
On Mac you might also need:
brew install --cask chromium
Config file lives at ~/.openclaw/skills/browser-automation/config.yaml. The settings that matter:
browser: chromium
headless: true
timeout: 30000
screenshot_on_error: true
viewport:
width: 1280
height: 800
For local development, flip headless to false. Watching the browser actually do things saves a lot of time when something goes wrong.
Real Usage: Monitoring a Page for Changes
I built a simple price monitor: every 5 minutes, open a target page, read a specific element, compare to the last value, send a Telegram notification if it changed.
Combined with the cron skill, the whole setup takes about 10 minutes. You describe the task in natural language, the agent generates the Playwright script, you approve and run it. No boilerplate from scratch.
Bugs I Hit (This Is the Useful Part)
Bug 1 — Scraping before the page finishes rendering
Playwright waits for DOM load, but modern frontends are async. The data arrives after DOM load, filled in by JavaScript. Fix: wait for a specific element to appear, or add a page.wait_for_timeout(2000) as a quick workaround.
Bug 2 — Login state lost on every run
Each new browser instance starts fresh with no cookies. To persist sessions, save state after manual login:
context.storage_state(path="session.json")
Then load it on subsequent runs: browser.new_context(storage_state="session.json")
Bug 3 — Headless browser detected and blocked
Some sites detect and reject headless browsers. Add a realistic user-agent string, or install playwright-stealth and enable it in the config. Not always necessary but worth knowing it exists.
Bug 4 — Screenshots not where you expect them
The skill saves screenshots to /tmp/ba_screenshots/ by default, not the working directory. When something errors out, check there first. I spent 30 minutes looking in the wrong place.
How We Use It at SFD Lab
Three browser-automation tasks run continuously at SFD Lab. First: check ClawHub for new skills every morning at 8:55 AM, push findings to our research agent if anything new appears. Second: monitor competitor pricing pages hourly, alert our finance agent on any change. Third: daily screenshot of our own site saved to archive for visual diff against yesterday.
That third one sounds unnecessary until it saves you. One time a bad nginx config change broke the layout. The error monitoring did not catch it. The screenshot diff did.
SFD Editor note: The real value of browser-automation is not automation — it is resilience. APIs change formats constantly. Websites rarely fully restructure overnight. Driving a real browser is more stable than maintaining a fleet of API scrapers, and when things break you have a screenshot to look at.
⚙️ 安装与赋能
clawhub install browser-automation-skill-openclaw-web-control-practical-guide-2026安装后在你的 Agent 配置中启用此技能,重启 Agent 即可生效。