A real browser, self-hosted
Scrapes run on actual Chromium through your own browser node — JavaScript, SPAs and anti-bot pages render the way a user sees them. No emulated DOM, no flaky headless tricks.
Selectors when you can, AI when you can’t — one endpoint, every site. Run it on real, self-hosted Chromium and get JSON back inline.
1,000 free credits / month · no card required
Scrapes run on actual Chromium through your own browser node — JavaScript, SPAs and anti-bot pages render the way a user sees them. No emulated DOM, no flaky headless tricks.
Hand-write a typed action DSL for pages you know, or pass a plain-English query and let the model emit a validated plan. Same endpoint, same schema, one mental model.
The first AI run generates an extraction plan; every run after reuses it — keyed per URL, healed when a page drifts. You pay the model once, then scrape at selector speed.
Every engine runs the same action schema. Credits are charged per URL, weighted by what the run actually costs us to run.
html 1cr Static HTML over plain fetch — the cheap floor.
ai-html 3cr Free-form query resolved on the static DOM.
browser 5cr Real Chromium for JS, SPAs & anti-bot pages.
ai 5cr English query, auto-routed to a live browser.
ai-browser 10cr AI planning on a fully rendered page.
map 1cr Discover a site’s URLs via sitemap & robots.
Real pages fight back. ScrapeSilo absorbs the failure modes so your extraction code stays boring.
The browser engine drives real Chromium, so client-rendered pages return the DOM a user actually sees.
A stealth browser plus a proxy hop means most bot checks see a genuine session, not a headless tell.
Route through a residential pool, target a country with a two-letter code, or bring your own proxy.
Send up to 50 URLs per call; each one succeeds or fails on its own, so a single bad page never sinks the batch.
Capture a logged-in session once, then replay its cookies and storage on every later scrape.
Extract text, attributes, tables, clean Markdown, JSON paths or screenshots — typed, not scraped-and-prayed.
Deterministic CSS/XPath. Cheapest, fastest, exact.
curl -X POST https://api.scrapesilo.com/scrape \
-H "Authorization: Bearer sf_live_…" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"engine": "html",
"actions": {
"title": "css=h1@text",
"body": "css=p@text"
}
}'import requests
resp = requests.post(
"https://api.scrapesilo.com/scrape",
headers={"Authorization": "Bearer sf_live_…"},
json={
"url": "https://example.com",
"engine": "html",
"actions": {
"title": "css=h1@text",
"body": "css=p@text"
}
},
)
print(resp.json())const res = await fetch("https://api.scrapesilo.com/scrape", {
method: "POST",
headers: {
Authorization: "Bearer sf_live_…",
"Content-Type": "application/json",
},
body: JSON.stringify({
"url": "https://example.com",
"engine": "html",
"actions": {
"title": "css=h1@text",
"body": "css=p@text"
}
}),
});
const data = await res.json();type ScrapeResponse = { url: string; data: unknown; tookMs: number };
const res = await fetch("https://api.scrapesilo.com/scrape", {
method: "POST",
headers: {
Authorization: "Bearer sf_live_…",
"Content-Type": "application/json",
},
body: JSON.stringify({
"url": "https://example.com",
"engine": "html",
"actions": {
"title": "css=h1@text",
"body": "css=p@text"
}
}),
});
const { data } = (await res.json()) as ScrapeResponse; {
"url": "https://example.com",
"data": {
"title": "Example Domain",
"body": "This domain is for use in examples…"
},
"tookMs": 312
} Describe it in English. We generate the plan and cache it.
curl -X POST https://api.scrapesilo.com/scrape \
-H "Authorization: Bearer sf_live_…" \
-H "Content-Type: application/json" \
-d '{
"url": "https://news.ycombinator.com",
"engine": "ai",
"query": "top 10 stories: title, link, points, author"
}'import requests
resp = requests.post(
"https://api.scrapesilo.com/scrape",
headers={"Authorization": "Bearer sf_live_…"},
json={
"url": "https://news.ycombinator.com",
"engine": "ai",
"query": "top 10 stories: title, link, points, author"
},
)
print(resp.json())const res = await fetch("https://api.scrapesilo.com/scrape", {
method: "POST",
headers: {
Authorization: "Bearer sf_live_…",
"Content-Type": "application/json",
},
body: JSON.stringify({
"url": "https://news.ycombinator.com",
"engine": "ai",
"query": "top 10 stories: title, link, points, author"
}),
});
const data = await res.json();type ScrapeResponse = { url: string; data: unknown; tookMs: number };
const res = await fetch("https://api.scrapesilo.com/scrape", {
method: "POST",
headers: {
Authorization: "Bearer sf_live_…",
"Content-Type": "application/json",
},
body: JSON.stringify({
"url": "https://news.ycombinator.com",
"engine": "ai",
"query": "top 10 stories: title, link, points, author"
}),
});
const { data } = (await res.json()) as ScrapeResponse; {
"url": "https://news.ycombinator.com",
"data": [ /* 10 story objects */ ],
"tookMs": 2847,
"generatedActions": {
"title": "css=.titleline > a@text",
"link": "css=.titleline > a@href",
"points": "css=.score@text",
"author": "css=.hnuser@text"
},
"aiIterations": 1
} Real Chromium for JS-heavy, anti-bot, SPA pages.
curl -X POST https://api.scrapesilo.com/scrape \
-H "Authorization: Bearer sf_live_…" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example-spa.com",
"engine": "browser",
"actions": {
"title": "css=h1@text",
"content": "css=main@markdown"
},
"options": {
"waitFor": "networkidle",
"blockAds": true
}
}'import requests
resp = requests.post(
"https://api.scrapesilo.com/scrape",
headers={"Authorization": "Bearer sf_live_…"},
json={
"url": "https://example-spa.com",
"engine": "browser",
"actions": {
"title": "css=h1@text",
"content": "css=main@markdown"
},
"options": {
"waitFor": "networkidle",
"blockAds": True
}
},
)
print(resp.json())const res = await fetch("https://api.scrapesilo.com/scrape", {
method: "POST",
headers: {
Authorization: "Bearer sf_live_…",
"Content-Type": "application/json",
},
body: JSON.stringify({
"url": "https://example-spa.com",
"engine": "browser",
"actions": {
"title": "css=h1@text",
"content": "css=main@markdown"
},
"options": {
"waitFor": "networkidle",
"blockAds": true
}
}),
});
const data = await res.json();type ScrapeResponse = { url: string; data: unknown; tookMs: number };
const res = await fetch("https://api.scrapesilo.com/scrape", {
method: "POST",
headers: {
Authorization: "Bearer sf_live_…",
"Content-Type": "application/json",
},
body: JSON.stringify({
"url": "https://example-spa.com",
"engine": "browser",
"actions": {
"title": "css=h1@text",
"content": "css=main@markdown"
},
"options": {
"waitFor": "networkidle",
"blockAds": true
}
}),
});
const { data } = (await res.json()) as ScrapeResponse; {
"url": "https://example-spa.com",
"data": {
"title": "Pricing — Acme",
"content": "# Pricing\n\nSimple, usage-based…"
},
"tookMs": 1840,
"antibot": "passed"
} One endpoint, a lot of jobs — every one of these is the same /scrape call with a different plan.
Turn any page into clean Markdown or typed JSON your model can embed — no boilerplate, no broken tags.
Track competitor prices, stock and listings across thousands of product URLs in one batched call.
Pull search results, rankings and SERP features at scale to feed dashboards, models and reports.
Resolve company and contact pages into structured firmographics for your CRM or GTM stack.
Map a whole site, then extract every page into structured records in one coordinated pass.
Give Claude or any MCP client live, key-scoped read access to the web — no glue code.
Fire a one-off scrape with full engine and option control, right from the browser.
Watch a browser session stream in real time as it runs.
Every run recorded with request, result and per-URL errors.
Recorded sessions saved as Playwright traces you can scrub through.
Call scrape, get_execution and list_executions straight from Claude or any MCP client.
Built-in proxy pool with country targeting, or bring your own.
Per-plan limits with clean backpressure — never a silent stampede.
Extract text, attributes, HTML or clean Markdown — many elements at once.
Point any MCP client at your account and the model can scrape pages, list runs and inspect results with your key — no glue code.
See the MCP docsPOST /mcp/scrape
Authorization: Bearer sf_live_…
tools:
• scrape
• list_executions
• get_execution Credits reset monthly. Per-credit cost drops as you scale — and AI plan caching means you’re mostly paying selector prices anyway.
Kick the tyres.
$0
For steady pipelines.
$49/mo
Most teams land here.
$149/mo
High-volume extraction.
$499/mo
Per-URL credit cost:
html 1ai-html 3browser 5ai 5ai-browser 10
One unit of scrape cost, charged per URL and weighted by engine: HTML 1, AI-HTML 3, browser 5, AI 5, AI-browser 10. A multi-URL request costs the sum. Credits reset monthly and don’t roll over.
Start with html — it’s the cheapest and works for static pages. Reach for browser when a page needs JavaScript or trips anti-bot. Use ai / ai-html when you’d rather describe the data than write selectors.
The browser engine drives real Chromium through a self-hosted node with a proxy hop and ad/tracker blocking, so most bot checks see a genuine browser.
Yes. Every account exposes an MCP endpoint — POST /mcp/scrape — so Claude Code and other MCP clients can scrape, list and inspect executions with your API key.
Yes. Billing is handled by Stripe; manage or cancel from the billing portal. The Free plan is never charged.
No. Pass a plain-English query in AI mode and we generate a validated action plan, return it in the response, and cache it — so you can lift it into deterministic selectors whenever you want.
Spin up in minutes with 1,000 free credits — no card, every engine, full console.