Skip to content

POST /map

/map discovers URLs for a site by reading its sitemap.xml and robots.txt, following nested sitemaps. It does not fetch or extract page content — use it to build a work-list, then feed those URLs to /scrape.

POST /map
Authorization: Bearer <token>
Content-Type: application/json
{
url: string | string[]; // one site, or up to 25 sites
depth?: number; // nested-sitemap recursion depth, 0–10
limit?: number; // max links to return, 1–5000
ignoreSitemap?: boolean; // skip sitemap.xml, crawl from robots/links only
allowSubdomains?: boolean; // include URLs on sibling subdomains
include?: string; // regex — only return links that match
exclude?: string; // regex — drop links that match
}
{
tookMs: number;
sitemaps: Array<{
domain: string;
url: string | string[] | null; // sitemap location(s) used, null if none found
links: string[];
}>;
}
Terminal window
curl -sS -X POST https://api.scrapesilo.com/map \
-H "Authorization: Bearer sf_…" \
-H "Content-Type: application/json" \
-d '{ "url": "https://example.com", "limit": 100, "include": "/blog/" }'
{
"tookMs": 412,
"sitemaps": [
{
"domain": "example.com",
"url": "https://example.com/sitemap.xml",
"links": ["https://example.com/blog/a", "https://example.com/blog/b"]
}
]
}

/map is also exposed as the map MCP tool.