POST /map

/map discovers URLs for a site by reading its sitemap.xml and robots.txt, following nested sitemaps. It does not fetch or extract page content — use it to build a work-list, then feed those URLs to /scrape.

Request

POST /map
Authorization: Bearer <token>
Content-Type:  application/json

{
  url: string | string[];   // one site, or up to 25 sites
  depth?:           number; // nested-sitemap recursion depth, 0–10
  limit?:           number; // max links to return, 1–5000
  ignoreSitemap?:   boolean; // skip sitemap.xml, crawl from robots/links only
  allowSubdomains?: boolean; // include URLs on sibling subdomains
  include?:         string;  // regex — only return links that match
  exclude?:         string;  // regex — drop links that match
}

Response

{
  tookMs: number;
  sitemaps: Array<{
    domain: string;
    url:    string | string[] | null;  // sitemap location(s) used, null if none found
    links:  string[];
  }>;
}

Example

curl -sS -X POST https://api.scrapesilo.com/map \
  -H "Authorization: Bearer sf_…" \
  -H "Content-Type: application/json" \
  -d '{ "url": "https://example.com", "limit": 100, "include": "/blog/" }'

{
  "tookMs": 412,
  "sitemaps": [
    {
      "domain": "example.com",
      "url": "https://example.com/sitemap.xml",
      "links": ["https://example.com/blog/a", "https://example.com/blog/b"]
    }
  ]
}

/map is also exposed as the map MCP tool.