ScrapeSilo

Free tool

Sitemap Finder

Discover where a site keeps its XML sitemap — checked at /sitemap.xml and via robots.txt — and preview the URLs it declares, up to 500.

5 free runs per day · no account needed

Frequently asked questions

How does it find the sitemap?

It checks the conventional /sitemap.xml location and parses robots.txt for Sitemap: directives, then reads the sitemap (including sitemap-index files) and lists the URLs it declares.

What if no sitemap is found?

The tool tells you so — which usually means the site either has none or keeps it at a non-standard path not referenced from robots.txt. Both are worth fixing if you want search engines to crawl efficiently.

Why only 500 URLs?

The free tool previews up to 500 discovered URLs to keep results fast. The full map endpoint (free on sign-up) goes to 5,000 per request with include/exclude filters.

More free tools

Meta Tag Checker See exactly what search engines and social networks read off a page — title, meta description, canonical URL, robots directives, plus every og: and twitter: tag. Heading Outline Extractor See a page’s heading hierarchy the way crawlers and screen readers do — every H1, H2 and H3 in document order, with structure problems easy to spot. Link Extractor List every hyperlink on a page with its anchor text and rel attributes — deduplicated, resolved to absolute URLs, and split into internal and external. Image Alt Text Checker Audit every image on a page: which ones are missing alt text, which are intentionally decorative (alt=""), and what screen readers and image search actually get. Structured Data Extractor See every schema.org JSON-LD block a page ships — parsed, pretty-printed and labelled by type, with broken JSON flagged. This is the raw input to Google’s rich results. Hreflang Checker List every <link rel="alternate" hreflang> annotation on a page — which languages and regions it targets, whether x-default is set, and duplicate locales that confuse crawlers. Social Profile Finder Paste any site and get the social profiles it links to — grouped by platform, deduplicated, share buttons filtered out. Handy for lead research and brand audits. Email Extractor Pull every email address a page exposes — in visible text or mailto: links — deduplicated and ready to copy. One page per run. Word Counter Count a page’s visible words the way a crawler sees them — total words, estimated reading time, and the most frequent keywords with density percentages.

Need JavaScript rendering or AI extraction?

This tool reads server-rendered HTML. The full API adds a real Chromium engine, AI extraction from a plain-English query, and 1,000 free credits a month.

Start free