How to create an XML sitemap for your website

An XML sitemap gives search engines a direct list of the public pages a website wants crawled and reconsidered for indexing. It is especially useful when a site is new, large, or updated unevenly, because it helps crawlers discover important URLs without depending only on internal navigation.

The sitemap is a UTF-8 XML file published at a stable URL, usually near the site root, with one fully qualified canonical URL per entry. Search engines can find it through a Sitemap: line in robots.txt, and Google Search Console can then show whether the file was fetched, parsed, and matched to the correct property.

The file only helps when it reflects the site's real preferred URLs. Redirected pages, duplicate parameter URLs, blocked paths, login-only pages, soft-error pages, and noindex URLs should stay out, and a CMS-generated sitemap should be reused instead of publishing a second competing file when that output is already clean. Google also retired the old sitemap ping endpoint, so the practical workflow is to publish a valid sitemap, reference it in robots.txt, and monitor it in Search Console.

Steps to create an XML sitemap for your website:

  1. Start from the site's canonical public URLs and leave out redirected, blocked, duplicate, noindex, or login-only pages.

    Only include URLs that should be indexed and that normally return 200 OK to crawlers.

  2. Create a UTF-8 sitemap file with a root urlset element and one url block for each canonical page.
    <?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      <url>
        <loc>https://www.example.com/</loc>
        <lastmod>2026-04-15</lastmod>
      </url>
      <url>
        <loc>https://www.example.com/contact/</loc>
        <lastmod>2026-04-10</lastmod>
      </url>
    </urlset>

    loc is required, lastmod helps only when it matches the page's last significant change, and Google does not use changefreq or priority.

  3. Publish the file at a stable host-root HTTPS location so it can describe the whole host.
    https://www.example.com/sitemap.xml

    A sitemap stored below the host root can only list URLs from that directory path or below, so the root location is the safest default for most sites.

  4. Split the sitemap into multiple files and submit a sitemap index when the published set would exceed 50,000 URLs or 50 MB uncompressed.
    https://www.example.com/sitemap_index.xml

    Keep the child sitemap files on the same site and list them from a root-level sitemap index.

  5. Add the sitemap URL to the site's robots.txt file with a fully qualified Sitemap: line.
    Sitemap: https://www.example.com/sitemap.xml
  6. Submit the full sitemap URL in the Google Search Console Sitemaps report for the matching site property.

    Search Console tracks only sitemaps submitted through the report or its API, even when Google can already discover the same sitemap from robots.txt.

    Do not rely on the old sitemap ping endpoint because Google deprecated it; use robots.txt discovery and Search Console submission instead.

  7. Check the published sitemap URL directly and confirm that it returns readable XML without redirects or access barriers.
    $ curl -sS https://www.example.com/sitemap.xml
    <?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      <url>
        <loc>https://www.example.com/</loc>
        <lastmod>2026-04-15</lastmod>
      </url>
    ##### snipped #####

    A direct fetch should return the sitemap itself rather than an HTML page, a redirect chain, a login prompt, or a 404 response.

  8. Monitor the submitted sitemap in Search Console until the status reaches Success or the reported fetch and parsing errors are resolved.

    Common failures include a blocked sitemap URL in robots.txt, a wrong property version such as http versus https or www versus non-www, invalid dates, and URLs that point to redirects instead of the final canonical page.