Dynamically generate sitemap.xml

sitemap.xml is a top level document on your website “for webmasters to inform search engines about pages on their sites that are available for crawling.”  Google not surprisingly has its own documentation on how to improve your site’s visibility using sitemap.xml.

Typically sitemap.xml is a static file that is hand generated.  But on large sites it makes more sense to generate this dynamically.  One way to do this is to generate it on demand using a servlet.  Here is my simple solution.  I did not include the implementation for outputPages() since that will be specific to each application server’s DB hierarchy or web server’s file structure.

public class SiteMap extends HttpServlet {

  protected static final String MIME_TYPE_XML = "application/xml";

  // XML tags
  protected static final String SITE_MAP_XML_INFO = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>";
  protected static final String SITE_MAP_BEGIN =
  protected static final String SITE_MAP_END = "</urlset>";

  protected static final String LOC_BEGIN = " <loc>";
  protected static final String LOC_END = "</loc>";
  protected static final String PRIORITY_BEGIN = " <priority>";
  protected static final String PRIORITY_END = "</priority>";
  protected static final String URL_BEGIN = "<url>";
  protected static final String URL_END = "</url>";

  public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException {

    // set content type to be XML

    // get writer
    PrintWriter out = response.getWriter();

    // output header

    // output pages
    outputPages(request, out);

    // output end

  protected void outputPage(String uri, String priority, PrintWriter out, String urlStart) {
    out.println(LOC_BEGIN + urlStart + uri + LOC_END);
    out.println(PRIORITY_BEGIN + priority + PRIORITY_END);

Then you configure web.xml to use the SiteMap servlet.



One thought on “Dynamically generate sitemap.xml

  1. Pingback: betweenGo » mod_rewrite to bypass security

Leave a Reply

Your email address will not be published. Required fields are marked *