Create Sitemap Using Java

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling.

Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

Web crawlers usually discover pages from links within the site and from other sites. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site. Sitemap 0.90 has wide adoption, including support from Google, Yahoo!, and Microsoft.

It will also support for google site map updations.

Note : sitemapgen4j-1.0.1.jar – Set in classpath

SitemapGen4j Features

SitemapGen4j is a library to generate XML sitemaps in Java.

  • Adding any number of URLs
  • Can set gzipped output
  • Can set lastmod option
  • Can set priority option
  • Can set changefreq option
  • Configuring the date format
  • Configuring sitemap index file
  • Validate your sitemaps against official XML Schema Definition (XSD)

Create Sitemap Using SitemapGen4j


import java.io.File;
import java.net.MalformedURLException;
import java.util.Date;

import com.redfin.sitemapgenerator.ChangeFreq;
import com.redfin.sitemapgenerator.WebSitemapGenerator;
import com.redfin.sitemapgenerator.WebSitemapUrl;

public class SitemapGenerator {

public static void main(String[] args) throws MalformedURLException {
// If you need gzipped output
WebSitemapGenerator wsg = WebSitemapGenerator.builder("http://www.javamagic.wordpress.com", new File("C:\\sitemap"))
.gzip(true).build();

WebSitemapUrl url = new WebSitemapUrl.Options("https://javamagic.wordpress.com/2012/02/24/create-pdf-with-itext-java-tutorial/")
.lastMod(new Date()).priority(1.0).changeFreq(ChangeFreq.HOURLY).build();
// this will configure the URL with lastmod=now, priority=1.0, changefreq=hourly

//You can add any number of urls here
wsg.addUrl(url);
wsg.addUrl("https://javamagic.wordpress.com/2011/12/16/prototype-pattern/");
wsg.write();
}
}

Hope you will like this. Cheers… 🙂