Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

Research study of Sitemaps

Monday, April 27, 2009 at 1:20 PM

We've been tracking the growth of Sitemaps on the web. It's been just 2 years since Google, Yahoo and Microsoft co-announced the Sitemaps directive in robots.txt, and it is already supported in many millions of websites including educational and government websites! At the WWW'09 conference in Madrid, Uri Schonfeld presented his summer internship work studying Sitemaps from a coverage and freshness perspective. If you're interested in how some popular websites are using Sitemaps, and how Sitemaps complement "classic" webcrawling, take a look:


At Google, we care deeply about getting increased coverage and freshness of the content we index. We are excited about open standards that help webmasters open up their content automatically to search engines, so users can find relevant content for their searches.

The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

11 comments:

James said...

Even as a statistician, I had to look at the graph a couple times before I fully grasped the point. A couple of sentences explaining the relevance would be useful to most readers.

Jenn said...

This is great stuff!

I havebenn pushing my clients to create and submit sitemaps since 2005 - Now we have even more reason for why they should create and submit them.

yishai said...

My site was hacked, and when I go to a request review in the Google webmaster tools, I get a message saying "an internal error has occurred, please try again later." Please help!!

Rags said...

I have to agree with James, that chart looks like a paint ball target.

TechAlex said...

Chart could definitely be more clear, good to see more on sitemaps though.

picobreezer said...

i've been looking at the graph for a few mins now, and i'm still not getting it. can someone explain?
thanks

vinod kumar said...

I could not understand this graph , please explain it also

James said...

Here's my take on how to read this.

- The bigger the circle, the more pages were indexed on that day.
- As you go from left to right, you see how much content was added via regular crawling.
- As you go from top to bottom, you see how much content was added via site maps.
- The dark blue circles are sites that do not have xml site maps.
- The diagonal line of multi-colored circles from the bottom left to the top right of the graphs represents pages that are represented by site maps. The line running through the center of these circles has a slope of one.

Let's look at an example. At approximately day 22, there's a large influx of content represented by the yellow circle kind of hiding behind two mint green circles. That yellow circle represents pages that live on sites that do not have site maps. If you follow an imaginary line from the center of that yellow circle horizontally across the graph, you will see several blue circles that stretch to the end of the study at ~day 110. Every blue circle on that line represents a page or pages that was crawled on day 22 because of its site map submission. The natural spider crawl did not reach some of those pages until nearly three months later.

Put another way, a large blue circle that is at day 100 on the x axis, but day 22 on the y axis represents the following:

- On day 22, the page was indexed via an xml site map submission
- On day 100, that same page was finally crawled via the regular spider activities that Google employs.

Translation: If you have timely content that you need to be read as quickly as possible, use an xml sitemap and update it regularly.

Sam said...

if you click on the link above the chart you can get a full blown explanation in PDF version. It's a long explantion ;)

J D Web Designs said...

I am glad that I was not the only one confused when I tried to make out the graph. But with some explanations being made, it made it a little easier. But I am not 100% clear on all of it.

Google Webmaster Central said...

Hi everyone,

Since over a year has passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Central Help Forum.

Thanks and take care,
The Webmaster Central Team