Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

An Update on Sitemaps at Google

Thursday, June 11, 2009 at 4:26 AM

Did you know that the number of website hosts that have been submitting Sitemap files has almost tripled over the last year? It's no wonder: the secret is out - as a recent research study showed, Sitemaps helps search engines to find new and changed content faster. Using Sitemaps doesn't guarantee that your site will be crawled and indexed completely, but it certainly helps us understand your website better.

Together with the Webmaster Tools design update, we've been working on Sitemaps as well:
  • Google and the other search engines which are a part of Sitemaps.org now support up to 50,000 child Sitemaps for Sitemap index files (instead of the previous 1,000). This allows large sites to submit a theoretical maximum of 2.5 billion URLs with a single Sitemap Index URL (oh, and if you need more, you can always submit multiple Sitemap index files). 
  • The Webmaster Tools design update now shows you all Sitemap files that were submitted for your verified website. This is particularly useful if you have multiple owners verified in Webmaster Tools or if you are submitting some Sitemap files via HTTP ping or through your robots.txt file.
  • The indexed URL count in Webmaster Tools for your Sitemap files is now even more precise.
  • For the XML developers out there, we've updated the XSD schemas to allow Sitemap extensions. The new schema helps webmasters to create better Sitemaps by verifying more features. By validating Sitemap files with the new schema, you can be more confident that the Sitemap files are correct.
  • Do I need to mention that Sitemap file processing is much faster than ever before? We've drastically reduced the average time from submitting a Sitemap file to processing it and showing some initial data in Webmaster Tools. 


For more information about using Sitemaps, make sure to check out our blog post about frequently asked questions on Sitemaps and our Help Center. If you have any questions that aren't covered here, don't forget to search our Help Forum and start a thread in the Sitemaps section for more help.

The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

35 comments:

mysteriouslion said...

We (muabannhadat.com.vn)uploaded a new sitemap followed all the rules 2 weeks ago and immediately underwent a dramatic drop for number of indexed pages. Sadly, now it's still decreasing regardless of our desperate effort. I mean if only we hadn't uploaded the sitemap! :(

Tony Ruscoe said...

The indexed URL count in Webmaster Tools for your Sitemap files is now even more precise.

That's not what I'm seeing. Since the UI update, a couple of mine are now showing as Pending with no indexed URLs even though there was data before the update.

Perhaps a related issue is that I'm getting the error "Unsupported file format" reported for a couple of them even though the same sitemaps have been perfectly fine in the past.

Here are the sitemaps reporting an error:

http://www.haldane100.com/sitemap.xml
http://www.wigstonmethodistchurch.org.uk/sitemap.xml

Is there a bug or do I have a genuine problem?

Note: There's a bug when viewing the sitemap details page and changing the domain using the dropdown in the top right - i.e. It doesn't work.

effisk said...

If someone at Google could look into that problem that would be great:
http://www.google.com/support/forum/p/Webmasters/thread?tid=316784b51106f373&hl=en

spicynugget said...

I noticed that too for the sitemap
http://www.legacyadvisorgroup.com/sitemap.xml

... "We encountered an error while trying to access your Sitemap. Please ensure your Sitemap follows our guidelines and can be accessed at the location you provided and then resubmit."

I am sure the issue will be resolved automatically very soon.

hgil said...

Please re-implement: What the Googlebot sees. Internally and Externally - just as before. Thank you!

Sagar Kamdar said...

@hgil, it was not removed:

To see frequently occurring words on your site, and get insight into how your content appears to Google, click Your site on the web, and then click Keywords.

To see how other sites link to you, click Links to your site (under Your site on the web), and then click Anchor text.

We no longer list content information, such as distribution of file type or encoding information.

Martin Webster said...

For all you Windows people there is a freeware command line program that generates a Google XML sitemap by looking at the files in your website. You can add filters and specify priority, change frequency of specific files if required.

Download this useful utility from:
http://www.logicmighty.co.uk/Home/GoogleSitemapGenerator.aspx

Defunkid said...

Sitemap strategy for large sites
http://dynamical.biz/blog/seo-technical/sitemap-strategy-large-sites-17.html

hgil said...

@sagar

yes it's true functionality was removed:

Before we had 2 sets of data:

keywords from others sites linking to you

keywords from inside your websiste

there was 2 tables one adjacent to the other

that was the most precious and useful thing.

now "keywords" refers to one set of data, which could be anything (what googlebot sees outside, inside, a mix?) .. but there is no way of telling now for sure.

yes, it was removed

there is only ONE set of keywords data .. before there was two!

the keywords as now presented is no way as useful as before.

i find it sad, because the old way was much better .. why change a good thing? i will never know

Sagar Kamdar said...

@hgil, it was not removed. The keywords on your site we found are under the Keywords menu option.

For the common phrases coming from the links to your site, you go to the "Links to your site" functionality and click on "Anchor Text".

jittithap naudom said...

Jun 13 2009,08:21AM
Please accept our best wishes.

hgil said...

@ Sagar

Why are you denying something that was obviously removed?

Please don't distract the issue.

I am not talking about anchor text, I am talking about prioritised *keywords* from external links.

If you are the team-leader of the development program than you are perfectly aware there before there was 2 tables before

Keywords from your site
Keywords from external links

So there were two adjancent tables. Neatly organised, side by side, for easy comparison and analysis

But now we have only ONE table.

Everyone here knows something was removed.

Just say - yes it was removed, because we no longer want to you supply you with that extra information.

But please please don't insult our intelligence, saying that the new update contains the same data as before as before - because it clearly does not.

You are obfuscating the issue here.

Of course, as the team developer you are allowed to do whatever you want, we in turn have no option but to comply.

But kindly offer some honesty here.

Just acknowledge the fact and say: Yes, the extra data was removed, because we no longer feel it serves us and our business model to supply you with the extra information.

And as a honest feedback, from a webdesigner who strives to deliver the best experience for both my clients and the visitors to the site's I design - I will just simply state this in return:

I very much prefer the older version, with better laid out and extra important information.

It was better, before.

I truly miss the TWO sets of keywords data, neatly set side by side:

1. keywords found within the site
2. keywords (NOT keyphrases) from external links.

GaryTheScubaGuy said...

Hi Guys,

At SMX London I heard someone mention that you guys are going to release geotargeting within xml sitemaps - any word on this?

GaryTheScubaGuy said...

What Webmaster Central - what an idiot - spamming Webmaster Central - LMAO

anamika said...

http://www.malayflorist.com
Send gifts to Malaysia, Online delivery of flowers to Malaysia, gift to Malaysia, chocolates, cakes, watches, teddy, sweets, fresh fruits, dry fruits.
Anniversary, birthday, wedding gifts, cakes to Malaysia, Same day delivery to Malaysia, Gift Shop.

dougie said...

Interesting article. Thanks for sharing!

www.somethingdesigner.co.uk

www.somethingsensual.co.uk

Hube said...

What happend to the external links?

On the startpage, I get the info, that there is about 439.436 external links on my site..

When I page them, I read
1 - 100 of 205.976 external links.

When I download them, I can count
734526 external links ....

Before upgrading, there where about 70000 links in the WMT for this site...

Sagar Kamdar said...

@hube, can you provide a URL to your site so we can investigate it further?

Thanks
--Sagar

e4hats said...

How can you download indexed urls? We have more than 11,000 urls submitted but only 7000 are indexed. We want to figure out why by analyzing indexed urls.

e4hats said...
This comment has been removed by the author.
Europe Trip 2005 said...

What is going on with sitemaps? The last 2 days, there is no update to my sitemaps that have been their for over a year. They have been pinged,resubmitted, and still say pending. Additionally, there is no option anymore outside general and mobile submission, what happened to video sitemaps?

Mayra said...

I keep getting this erroneous response...

URL restricted by robots.txt
We encountered an error while trying to access your Sitemap.

Problem detected on:
Jun 14, 2009

I already resubmitted my sitemap after the GWT change.

Europe Trip 2005 said...

Mine was working fine after the update but the last 2 days - no status on crawl, no urls, still says pending. Also tried to verify another site with meta tag, which is in place, and got a failure message. There is clearly something wrong with the sytem.. ANY UPDATES GOOGLE?

Jonathan Simon said...

@Europe Trip 2005 - If you post your sites' URLs in the Webmaster Help Forum we'll take a closer look at what's going on with your Sitemaps and the verification issue you mention.

Europe Trip 2005 said...

@jonathan, so there isn't a system wide issue happening right now? I received an error that said it was reported to Google. I forget the exact text.

Unrelated, my google chat is now no longer working via gmail apps.. Weird stuff.

Ill submit something to the forums but figured it was something more widely happening and was just going to wait...

Europe Trip 2005 said...

@jonathan, just went to the forums to report it and saw this;

"Some websites have reported issues with some Webmaster Tools Gadgets, Sitemaps submissions or the Change of Address tool. We are aware of these issues and are looking into them. You do not need to take any action. This will have no effect on your site's performance in search results"

Of course, that isnt exactly true as it is not updating the index with newer posts that are submitted via sitemaps. All my sitemaps that were previously working all have the time clock and show status pending - now for 3 days.

I am also unable to verify another domain that is new despite the meta value tag being added and verified via view source.

Do you still want/recommend that I post something? Thanks

Jonathan Simon said...

@Europe Trip 2005 - Yes the Forum is the best place for posting the verification and Sitemaps issues you mention. If you post your sites' URLs you'll get helpful responses from Forum members that can help you determine if these issues are specific to your sites or not.

Europe Trip 2005 said...

@jonathan - OK, thanks. I posted my help in the forum here - http://www.google.com/support/forum/p/Webmasters/thread?tid=5ed7c2ab08b80fb4&hl=en

Hopefully someone gets back to me. I havent had much luck in the past on these forums. Thanks again for your help.. Things still seem broken for me.

PokerDude said...

I also received an error for my sitemap.xml on http://pokersiteonlineguide.com. also after i cahnged the schema to 0,9 and even after making a strict validation.

Europe Trip 2005 said...

This seems to be a problem all over again - no updates since July. Ive even added new sitemaps and no progress - http://www.google.com/support/forum/p/Webmasters/thread?tid=48a0756227327e29&hl=en

Mr. Christopher said...

What do you do if you have videos hosted somewhere else and need to list them in a sitemap. I'm trying to create a sitemap of just the videos, but they're not on my domain and I can't verify the video site as mine in Webmaster Tools for the "cross referencing". My player is in one location on one URL and the video that plays just pulls from the external site. Can't find an answer anywhere.

Valentina said...

Hi, I'm an italian user...I don't speak english very well...
I've a problem with my sitemap; I've updated it three time yet, but for the first time google don't refresh the homepage's cache copy, since 18th of april.
What I can do? I've modified the google's speed scan...could this be a problem? I know, I'm a beginner, please be patient!...may someone help me please? I will be really grateful!
Thank you...

Web Directory said...

Initially i did not noticed my website's (http://www.diolt.com) had error in sitemap and after that i submit a correct sitemap. My question is, if there is an error in site map will this effect on ranking?

Mr. Christopher said...

Web Directory: No. You're ranking won't be affected by sitemap errors, but you need to fix them since you're assisting Google with crawling your website. Google looks at the content of the actual pages it sees when considering where to rank you for relevancy. There are a lot of good site map makers out there for free that you can use to update your information. Check out http://www.auditmypc.com/xml-sitemap.asp for a free sitemap maker to resolve the errors.

Google Webmaster Central said...

Hi everyone,

Since over a year has passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Central Help Forum.

Thanks and take care,
The Webmaster Central Team