Friday, July 13, 2007 at 2:24 PM
Given helpful suggestions from our discussion group, we've improved feedback for sitemaps in Webmaster Tools. Now, minor problems in a sitemap will be reported as "warnings," and will appear instead of, or in addition to, more serious "errors." (Previously all problems were listed as errors.) Warnings allow us to provide feedback on portions of your sitemap that may be confusing or inaccurate, while saving the real "error" alarm for problems that make your sitemap completely unreadable. We hope the additional information makes it even easier to share your sitemaps with Google.
The new set of warnings includes many problems that we had previously classified as errors, including the "incorrect namespace" and "invalid date" examples shown in the screenshot above. We also crawl a sample of the URLs listed in your sitemap and report warnings if the Googlebot runs into any trouble with them. These warnings might suggest a widespread problem with your site that warrants further investigation, such as a stale sitemap or a misconfigured robots.txt file.Please let us know how you like this new feedback. Tell us what you think via the comments below, or in the discussion group. We also appreciate suggestions for additional warnings that you would find useful.


22 comments:
Do these new warnings and errors cover anything for Blogger sitemap feeds?
Blogspot robot files now include sitemap autodiscovery, but the path to that file (Sitemap: http://{blogname}.blogspot.com/feeds/posts/default?orderby=updated)
Cannot be added manually as a sitemap to allow for monitoring of Googlebot activity.
The default feeds on my blog (Atom feeds) are redirected to Feedburner and the links are tracked, making them appear to be off-site.
Only Google would deny the removal of the url of an old directory and then post url's for pages under that directory. Hello!! The directory no longer exists. That means the pages no longer exists. Seems a little bit pre 101 to me.
It might also be useful to show a warning stating any pages that googlebot found on that site that are not defined in the sitemap.
Pretty nice. Thank you for improving this tool.
What makes a sitmap unreadable?
That is great! That tool is one of the most useful tools I have used!
Jonathan
http://www.photosforsouls.com/sitemap.html
I've got "error" at my webamasters account related to robot.txt;
User-agent: *
Disallow: /search
Disallow: /
This robt.txt is blocking my whole blogger blog.
I cannot find answer anywhere. Could you please advice what is going on...?
Thank you
I got a major problem with my site not being indexed. It used to be properly indexed and but since yesterday, I have been seeing an error saying the robots.txt file blocked the crawling in some of our major pages. The funny thing is there is no robots.txt file in our site. Please explain.
I think everything is great. I use this service for some time now and it really helps. I have one doubt that arose just recently I get all of a sudden 97 urls not crawled by robots and they all are of label type. How come if haven't changed the robots.txt, mainly because I can't in blogger. Could you help me with that. My site is http://barcelonaphotoblog.blogspot.com
I am reading comments above mine and see that this is a general bug as of this week. Is anyone tuning that up in bots algorythms. I also noticed the disallow line with the /search stuff. That doesn not sound right. Labels are a major source of reference in search engine number of link count and I suppose someone is trying to cut down the number. What else can it be?
I see we still have to actually click into any domain to see if it's actually error free. That's dumb and unproductive.
At this point, the dashboard has three columns - Domain name, Sitemap and Verified.
C'mon Google, make my life simpler and give me back my old column called "Errors" giving me a heads up when something's wrong. THEN I'll click in to investigate the issue.
Either that or at least alert me to errors via the message center.
The Dashboard USED to show when you had an error on a specific domain.
All you had to do was glance at the Dashboard view every morning, and if there were any errors, you were alerted instantly, without having to click into each domain.
This seems to me to be an easy fix, to put things back the way they used to be, especially with all the extra white space available on that Dashboard page.
do you have to have a sitemap? how common are errors. dont know thatg this matters much as im not sure about adding it anyway
sitemaps console has my verification marked as pending for like 3 days. Plus it keeps saying system errors.
I'm having the same problem of the sitemap in a pending state for a few days now when it use to take a few minutes.
I have to agree with Scott here. I dont like the new system. Before, I would just go to my dashboard and in 1 click I was able to see all problems with my URL (404 error, etc...). It was a great way for me to see I had to check a link. Now I don't see that anymore. I know I have some bad URLs on my site but I haven't received any warning! The old way was MUCH better!
For the "Sitemap:" keyword in "/robots.txt", why are relative URLs now considered invalid (as of about August 15, 2007)? Sitemaps.org says that the data for this entry "should" (not "MUST") be a full URL, but by not requiring it, relative URLs are permitted (even if discouraged). Googlebot seems to have no problem computing a proper request from a relative URL for a sitemap for sites I have never submitted to the webmaster tools (i.e. autodiscovered only). So why does the "robots.txt" analysis section of the webmaster tools flag relative URLs for sitemaps as errors when they are permitted by the specification?
The warnings are great, really, but if you're new to this like I am, it would be beneficial if there was a small link beside the error that says "click here to see help files for how to correct this ". Or something to that effect. I had to find the little help button at the top of the page, then do a search on the error specified. Then pick from the many results, some of which answered my question, some did not.
Initially after the first crawl of my site, I looked at the report and saw no problems. Only after clicking on all the different tabs and categories did I find my robots.txt problem. Maybe a bit more warning than that tiny yellow sign would be helpful.
My sitemap gets an error: "unsupported file format". The sitemap's extension is .xml. Does the message mean the file extension is wrong, or does it mean that the content is wrong?
A few days ago, we started to get the unsupported format error for our images sitemaps. But nothing has changed about our sitemaps recently. How can we tell why the parser is no longer correctly handling our sitemaps?
Thanks for providing us such happening warnings before hey attack us,The default feeds for my blog are moved to Feed pade and the links are captured, making them appear to be off-site.
Thnks onece again!
Mysql examples
http://mysqlexamples.blogspot.com
lol great i love this... it will be usiful if any pages that googlebot found on that site that are not defined in the sitemap. Like that...
Hi everyone,
Since several months have passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Help Group.
Thanks and take care,
The Webmaster Central Team
Post a Comment