Wednesday, March 04, 2009 at 4:40 PM
Webmaster Level: Beginner to Intermediate
Every now and then in the webmaster blogosphere and forums, this issue comes up: when a webmaster performs a [site:example.com] query on their website, the number of indexed results differs from what is displayed in their Sitemaps report in Webmaster Tools. Such a discrepancy may smell like a bug, but it's actually by design. Your Sitemap report only reflects the URLs you've submitted in your Sitemap file. The site operator, on the other hand, takes into account whatever Google has crawled, which may include URLs not included in your Sitemap, such as newly added URLs or other URLs discovered via links.
Think of the site operator as a quick diagnosis of the general health of your site in Google's index. Site operator results can show you:

Your Sitemap report provides more granular statistics about the URLs you submitted, such as the number of indexed URLs vs. the number submitted for crawling, and Sitemap-specific warnings or errors that may have occurred when Google tried to access your URLs.

Feel free to check out our Help Center for more on the site: operator and Sitemaps. If you have further questions or issues, please post to our Webmaster Help Forum, where experienced webmasters and Googlers are happy to help.
Posted by Charlene Perez
Every now and then in the webmaster blogosphere and forums, this issue comes up: when a webmaster performs a [site:example.com] query on their website, the number of indexed results differs from what is displayed in their Sitemaps report in Webmaster Tools. Such a discrepancy may smell like a bug, but it's actually by design. Your Sitemap report only reflects the URLs you've submitted in your Sitemap file. The site operator, on the other hand, takes into account whatever Google has crawled, which may include URLs not included in your Sitemap, such as newly added URLs or other URLs discovered via links.
Think of the site operator as a quick diagnosis of the general health of your site in Google's index. Site operator results can show you:
- a rough estimate of how many pages have been indexed
- one indication of if your site has been hacked
- if you have duplicate titles or snippets

Your Sitemap report provides more granular statistics about the URLs you submitted, such as the number of indexed URLs vs. the number submitted for crawling, and Sitemap-specific warnings or errors that may have occurred when Google tried to access your URLs.

Feel free to check out our Help Center for more on the site: operator and Sitemaps. If you have further questions or issues, please post to our Webmaster Help Forum, where experienced webmasters and Googlers are happy to help.
Posted by Charlene Perez


28 comments:
Thanks for this info. I have been wondering as well and you have surely enlightened me. it pays to subscribe to your RSS. Thanks!
What is the reason why the # of URLs in the sitemap vs the # of URLs indexed is so different in some cases. As long as the URLs in the sitemap are valid and have unique content shouldn't they all be indexed?
Thanks!
I have also noticed that the number of pages indexed do not reflect the number of pages shown when using the site: modifier in a search.
I often tell people that pages show up when they are relevant to what google believes is the topic of the website. Even so, I have seen some pretty blank pages and pages that have no relevance at all showing up on the odd occasion.
Rather than summarising why the results differ, I think it would be much more useful to educate those who perform SEO on a daily basis so that they have something to report back to their customers who I can understand are probably getting very angry with the lack of knowledge and information.
*Edited typo*
I have a similar issue with the link operator (link:). The operator returns far fewer results than webmaster tools - why would they be any different?
Hi
Is there any way to bring Delisted site back on google...
Good news. How about robot.txt?
Great information, thanks for sharing.
Hi. Como ago para eliminar todos mis sitios web, que estan en google..
Thankd for the informaiton. I am new to all of this and am trying to learn as much as possible.
How about showing pages that are indexed but not in the sitemap? Or even pages that are found but are not in the sitemap? This will help us identify spam.
We'd probably also need a way in webmaster tools to indicate that those pages are legitimate--sometimes you may only put part of a site in a sitemap, or you may have user-contributed info that may you may not put into a sitemap.
My site is a problem a few months can not solve, the site
ok, following the google webmaster guidelines.
I'm even dreaming about that rsrs s complicated, I hope that one day the
google my site review
link my site: santaisabelonline.com.br
where can I seen my sumary? I already search around and nothing had found. My website is simplefuelwater.com
How do i remove a link set by a spam site from my Google webmaster tools "Not Found" diagnosis. I have a site at 1mainstreet(dot)com. And some spam site linked to my site in a kind of weird way. Just notice the two posts under 25th December date on pages http://tinyurl.com/bjlsrx and http://tinyurl.com/bylc4p.
Now the diagnosis via webmaster tools says that this page is not found. I cannot do anything about other site that links that way. I like it when the tools stats show 0 in diagnosis tab. So i would like this to be removed. But simply cannot find the way to go about it. Any help would be appreciated
All righty. This is useful from a couple different angles:
1. It sets our staff's mind(s?) a little more at ease about Google's actions and intentions (yeah, we *know* you guys would never do anything evil, but it's always nice to get more evidence of that).
2. It gives us an actual explanation to provide to clients who notice the discrepancy (stuttering and saying, "uhhh ... I'll get back to you on that" never seems to impress them).
Hi.
Sometimes, using the site: operator, I get a high number of urls indexed, like 1,200,000 that makes sense. but most of the times I get low number, 625,000. does this have to do with the multiple data centers not beeing upadated?
Thanks!
I'm still trying to figure out how to submit a sitemap for a blogger blog.. :/ but I'll figure it out soon I'm sure :) Thanks for this post tho.. I learned why it's important.
I can't get my livejournal sitemap to work It reports "URL not allowed", because all my posts are immediately under the root of the site (e.g. foo.com/a.html) while the RSS/Atom feeds are under a subdir (e.g. foo.com/data/rss).
Any way to remedy this, knowing that I have no control over how livejournal works?
Sitemaps? I don't know what this webmaster's tool is about. It doesn't work at all. I submitted a sitemap. Result?
I have 0 visits from Google search. That's a zero!
For this blog : http://www.leblogdelamirabelle.net
I have a Google rank of 3 and lots of quality backlinks.
When I type "blog de la mirabelle" my blog doesn't show up anymore!!
But according to this tool everything's fine.
Of course Google ignores all my mails. It's time Google gets competition since it's not working and doesn't respect users enough to deal with the problem.
I am staying busy now upgrading buisness broker web sites with old, corrupt and/or non standard code - and optimizing as I go. Some are riddled with page errors and MS styles
I have heard that too much improvement in a site's search search friendliness - TOO QUICKLY can raise a false flag.
Some of the sites I have worked on are total wrecks (one was not displaying in Firefox and IE8), with no text tags, keywords, descriptions, robots tags and on and on - so the legitimate changes I make are drastic, compared to the state they were in.
What can I do to keep from triggering some kind of sanction?
I'm the first commenter of this post and i just want to tell you that i have submitted sitemaps and i see a lot of results when i do search now through my site. yipee!
Your Daily Word
Thats great as it shows up all the PDFs as well as the HTML pages
Crooks Design – East Anglia Graphic Design
I have submitted my sitemap for my website on March 31. On April 13 my sitemap was crawled with 47 URLs and for a SECOND time my sitemap was crawled today on April 19 with 56 URLs and NONE of my pages are indexed. Am I doing something wrong? Anyone that knows the answer to this, please reply to this feed.
This post makes 0 sense, implying that the sitemap is a "guide" and the site operator tells you what they crawled and have in their index.
If that is the case, what is the reason for the INDEXED field in sitemaps? Why give us a number, specifically saying the * number of urls * indexed out of this many crawled, based on the sitemap I've submitted.
If I have 50,000 links submitted, and I see 15,000 indexed.. then I do a site: and get 5,000 what the hec??
It just doesn't make any damn sense Google! I'm simply trying to find a reliable way to calculate my site's inclusion ratio. Why do you have to beat around the bush. The Webmaster Tools was a big improvement.. (over.. nothing which existed before?) but you still are not addressing our needs.
I need help not sure of this is the right place to list this but I have submitted a sitemap in Google webmaster tools and the number of indexed pages is 344 lower then before it was submitted - is this due to the newly submitted sitemap being crawled from 0 again so its only so far through it up to now.
When use the site: operator to se how many pages are indexed from my web page I obtain a huge diference from google.co.uk than if I go to google.com.mx; for example the uk the number is 300,000 in mx the number is 38,000, why????
Hi,
I have uploaded a xml sitemap for a site, which in webmaster tools it says it has been indexed.
However, when I search for it site:newSite.com - it does not match any documents
Can you help please
Thanks
B
it's a good info, thanks for this info.
Hi everyone,
Since over a year has passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Central Help Forum.
Thanks and take care,
The Webmaster Central Team
Post a Comment