Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

Behold Google index secrets, revealed!

Tuesday, July 24, 2012 at 3:56 PM

Webmaster level: All

Since Googlebot was born, webmasters around the world have been asking one question: Google, oh, Google, are my pages in the index? Now is the time to answer that question using the new Index Status feature in Webmaster Tools. Whether one or one million, Index Status will show you how many pages from your site have been included in Google’s index.

Index Status is under the Health menu. After clicking on it you’ll see a graph like the following:





It shows how many pages are currently indexed. The legend shows the latest count and the graph shows up to one year of data.

If you see a steadily increasing number of indexed pages, congratulations! This should be enough to confirm that new content on your site is being discovered, crawled and indexed by Google.

However, some of you may find issues that require looking a little bit deeper. That’s why we added an Advanced tab to the feature. You can access it by clicking on the button at the top, and it will look like this:





The advanced section will show not only totals of indexed pages, but also the cumulative number of pages crawled, the number of pages that we know about which are not crawled because they are blocked by robots.txt, and also the number of pages that were not selected for inclusion in our results.

Notice that the counts are always totals. So, for example, if on June 17th the count for indexed pages is 92, that means that there are a total of 92 pages indexed at this point in time, not that 92 pages were added to the index on that day only. In particular for sites with a long history, the count of pages crawled may be very big in comparison with the number of pages indexed.

All this data can be used to identify and debug a variety of indexing-related problems. For example, if some of your content doesn’t appear any more on Google and you notice that the graph of pages indexed has a sudden drop, that may be an indication that you introduced a site-wide error when using meta=”noindex” and now Google isn’t including your content in search results.

Another example: if you change the URL structure of your site and don’t follow our recommendations for moving your site, you may see a jump in the count of “Not selected”. Fixing the redirects or rel=”canonical” tags should help get better indexing coverage.

We hope that Index Status will bring more transparency into Google’s index selection process and help you identify and fix indexing problems with your sites. And if you have questions, don’t hesitate to ask in our Help Forum.

Posted by , and, Webmaster Tools Team
The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

73 comments:

BrettMove said...

Awesome! Just awesome!

Takeshi said...

Ok, this is amazing, but where is the export function??

Jennifer Alviso said...

Please give us a way to export the data. Thanks :)

Menashe Avramov said...

Now provide us the same tool for the unindexed.
pages to help the webmasters get ready for potential problem

Richard Wan said...

Nice for dynamically generated pages from webmasters...

Naat Niit Nuut said...

Awesome! Thanks :)

webdesignarticle said...

This was much awaited feature.. Thanks for getting it done... But i suggest that this feature should have have data export feature in it.

Pierre Bouchard said...

Amazing feature ! Hope the API datas will be refresh every night correctly, not as crawl errors datas via API. It's important !

Javi said...

colors are not very well chosen... gren should be "indexed" = good, red should be for BAD (not indexed, blocked...)

Jitendra Indave said...

This is what I was looking for...thanks blogger team..

Krishna Salvi said...

This Option is very helpful for Webmasters, If Google Webmaster Tool add one more feature for Export Data than Webmasters manage website very well.

I hope Google update this feature in near future!!

Kevin Gallagher said...
This comment has been removed by the author.
Kevin Gallagher said...

This is all great but is there anywhere where you can see which ones are indexed and which are not?

TOP PAGE SEO said...

A massive leap forward in transparent information from Google, Thanks a lot

Keith Horwood said...

Nice one big G.

Would be great to compare against another managed url for Migrations. As others are saying here, would be great to highlight some unindexed pages, and to be able to export this.

Javi said...

It seems that crawled and not indexed include urls with parameters (sort=asc affId= for example). Only that would explain those high numbers. Any clue?

blog said...

I think it will be very much helpful to Webmaster to determine indexed pages's number as well as pages which are out of index.

Krinal Mehta said...

How can a rel=”canonical” tag help in redirect ?

Krinal Mehta said...

How can a rel=”canonical” tag help in setting up the redirects ?

asri yatno said...

Mantap memang google

Federico Sasso said...

Great!
Will when it be available via GData API?

David Kartuzinski said...

I am not sure how you want Google to tell you which pages are not indexed - since knowing they exist - would mean that they're indexed. Anyway, maybe I am missing something, but this tool simply tells us which pages Google is aware of.
-DK

araghu said...

Thank you very much for your help! Note of webmaster

Seo Services And Packages said...

Here we can see only no of urls. can we see those pages also so that we can request google to index our pages. If its already in webmaster then please let me know.

Amardeep Singh said...

Thats really a great help for webmasters and SEO guys to identify the number of pages are indexed, blocked or some other crawling errors.

Thanks for bringing more transparency to GWT.

Amardeep Singh

David Radford said...

This is a really nice feature - but it would be really good if you could drill-down into the 'not selected' data to get a ratio of re-directs vs. duplicate content as Wordpress (for example) creates a ton of redirects for permalinks etc but in the grand scheme of things this is less of a concern than knowing how much of your content is not showing because Google thinks it is duplicate content...

sc said...

very useful, thank you!

kumar said...

Its really help for webmaster's...

Mahrus Cell said...

di podolin ....

Meryem Psychic said...

Please webmasters url stats/details of total index, ever crawled, not selected, and blocked by robots pages. It will help them properly fix the ignored/ corner case pages. It might help people suffering from Panda penalty unnecessarily as well. Some website that are spammed without any knowledge of webmaster will also gain from the data details. Yes it might help spammers as well in some cases. But in the we need to keep the interest of innocent people first. So considering that please release the said option as soon as possible.

Rajiv said...

really helpful information thanks for sharing. :D

JB said...

Really useful. Next step, give some example URLs that are indexed, blocked etc

Margarita said...

This is AWESOME! Thank you, so much. :)

adamparnala said...

This is an amazing development! Very nice. =)

Javin Paul said...

Thanks for adding this detail, it will certainly help webmaster around the world which were previously rely of site: wide and URL search. By the way author status is still not showing correct details, looking forward to see it fixed.
Javin

Alex said...

The new features will help webmasters for checking again and again for index pages. Now they can see their index pages at one place. A great tools for webmasters. Thanks a lot for adding new features under Health tab

Bibiano Wenceslao said...

Thanks for the transparency. For everyone else who needed clarification about this new feature, read: https://support.google.com/webmasters/bin/answer.py?hl=en&answer=2642366

Abdelrahman Ellithy said...

this is a great update
one single most important diagnostic in WMT is now monitored more easy

Eric : Blog De Manila said...

I just hope we can see which pages were indexed or not.

Hassan Awan said...

Hello Google guys, can we also have blocked URL in:
Index Status> Blocked by robots

מנהל said...
This comment has been removed by the author.
הגרגרן said...
This comment has been removed by the author.
Bikky Sah said...

Sir I am new

Please tell me about this index image

Here is the Link - Image of Index Status

dalgicpompa said...

awesome action. thanks for article

revolution said...

Great tool. There is anyway to check which pages are not indexed
the actual URL's?

Mike Miller said...

For sites that have sub domains, this new tool is not useful.

I did a search for site:www.mysite.com and found that I have 164,000 pages indexed. I compared that to GWT and found that I have 1.37mil pages indexed. Why such a difference? Well its probably because I have several subdomains and this report is aggregating the results together.

Please split this out or all us to select a filter that says "exclude subdomains" since technically this is a completely different site.

metamercadeo said...

I still see that Googlebot can be cheated about the amount of indexed pages, for example that worspress plugin called search terms tagging 2 use to index a whole lot more than the ones that actually exist on the website

Michiel Van Kets said...

that's indeed very handy, after all the pages only count as long they're indexed.

Birkan said...

This is a really nice feature - but it would be really good if you could drill-down into the 'not selected' data to get a ratio of re-directs vs. duplicate content as Wordpress (for example) creates a ton of redirects for permalinks etc but in the grand scheme of things this is less of a concern than knowing how much of your content is not showing because Google thinks it is duplicate content..

E M said...

Can you show us the URL's that havent gotten indexed, and if that's a large data set, can you show us the URLs for pages that are indexed? We need more transparency in order to make the site better.

E M said...

Can you show us the URL's that havent gotten indexed, and if that's a large data set, can you show us the URLs for pages that are indexed? We need more transparency in order to make the site better.

design said...

Great share, i am just hoping that next time, in this index search term, Canonical index urls should be covered, then it would be easier to filter that on which pages you set canonical and on which pages u didn't set!!

Peter Lauge said...

The green line (Not Selected) What can I use it for, when I do not get all the URLs behind it?

Yes, then I know I have some DC problems, but not where.

:-)

Adrien Ménard said...

I daily use for my website a SEO software called Botify. Web crawl and logs analysis are merged in a webinterface that lists indexed and unindexed pages.

Unknown said...

Really, An awesome Thing

Aditya Solanki said...

awesome

Steve Smith said...

Such a great feature is added to the webmaster tools it will help us to know about our indexed pages in a short period of time !!

JohnMc said...

Thank you G. This new tool helps to evolve the relationship between Google and Webmasters. I hope in a future coming you also let us to know or download a list with the exactly "No Selected" pages that aren't being indexed as well as those being indexed and blocked by robots. :-J

Christopher Drinkut said...

Your side column link is not working...

New to Webmaster Central?
>>Learn more<< about Google Webmaster Tools.

FYI

Super Coder said...

not good because what to do after findout how many URLS are still pending. As we don't have any way to findout which URL isn't indexed yet.

4 Out of 10
Poor

craig Hoggins said...

Index check shows 0 pages indexed yet when i check sitemap stats it shows i have all my pages indexed. I"m confused.

Why am I getting this conflicting data

Alexandrina said...

Thanks for information,it will help to track changes.

jaqkar said...

What JohnMc said. Would be great to find out what pages are not selected.

Satish Sharma said...

For one of my website "Total Indexed" is nearly double of "Ever Crawled". This is just not possible, can you please explain any such possibility.

Darren said...

This post will help us a lot but how can i get the data for not selected URLs.

victoria tran said...

Thanks when I know how to see the site.

Vietnam Blogger said...

oh, it is a post good and useful...

Boris said...

How it possible if "Not selected" is much more than "Total indexed"?

Admin PakAdsense4u said...

Hello sir! i hope you are fine
I have a blog which is 1 years old now. I saw it after a long time now and was shocked to see that Google robot has blocked 200+ pages of my website. I saw this through webmaster tools. IS there any way I can unblock them? I much confused please help?

Jadla Walid said...

google take more than week before recognize its new post

Author, Scott Peters said...

I just looked at my graph for my blogger site, and it says 74 pages indexed, and 108 pages blocked by robots. Help! The yellow line is off the charts! I don't have a custom robot text in my blogger page. How can I fix this?

Soneet Aggarwal said...

drsoneet.com need to re indexed. So what should be done for that?

What is google's policy or mechanism for submitting a site for re indexing.

Herald Seo said...

Awesome! It is really helpful for me.