Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

Better details about when Googlebot last visited a page

Tuesday, September 05, 2006 at 7:34 AM

Most people know that Googlebot downloads pages from web servers to crawl the web. Not as many people know that if Googlebot accesses a page and gets a 304 (Not-Modified) response to a If-Modified-Since qualified request, Googlebot doesn't download the contents of that page. This reduces the bandwidth consumed on your web server.

When you look at Google's cache of a page (for instance, by using the cache: operator or clicking the Cached link under a URL in the search results), you can see the date that Googlebot retrieved that page. Previously, the date we listed for the page's cache was the date that we last successfully fetched the content of the page. This meant that even if we visited a page very recently, the cache date might be quite a bit older if the page hadn't changed since the previous visit. This made it difficult for webmasters to use the cache date we display to determine Googlebot's most recent visit. Consider the following example:
  1. Googlebot crawls a page on April 12, 2006.
  2. Our cached version of that page notes that "This is G o o g l e's cache of http://www.example.com/ as retrieved on April 12, 2006 20:02:06 GMT."
  3. Periodically, Googlebot checks to see if that page has changed, and each time, receives a Not-Modified response. For instance, on August 27, 2006, Googlebot checks the page, receives a Not-Modified response, and therefore, doesn't download the contents of the page.
  4. On August 28, 2006, our cached version of the page still shows the April 12, 2006 date -- the date we last downloaded the page's contents, even though Googlebot last visited the day before.
We've recently changed the date we show for the cached page to reflect when Googlebot last accessed it (whether the page had changed or not). This should make it easier for you to determine the most recent date Googlebot visited the page. For instance, in the above example, the cached version of the page would now say "This is G o o g l e's cache of http://www.example.com/ as retrieved on August 27, 2006 13:13:37 GMT."

Note that this change will be reflected for individual pages as we update those pages in our index.
The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

7 comments:

robvh said...

Checked one of my websites during 2 month (febr.,march, 2007)now and checked all googlebot/2.1 visits (access.log) 9/day in total 415 googlebot hits of 30 webpages. Compared it to the cache dates. Cache dates are completely randomly changed. Page 20 times visited were changed 1 time in cache date. From 3 pages cache dates were not changes after 2 months and ca 15 visits per page. Other page cache dates were changed 4 or 5 times. Content changes did not influence anything. So I'm sorry but up to now I don't believe a word of what is written here.

Wonderer said...

I foolishly put in my full name on a couple of sites. I have since been in touch with them and they have removed my name, however it still exists in the google search engine and leads to those sites...I am told it depends on when Google updates its cache...how often is this done and can I make a personal request?

aaront said...

Can anyone explain why the last retreived date shown for my website has gone back to an older date? I cant work out why Google was showing the last retrieved date as the 6th Feb for the last few days and then today shows the 3rd Feb?! How could this happen?

vanhouseip said...

Hi,
My question is: Why is not showing a cached page of a site?. Instead page is being indexed and showing the cache link on Google results.

Shakira_1441 said...

How can we ask google to take down our cached pages, for example we have a personal page with personal information. Like an address, or phone number. And that page got cached, but recently we have updated it....

Heera said...

Why would the cache command all of a sudden show:

Your search - cache:www.domainname.com - did not match any documents

This is so for our domain and works fine with others.

Google Webmaster Central said...

Hi everyone,

Since quite a while has passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Help Group.

Thanks and take care,
The Webmaster Central Team