Tuesday, June 08, 2010 at 5:00 PM
(Cross-posted on the Official Google Blog)
Today, we're announcing the completion of a new web indexing system called Caffeine. Caffeine provides 50 percent fresher results for web searches than our last index, and it's the largest collection of web content we've offered. Whether it's a news story, a blog or a forum post, you can now find links to relevant content much sooner after it is published than was possible ever before.
Some background for those of you who don't build search engines for a living like us: when you search Google, you're not searching the live web. Instead you're searching Google's index of the web which, like the list in the back of a book, helps you pinpoint exactly the information you need. (Here's a good explanation of how it all works.)
So why did we build a new search indexing system? Content on the web is blossoming. It's growing not just in size and numbers but with the advent of video, images, news and real-time updates, the average webpage is richer and more complex. In addition, people's expectations for search are higher than they used to be. Searchers want to find the latest relevant content and publishers expect to be found the instant they publish.
To keep up with the evolution of the web and to meet rising user expectations, we've built Caffeine. The image below illustrates how our old indexing system worked compared to Caffeine:

Our old index had several layers, some of which were refreshed at a faster rate than others; the main layer would update every couple of weeks. To refresh a layer of the old index, we would analyze the entire web, which meant there was a significant delay between when we found a page and made it available to you.
With Caffeine, we analyze the web in small portions and update our search index on a continuous basis, globally. As we find new pages, or new information on existing pages, we can add these straight to the index. That means you can find fresher information than ever before — no matter when or where it was published.
Caffeine lets us index web pages on an enormous scale. In fact, every second Caffeine processes hundreds of thousands of pages in parallel. If this were a pile of paper it would grow three miles taller every second. Caffeine takes up nearly 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day. You would need 625,000 of the largest iPods to store that much information; if these were stacked end-to-end they would go for more than 40 miles.
We've built Caffeine with the future in mind. Not only is it fresher, it's a robust foundation that makes it possible for us to build an even faster and comprehensive search engine that scales with the growth of information online, and delivers even more relevant search results to you. So stay tuned, and look for more improvements in the months to come.


116 comments:
Great update to the index
YAY!
All very cool, until you get someone complaining about a post we deleted 160+ days ago which is still in Google under their name. Freshness everywhere pls.
this is a good sstem. can we expect our new published posts to be indexed in google quickly than it used to be before
I love the graphics. Caffeine looks a bit chaotic though, eh.
Very interesting. Can you please elaborate a bit more on the new system internals? Please!
This is a great name, hope will be a great form, as to which we have become accustomed.
that is awesome! cant wait
re: I love the graphics. Caffeine looks a bit chaotic though, eh.
Why is it that lines look "organized" and curves look chaotic? Too much 3-D animation eh? to me looks like the cosmos vs a railroad track. lol
Will this change the googlebot useragent?
Awesome post, very informative on how things work. Can you clarify on what are the percentages (%) of index is photos,content, real time updates?
Looking forward to a better search experience. Thanks.
How do you predict this will affect the problem of SEO Poisoning websites that quickly climb the ranks of popular search terms?
So whats the significants of the inverse square law in caffeine?
Significant update!!!
Does that mean for example that Google would update metadescriptions way more frequently? I am not talking about blog, tweet or fb updates i am talking i.e. about static content like in the eshop. If i would put the price of an item in my metadescription is there chance that it would be indexed way faster if i change the price? Does that affect rich snippet updates as well?
that sounds really exciting as an seo!
does this mean that the push for QDF is going to continue and that it is now easier to rank well for fresh content instead of just <a href="http://www.paygseo.co.uk>building backlinks</a>?
SEO consultants have to work harder from now on! :)
This is interesting..
I am just a heavy coffee drinker and I got this here via twitter by the way..
It's definitely much better as the previous index. We can see improvements straight away - for example blog posts are being indexed right after they are published, news are much more fresh, etc. Good work!
finally the Caffeine!
this means quicker and quicker. for "fresher" sites/blogs they get to top the serp quicker than before.
correct me if i was wrong :)
'So stay tuned, and look for more improvements in the months to come.' Hmm... wonder what else in the pipeline?
looking forward to the the new caffeine magic !
Yes this is sure that all SEO's work more harder. This features is very cool and good for normal man who finds only relevant and updated information for his search. Good Job....But now how this change effects the search results..
Yes, this can be very helpful for social media sites and bookmarking sites because the content in these sites on frequently updation ....
Let's see how fast it works. It seems it'll prioritize social media for ranking and indexing. It also somehow beneficial for SEO consultant like us as it proves again that SEO is not a one time deal ;-)
What is this going on Official Google Webmaster Central Blog useless and spam comments are approved whether they are nofollow but this is totally spamming...So how our blogs and sites are safe from these spammers...
This is an blog for public announcements not for website promotion....
Google should take an immediate action against on these spammers.
Fresh and informative blogs and sites will stand first now. :) So be update
Cool, really curious to see how this is going to work!
You are right madhav. Spam comments should be removed. What the hell google's spam controllers are doing!!!
Could you submit some ip addresses where caffeine is working now? and please give more information when it will be available globally. regards.
So if Google indexes web pages in short portions, then the websites like forums, blogs, twitter, facebook etc will get indexed faster than other normal website.
Superb News..Let's see how it effects in my blog rankings..
http://www.smartbloggerz.com
Thnks for information..
We Love Google :))
video izle
great information - look forward to working with it. :-)
how does caffeine behaves versus google sitemap generator installed on the host ?
I've been using the Caffeine beta for a while now and must say I prefer the depth of search options available.
Great News Google!
http://www.smartbloggerz.com
I don't get it, you are saying that it's the largest collection of web content ever offered, but in the last few weeks millions of pages have been dropped from the index.
I am wondering how this will affect those of us who are Etsy sellers. http://MadArtjewelry.etsy.com
I love the new search engine. Yes, it is a bit different but it certainly seems to work faster. Good job!
While I'm delighted in the caffeine enhancements, I find it interesting that the perpetual spammy top 2-3 SERP positions for my top search words (time clock software) for our Virtual TimeClock product haven't changed.
Either the black hat SEO guys are on top of caffeine or the full benefits have yet to arrive.
What about Google Groups search?
Google rocks! Can't wait for Caffeine to kick in :)
Carrie Grimes... Anyone ever call you Grimey?
On-topic : This is good news. I am interested in playing around with this new engine and finding new ways to test SEO.
I have some articles that are not indexing on google, can someone tell me who I can contact with questions?
thanks.
Kelly
Web 3 in the making...new challenges for SEO indeed. My clients will benefit for sure.
Interesting.
Google now giving more fresh and relevant results.
Ok, time to check the results... did our sites go up, or go down. *LOL*
Okay, at least i'm getting some information. Thanks for posting and letting us webmasters in on whats going on. Really appreciate it.
Its interesting, I hope the new system would better results than previous.
Looking forward to get more experience....
Hmm, a robust foundation for the future? There's live real time direct search lurking around the corner, so just re-vamping the dinosaur won't save you from going where others have gone before. The internet itself will be searchable, not a private index ... :D
-luzie-
Grate and awesome, worlds always admire with Google technology...
I have an idea. Can we have a software that we install on our server or local computer which allows us to index our site when & as often as we want & how much to index. It then sends the raw info to Google. This takes the strain away from Google's computer, but ensures that our site is up to date & correct with Google.
The new google is great but I was just wondering why it doesn't see all the updates of my site http://www.iberestudios.com !! We worked really hard to get a really fresh content. I also wonder why google still looks at old contect when we change all the web structure six months ago.
Thanks
Excellent work guys. Keep it up.
Thank you for this update South Africa will be green IT by www.greenitweb.co.za
Freshness is good and changes are good.BUT,SMBs have a hard time keep track of it all. See how SMBs can make make Caffeine work for them: http://www.zebworks.com/zeblog/google-caffeine-chaos-whats-an-smb-to-do
I wonder how this will effect the blog search or blog indexing? Faster recognition of daily update ?
Index it very quickly and content is more effective. Congratulations!
But where is the road of SEO.
I have received a couple of Tweets from my friends regarding a problem with Google search and caffeine with inappropriate and irrelevant content showing up as the first result!!! can u please check it. I have posted the link here.
http://www.google.com/search?hl=en&source=hp&q=symbiosis+distance+mba&aq=f&aqi=g-c1g7g-m1&aql=&oq=&gs_rfai=
Was wondering what the soft 404's were for. Had six show up for my article directory.
Very intresting! Let's see what the impact on SEO will be...
Under Caffine, will new pages index faster?
My client hosts their website with UniteU and is integrating their website inventory with the Retail Pro inventory management system. This integration will change the URL structure breaking all current URL's of department and product pages indexed by Google. The UniteU Account Manager says that they cannot do anything to redirect visitors who click on an indexed link containing the old URL to the new URL. So the link will take them to a "sorry, page not found" message on their site.
What are your thoughts?
that illustration of how your old indexing system worked compared to Caffeine just cracks me up!
Great graphic!
Thank for your information,, i think this article very useful.
I saw my restaurants in cape town website jump from page 18 to 1 then back down to 4 thanks to this new indexing system.
does caffeine consider your web history when it gives you search results??
It seems like new blog posts can be found quickly on google, but the 'old fashioned' websites and shops are getting a lower rank. so it is good thing fresh information is ranked first, however it is a pitty that good information sites or shops will get a lower rank...
It's a great resource of update. that sounds really exciting as an off page seo
great stuff better for users as well as for new publishers
this is good stuff, but where are the details?
increible......info actual
How do they assign a page ranks to a fresher webpage as compared to an a older page in caffeine indexing?
Fabuluous c:)
what about seo services now
Great news!! Looking forward to get more details about the core process.
Does SEO still factor into this or not?
I'm already noticing blog posts indexing faster. Nice job Google.
Does keyword density play a factor in indexing now?
I'm all excited with this new indexing system. Hope google would have a video on this, to help us spread the message to others.
I guess, this new update in the Search Algorithms is a vital strategic point of difference against other Search Engine Competitors! ;)
Well Done, Google! :)
Yours Forever,
SwayamDas2010.
Does this mean at last I can be on page 1
http://www.pension-transfers-qrops.com/
Interesting. Now we will get fresh results. Thanks to Google for such a great service
Great step from Google, I am dreaming one day search engine can explore all websites in the word without spending time in indexing and ranking, it’s far from reality but caffeine system it’s one step to make it real.
Mike, Dailywebarticles.com
this is good news, hopefully the new system works well.
indoor led lighting
Great news for Bloggers! Gives us a better chance vs' the Established sites in our niche :)
crawling and indexing... just it.
Finally. A definitive answer.
great update for fresh results and advantageous,but in previous one of my site web cache is for every 5 days but now it doesn't change from when caffeine completed, if i update new content to my page when it will get cache and shown in search results
Great.........Keep it up...Google
=====================
http://www.toohow.blogspot.com/
http://toohow.blogspot.com
Wow.. it's really great improvement!!
I have waiting for 2 week my new site junglenotes.com to be indexed!
But with this new search index - caffeine, I'm really excited!!
Hope we can enjoy it sooner!
Thanks Google!
I love Googlebot when it comes to my site again and again!
So basically, how do I have my website http://betfootballspreads.com appear in google search engines? Is it something that I have to 'add url' or will it automatically catch it?
Hmm, Sounds all cool but our website GoingBusiness.com, nobody can find it in the US when u search for it's topic - businesses for sale, instead it shows up in India and were' a US based company with proof right on our about us page.
Bing and Yahoo both have us covered in the right location (US),we don't rank great on them but I guess something is better then nothing.
Too bad that link spammers can still get their site pagerank 5, but good SEO doesnt get you any higher than a 3 with months of work :(
Amazing аgain! So what is next? I know Google, basic advantage of it is that it never sleeps!
wow thats great for information and real time search,but it requires a new line of thought to the marketing online professional aroud the world
I think google is reading JavaScript and showing in search results.
It indexed unwanted URLs in javascript code and In webmaster tools its shows as 404/500 page errors.
what do u say ?
really a very good and much needed update
Sir,
Can you kindly give me this information:
how can I have my website http://prithwiraj-jha.weebly.com
shown by using google search engines?
I had already indexed the site with http://www.google.com/addurl about 2 months ago, but still my site does not appear through google search............. can someone please help me out???
Yours faithfully,
Prithwiraj Jha
so.. how my web can index-ed by google
Will this change the page rank?
I'm looking for more information on how to get listed on google maps. Can anyone direct me to a page that can help?
Great speed for web 2.0 post and media.
Congrats
Too much caffeine? This new search index seems to have indexed an unrelated greek PDF as our homepage. Here is an unbeleiveable cached copy of our site www.datarecovery.co.nz
http://webcache.googleusercontent.com/search?hl=en&q=cache%3Ahttp%3A%2F%2Fwww.datarecovery.co.nz
This unrelated content has killed our site's ranking.
i hope this works out for all the 'honest' webmasters out there :-)
I m always waiting googlebot to index my web site www.saklilezzetler.com
I know that Google always tries to improve itself.And i'm sure that new system will achieve the goals..
I hope this system will be better for webmasters.
I posted a page 10 days back. It got crawled 5 days back. But it is yet not indexed. Is Caffeine working?
Thanks Google for this new crawler
Caffeine may work for bloggers, but SUCKS for ecommerce users/owners. Now a simple search for a product we've had for years returns HIGHLY irrelevant results. What a joke. The only one benefiting from this is Google. Their CPC revenues will go through roof. Wonderful job Google ... NOT.
Hope it's will be better. Good effort from google team. I appreciate it.
Really great update. Now indexing process is so brilliant and fast.
This makes us more spirit to continue to provide the latest information for the readers.
Hi everyone,
Since over a year has passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Central Help Forum.
Thanks and take care,
The Webmaster Central Team
Post a Comment