Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

Using RSS/Atom feeds to discover new URLs

Thursday, October 29, 2009 at 5:50 PM

Webmaster Level: Intermediate

Google uses numerous sources to find new webpages, from links we find on the web to submitted URLs. We aim to discover new pages quickly so that users can find new content in Google search results soon after they go live. We recently launched a feature that uses RSS and Atom feeds for the discovery of new webpages.

RSS/Atom feeds have been very popular in recent years as a mechanism for content publication. They allow readers to check for new content from publishers. Using feeds for discovery allows us to get these new pages into our index more quickly than traditional crawling methods. We may use many potential sources to access updates from feeds including Reader, notification services, or direct crawls of feeds. Going forward, we might also explore mechanisms such as PubSubHubbub to identify updated items.

In order for us to use your RSS/Atom feeds for discovery, it's important that crawling these files is not disallowed by your robots.txt. To find out if Googlebot can crawl your feeds and find your pages as fast as possible, test your feed URLs with the robots.txt tester in Google Webmaster Tools.

The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

22 comments:

Risma2006 said...

This is could be the first. I've been using RSS/Atom Feeds since I make my blog. The feature is easy to use more delight. Thanks

dvdroest said...

Interesting, RSS is certainly an easy digestible format. I have 2 things that I find very interesting here and perhaps you guys can answer this.

1: What does this mean for sites that have RSS, will the web pages themselves still be fully crawled and how important will the RSS data be compared to the document itself?

2: What fields of the RSS feeds will Google read and how do these relate to the meta description and title tags for the web pages themselves?

el7cosmos said...

I'll gonna try it...

David Herron said...

This is so weird - I'd been assuming all along y'all would consult RSS/Atom feeds for new content. It's such an obvious thing to do and that's what they were invented for. What took you so long? (wink)

Michael Martinez said...

So now that you've incentivized the link spammers to redouble their efforts to plague blogs with crappy comments, will the other hand at Google give Blogspot users the ability to update the robots.txt file so that we CAN disallow the atom/RSS feeds?

Ideas At Random said...

Do you think Google will ever start indexing the rel="tag" ? It would be cool to search your site or blog by tag.
http://microformats.org/wiki/rel-tag

Mark Essel said...

Brilliant.
A smart practice websites can partake in, upload content to local super hubs for easy indexing (pubsubhubbub, RSScloud)

sunil said...

Nice infomation about feed, but i am still getting problem to index my blog http://vegfruitcarving.blogspot.com

Ash said...

I don't get this. I assumed that Google has access to Feedburner, which knows all my updates as they happen.

OIC, not all blog owners know about Feedburner.

Still, this is a weird bit of news.

kK said...

feedburner works great.. and i have done feed submissions before and they seems to have triggered something that got my blog post/pages indexed within 24 hours..

amazingly fast

inkdroid.org said...

Anyone know if GoogleBot supports following links to other feed pages by grokking Atom Feed Paging and Archiving?

stellaronlinemedia said...

RSS/ATOM feeds are generally details present on website, how can be this useful to discover new URLs

shaun said...

Thanx for the valuable information. RSS is certainly an easy digestible format. I have 2 things that I find very interesting here.. keep posting. Will be visiting back soon.

Jaypee of enjayneer.com said...

i sure learned a lot from this article.. thanks.

cornyprincesston said...

Related to RSS indexing by Google I have a suggestion :
Under "show options" in Google search there should be an option to retrieve only links to RSS feeds for the particular searched term. This will greatly help in finding RSS feeds related to topic of interest and help in filtering/finding useful content.

Tenta said...

Not sure if this is a good idea... google has indexed my rss feeds, and for some reason googlebot thought that some html tags used for formatting the feed are text, so when I look at webmaster tools, the keyword with most significance is 'strong'... weird isn't it?

Tastro.org said...

how to ping a RSS feed ?

david lawton said...

isn't it a bad practice to allow crawling of feed in robots because of duplicate content? (especially if using browserfriendly for feedburner, so the page is rendered as html)

gary said...

thanks all useful stuff.

However if using atom.xml?redirect=false&start-index=X&max-results=100

it kindly submits to google, but when should i expect to see them indexed? 24 hours - 24 days or what?

anyone any ideas?

gary said...

my other question relates to atom and rss.

Should i prefence one or the other or do i need both?

Any ideas?

Lukman said...

i used to be using rss/atom feeds, now i will using that again, since it being indexed by google bot.

Google Webmaster Central said...

Hi everyone,

Since over a year has passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Central Help Forum.

Thanks and take care,
The Webmaster Central Team