Tuesday, November 06, 2007 at 4:52 PM
Many webmasters have discovered the advantages of using Ajax to improve the user experience on their sites, creating dynamic pages that act as powerful web applications. But, like Flash, Ajax can make a site difficult for search engines to index if the technology is not implemented carefully. As promised in our post answering questions about Server location, cross-linking, and Web 2.0 technology, we've compiled some tips for creating Ajax-enhanced websites that are also understood by search engines.How will Google see my site?
One of the main issues with Ajax sites is that while Googlebot is great at following and understanding the structure of HTML links, it can have a difficult time finding its way around sites which use JavaScript for navigation. While we are working to better understand JavaScript, your best bet for creating a site that's crawlable by Google and other search engines is to provide HTML links to your content.
Design for accessibility
We encourage webmasters to create pages for users, not just search engines. When you're designing your Ajax site, think about the needs of your users, including those who may not be using a JavaScript-capable browser. There are plenty of such users on the web, including those using screen readers or mobile devices.
One of the easiest ways to test your site's accessibility to this type of user is to explore the site in your browser with JavaScript turned off, or by viewing it in a text-only browser such as Lynx. Viewing a site as text-only can also help you identify other content which may be hard for Googlebot to see, including images and Flash.
Develop with progressive enhancement
If you're starting from scratch, one good approach is to build your site's structure and navigation using only HTML. Then, once you have the site's pages, links, and content in place, you can spice up the appearance and interface with Ajax. Googlebot will be happy looking at the HTML, while users with modern browsers can enjoy your Ajax bonuses.
Of course you will likely have links requiring JavaScript for Ajax functionality, so here's a way to help Ajax and static links coexist:
When creating your links, format them so they'll offer a static link as well as calling a JavaScript function. That way you'll have the Ajax functionality for JavaScript users, while non-JavaScript users can ignore the script and follow the link. For example:
<a href=”ajax.htm?foo=32” onClick=”navigate('ajax.html#foo=32'); return false”>foo 32</a>
Note that the static link's URL has a parameter (?foo=32) instead of a fragment (#foo=32), which is used by the Ajax code. This is important, as search engines understand URL parameters but often ignore fragments. Web developer Jeremy Keith labeled this technique as Hijax. Since you now offer static links, users and search engines can link to the exact content they want to share or reference.
While we're constantly improving our crawling capability, using HTML links remains a strong way to help us (as well as other search engines, mobile devices and users) better understand your site's structure.
Follow the guidelines
In addition to the tips described here, we encourage you to also check out our Webmaster Guidelines for more information about what can make a site good for Google and your users. The guidelines also point out some practices to avoid, including sneaky JavaScript redirects. A general rule to follow is that while you can provide users different experiences based on their capabilities, the content should remain the same. For example, imagine we've created a page for Wysz's Hamster Farm. The top of the page has a heading of "Wysz's Hamster Farm," and below it is an Ajax-powered slideshow of the latest hamster arrivals. Turning JavaScript off on the same page shouldn't surprise a user with additional text reading:
Wysz's Hamster Farm -- hamsters, best hamsters, cheap hamsters, free hamsters, pets, farms, hamster farmers, dancing hamsters, rodents, hampsters, hamsers, best hamster resource, pet toys, dancing lessons, cute, hamster tricks, pet food, hamster habitat, hamster hotels, hamster birthday gift ideas and more!A more ideal implementation would display the same text whether JavaScript was enabled or not, and in the best scenario, offer an HTML version of the slideshow to non-JavaScript users.
This is a pretty advanced topic, so please continue the discussion by asking questions and sharing ideas over in the Webmaster Help Group. See you there!


23 comments:
//////Turning JavaScript off on the same page shouldn't surprise a user with additional text reading:...
The reason that some might do this is to simply make the Non JavaScript version more informative if the user is not seeing the benefits of the Javascript design extras.
Sometimes, this includes images, animations or embeded hosted flash - so one might compensate for the void by making the text more compelling and descriptive.
Again, the goal is not to replicate - but to compensate. So making the non Javascript version equally as effective may require adding related 'extras'.
I like the way of creating a link with a href and an onclick. But the major problem still stays: people that deeplink to ajax.htm#foo=32 made a deeplink to ajax.htm.
So almost every incoming link will point to ajax.html in stead of ajax.htm?foo=32.
And therefore: a site completely in Ajax will never be optimal for searchengines.
just a precision, this statement : "There are plenty of such users on the web, including those using screen readers ..." is partially false screen readers support javascript. It basically depend on how you use it
I think this is a very poignant post. The important thing to remember is that the aim of Web 2.0 and AJAX is about enhancing what you've got rather than making a flashy website with swooshes.
AJAX != DHTML. Or it shouldn't be.
A wise man should remember that we aren't going to change the way someone designs. Someone who relies upon Dreamweaver's graphical interface and likes downloading 'plugins' left right and centre is likely to remain the same. Someone who designs thoughtfully, slowly, and believes strongly about not alienating any of the potential audience is likely to get the right idea about AJAX
In the last two months there are crucial problems with Google Sitemap. In Google Groups an active discussion is going on about the
"Network unreachable, verify your site please" issue.
It would be nice if Google developers could look after the problem and resolve it. It causes a daily pain for the webmasters.
Thank you in advance!
When I was finishing up the Google Search API-powered 404 Plugin for WordPress, I thought about what would happen to all the web robots and to a lesser extent the people with js disabled when they landed on my errorpage.
The solution I came up with is simple but pretty slick.. I use the servers request_uri variable to prepopulate the javascript variables, and also display alternate information using php.
So by doing it this way a user would never even guess they were missing out on the beautiful 404 Not Found error page returned by Google.
if i use ajax template or script to my blogger blog then it will effect search engine rankings of my site http://www.wwwportal.blogspot.com
I'm curious about "display: none;" causing problems as well. Take a JS drop-down menu for example that modified the css dynamically hiding and showing things. The html href tags are all there in the source like this post suggests but are they visible to Googlebot? I suppose just taking the "turn Javascript off" advise would answer that question in a round about way. Perhaps making all DOM objects visible and using JS to turn "display: none" would work better than embedding the style in the html or css file. That way it should be readable still if no JS support. I guess I answered my own question. Good post.
Google is indexing my site by my IP address. I have no idea how this is happening. How can I stop this and make all future requests for IP address redirect a 301 to my domain (blog.josh420.com)?
Please email me josh420@josh420.com
Kind regards...
At least your site is being indexed as opposed to ours that still maintains this ridiculous nonsense of a penalty.
When is Google going to adopt class="robots-nocontent"? We really need this!
Something isn't clear to me.
Say that when I use this code:
href=”ajax.htm?foo=32” and onClick=”navigate('ajax.html#foo=32'); return false” in the Anchor tag
I want the navigate('ajax.html#foo=32') to load something in the page and not go to a different URL than the current one. Doesn't the fact that I user the href attribute make the broser just go to ajax.htm?
If so, it's not solving anything, because ajax.htm maybe something meant only for Ajax use and not regular browsing.
Am I missing anything?
I am getting indexed at a furious pace, almost too much. I feel that it may be my competition doing this knowing full well that it throws me back in the standings and then prevents me from reaching the front pages again until many weeks have elapsed. Is there a way to determine if I am a victim or if this is just a natural state of affairs.
That's terrific ,Thanks .I will try the hijax technique on my sites.
I'm curious why google develops a pure AJAX Toolkit "GWT" which offers absolutely zero means to make an application written with it visible to search engines, not even to Google.
how does google spider identifies the key words which are relavent and place my website in first position with that keyword taken.
www.wwwportal.blogspot.com
i have seen in my webmasters account the keywords for which my blog is placed in search engines.
How can i specif a particular keyword that can enhance visitors to my blog
www.indianglitz.blogspot.com
Well I think GoogleBot is already scavenging a bit too much my javascript for links to crawl.
Particularly it is finding bit of JS code like
new Ajax.Request('someAjaxLinkURL', {parameters:Form.serialize(form), method:'post',evalScripts:true});
And trying to do a GET on that URL.
The problem is some of those URLs only accept POST, and expect mandatory parameters from the form, so my error log is being filled with exceptions.
How can I instruct the bot not to follow these URLs? Is something similar to the NoFollow attribute/CSS class?
Could googlebot be turned smart enough to look at the "method:'post'" piece and not crawl the URL? That would demand it to recognize the ajax toolkit being used, probably, and I know it would be more difficult to implement, but I can dream, can't I?
Thanks
Thanks for sharing this information, can anyone elaborate the example of keeping content the same in the hamster example, i could not understand the meaning clearly
What about the case where create "hijax" links leads to creation of duplicate content.
For example, if there is a product called foo and is available in 2 colors, red and black.
One way is to use javascript and show details for both colors inplace. This way, the search engine gets to see only one color.
Instead, if I created 2 links one for red and one for black, then the search engine and the users will see the 2 links.
This is good for the users as it gives progressive enhancement - but "may" be treated as duplicate content by the search engines.
Yes, the content is duplicate as 95% of the resulting page may be the same.
What is the recommendation for such a case.
Will creating more pages lead to dilution of pagerank? as there may be multiple urls like /show?prod=foo&color=red , /show?prod=foo&color=black instead of /show?prod=foo#red.
Any clarity on this will be greatly appreciated.
Is this a possible solution for the hash/history problem?
1. use "hijax" links leeding crawler to href=product.php?id=23 and users to onclick=ajax(#id=23). If if got it right this will make the crawler to visit product.php and users will enjoy the ajax page as usual loaded from index.php.
2. in product.php reveal all data as is but check it with a web validator. At the end of product.php redirect the page to index.php#id=23 with a metarefresh. This would make a browsing user following the link from google to see the whole page while the crawler would see the data of the page.
Is this correct?
@Nami
I do not see why you need step2.
Avoid the meta refresh where possible.
Ideally if you have designed your site well, product.php?id=123 and index.php#id=123
should look , feel and behave the same (rather similar) with or without javascript (atleast the critical funcionality should be presented and behave the same)
In other words, on product.php?id=123 you would render a page similar to index.php with all the critical functionality (possibly the checkout, add to cart if it is an ecomm application) but I would not mind if the hide/show link for a certain description is not there and the description is shown fully - This is purely a product choice.
If you attain the goal of making your site fully functional with or without javascript (similar behavior where possible) you wouldn't need to worry about goog sending links to product.php?id=123 as would be as functional as index.php#id=213.
You could also design it such that javascript running on product.php can makes it look like index.php (assuming index.php has more stuff other than just product details...) - which means all product pages behave like index.php.
This should not be tough if you plan for it.
Hi
and if I do this:
href=”ajax.htm?foo=32”
onClick=”navigate('anotherAjax.html#foo=40'); return false”>foo 32
what happens?
Hi everyone,
Since several months have passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Help Group.
Thanks and take care,
The Webmaster Central Team
Post a Comment