Friday, July 22, 2011 at 8:15 AM
Webmaster level: AdvancedYou may have noticed that the Parameter Handling feature disappeared from the Site configuration > Settings section of Webmaster Tools. Fear not; you can now find it under its new name, URL Parameters! Along with renaming it, we refreshed and improved the feature. We hope you’ll find it even more useful. Configuration of URL parameters made in the old version of the feature will be automatically visible in the new version. Before we reveal all the cool things you can do with URL parameters now, let us remind you (or introduce, if you are new to this feature) of the purpose of this feature and when it may come in handy.
When to use
URL Parameters helps you control which URLs on your site should be crawled by Googlebot, depending on the parameters that appear in these URLs. This functionality provides a simple way to prevent crawling duplicate content on your site. Now, your site can be crawled more effectively, reducing your bandwidth usage and likely allowing more unique content from your site to be indexed. If you suspect that Googlebot's crawl coverage of the content on your site could be improved, using this feature can be a good idea. But with great power comes great responsibility! You should only use this feature if you're sure about the behavior of URL parameters on your site. Otherwise you might mistakenly prevent some URLs from being crawled, making their content no longer accessible to Googlebot.

A lot more to do
Okay, let’s talk about what’s new and improved. To begin with, in addition to assigning a crawl action to an individual parameter, you can now also describe the behavior of the parameter. You start by telling us whether or not the parameter changes the content of the page. If the parameter doesn’t affect the page’s content then your work is done; Googlebot will choose URLs with a representative value of this parameter and will crawl the URLs with this value. Since the parameter doesn’t change the content, any value chosen is equally good. However, if the parameter does change the content of a page, you can now assign one of four possible ways for Google to crawl URLs with this parameter:
- Let Googlebot decide
- Every URL
- Only crawl URLs with value=x
- No URLs
Of the four crawl options listed above, “No URLs” is new and deserves special attention. This option is the most restrictive and, for any given URL, takes precedence over settings of other parameters in that URL. This means that if the URL contains a parameter that is set to the “No URLs” option, this URL will never be crawled, even if other parameters in the URL are set to “Every URL.” You should be careful when using this option. The second most restrictive setting is “Only URLs with value=x.”
Feature in use
Now let’s do something fun and exercise our brains on an example.
- - -
Once upon a time there was an online store, fairyclothes.example.com. The store’s website used parameters in its URLs, and the same content could be reached through multiple URLs. One day the store owner noticed, that too many redundant URLs could be preventing Googlebot from crawling the site thoroughly. So he sent his assistant CuriousQuestionAsker to The GreatWebWizard to get advice on using the URL parameters feature to reduce the duplicate content crawled by Googlebot. The Great WebWizard was famous for his wisdom. He looked at the URL parameters and proposed the following configuration:| Parameter name | Effect on content? | What should Googlebot crawl? |
|---|---|---|
| trackingId | None | One representative URL |
| sortOrder | Sorts | Only URLs with value = ‘lowToHigh’ |
| sortBy | Sorts | Only URLs with value = ‘price’ |
| filterByColor | Narrows | No URLs |
| itemId | Specifies | Every URL |
| page | Paginates | Every URL |
The CuriousQuestionAsker couldn’t avoid his nature and started asking questions:
CuriousQuestionAsker: You’ve instructed Googlebot to choose a representative URL for trackingId (value to be chosen by Googlebot). Why not select the Only URLs with value=x option and choose the value myself?
Great WebWizard: While crawling the web Googlebot encountered the following URLs that link to your site:
- fairyclothes.example.com/skirts/?trackingId=aaa123
- fairyclothes.example.com/skirts/?trackingId=aaa124
- fairyclothes.example.com/trousers/?trackingId=aaa125
CuriousQuestionAsker: What about the sortOrder parameter? I don’t care if the items are listed in ascending or descending order. Why not let Google select a representative value?
Great WebWizard: As Googlebot continues to crawl it may find the following URLs:
- fairyclothes.example.com/skirts/?page=1&sortBy=price&sortOrder=’lowToHigh’
- fairyclothes.example.com/skirts/?page=1&sortBy=price&sortOrder=’highToLow’
- fairyclothes.example.com/skirts/?page=2&sortBy=price&sortOrder=’lowToHigh’
- fairyclothes.example.com/skirts/?page=2&sortBy=price&sortOrder=’ highToLow’
- fairyclothes.example.com/skirts/?page=1&sortBy=price&sortOrder=’lowToHigh’
- fairyclothes.example.com/skirts/?page=2&sortBy=price&sortOrder=’ highToLow’
CuriousQuestionAsker: How about the sortBy value?
Great WebWizard: This is very similar to the sortOrder attribute. You want the crawled URLs of your listing to be sorted consistently throughout all the pages, otherwise some of the items may not be visible to Googlebot. However, you should be careful which value you choose. If you sell books as well as shoes in your store, it would be better not to select the value ‘title’ since URLs pointing to shoes never contain ‘sortBy=title’, so they will not be crawled. Likewise setting ‘sortBy=size’ works well for crawling shoes, but not for crawling books. Keep in mind that parameters configuration has influence throughout the whole site.
CuriousQuestionAsker: Why not crawl URLs with parameter filterByColor?
Great WebWizard: Imagine that you have a three-page list of skirts. Some of the skirts are blue, some of them are red and others are green.
- fairyclothes.example.com/skirts/?page=1
- fairyclothes.example.com/skirts/?page=2
- fairyclothes.example.com/skirts/?page=3
- fairyclothes.example.com/skirts/?page=1&flterByColor=blue
- fairyclothes.example.com/skirts/?page=2&flterByColor=blue
- - -
If your site has URL parameters that are potentially creating duplicate content issues then you should check out the new URL Parameters feature in Webmaster Tools. Let us know what you think or if you have any questions post them to the Webmaster Help Forum.


46 comments:
Good stuff! Thanks.
now that is really loads of info..tweeted and plused
Is the "Great WebWizzard" Matt Cuts?
When a URL is no longer crawled, because it's filtered by one of the URL parameters, is it also removed from the index?
Is this example assuming that the SortBy parameter is required for functionality of the example site?
If you had the option of catalog/ and catalog/?SortBy=Price why wouldn't you pick to crawl "No URLs"
To Peter: short answer is yes. Although the goal is that with the right configuration, the useful content contained in that url can be found in other crawled&indexed urls.
To Jeremy: if SortBy is not required for the functionality of the site, AND the search engine can discover all urls with similar content but without SortBy (such as your example catalog/), indeed "No URLS" could be a better option. If webmasters are not sure about any of the two assumptions above, "Only URLS with value=" is a more safe setting.
To Peter: short answer is yes. Although the goal is that with right configuration, the useful content contained in that url can be found in other crawled&indexed urls.
@ningning To Jeremy: indeed "No URLs" could be the right setting if the contents from all urls with "SortBy" can be found with urls without "SortBy", AND that the urls without "SortBy" are properly linked and can be discovered by Googlebot. If webmaster is not sure about either of the assumptions, "Only URLs with value =" is a safer bet.
~
@ningning Thank you. I'm actually already returning noindex on those pages. I'm hoping this could / would remove them faster as they should not have been indexed in the first place -- by mistake the canonical link was left out, so now we have tons and tons of indexed links that should have been.
Correction: should "not" have been...
@ningning To Peter. No. "No URLs" can not be used to remove a url fast from the index. It is not fast and no absolute guarantee.
Very good timing guys.
't Be usefull to also see an option to actually remove a parameter or ignore its existence completely. See for instance Peter's example. The sort-parameter is completely optional and could be left out.
But maybe you're not entirely sure Google also found that other link or you'd like Google to treat any variant of a certain parameter as if it didn't have that paramater at all. The latter is different from ignoring the url completely and it will still reduce the total number of url's crawled.
My site working from 2003. Now my site has 500 000 pages. This is not a store. I do not sell nothing from my site. After that news I must delete my site from internet because I can follow to this instructions! Google, please stop to trying to be monopolizing the Internet.
My site working since 2003 and has more 500 000 pages. This is not a store. My site has free unique programs which visitors of my site like too much. Following this instruction I must close my site because I can not do what Google's team tell here. Also I can follow most of new guidelines of Google. Google even required that kind of CMS I have to use on my site. My site use bitrix but Google's robots can not indexing right the site with bitrix.
Google, please, STOP monopolizing the Internet. Internet is not your private property!
good stuff and nice information
Now, the Google Webmasters has provided the real tool for dynamic websites, it seems that now Google has started things about dynamic websites. By defining Parameters, Search Engine would accept the pages more quickly and provide the good results to the internet users. I had changed my website parameters twice in last 3 years but now with new tool, I can stop Search Engine to stop displaying my Old Parameters pages to the users. Both the Owner of website and user would be benefited with this feature. I would also like to suggest webmasters to define the parameters to understand the dynamic websites. For instance, if dynamic website are searching for something then describe the parameter as "search" or "searchterm" and for sorting the results as "sort" to track the product use the parameter as "productid" or "fileid" and for categories use of parameter "cat" or "catz" and by defining the such parameters this would help the search engine to understand the dynamic websites more easily. Last recommendation to add the parameter for site category in url pattern such as example.com/search.php?news=&search=&google
example.com/search.php?legal=&search=guilty
I would like link to give 6/10 for new feature but still more to be done for dynamic websites.
What if I don't do anything, does it affect anything on my site?
@ningning To Clicker: No it does not.
Thanks
@ningning to Corona: I am not sure that I understand your comment. If the concern is that parameter configurations seems very difficult or inconvenient, then there is no need to worry. For most of the sites, Googlebot does pretty good job by default.
@corona. What you say doesn't make sense. It doesn't matter what google does, your website won't be affected. A different thing, is that google provides you the free service to let others know about your website. You should be thankful.
My cms uses the same page to show the news feed, the ICAL feed and the printable version:
/page.html?show=RSS
/page.html?show=ICAL
/page.html?show=printable
What's the best solution? Exclude everything, exclude just the printable one (is the same page without graphics), don't exclude anything?
Right as rain!!
If I create a parameter for my manufacturer pages, but the only way those are designated is through a /m- (sitename.com/m-manufacturerx), would I just use /m- for the parameter rules?
@ningning to CB: m-manufacturerx in your example is not a URL parameter. Your site is using URL path to encode parameters. This is difficult for Google to interpret and this parameter configuration tool can not help for such case.
@ningning to Gerryino. I don't understand the question very well. Do you mean that for EACH page content in your site, there are always three variants of the page with 3 urls?
Can you provide 3 full urls as example?
Can anyone explain where those parameters came from?
I have a page that has been indexed by Google for a number of years without any problems.
Severeal days ago it is flagged for duplicate content, meta description, title, etc.
Original, uploaded page: www.example.com/abc.html
Mystery page with added parameters: www.example.com/abc.html?iframe=true&width=100%&height=100%
Who added those parameters, where is that page located, and how do I delete it?
Another Handy tool, now there will ne no complexity in maintaining URL.
Does this feature mean that the SEO value of the dynamic URL will pass on to the static version i.e. in simple words does this feature act as a canonical tag????
A good thing that the URL parameters got more options. In the company we manage several unit websites at example.com/unitname individually via webmaster tools. If the main company website at example.com is setting the URL parameters will they override our unit settings?
Example: In the unit we set 'name' to every URL but at company level 'name' is set no URLs. Will the company level setting for the root domain cascade down to all units and override individual settings?
Seriously I don't understand which is the real added value of having indexed the pages ordered by prices from higher to lower.
But thanks for the new feature. That was a real limitation.
@ningning to Andre: You raised a good questions. The configuration on the main site(example.com/) DOES NOT override configurations on the child site(example.com/unitname). But if the main site configured some parameter that the child site does not configure, it will take effect on both main and child site.
Hello.
1st of all Great Post and really Great new Feature.
Still I have one question. Lets say my site has, for some items/periods of time, a limited inventory and (following your example) the page that shows the cheapest items and the page that shows the pricest items are basically the same - same page with the same content, only arranged differently.
How can i prevent a duplicated content scenario?
What happens if not, but some content (60-70%) is the same?
I think many sites will have this problem because there are many different site searches that will provide a very similar pages but different URLs.
For Example: Searching for most popular blue shirt and searching for the pricest blue shirt... etc.
thanks. :)
@ningning to Igal: If I understand you correctly, you are comparing two urls:
1.example.com/search?category=shirts&color=blue&orderby=price&ordering=HighToLow
2.example.com/search?category=shirts&color=blue&orderby=popularity&ordering=HighToLow
These two urls should produce 100% same content (across all pages if there is more than one page).
You can configure "orderby" parameter to use only one value (e.g. popularity) and configure "ordering" parameter to "HighToLow".
thanks for info
confused
Thank you for the nice insight, but I can't get hold of the utility in the webmasters tools.
Very good post however there is still one part of the feature that doesn't quite work and that is setting the parameter as not affecting the content on the page.
On my site we have internal linking with no tracking information on them, however from external sites we have added a parameter so we can see who the referencing site/company is, this means we end up with two versions of url indexed www.example.com/pageA.asp and www.example.com/pageA.asp?tc=PO
The setting only allows you to have one representative URL indexed and not that no URL's with this should be indexed. You can make this choice if you say it affects the page content, but not if it doesn't. Is there any way to achieve this without specifying the wrong affect of the parameter?
Hi, a silly question. If I choose to crawl URLs with only "x" value in "y" parameter.. what about URLs that do not make use of that parameter? Are they still going to be crawled?
To @Yael, the answer is yes, urls without the parameter are not affected by the setting for a parameter.
Post a Comment