Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

Farewell to soft 404s

Tuesday, August 12, 2008 at 2:54 PM

We see two kinds of 404 ("File not found") responses on the web: "hard 404s" and "soft 404s." We discourage the use of so-called "soft 404s" because they can be a confusing experience for users and search engines. Instead of returning a 404 response code for a non-existent URL, websites that serve "soft 404s" return a 200 response code. The content of the 200 response is often the homepage of the site, or an error page.

How does a soft 404 look to the user? Here's a mockup of a soft 404: This site returns a 200 response code and the site's homepage for URLs that don't exist.



As exemplified above, soft 404s are confusing for users, and furthermore search engines may spend much of their time crawling and indexing non-existent, often duplicative URLs on your site. This can negatively impact your site's crawl coverage—because of the time Googlebot spends on non-existent pages, your unique URLs may not be discovered as quickly or visited as frequently.

What should you do instead of returning a soft 404?
It's much better to return a 404 response code and clearly explain to users that the file wasn't found. This makes search engines and many users happy.

Return 404 response code



Return clear message to users



Can your webserver return 404, but send a helpful "Not found" message to the user?
Of course! More info as "404 week" continues!

The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

21 comments:

Akuma said...

I serve 410 for pages that are removed. Is that the right thing to do for google or do you prefer 404s?

himerus said...

I'm just glad that some "good" content management systems handle this properly.

--18:01:39-- http://xxxxxxxx.com/dumb-page-that-is-not-here
Resolving himerus.com...
Connecting to himerus.com...
Connected.
HTTP request sent, awaiting response...
404 Not Found
18:01:39 ERROR 404: Not Found.

This sample response was from a Drupal install on my own blog.

Storm Website Design said...

I too am interested in how Google would prefer us to handle removed pages, for instance our site now has .php instead of .html and the old pages have been removed, this has generated a few 404 error pages in the webmaster tools, although we have updated and verified our xml sitemap.

Hainesy said...

What happens if the bad URL returns a 302 temporary redirect to another page that serves a 404 error code? Will google bot understand this?

Ian M said...

akuma - Matt Cutts said 410's are treated like 404's.

Web Design said...

I'm sorry but I just don't agree that a hard 404 is beneficial to a user at all.

If someone browses to a page that doesn't exist and you give them a custom error page with suggested links, a search function and an explanation that's pretty much as good as it gets. Granted some people just redirect to their homepage, this should certainly be discouraged.

As for Google bot how about introducing a meta tag for us to tell it custom error page? Surely this can be used to the same effect.

Your saying the user would be better off hitting a dead end and having to press "back" which lets face it, not all users know.

I know you are presenting solutions and its likely compromise can be reached on subjects like this but saying user prefers a dead end is just plain wrong.

Joel said...

@webdesign

I think google arent saying that you shouldn't send the useful error handler page. All they are saying is that you should send a 404 error header, instead of a 200 success response.

Comprehensible - but not exactly news.

barryhunter said...

@Web Design

I think you have misunderstood the intention. The 'hard' just means return a HTTP status code 404. This is to tell bots its a Not Found. The page the users sees is the same custom message as you want to use. All google are suggesting is to not return this custom page/homepage whatever, with a 200 OK status.

@Hainesy I too would be interested in an answer to this. I often see (when writing link checkers), that some sites redirects to a 'not found' page, this usually for some reaons results in a 200 OK, but at least it should be a redirect to a 404 Not found page.

sam said...

custom 404 pages and soft 404 are two different things. The header must return the 404 response however you can have anything on the page. I have read that nofollow tags are a good thing on the links (for instance we generate a site map for the 404 page to help aid the human navigator). You have a lot of options with custom 404 under Apache, M$ not so easy. When we switched over to a new CMS the URL strings that followed the same structure as the old site generated 404 and the pages that did not exist in the database kept creating 200. I caught this through analytics. I had to prove to the hosting company that the 200 responses were not valid. They did not see the big deal.

paisley said...

umm.. you have an error in the post above.

contact me at my gmail address so i can explain it to you instead of posting it publicly please.

sam said...

Certainly does not hurt my feelings posting if I have been unclear about something?

my reference to a "site map" is the html version, not an xml file that Google recently brought on for the bots in the last year or two..that would have been better named "URL MAP" Site maps have been around a lot longer then Google ;)
If it's technical: one can run Apache on M$ yes but why? I was referring to IIS and it's means of redirecting vs. Apache and it's means of re-directing. Completely different animals. After that we are splitting hairs :)

Maile Ohye said...

@hainesy, we treat redirects (e.g. 301, 302) to 404 as a 404.

@storm website design, it's fine to have 404s listed in Webmaster Tools as long as you believe they should, in fact, 404. The information listed in Webmaster Tools is helpful for troubleshooting purposes, especially if your site is not crawled as expected.

As an aside, if you have the time or inclination, you can review the 404s in Webmaster Tools to find old html pages that correspond to specific php files. These 404s can instead be 301'd to their appropriate new php URL -- potentially helping users find the exact URL they hoped to see.

paisley said...

@sam.. lol. no dude.. not you.. Google.

Storm Website Design said...

Thanks for the response Maile Ohye.

Sol said...

si alquien quiere intercambiar enlaces

www.enlacexchange.com

Web Design said...

@barryhunter/Joel

I see, that is fair enough then. I think the pictures threw me out a little :)

I still think justifying it with users is wrong though, how will they know?

I am all up for helping Googlebot though, their should always be a synergy between webmasters and SE.

Swanky said...

Thanks for the information. :)
it helped me a lot for my site.

- Swanky (http://www.project-bb.org/)

Spanish speaker said...

There are times when it is impossible to use 404 as when you use a service cache (Akamai), always gives a 302 (Moved Temporarily) although the page does not exist. How can we solve this?

EnglishFirst said...

I second Spanish Speaker. I'm having the same problem. If some one type or click a non-existent URL, it will be 302-redirected to the real 404 page. Is this actually okay?

mbjr said...

I used to have some sites where 404 pages were shown as 302 redirects. Besides it looked cool and we kindo' tricked googoo back then, there were some key points to consider.

You probably want to keep your error pages static as you may loose control of your application server under certain circumstances and if so, not even your error pages are served.

Your dynamic page may provide way more information about the error, i.e. last searches, closest matches, alternative suggestions that are impossible to do with a static one, hence the requirement for alternative error handling. And well, here we are at soft 404s.

I believe such setup is not to misguide the user but to provide a more friendly environment. No one likes errors.

Having specific rules to follow instead of ending up on the other side of the horse completely by punishing web peeps decided to use soft 404s.

Maile Ohye said...

Hi everyone,

Since some time has passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Help Forum.

Thanks and take care,
The Webmaster Central Team