Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

Keeping comment spam off your site and away from users

Friday, September 26, 2008 at 2:26 PM

So, you've set up a forum on your site for the first time, or enabled comments on your blog. You carefully craft a post or two, click the submit button, and wait with bated breath for comments to come in.

And they do come in. Perhaps you get a friendly note from a fellow blogger, a pressing update from an MMORPG guild member, or a reminder from your Aunt Millie about dinner on Thursday. But then you get something else. Something... disturbing. Offers for deals that are too good to be true, bizarre logorrhean gibberish, and explicit images you certainly don't want Aunt Millie to see. You are now buried in a deluge of dreaded comment spam.

Comment spam is bad stuff all around. It's bad for you, because it adds to your workload. It's bad for your users, who want to find information on your site and certainly aren't interested in dodgy links and unrelated content. It's bad for the web as a whole, since it discourages people from opening up their sites for user-contributed content and joining conversations on existing forums.

So what can you, as a webmaster, do about it?

A quick disclaimer: the list below is a good start, but not exhaustive. There are so many different blog, forum, and bulletin board systems out there that we can't possibly provide detailed instructions for each, so the points below are general enough to make sense on most systems.

Make sure your commenters are real people
  • Add a CAPTCHA. CAPTCHAs require users to read a bit of obfuscated text and type it back in to prove they're a human being and not an automated script. If your blog or forum system doesn't have CAPTCHAs built in you may be able to find a plugin like Recaptcha, a project which also helps digitize old books. CAPTCHAs are not foolproof but they make life a little more difficult for spammers. You can read more about the many different types of CAPTCHAS, but keep in mind that just adding a simple one can be fairly effective.

  • Block suspicious behavior. Many forums allow you to set time limits between posts, and you can often find plugins to look for excessive traffic from individual IP addresses or proxies and other activity more common to bots than human beings.

Use automatic filtering systems
  • Block obviously inappropriate comments by adding words to a blacklist. Spammers obfuscate words in their comments so this isn't a very scalable solution, but it can keep blatant spam at bay.

  • Use built-in features or plugins that delete or mark comments as spam for you. Spammers use automated methods to besmirch your site, so why not use an automated system to defend yourself?  Comprehensive systems like Akismet, which has plugins for many blogs and forum systems and TypePad AntiSpam, which is open-source and compatible with Akismet, are easy to install and do most of the work for you. 

  • Try using Bayesian filtering options, if available. Training the system to recognize spam may require some effort on your part, but this technique has been used successfully to fight email spam

Make your settings a bit stricter
  • Nofollow untrusted links. Many systems have a setting to add a rel="nofollow" attribute to the links in comments, or do so by default. This may discourage some types of spam, but it's definitely not the only measure you should take.

  • Consider requiring users to create accounts before they can post a comment. This adds steps to the user experience and may discourage some casual visitors from posting comments, but may keep the signal-to-noise ratio higher as well.

  • Change your settings so that comments need to be approved before they show up on your site. This is a great tactic if you want to hold comments to a high standard, don't expect a lot of comments, or have a small, personal site. You may be able to allow employees or trusted users to approve posts themselves, spreading the workload. 

  • Think about disabling some types of comments. For example, you may want to disable comments on very old posts that are unlikely to get legitimate comments. On blogs you can often disable trackbacks and pingbacks, which are very cool features but can be major avenues for automated spam.

Keep your site up-to-date
  • Take the time to keep your software up-to-date and pay special attention to important security updates. Some spammers take advantage of security holes in older versions of blogs, bulletin boards, and other content management systems. Check the Quick Security Checklist for additional measures.

You may need to strike a balance on which tactics you choose to implement depending on your blog or bulletin board software, your user base, and your level of experience. Opening up a site for comments without any protection is a big risk, whether you have a small personal blog or a huge site with thousands of users. Also, if your forum has been completely filled with thousands of spam posts and doesn't even show up in Google searches, you may want to submit a reconsideration request after you clear out the bad content and take measures to prevent further spam.

As a long-time blogger and web developer myself, I can tell you that a little time spent setting up measures like these up front can save you a ton of time and effort later. I'm new to the Webmaster Central team, originally from Cleveland. I'm very excited to help fellow webmasters, and have a passion for usability and search quality (I've even done a bit of academic research on the topic). Please share your tips on preventing comment and forum spam in the comments below, and as always you're welcome to ask questions in our discussion group.

The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

62 comments:

Chris said...

Interesting article, with some very useful suggestions. I've got a question for the author though. You say you're from Cleveland, is that Cleveland Ohio? If so, I think it's the first time I've seen an American use the word "dodgy". It's been in common use in the UK for decades but British slang doesn't travel across the Atlantic as quickly as US slang in the opposite direction. Is dodgy becoming mainstream in the US? PS This might not be the most relevant post but I promise it's not "comment spam"!

kkll2 said...

Spammers generally don't obfuscate keywords in comment spam, because they want these keyword to be readable by Google.

There's open-source Akismet-like filter:
http://code.google.com/p/sblam/

Noodle said...

I've been using a JavaScript technique to stop comment spam, with an amazingly high success rate (I've had one spam get through since January).

I've posted an entry about it here:
http://neilang.com/entries/a-better-technique-to-stop-spam/

Gareth said...

Javascript methods have the disadvantage of potentially preventing non-javascript users (like those running NoScript) from being able to post. OK, so you might only have one spam get through, but how many did you block and how many legitimate comments got stopped too?

We monitor all the comments on our site with no captcha and no javascript validation, and the most successful prevention by far is a honeypot trap.

It's simply a field with a common name (e.g. 'comments') whose only purpose is for spambots to fill out. It's labelled with "Do not fill out this field" and hidden with CSS, but to the spambot it's just another junk receptacle. Naturally any form submission with anything in this field is put aside to be manually checked.

Potentially there are a couple of issues with this approach:

1) Any 'autofill' browser functionality could trigger this trap, however we've found that autofill rarely applies to textareas and not to 'comments' fields.

2) It's trivial for anyone specifically targetting your site to circumvent this measure. At that point you have to start thinking about CAPTCHAs and the like, but I can't imagine most sites on the internet are open to that level of attack.

However, we've had zero false positives in our case, and provably no legitimate comments have been blocked

Noodle said...

@gareth The honeypot and javascript methods are very similar. Both methods rely on the browser rendering the form differently for spambots then real users. I can see benefits and issues with both methods.

With my implementation the user can be notified that they need javascript enabled before their submission, as well as after a failed submission. If you are concerned with legitimate comments being lost, like in your method, you can log the failed submissions for manual checking later.

Another benefit from the javascript method is that it can be applied to any form on your site without having to alter the server-side form processing script(s).

The honeypot method relies on the spambot to submit a value for the obscured field. If spambots begin to randomly vary their submission values to specifically target honeypots, they could succeed in posting.

I don't think either method is 100% foolproof. An issue with the javascript method is that it is only effective while spambots don't process embedded javascript.

Perhaps for best security without using captchas, you could use a combination of both methods.

Kipper said...

I hate form spam. Something i've tried and am having success with at the moment is an ajax style form, whereby a user fills in as normal, but rather than redirecting the whole page (and writing to the database inbetween), the submit button just triggers a background call to do the same work - and in doing so, also adds in another secret variable thats not available from within the form itself (like isPosted = 1) just to be sure that nobody is calling my script directly. It could be circumvented, but you'd need to be human to do that, or a very clever bot that can read and understand code!

Obviously this approach is going to deny those not using javascript, but I dont think its non-use is as widespread today as it used to be.

Yoli said...

Interesting article and Happy 10th anniversary at everyone at Google.

Megan said...

Another thing to keep an eye out for is posts that look legit but seem a little bit off on further examination. Spammers are sometimes adept at making their comments look legitimate.

For example, you might see comments like this:

"Hi, great post, thanks"

or

"Thanks for the great info"

Or similar, general comments that could apply to any blog post. These are often spam.

One thing I've noticed on my forum lately is people copying and pasting content from other sites and posting it as their own. If a post seems suspicious try doing a google search for the text that was posted (in quotes).

I think the automated systems like Akisment are often the best solution for a blog. They work well and they don't inconvenience your regular users.

Geert said...

Use Mollom to filter the content of the comment post. Mollom responds 'ham' or 'spam' and depending on their response you either allow or disallow the content on your website. There also is an in-between option that allows the webmaster to moderate. Plugins for Drupal, Wordpress, PHP, Ruby, Java, .NET ...
Cool!!

Borneoguy.com said...

CAPTCHA is still the best method for me, but that won't stop manual spammer I reckon, only bots ...

The said...

Is anyone gonna update us on the use of the word dodgy in the U.S? You'd think at least one American could reply to this question.

Rich Motivation said...

Yes, I've heard American guys use "dodgy," particularly guys from the Northeastern part of the US.

Mary

Chaoley said...

Here's an ingenious solution that uses a points system to filter out spam submitted via comment forms. Note that the author also turns off comments on a post after a certain amount of time.

http://snook.ca/archives/other/effective_blog_comment_spam_blocker/

David Burns said...

In the past I have heard 'dodgy' quite a bit. I guess it's not as 'in' as it used to be. Now it seems it may be an 'in' slang for your country.

ITSL Technologies said...

I am little confused to read your post. this suggestion is for URL redirecting problems .if yes then i have site whose design or look and feel will changed after few days , can you help me that i will not get any type of problem of like this , or URL redirecting will effect on my site

Bloggercito said...

Come one, Google is the first responsable on all spamm comments, because it pushes people to get backlinks.
I know you can give me a 4 hours explanation and theory, but maybe you should use that 4 hours to open your eyes and see the REAL world.
Ask yourself why is people using a lot of time trying to get backlinks to their site.
Google makes internet a links trading.
Must links over the web are not real and organic ones.
Even the big sites now uses nofollow EVEN for REAL AND ORGANIC links.
As i said, open your eyes and look the real thing.

Richard said...

It still seems to me, at least in niches like mine, that you can get organic backlinks with a phone call.

My niche: technology and churches

Samelove said...

very useful informations.thank you

jcore said...

I think people need to be a little more lax with the no follow attribute.
Everyone does this , it is like saying don't bother putting in effort on a good post because it will probably be picked up as spam anyways.

I am also pleased to say I have had great results on experiment s running just no follow,

I think Search engines are not giving rel+no follow as much gravity ,um externally that is.

But who am I , just proved everyones theory WRONG!

People still click on links, am I right?

Kevin said...

It is true that Google is probably the number on reason for comment spam.

Melesha said...

Kind of incredible that google would be talking about people spamming blogs. There is only one reason to spam blog comments.

Aaron said...

I don't see what's wrong with link trading if people are actually writing meaningful comments. Captchas get rid of bots, and it's not hard to see the manual crap people write. My Utah real estate blog gets lots of asinine comments but they aren't hard to delete if they aren't contributing anything...

Ben said...

Good post. I had a lot of trouble with spam on my blog a while back. If it wasn't for this widget I finally found, I would have been deleting comments for years!

Tramite said...

I hate form spam. Something i've tried and am having success with at the moment is an ajax style form, whereby a user fills in as normal

mike said...

This is a wonderful post. Very few blogs have articles like this. I think there should be a better program out there than Akismet.

maty said...

Make it so your forum only allows posting URLs to members who have proven their worth - for example by posting valid comments on several previous occasions.

Most genuine comments don't need a URL. Most spammers have to have them, so that's the point to choke them off.

Aamir Attaa said...

very useful post - i had serious issues resolving SPAM concerns, i am well placed to perform better now.

Şahan said...

Hello

Thanks for the essay,
Can I put this essay to my general information site?

Thank you again...

bali said...

can you recommended a good captcha program for my website http://www.ubudhotelabali.com
cause right know my captcha script is not working. need help

carpro said...

intereting article.

idea stack said...

CAPTCHA some times loaded not properly any way this method good thing..

genuineseo said...

Nice article, very useful information.

Clayton said...

Great tips, I'll start incorporating this just in case people start actually coming to my blog! ;)

John_Musca said...

Captchas only work for robots...

John_Musca said...

Google does push it for backlinks....so yes, to the one who commented they are responsible.

ketut said...

Nice article, very useful information.

Rachelle said...

I very much appreciated your writings and sharing your thoughts to everyone. I expect more articles to read for the next time.

Regards,
Noah Group

One Simple Tech said...

I enjoyed reading this article, good stuff.

Kevin Urbina said...

I love reading advice on how to properly use dofollow pages for building backlinks and then see the author has invoked the nofollow plugin.

felix said...

nice article..

Frank said...

I think CAPTCHA is the best method (for me) but that won't stop manual spammers.

javagems.info said...

well, i think it's ok to have spamming comments in our blog, because i think, they try to get popular and I'll be happy to help them.
But of course, I hate robot spammer..
If the manual spammer, i thought it would be no matter

jon said...

Wow what a fantastic article, I have recently done my very first blog mayans2012 and so far have only had some random spam however I will look to implement this if it gets any worse.

flights said...

IMHO,...captcha is still best method for me, but that won't stop manual spammer I reckon, only bots btw,this a cool n Interesting article. thanks

Max said...

Here really for the great post i think blog comments its big one for the advertising but i have not to know about the spam comments so thanks for the great post..

Robert said...

Spammers are taking advantage on dofollow sites which make their site create backlinks. That's why we have generating images so bots or rather spambots can't go through but still they can.

Matthew said...

I think part of the problem is the fact that Google puts WAY too much emphasis on backlinks - which it obviously will given it's history. This causes people to spend WAY too much time trying to "create backlinks" to get their site ranked rather than worrying about content!

Howard said...

Thanks for the list. I'm sire I will run into this issue down the road.

Please check out my weight loss site at http:www.yourfittestyear.com

www.sensuous-sextoys.co.uk said...

hmmmmmmmm

attayaya said...

i don't like spam comment
i hope they don't do that again

veeresh said...

javascript users (like those running NoScript) from being able to post. OK, so you might only have one spam get through, but how many did you block and how many legitimate comments got stopped too? Find More and more Coupon and Coupon Codes, To get More Information Visit Deals365.usdeals365.us

david said...

This is good article to get rid of spammer comments.

Mary said...

The best way to check is to see if the comment is relevant to the article. Sorry, I'm repeating myself I wanna make sure my comment goes through.

Tom said...

It's better to manual check also. Some spams can bypass your CAPTCHA system. That just kinda suck.

GimmeThatTrack said...

spammers will randomly place their keywords in their comments when it makes no sense at all to be placed there. its funny how they get away with this

koyo said...

Well, is there any way that we can report thousands of hacked email accounts to their concerned service providers like gmail, hotmail?

I do know some people who are doing this stuff and they hack hundreds of email accounts per day with automated system using phishing! How can we fight against such bad guys?

Ayia Napa Nightlife said...

Great article :) A lot of useful suggestions which I will use in the future. Thanks

Steve Last said...

This is a very useful article, and I use Akismet all the time in my blog. I have it set to only publish comments after I have manually checked them.

I think that it filters out all but those comments that are acceptable or marginally acceptable, just leaving me the comments that need manually checking for inclusion.

I don't think it can get any better than that. After all, you should be interested in what your comments say, and will want to read them all anyway, and I have gained many an idea for new posts, and even whole new categories from reading my comments.

jishan said...

this is a great sound from your side. i really wana appreciate u about ur thinking.Life is difficult enough without having to deal with someone’s selfishness and meaness! i thought that is there is some positive approach to fulfill this awful thing.I just wish they wouldn’t leave so much destruction in their paths! I guess we just have to do what is necessary for us to survive. Good luck to you

Awadesh said...

Here's an ingenious solution that uses a points system to filter out spam submitted via comment forms. It is true that Google is probably the number on reason for comment spam.

Jacks said...

I find that spam-bots sometimes can circumvent even CAPTCHA so I have resorted to manually checking all comments on my Wordpress blog.

Google Webmaster Central said...

Hi everyone,

Since over a year has passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Central Help Forum.

Thanks and take care,
The Webmaster Central Team