fox@fury
The craftsman's approach to fighting commentspam
Monday, Apr 11, 2005
Spamming (commentspamming, email spamming, junk-mailing, etc.) is all about economies of scale. Commentspamming itself has exploded in the last two years because commentspammers have identified the proper URLs and methods for sending to different types of blogging software. If you use Movable Type for example, your comment entry form probably has the parameters of the default installation, so it's trivial for someone to write software where they simply add in your site's URL to their target list and their spamming software does the rest, adding comment after comment after comment.

Though I wrote Fury's blogging software from scratch, when I decided to add comments four years ago I took a public domain PHP text-file solution and plugged it in to my own code. Fast and easy to be sure, but once every blog became a target it was only a matter of time before spammers wrote a plug-in for my little comment program. Where a year ago I never got commentspam and six months ago I would get about 10 spam comments a day, last Friday I got 960 commentspams, and that's pretty normal. I regularly get about a thousand a day. It's time to do something about it.

Like I said before, it's all about economies of scale. It's in a spammer's interest to go after the biggest targets, the lowest hanging fruit. The first step toward freeing yourself from commentspam while still maintaining an open commenting system is to write one yourself. Don't use the same fields as everyone else and a spammer will have to do extra work to make their spamengine work with your li'l ol' blog. For most spammers it's simply not worth the effort.

I could see a cottage industry of coders who write anti-comment-spam solutions for weblogs. In a way it's a magnificent throwback to the days before standardized parts; for maximum effectiveness, each solution should be different from those before it. Taking it a step further, they should be different in different ways. It's not enough to just change field names, because sooner or later someone will make a program that recognizes the rest of your code and deduces those field names on a per-site basis. It's not enough to introduce a CAPCHA (picture containing a word that the human has to enter) because if you do it often enough it's worth a spammer's while to make an AI that can deduce your CAPCHAs (and they have).

What's needed is good old-fashioned craftsmanship. In a way, it's a great way to hone one's coding skills. I started down that path tonight by making a small change to my own comments engine. I don't expect this patch will keep the spammers at bay forever, and I'd be disappointed if it did, because then I'd lose the opportunity to keep innovating against them.

The trouble with writing this post, of course, is that it's a gauntlet of sorts. By declaring that I will out-innovate, my dare turns their attentions to me and makes my work that much harder. To that, I only have three things to say:

  1. Commentspamming a Google employee's blog is about as smart as abusing the sherrif's daughter.
  2. The spamfighting enhancements I incorporate into this site will only be implemented on this site and no others, so the efforts you expend to defeat them will only afford you the opportunity to spam this one site, and no others.
  3. All comments on Fury are served on pages that are blocked from all of the web crawlers via the robots.txt specification. Even if you do manage to succeed in commentspamming, it's not going to do you any good and while you don't care about a few dead-ends when spamming half a million blogs, it's counterproductive defeating a system when you have nothing to gain in the first place.

In the last 90 minutes this change has blocked 67 commentspams. It's a good start.

If you like it, please share it.
aboutme

Hi, I'm Kevin Fox.
I've been blogging at Fury.com since 1998.
I can be reached at .

I also have a resume.

electricimp

I'm co-founder in
a fantastic startup fulfilling the promise of the Internet of Things.

The Imp is a computer and wi-fi connection smaller and cheaper than a memory card.

Find out more.

We're also hiring.

followme

I post most frequently on Twitter as @kfury and on Google Plus.

pastwork

I've led design at Mozilla Labs, designed Gmail 1.0, Google Reader 2.0, FriendFeed, and a few special projects at Facebook.

©2012 Kevin Fox