fox@fury
The craftsman's approach to fighting commentspam
Monday, Apr 11, 2005
Spamming (commentspamming, email spamming, junk-mailing, etc.) is all about economies of scale. Commentspamming itself has exploded in the last two years because commentspammers have identified the proper URLs and methods for sending to different types of blogging software. If you use Movable Type for example, your comment entry form probably has the parameters of the default installation, so it's trivial for someone to write software where they simply add in your site's URL to their target list and their spamming software does the rest, adding comment after comment after comment.

Though I wrote Fury's blogging software from scratch, when I decided to add comments four years ago I took a public domain PHP text-file solution and plugged it in to my own code. Fast and easy to be sure, but once every blog became a target it was only a matter of time before spammers wrote a plug-in for my little comment program. Where a year ago I never got commentspam and six months ago I would get about 10 spam comments a day, last Friday I got 960 commentspams, and that's pretty normal. I regularly get about a thousand a day. It's time to do something about it.

Like I said before, it's all about economies of scale. It's in a spammer's interest to go after the biggest targets, the lowest hanging fruit. The first step toward freeing yourself from commentspam while still maintaining an open commenting system is to write one yourself. Don't use the same fields as everyone else and a spammer will have to do extra work to make their spamengine work with your li'l ol' blog. For most spammers it's simply not worth the effort.

I could see a cottage industry of coders who write anti-comment-spam solutions for weblogs. In a way it's a magnificent throwback to the days before standardized parts; for maximum effectiveness, each solution should be different from those before it. Taking it a step further, they should be different in different ways. It's not enough to just change field names, because sooner or later someone will make a program that recognizes the rest of your code and deduces those field names on a per-site basis. It's not enough to introduce a CAPCHA (picture containing a word that the human has to enter) because if you do it often enough it's worth a spammer's while to make an AI that can deduce your CAPCHAs (and they have).

What's needed is good old-fashioned craftsmanship. In a way, it's a great way to hone one's coding skills. I started down that path tonight by making a small change to my own comments engine. I don't expect this patch will keep the spammers at bay forever, and I'd be disappointed if it did, because then I'd lose the opportunity to keep innovating against them.

The trouble with writing this post, of course, is that it's a gauntlet of sorts. By declaring that I will out-innovate, my dare turns their attentions to me and makes my work that much harder. To that, I only have three things to say:

  1. Commentspamming a Google employee's blog is about as smart as abusing the sherrif's daughter.
  2. The spamfighting enhancements I incorporate into this site will only be implemented on this site and no others, so the efforts you expend to defeat them will only afford you the opportunity to spam this one site, and no others.
  3. All comments on Fury are served on pages that are blocked from all of the web crawlers via the robots.txt specification. Even if you do manage to succeed in commentspamming, it's not going to do you any good and while you don't care about a few dead-ends when spamming half a million blogs, it's counterproductive defeating a system when you have nothing to gain in the first place.

In the last 90 minutes this change has blocked 67 commentspams. It's a good start.

Scoble gets taken to task for Gmail-bashing
Sunday, Apr 10, 2005
Via Google Blogoscoped I read Robert Scoble's diatribe about how Gmail wasn't scalable, using Orkut and an out of context quote about invitation-only services taken when Gmail was 10 days old. In the post Scoble, who is normally far smarter than this, claims Gmail is less valuable to advertisers than Hotmail because Hotmail has 100 million active users, and if audience size doesn't matter then why do we have Nielsen ratings?

The answer, of course, is that advertisers care about cost over revenue, and when they can't get that, they care about cost-per-click, and when they can't get that they care about cost-per-impression, and when they can't get that they care about cost-per-estimated-audience. Nielsen demographics give a coarse guess as to who is seeing your ad. In contrast, keyword-targeted online advertising gives you precise ROI data that lets you hone your advertising campaign better than blasting animated gifs to users chained to their antequated, undersized and visually frenetic Hotmail accounts.

I was going to explain this in Scoble's comments, but lo and behold the rest of the Internet beat me to it. And people say I whine. (Well, I do, but that's not the point.)

The potential of Maps
Friday, Apr 08, 2005
Maps are usually implemented as a destination instead of an inline service; Look for the data you're looking for, then get the map of where it is at the end. Some very clever people have hacked Google Maps to get around this pattern of thinking, and in so doing have made the coolest thing I've seen on the web this month, or maybe longer.

Craigslist Housing - Google Maps mash-up. I can only imagine the changes this site will have in the way people think about maps.

What is your essential mac software?
Wednesday, Apr 06, 2005
Every now and then I get a new mac and I have to load it up with a standard repertoire of software before I can use it the same way as I use my other computers.

Usually I add the pieces here and there over several weeks, but I'm trying to standardize it now, possibly burning all the software onto a single DVD for easy installing. Here are my absolutely essential isn't-my-mac-without-it list of programs:

  • Standard 10.most-recent suite of OS X apps
  • Stuff on the OS X Dev Tools DVD
  • Photoshop (used to be 7, now CS, soon to be CS2)
  • BBEdit 8
  • Transmit 3 (ftp)
  • Firefox
  • mySQL (php is already in the OS X installation)
  • CocoaMySQL (I wish they'd update it, but it's free!)
  • Omnigraffle Pro (diagramming and flowcharting)
  • Either Microsoft Office or iWork
  • Snapz Pro (Screenshot-taker)
  • Quicksilver (quick-access launcher)
  • Subversion (version control software)

That's it. That's literally all I need to get along swimmingly on any mac and do all the things I want to do. There's nothing else anyone would need in order to create world-class web applications, or if there is I haven't found it. If I were locked in a room for a year with a machine loaded thusly and access to ports 22 (ssh) and 80 (web), I'd be wildly productive. Okay, maybe more productive without port 80 (damn you port 80...)

What's your essentials list? (Okay, okay. Windows folks too...)

Nostalgia from 100 miles up
Tuesday, Apr 05, 2005
Today's meme, began by Matt Haughey, is the Memorymap. An emergent collaboration between Google Maps and Flickr, people are taking satellite snapshots of their hometowns and annotating spots with memories from their past.

Brilliant.

Google Gulp on Ebay
Friday, Apr 01, 2005
Only hours after Google announces (ahem) "Google Gulp", Gulp caps are already finding their way on to Ebay!
Stick a fork where, Ben?
Thursday, Mar 31, 2005
Updated on 3/31/05 at 11:20pm)

Today's Guardian features a story by Ben Hammersley about Yahoo being the new Google. To 'prove' his point he pulls out a data point here and a data point there and says 'voila! Q.E.D.' The trouble is that his datapoints are pretty selective, and in some cases are flat-out incorrect.

Ben chooses one moment in time where he perceives Yahoo to have barely edged ahead and calls the game over ('three-nil'). Even if I agreed with his analysis, in the words of Yogi Berra "We didn't lose, the game just ended too early."

Ben says "Google's [search] API was a thing of beauty when it launched [three years ago]" but has been overtaken by Yahoo's [search] API, which was launched last month. Even if you ignore Google's five other APIs, it's disingenuous to fire the final buzzer just because the other guys scored a three-pointer.

For five years Yahoo made relatively minor changes to their maps service, and last month Google came out with an entirely new offering. A little later Yahoo adds traffic data to their maps and yes, it's a useful feature. Ben is quick to note how Yahoo is a 'leader' because they just launched their labs competitor, research.yahoo.com, yet he completely ignores the fact that Google Maps itself is a labs offering. Is it fair to compare a six year old product with a one month old labs release and declare a winner?

Closer to my own heart though, Ben says, "Google's webmail product, Gmail, caused a fuss by offering accounts capable of storing a gigabyte of mail, four times that of Yahoo Mail. No problem, said Yahoo last week, Yahoo mail users can have a gigabyte too." Whatnow? When Gmail launched (a year ago tomorrow) Yahoo mail gave users four megabytes and Gmail represented a 250x increase, not the other way around. I'm proud that Gmail caused the other guys to raise their falsely-limited storage sizes but c'mon. Yahoo announced that they'll finally match the competition's storage offering, a year after the competition's launch, and Ben scores this as a 'Yahoo win'? Have you used Gmail, Ben? Is storage size the differentiator?

Next on the block is Blogger vs Yahoo 360: "Google's purchase of Blogger gave them a place at the blogger's table, but it has done little with it. Yahoo's blogging tool, Yahoo 360, launches this month, allegedly fully integrated with the rest of the content they produce." Ben, Yahoo bought Geocities for 3.6 billion dollars six years ago, and you fault Blogger for not advancing? Also, I have to ask if Ben's used Yahoo 360. Not to degrade the product, I have friends who worked very hard on it and I think they made something cool and pretty. But its a social networking service. Go ahead and compare it to Orkut and I'll happily listen, but a Blogger replacement it is not.

I accept the point about Flickr though. I think Picasa is slick as all hell, but I've never seen someone get communities as right as Flickr has. I wish those guys were sitting down my hall.

The game is a long one, my friends. In the end of course the users win either way. Like two farmhands vying for the love of a girl, no matter which one emerges victorious the lady's got a lot of flowers. Now if you'll excuse me, I've got to tend my greenhouse.


Update: Yahoo! Mail quotas won't be upped to a gigabyte for another 4-6 weeks. How big will Gmail be by then?

Gmail storage: To Infinity and Beyond!
Thursday, Mar 31, 2005
Gmail turns 1 today and as April Fools jokes go it was a doosey! Now Gmail can not only cry, but it can walk and say Mama and Dada with the best of 'em.

It's also growing like the Dickens. Yesterday you had a gigabyte. How much will you have tomorrow?

Great, now my identity's been stolen.
Tuesday, Mar 29, 2005
That'll teach me to have applied to grad school three years ago using my real social security number.

I'm trying to find some witty way to correlate this with my recent enjoyment of the song 'Centerfold' (My blood runs cold / my memory has just been sold) but I'm failing miserably.

AOLiza evil? Not hardly. Check out Laura Pahl.
Tuesday, Mar 29, 2005
To anyone who thought AOLiza was an evil way of messing with people who randomly IM you, it's nothing compared to what's going to happen to Laura Pahl.

Update: Turns out the thing was a hoax, an April Fools joke they say, though I say no, since it's not yet April first. Ah well. Caught me!


Update to the update: Okay, purportedly not a hoax after all, but a story with some form of a conclusion now. Witness the drama, and the value of having a nice mom.

  
aboutme

Hi, I'm Kevin Fox.
I've been blogging at Fury.com since 1998.
I can be reached at .

I also have a resume.

electricimp

I'm co-founder in
a fantastic startup fulfilling the promise of the Internet of Things.

The Imp is a computer and wi-fi connection smaller and cheaper than a memory card.

Find out more.

We're also hiring.

followme

I post most frequently on Twitter as @kfury and on Google Plus.

pastwork

I've led design at Mozilla Labs, designed Gmail 1.0, Google Reader 2.0, FriendFeed, and a few special projects at Facebook.

©2012 Kevin Fox