2005-11-16

On the rel=nofollow splogging deterrent

Splogging is bad. That far we all agree. To some extent, Blogger and several other blog tools / providers try to combat the harm they do to the web experience, and to the Google PageRank metric. This measure of relevance would otherwise be very cheaply influenced for a spam blog wanting to climb the PageRank ladder, just by spamming links to itself in blog comments all over the web. The CAPTCHA word verification used in this blog and many others are one preventive measure, and when that fails, the links the spammers add are tagged with a rel=nofollow attribute which tells Google not to raise the PageRank of the link target when my blog's PageRank otherwise would have. A full stop to link love.

I'm not going to say that rel=nofollow is a bad solution, period, though I agree to a large extent. Not having this damage control, when CAPTCHA fails, would add further to the sploggers' incentive of posting their junk wherever they can, which is of course a bad thing (see, I already firmly decided that we agreed on that, you might recall). I think rel=nofollow has its purpose, but I would also like to share link love where link love is due, and I would like it to be a very easy thing to do -- just a click away in some comment moderation view. Today, Blogger unconditionally adds the rel=nofollow attribute to all links in your comments, providing neither the option not to, nor the tools to manage this on a post by post basis.

It shouldn't be wasted effort writing good comments, tying together relevant links across the web, even though the few humans directly addressed by a comment might follow the links, unaware of the invisible search engine deterrents sprayed across those links.

I'll settle for something which solves this problem for me and other Blogger users who feel the same way I do, and I think I have an embryo of an idea growing, already. In researching unrelated topics (how to inline the post a comment form in your own Blogger page layout at Jasper de Vries' Browservulsel, and also at the Singpolyma technical blog) I happened upon an article about how to edit Blogger comments posted to your blog, post factum -- and not only yours, but all others, too. Interesting. Each comment on your Blogger blog is technically a blog post, just like your own, and can thus be edited in the Blogger post editor, if you have the rights to. Which you do, when the comment is put in your blog.

History falsification issues aside, this is a really good thing. It means not only that you can go back, correct spelling mistakes you did without posting a new comment and removing the old one, or that you can tidy up posts by others, but it also means that we can fix the rel=nofollow issue when we get somre really good links. Because the HTML we get to edit is not the same HTML the commenter wrote when submitting, it is the post-processed HTML that eventually ends up on your blog. I tried, and yes, editing out the rel=nofollow attribute is done in a snap. Works like a charm.

It's not the one click away story in a comments overview I'd like it to be, though. Jasper has made a handy Greasemonkey script to add comment edit links (and if you read the article on how to edit comments, you also know that you could edit your Blogger template to provide these links yourself), which of course helps, but I'm pondering the next generation of precision automation, doing away just with the rel=nofollow attribute at the click of a button, and without loading any other page in doing so (because then it would not just be clickety-clickety-click when you have three good comments on a post, but also lots of waiting and clicking back, rinse, repeat).

I envision a nifty little Greasemonkey hack which tracks down the comments that have links still marked Google hates your guts, and adds a little heart icon (<3) to it, which, when clicked, fires away an XMLHttpRequest behind the scenes that edits this comment, and when it has succeeded, removes the heart from the comment (since there is no more any need to add any link love there). Points for style awarded for pulsating the heart while this is happening behind the scenes.

And, as it happens, I already tried out how that can be done in a previous, less useful hack of mine where I submit images to ImageShack, which is probably done a lot better with their own Firefox context-menu extension. But it was a useful learning experience, nevertheless, and I did write that image pulsation code. :-)

There, I think that pretty much sums it up. And while this would technically qualify for my Some assembly required blog, I wanted to focus a bit more on the discussion side to it, so I'll just post a back referral and a summary there. Feel free to join in; this will be a fun and worthy hack.

3 comments:

  1. I get something like one spamment per post (since I don't have CAPTCHA activated, I guess). This far, I've tried to just delete them as soon as I see them, which, I hope, is before they're sucked up by Google.

    ReplyDelete
  2. Most likely; I have not seen any since I turned it on, a few weeks ago.

    Somehow I maybe managed not to mention that Blogger applies the rel=nofollow attribute to all our blogs, so whether you're quick enough or not, they won't get any PageRank effects, either way.

    (I added a note on this now, so Hans was not being unattentive above.)

    ReplyDelete
  3. It's definately a script I would use if someone got it to work, good link love is very important and actually helps combat SPAM to some extent by at least making sure the good resources show up in the search engines.

    ReplyDelete

Limited HTML (such as <b>, <i>, <a>) is supported. (All comments are moderated by me amd rel=nofollow gets added to links -- to deter and weed out monetized spam.)

I would prefer not to have to do this as much as you do. Comments straying too far off the post topic often lost due to attention dilution.

Note: Only a member of this blog may post a comment.