Categories
Personal Publishing Social Software

Trackback is dead. Are Comments dead too?

I think it’s time we faced the fact that Trackback is dead. We should state up front – the aspirations behind Trackback were admirable. We should reassert that we understand that there is a very real need to find mechanisms to knit together the world of webloggers and to allow conversations across multiple weblogs to operate effectively. We must recognise that Trackback was one of the first and most important attempts to work in that area. But Nevertheless, we have to face the fact – Trackback is dead.

It has been killed by spam and by spammers – by the sheer horror of ping after ping pushing mother/son incest and bestiality links. It has been killed by the exploitation of human beings quite prepared to desecrate the work of tens of thousands of people in order that they should scrabble together a few coins. It has been killed by the experience of an inbox overwhelmed by the automated rape of our creative endeavours.

In a way it should have been predictable from the beginning – we should probably all have spotted that functionality that allows individuals to place links on other people’s sites could be exploited by spammers. Some people did spot these problems, but even they had no sense of the scale. Their responses were – at best – muted. But now I think we have to accept that the evidence is in. The situation is clear and it is not good. We’re engaged in an arms race with the worst kind of people, an arms race that has raged across other communications media and which we show no sign of winning. For me, the negative experience of dealing with trackbacks has long-since overwhelmed the benefits it brings. For these reasons, I’m turning off all incoming Trackbacks on plasticbag.org from this moment on.

Of course the problem isn’t restricted to Trackbacks. The systems we’re using to manage comments on our sites are probably under even more strain from spammers. The only reason I’m prepared to put up with this in the short term is because the comments seem to be more useful to more people at the moment. But I’m clear in my mind – we’re rapidly approaching a crisis here as well, and it is likely to be one that ends in the abandonment of comment systems as well.

And how to solve this problem? I don’t think it’s a matter of iterative improvements. I don’t think this problem can be solved by engaging in the arms race. MTBlacklist has saved my life, but it’s a patch, not a fix. No, any solution for this problem will be conceptually distinct from our current approaches. It could be a centralised approach – letting professionals manage the data that links our communities together. It could be a radically decentralised one beyond what we’re working with at the moment. I don’t know for certain. But I think we should be looking back to the origins of the weblog and seeing how things operated then.

Originally there was no weblog spam and yet conversation and discussion still existed. If an individual posted something and another individual wanted to respond to it, they simply wrote a post on their own site linking to the original. This environment was entirely free of spam. It was completely clean. I can’t help thinking that maybe we need to start thinking in terms of approaches like that – where there is no automated functionality that could be robotically exploited. Or perhaps we should be looking in other directions – how can we abstract out the kind of social networks that lie behind Flickr to structures that we could overlay across the internet as a whole. A question I think we should be asking is how could we build services that let you decide precisely which groups of people should be able to see, link to, ‘trackback’ or comment on the work you do in a decentralised, disaggregated way?

But this is to get ahead of ourselves. Today, we are here to mourn the passing of a great friend and a solution designed for happier and less cynical times. Trackback, I come to praise and bury you. May you rest in peace…

54 replies on “Trackback is dead. Are Comments dead too?”

I know you’re not going to want to let this descend into a you-should-use-this-plugin-it-really-rocks discussion, but I’ve found that using SpamLookup has completely removed spam from my comments and trackbacks. Of course, very few people actually read my blog, and even fewer comment on it, so it could conceivably be blocking legitimate commenters and i’ve never know about it. Continuing your metaphor, then, it may well end up being Trackback’s iron lung — it keeps it alive, but at the cost of, you know, actually having a life…

Trackback is dead. Are comments dead too? Yes, and yes. Finally!
Just move to taggregation and get over it. We’re going to have to hold hands with the big boys — Google: no-follow, Technorati: rel=tag, more to follow no doubt — or fight it out in the trenches with the spammers.

I have only one problem with killing all these spam-magnets:
Stats continue to show that only 1 out of 5 blog readers is a blogger. (This number has held steady according to Pew’s last two reports on the subject.)
That means 4 out of 5 potential readers are not able to write a post referencing your post. It’s so imperious to say, as some bloggers do, “just write a post if you want to comment” (oh, and by the way giving me link love and increasing my own “influence” while offering nothing in return.) It’s even more imperious to say “oh, just start a blog, and write a post if you want to comment!”
I have blogs on 4 different tools (iBlog, MT, TypePad and Blogger.) Only the MT blog is inundated with comment spam. Maybe the problem is with them.

I love SpamLookup but it’s a little too aggressive killing off my legitimate trackbacks.
The Comments are dead/Trackbacks are dead discussions come and go, but you’re right, the arms race against the spammers gets a little tiresome.

I use TrackBacks internally on The Sand Trap (http://thesandtrap.com/) to automate the handling of “related articles” and it works wonderfully. We have very few spam problems. A blacklist that blocks as few as 100 words (most of which are variations, in fact) is highly effective.
TrackBacks dead? No, but on life support. Comments – not by a long shot.

Trackback spam and other forms of comment spam is definitely a growing problem. It can be managed pretty well by blacklist/graylist usage, but there are still problems to be considered. Just because you block the spam from showing up on your blog doesn’t mean that the spam doesn’t cause problems, as I’ve recently noted on my blog (Spammers should all DIE DIE DIE).

It really pisses me off at a deep level that the spammers keep pissing in the well like this.

I’m so so so not looking for discussion about plugins and solutions. The very premise is clearly now flawed. There’s nothing we can do about it. Now I’m looking for different solutions beyond it. Come on guys, think a bit harder!

I could be being really naive – but couldn’t a solution be to just “techorati” a post to see who’s talking about it? It is a verb to “technorati” right? You could even perhaps create a script to define a subset of people whom you ‘trust’ to pull in links referencing your post. But no radical new approach from me Tom, sorry. Stop blogging and write letters, lots of letters, perhaps?

“very premise clearly now flawed” – didn’t see that in your post, just that you were mightily fed up with it. “*Practical implementation* clearly now flawed”, yes, unarguably, but I don’t see any argument for the fundamental premise being flawed in the above, just a few melodramatic lines pronouncing it dead because it’s boring to manage.

Not that it isn’t a great idea to think harder about more imaginative ways to do the same thing, of course, and I’d prefer to read about new ways to link discourse across many sites than lots of “I-use-this-plugin” banter.

[tangentially: I previewed this about seven times just to watch the links change colour…]

Before blogs, there were mailing lists. And I miss the sense of community, focus and closeness they had. The ability to self publish in blogs is wonderful, but in the process we’ve spread the conversations in mailing lists all over the web without really providing the tools to bring it all back together in one place.
The blog comment and trackback are essentially a write only medium. Like the blog you get to put your message in front of the blog author and the people who come after you, but you’re very unlikely to see any dialog that it then generates. I find this sad. That’s not a dialog, but a whole series of interlocking monologues.

Rich – the premise being flawed is what the part that says that “we should probably all have spotted that functionality that allows individuals to place links on other people’s sites” was going to be exploited by spammers. It’s clear to me that anything that has that kind of functionality is going to result in an arms race between spammers and webloggers. That’s why I was proposing different models in the paragraph that starts: “And how to solve this problem? I don’t think it’s a matter of iterative improvements. I don’t think this problem can be solved by engaging in the arms race. MTBlacklist has saved my life, but it’s a patch, not a fix. No, any solution for this problem will be conceptually distinct from our current approaches.”

Robert makes a good point that we should just move to “taggregation.” I agree with that statement to some degree. The system would be pretty cool, but what prevents spammers from hitting Technorati and destroying that setup? Nothing, right now. Until they cook something up over at Technorati to baffle the spammers, the blog community should just rely on Google’s PageRank to lead the way. At least that gives us some ammo in the war on spam. It’s no Technorati, but at least it’s more or less clean.

Tom – thanks for clarifying. I was engaging with it at a slight angle, I think, mistaking one premise for another (the general premise that entries from different weblogs should be able to reference each other in the course of a discussion was the one I didn’t see proved flawed in your post, presumably for the very good reason that you were discussing another, more practical, premise: that if, while attempting to facilitate this referencing in the course of a discussion, you let people put links on your page, you invite all kinds of company to the party)

And absolutely, yes, radical new models needed urgently.

I take it you know about MTKeystrokes?:
http://overstated.net/projects/mt-keystrokes/
It’s an MT plugin that counts the keystrokes entered in the TEXTAREA for the Movable Type comments form, and sends that number along with the comment. The plugin then makes sure the number matches the number of characters in the actual comment. Since most spammers write scripts that simply paste in a chunk of ad-ridden text, only one keystroke is counted – but the text amounts to more than one keystroke, so the comment is not published.
Of course, there’s nothing to stop the spammer writing a script that fakes the keystrokes too – but it would take much more effort.
But I think the solution needs to be smarter than this – it needs to be something that takes slightly more crafty approach.
Perhaps the plugin could also figure out the time difference between the comment section being accessed, and the time it took to post the comment too, or figure out if a human actually typed the words by counting the average time between keystrokes?
Whatever it does, it needs to create that elusive combination of factors that a computer simply can’t figure out or fake.
I remember reading somewhere about the fallability of Yahoo verification system, where the user is required to input a number or a code in order to progress – a code which based on a randomly generated wobbly image – but these systems are computer-based, and thus can be beaten by a computer.
What comment systems really need is a script that places very simple, unobtrusive obstacles in the path of a spammer which test for the existence of a human being – such as working out a simple puzzle or answering a simple logic question.
Even simpler – why not have a script that creates a randomly-generated query string in the URL of the comment link when it is accessed? That way, the spammer would have to know the exact unique query string every time they wanted to post a comment on an individual entry, which they wouldn’t be able to do via an automated process, since the number wouldn’t be generated until the link had been accessed. I dunno…
There’s another big problem with the Movable Type comments system which Six Apart haven’t addressed; its ubiquity. If I was a writer of comment spam scripts, I’d be rubbing my hands at all those ripe MT blogs out there – since just about everyone who uses MT shares the same scripts, filenames, code and design layout as every other MT user. It’s a spammers dream come true.
Remember, that comment spam is automated – so the more automated obstacles placed in the spammer’s path, the harder it’ll be to crack.
The one other big weapon in the fight against evil-doers everywhere is confusion. If the spammer thinks their crafty scripts are working, he/she won’t change their methods.
If they think they’re being thwarted, they’ll devote time and effort to working around the obstacles.

“The very premise is clearly now flawed. There’s nothing we can do about it. Now I’m looking for different solutions beyond it. Come on guys, think a bit harder!”
I agree. Unauthenticated communications protocols are fundamentally flawed and can’t be used for useful communication, regardless of what advances are made in feedback management or filtering.. However, I clearly missed your “email is dead” post on that matter. How did you get by after you abandoned email? 🙂

I’ve just realised something that might help add to the debate that “comments might also be dead”…
When I make a comment on a weblog entry, I rarely read the comments that preceded my own.
However, there are circumstances when I will read the preceding comments:
1. If the number of comments is small enough to warrant a quick scan – say 2, or 3.
2. If the comments mentioned in 1 are not excessively long.
3. If I have already made a comment, I will go back at a later date and read the comments that are subsequently posted beneath my original comment, to see if anyone has picked up on my reply.

Anil wrote, “However, I clearly missed your “email is dead” post on that matter. How did you get by after you abandoned email?”. The difference between email and Trackback/comment-functionality is that email is, for many people, an essential means of communication. I can’t function without email, but I think I’ll manage without trackback :). Let’s face it for most people weblogging is a leisure activity,but comment and Trackback spam can make it feel like work. So, the easy solution is to turn it off, but in doing so you lose part of your connnectedness to the larger weblog community. Not a good thing I say.

My guess is that we’ll see the emergence of several large and broad communities that are comprised of a large number of very distinct (and smaller) communities (eg. LiveJournal). These large and small communities will need to have public fora where users are enabled to reflect on each other’s thoughts and ideas,as well as connect them to “what’s hot” in the larger web community. I do see taggregation playing a large role in this, but also (administrated) group blogs, message boards etc.

I keep changing my comments page name, and use mt-blacklist. Nothing is easy, I hate spammers with a vengeance and delight in smacking them out of the comments permanently, blocking IPs and the like. But non-tech users should not be subject to the torrent of spam that they have to manage one way or another. I take solace in the chance that many of these spammers will end up paying dearly for contravention of current and yet-to-come laws.
Until then, we can only get on with it as best we can and hope our blogs aren’t permanently tainted with trash. For me, that means checking comments every few hours. Hell, I hardly blog more than once every few days!

I auto close comments after 7 days, which seems to negate 99% of the comment spam. I had to turn off trackbacks, but I honestly don’t miss them. Technorati usually finds people that link to me with in a day or two.

I do agree that plugins are quick fixes that don’t address the vulnerabilities that our blogs have. They become like an addiction, requiring attention, energy and time
Now, the blog ecosystem is an evolving one, and spammers are our parasites – there is got to be a decent solution, an elegant one, which prevents such a thing from appearing on our blogs. Taking the analogy a little further, what about a symbiotic relationship with a despamming service? Tagging? Secure comments? A mix of all of these? I am tired of plugging the dam so as to forestall the flood of comments – but I am weary of centralized anything.
We need to evolve past the stage of manual fixes, and find an elegant solution.

I have to disagree completely. Yes it’s a hard problem but 1. Captchas work pretty well and don’t bug the user very much for comments, and 2. we just need better trackback systems that also have a secondary confirm stage probably also using captchas.
Keep in mind, Captchas can most likely easily keep ahead of the state of AI for many years to come.
Am I missing some fatal flaw with this approach? Doesn’t seem all that hard to me. And I certainly don’t want to see commercial apps like Technorati take their place. Way too many flaws with that approach. To me that just seems like an excuse for the big commercial bloggers to ignore their audience.
http://en.wikipedia.org/wiki/Captcha

Random thoughts:
– I’m obviously biased, since my world pretty much revolves around blog interaction, but a blog without comments is virtually pointless.
– Comments don’t necessarily need to contain anonymously contributed links. And with some small tweaks in implementation, neither does Trackback.
– Tiered comment security systems can help, leaving the door open to casual contribution without inviting in spammers.

Automatic link detection and aggregation and taggregbation are not save from spammers: just think of linking blogs with your site that is structured as a blog and put spam stuff as content in your sites. To get blogs is easy nowadays and so you just spam by writing pseudo blogs and linking and pinging …

I have disabled trackbacks on my blog as well. I hardly get any comments, let alone legitimate trackbacks, and that’s less time I have to spend picking the weeds out of the garden, so to speak.

Cory Doctorow says All Complex Ecosystems Have Parasites
And spam is just one of many parasites the blogosphere is dealing with…
My own response is to consider this as a run-of-the-mill matter of dealing with the Technical Arteriosclerosis in any complex system.

One bright light I have found in this respect is in the reaction to RSS that has its expression in the development of the Atom publishing format and protocol. The elements are as follows: early diagnosis of the clogged arteries, development of a clean and tight specification, running code, a lightweight touch in anticipation of system evolution and, most of all, architectural leverage like I keep harping on.

I think with respect to trackback we have diagnosed the clogged arteries… not for the systematic application of clean and tight specs… I leave to others the matter of running code and the lightweight touch

All spam stems from the same root, the lack of accountability for actions. We know what the solution is (enforcing accountable identities for senders) and we have several different technical means for getting there (x509 certificates would probably be easiest since the support infrastructure is already available on all major platforms).
However we lack the political will to create the hierarchical authority that would make this enforceable on a global basis…

One solution that comes to mind, that I haven’t seen anywhere, is to allow moderation of how links in trackbacks and comments are displayed/rendered. If any user can add to a collective opinion about the links in comment, it is fairly likely that harmful links would show up as text or whatever and useful links would show up as clickable links. Just make sure that you get something like ~1 vote per visitor and prevent automation somehow and you should be all set. Links could then even be direct. I’m not sure how easy the unique vote problem is, but it sure seems like it has been solved in lots and lots of poll software. The only other problem that comes to mind is how interactive the particular community is…

I’ve turned off trackbacks on all the blogs I maintain except for my own personal blog and that’s only still on because I help test ExpressionEngine. I agree that trackbacks have become more of a burden than a benefit, but comments still remain very important to me.
That said I’m surprised no one has pointed out that the most effective way to combat comment spam is to require registration in order to post comments. You’re already using TypeKey here. One setting change would probably all but eliminate comment spam for you. I also allow non-registered comments, but require a captcha be entered for any non-member comments. As a result I think I get perhaps one or two comments spams in as many weeks and rarely more than four or five at once. I’ve had whole months go by without any comment spam.

I like trackbacks. I hate people who send 1200 trackback pings to my little webserver in 45 minutes. At one every 2 seconds starting a 5.5MB PERL process and a 120 second timeout period, they overwhelm the 250 MB of real memory free on my box and start it paging and stop it from responding.
My solution was a simple 640 byte shell script that grepped the active process list for ‘mt-tb.cgi’ and if there were more than 10 running, killed them.
It’s a harsh form of throttling and it’s a band-aid for the system, but it turned my trackback spam from a denial of service into a nuisance. Yes, a ‘real’ trackback during a spam attack is lost. I’m willing to pay that price.
YMMV, but between this, MT-Keystrokes and MT-blacklist, I’ve been able to keep ahead of the spammers.
My script is here: kill-tb.bash. You’ll want to tweak it to meet your needs in terms of sleep time, actual shell commands for your shell, user and process to check, and thresholds.

I’ve had excellent results with stopping spam through the following methods:

Reverse auto-discovery – when a trackback comes in, go to the referring site and make sure it actually links to you. This obviously doesn’t stop someone from putting up their own blog and sending trackbacks, and I’ve seen someone who does this. But it does screen out a lot of junk.

Automated script detection – Bad Behavior screens out virtually every spambot known to man, before they have a chance to scrape for links or anything else.

Blackhole lists – blocking open HTTP proxy servers screens out a lot of spam as well.

I use Word Press and don’t seem to have any spam posting problems. I also limit comments to one URL, so all those poker guys get wasted.

Tom –
While I know you weren’t looking at plug-ins and such, I’ve found a plug-in for WordPress that actually distinguishes between legitimate trackbacks and spambots at the HTTP-Request stage called Bad Behavior. It doesn’t prevent spam all together, and it doesn’t solve the used bandwidth problem completely, but it does keep the trackbacks themselves open to legit pings. If this plug-in, which can run on any PHP-enabled server, or something like it, became widespread among the sphere, the efficacy of spamming might just diminish to the point that the lazier spammers would quit.
(And no, I’m not profiting from touting this in anyway. It’s just saved my blog from an onslaught of spam of about 1,100 spams a day for the last 5 days. Since I installed it, NOTHING has gotten past, and no legitimate trackbacks have been caught.)

I’ve never seen this happen, but that’s because I haven’t looked:

If I were to set up a blog which extolled the benefits of “s3x-pi11”, would it be legitimate?

Next I go and find other blogs of interesting things. I write a valid comment on my blog regarding what was discussed on those other and trackback to those other blogs.

When these trackbacks are followed to reach my blog, the user will be presented with mostly spam-like content, but relevant content as well. Is that ok?

Or, perhaps I write in my excerpt to the “victim” about how “x” relates to my spam-topic-of-choice.

If I have a point, it’s that some blogs are spam anyway. They won’t all relate to what you’re concerned with. Not every one has a hippy-like mantra of “let nature take its course” view of technology (though I certainly do).

If I were going to suggest a solution, I’d say

  1. keep trackbacks,
  2. check all trackback links to ensure that a link to my blog does exist,
  3. filter using terms I want to (drugs, pills, clowns, anything I hate),
  4. test that the excerpt contains terms from my original post (excerpt is relevant),
  5. get the title for the trackback link from the web page at the trackback URL, and
  6. when displaying trackbacks, keep them separate from regular comments.

Regarding the last item, I usually include my links at the bottom of my post, I think I’d prefer to keep my trackback links there too.

I doent think comments are dead, when i go in the internet and i found and interesting comment i respond to it, i believe if comments where dead we all wouldt be talking about it…..
so come on guys

A blog without the ability for people to join the conversation ISN’T A BLOG. It’s just a web page, simply a broadcast-only method of communication little different than a commercial or brochure.

I’m afraid that’s just rubbish. I’ve had a weblog for seven years, and you couldn’t even HAVE comments on weblogs for a good couple of years, and Blogger didn’t have them on by default for much longer than that. The way we used to have conversations with each other was between weblogs – I talk from my soapbox, you talk from yours and we have a conversation in the middle.

All forms of spam are a nightmare. Lots of my work is in the email realm and the ISPs have been in a continuing struggle with the witts of the spammers.
The email world is employing two strategies:
1. Authentication – tricky, but possible, and raises all sorts of questions about who identifies a legitimate person, especially in the blogosphere which is so fragmented.
2. Reputation – rapidly taking over from authentication as the key way to trap spammers. Building a pattern of positive posts, or I guess trackbacks, over time essentially providing a whitelist for trackbacks, comments.
Of course, wherever there’s an algorithm, there’s someone looking to game it. Ah the joys of the Interweb.

Stop trying to kill the spam and kill the Spammers.
The people paying them have to start taking some responsibility, these days there are enough legitimate publishers that the spammers are no longer required.
I suggest forming a coalition of surfers to find and report all Spammers and that Pay per … advertisers should be fined for each one found on their network and the fine be paid as a reward to whoever reported them! It would make the net a nicer place, now I have to go feed my FLYING PIGS.

Comments are closed.