Follow tomcoates on Twitter
A weblog by Tom Coates concerning future media, social software and the web of data
Quote of the month: "This is not a brothel, there are no prostitutes here"
You can explore the archives, read the disclaimer or subscribe to an RSS feed

The Balkanisation of Blogdex...

Posted July 29, 2003 8:53 AM.

The last couple of days have seen a Daypop and Blogdex Top 40s that are totally overwhelmed by political articles from the States. If it wasn't for the fact that many of these articles are concerned with the war in Iraq, you could be excused for thinking that nothing else was happening in the world at at all - even perhaps that there was no world outside the US.

Three years ago - back in the days of Beebo.org's metalog - it was quickly observed that the various aggregation sites on the internet had a reinforcing effect on people's browsing - that when they started, the popular links were getting two or three links a day, but that a month later they were getting up to ten or twelve. People linked to good things that they were exposed to - and they decided that aggregators represented an efficient way of finding those good things, prefiltered on the basis of popularity by the community at large. The effect? Sites that appeared on these sites got a significant extra amount of trafic, links, exposure. There's significant value in this mechanism - it produces a manageable amount of links each day that an individual has a chance of being able to read. It also provides a sense of the overall community of webloggia and what they care about.

The problem comes when these aggregators don't have enough granularity. Let me put it this way - Blogdex, Daypop, Popdex, Technorati and the like are no longer simple reflectors of a community's activities - they are also one of our community's best mechanisms for news discovery. To some extent they're gradually becoming one of the most significant ways we find out what's going on in the world around us.

Unfortunately it also means that the country with the most weblogs sets the international community's agenda. There are only two obvious results of this - (i) that these aggregators will (or have) become less interesting or useful to people who don't live in America or (ii) that the international community becomes used to the hideous unrepresentation of their own local news and debate. It used to be said that America had no idea of what happened outside its own borders. Can we really be working towards a new way of distributing and discovering media that means the rest of the world has no idea what happens outside America's boundaries either?

There are a couple of ways that we could address this problem. Firstly there's sampling - we could create a version of Blogdex that doesn't work purely on the basis of popularity, but samples geo-coded weblogs from across the world in such a way that we are presented with a balanced world-wide view of what's important. It's a nice idea, but I think it's impractical - for a start the linguistic barriers would make it less useful for many of us, but also because there would an infinity of ways of determining sampling rates across the world, none of which would likely be 'fair' or 'clear' to people.

No - the most practical way of approaching this problem is to find mechanisms which allow us to balkanise our aggregators - slice their responses - on the basis of metadata. There are many ways of geocoding weblogs in such a way that aggregators could have a sense of your nationality, location, language, time-zone and the like. And above and beyond such meta-tagging there are dozens of directories that include information based around clumping weblogs around interest groups and/or site locations. So I'm putting out a call now for someone to balkanise Blogdex. I want to be able to see the most popular links generated by people in my country - wherever the links themselves are based. I want to be able to slice these links in different ways, to see popular links mentioned on all English language sites (for example) or just those within the European Union. In fact I'd like to be able to see what gay webloggers are reading too. And people within my age group. All of this stuff should be possible, one way or another. I'd build it myself, if I had the expertise required... Can't someone help me out?

Comments

Please stay on-topic, informative and polite. I reserve the right to remove comments for whatever vague capricious reasons seem reasonable at the time.

Wonderful idea . Marvelous. But, just like the French willingness to cordon off their culture from America - and Americanisms - the Imperial arbiters of cool always find a way by sheer force of personality and reinforcement; unless we decide to implement these tools ourselves.

I like the idea, it would be great. I'm just a pessimist. Good luck in finding your enabler. By the way, poor choice of words, 'balkanisation', I see all sorts of nasty visions.... ethnic cleansing of links, shouts of Anti-American slant, with other -isms and -ises thrown in for good measure.


Posted by: Gummi at July 29, 2003 2:27 PM

Yeah - but it's alliterative.

Posted by: Tom Coates at July 29, 2003 2:36 PM

Well, it's a -verse. Therein lies my petty gripe. It *might* obscure and confound what you're asking for.

Posted by: Gummi at July 29, 2003 4:23 PM

few quick thoughts

GeoURL might help, course I don't use it and am not even sure if I got the name right...

Doesn't having a site with a generic .org address hurt the prospects of ever really getting such a Balkinization?

There does seem to be a granularity of sorts emerging in the larger listings of absolute popularity. IE the Truth Laid Bear Ecosystem [ http://truthlaidbear.com/ecosystem.php ] weighs heavily towards US politics.

Is geography really the best divider in this day and age? How about balkanizing by interest?

Posted by: Abe at July 29, 2003 6:16 PM

Tom - not sure about this one. Although I appreciate that having localised blogdex type aggregators might be useful for the reasons you mention, and probably relatively easy for any one of the existing aggregators to achieve (by asking those weblogs register ed/ing with them to specify a country), is this something we want? - to have the blogosphere delineated along narrow national lines - especially if, like me, you'd love the "emergent democracy" or "second superpower" hypotheses to evolve into a tangible reality...

Posted by: Robin at July 29, 2003 6:47 PM

Hasn't Google done something like this already with their News page? It's not perfect, but it works fine. Surely balkanised weblog/news aggregation shouldn't be that difficult?

Posted by: MacDara at July 29, 2003 7:38 PM

Actually scratch that, I just saw the gaping hope in my argument that no doubt you lot'll all pick upon like vultures. I shouldn't think out loud so much ;o)

Posted by: MacDara at July 29, 2003 7:41 PM

Tom, you could always spearhead such a project yourself. You could set it up on sourceforge.net and simply project manage it. Developers can write any old code you want, it's the people with ideas that are really needed. I'd strongly urge you to consider it, and if you do, drop me a line.

Posted by: Drew at July 29, 2003 8:37 PM

This seems to assume that (a) webloggers are significant and/or representations component of their country's populations, and (b) most webloggers use tools like daypop, popdex, etc to determine what's important in the world. We know that (a) isn't true. Webloggers are a statistically insignificant segment of the general population. And I strongly suspect that (b) is also not true. That aside, the idea of population-specific aggregators is an interesting one. Have you seen what Micah Alpern has done with his "trusted blog" search tools? http://www.alpern.org/weblog/php/blogsearch/writeup.html

Posted by: Liz at July 30, 2003 4:42 PM

Ugh. Wish I could edit my comments. That should be "significant and/or representative components".

Posted by: Liz at July 30, 2003 4:44 PM

Not only that--I've always thought you should be able to pick the subset of blogs you want sampled. If I only want to filter 10 specific blogs, let me. To separate the CSS/design sites from the music sites from the political sites, etc. A blogdex for every subculture and interest. Or it could be done categorically (if people could be consistent about categories).

Posted by: omit at July 30, 2003 11:28 PM

Well I'm not sure that it does assume either of those things! We're talking about webloggers finding out information that is pertinent and interesting to them through the filtering mechanism of popularity among their peers! That seems to me to be a quite plausible thing to investigate. I would think that a large number of webloggers do glance at Blogdex, Daypop and the like in order to find out what's going on around the world of webloggers.

Posted by: Tom Coates at July 30, 2003 11:31 PM

If you look at Drupal, they've got a category based RSS reader. I've got it running on my site at...
http://www.bbcity.co.uk/?q=import/bundles

Something like that could be the basis of such a project - start adding Google-esque PageRanks and community trust rankings (a la. Technorati, Daypop etc.). Make it so that it's more "relational" - you could categorise the person (eg. 18-year-old slobby intellectual art student) and the content (eg. crap morose anti-American cynicism). Then you could mix and match the content and the person - I want to hear "upbeat feminist mumblings" from "Chinese Second Hand Car Dealers living in Southend".

Either that or, on a simpler scale, a wide ranging encouragement for people to start up vertical aggregators.

Posted by: Tom Morris at July 31, 2003 12:51 AM

I agree with Liz, predictably. However, I still think it's good idea, the North American slant has alot to do with the sheer number of indexed weblogs over there, as Tom pointed out. The inherent bias of these tools turned me off them completely -- especially the link reinforcement. If anything, I sample a select group of sites with RSS feeds and avoid Blogdex et. al., it is a little obvious. In sum, it should be done or at least attempted just to see result.

Posted by: Gummi at July 31, 2003 12:57 AM

[trying to ignore the sh*ty spam from so-called Jamie above]

Liz, I agree that webloggers are not representative of an entire country population, but I disagree on your immediate conclusion on (a). I believe that among webloggers there is a representative set of trend setters and opinion leaders from whom you can draw, well, significant trends and opinions. Let's do a big simplification and say those are the A-list bloggers. Regarding (b), I agree with Tom that those tools have a great influence on what "bubbles up" on the weblogs and, in a typical retrofit loop (A-list sends a signal which gets transmitted by those tools and amplified by their users), the A-list bloggers will have *huge* influence on what percolates on the bloggosphere. Now if a majority of those A-list bloggers in the English-speaking web are American, you can easily draw a conclusion.

Gummi: in French, balkanisation in the context that Tom described makes perfect sense to me.

Posted by: François at July 31, 2003 11:51 PM

(argh. keep forgetting that tabbing from the homepage field takes me up to the top of the page. :P )

Tom I was responding primarily to the line reading "Let me put it this way - Blogdex, Daypop, Popdex, Technorati and the like are no longer simple reflectors of a community's activities - they are also one of our community's best mechanisms for news discovery. To some extent they're gradually becoming one of the most significant ways we find out what's going on in the world around us." While that may be true for some, I know it's not true for me, or for the majority of bloggers I know. That's not to say there's not value in what you suggest...just that the what you describe isn't necessarily as universally problematic as you implied

Posted by: Liz at August 1, 2003 1:48 AM

What is going on? has someone hacked the article or something? The original article has been replaced by some crap about jamie leigh, whoever that is.

Posted by: BenM at August 1, 2003 10:51 AM

It certainly looks like a hack; the same content is on FridayFive.org at the moment. Looks to be a security hole MoveableType?

Posted by: Seldo at August 1, 2003 12:23 PM

Whoa, what was that... it made it into the RSS feed!
Tom -- any idea?

Posted by: Justin at August 1, 2003 7:27 PM

We're aware of this problem (the Balkanisation, not the hacking — sorry) blogosphere.us and we're trying to find ways to break down the "whole" blogosphere into more managable (and interesting) chunks. Our first experiment has been to run the text from weblogs through TextCat and then aggregate across the language of the weblog. Unfortunately, it only produces interesting results for Arabic/Farsi and French (this is probably a limitation of TextCat running on HTML, but we're working on enhancing the numbers a bit). At any rate, I apologize for the self-link, but I felt it was necessary to point out that there are aggregators out there working to solve this sort of thing.

Posted by: Nick at August 1, 2003 10:59 PM

I don't know what caused the problem. Certainly it started as comments spam. I went into Movable Type to delete it, and it seems to have replaced the current post with itself. That seems to be either a bug in MT or a piece of code inserted into the comment designed to bugger things up. Either way - thank god for Google who had a cached copy of the page already. So it was pretty easy to go and copy and paste it back into the site...

Posted by: Tom Coates at August 2, 2003 11:20 AM

Nice piece Tom. I've had a number of different people email asking for country-specific versions of Blogdex. While I'd love to offer a sort of international version, my initial experimentation into automatic classification came up bust. Without a large enough sample for each language, and with so many bi- and tri-lingual weblogs out there, the edges just aren't so well defined. One of the major problems with the prominence of English weblogs is simply that countries which have not reached critical mass are filled with webloggers reading and quoting the English hegemony. You're right in thinking that metadata is the easiest solution -- tack another one onto the list of things Atom can solve if implemented fully.

Posted by: cameron at August 2, 2003 8:59 PM

Many interesting thoughts above! Due to the low numbers of blogs outside the US I think there could be a case made for establishing non-US versions of Blogdex, etc... at least until individual nations have built up their blogging communities, and merit their own dedicated services. Such non-US services would also be simpler to establish and act as more of a honeypot for US readers interested in a wider range of issues and sites than the big blogs tend to cover (US politics, tech, personalities, blogs, etc).

Posted by: Matt Prescott at August 6, 2003 4:27 AM

Want to add your opinion?

© 1999-2007 Tom Coates