Archive for January 15th, 2008
Clustered river of news
RSS readers have over time become pretty fully-featured software on their own. Most now provide the standard set of features: OPML import/export, categories, river of news and search irrespective of their avatar — online or offline — and I have pretty much grown used to depending on my reader of choice Google Reader to satisfy the need to read my feeds.
That said, there is one feature I’d really love to have in my RSS reader – to have clustering on feeds as an additional way to categorise data, other than the current methods of categories and tags. Think of it as a cross between your RSS reader and Google News/Techmeme. Would it not be nice to have your little personal Google News or Techmeme from the sources that you have picked than be led by what Gabe or the kind folks at Google News may have seeded their websites with?
There are, though, a couple of problems that could make this impossible:
Processing: Any algorithm that finds similarities in text is computationally intensive even in cases where the data set is limited. Scaling is often possible in such circumstances when the size of the data set is reasonably fixed and with the variance that comes in the size of different RSS subscription lists, it would be a royal pain to find a right algorithm that will scale effectively and efficiently.
Entropy: Traditional similarity match approaches work best when they cover a similar domain so that an apple would mean apple the fruit rather than Apple the company. The entropy that is found in the data set needs to be reasonable for the algorithm to function reasonably well and learning systems also need to be taught with training data, which may not be possible in this case.
Link Match: What we are then left with is to hit the problem purely by tracking outgoing links. This would thankfully involve a far less computationally intensive approach than going via the pure text analysis approach. The degree of accuracy and the utility this approach may have may not be stunning, but it would certainly be good enough for the immediate purpose – a reasonable way of classifying what my subscription list is talking about.
Related articles:
RSS Clustering: A Unique Approach for Managing Your RSS Feeds
A Novel Clustering-based RSS Aggregator
Nearest Neighbors and Similarity Search by Yury Lifshits
Four Music Picks
Sunday was one of those weekends again when I went over to the music shop and book store to replenish my music and book collections. To my surprise, quite a few of my recent finds have been my old friend – straight out rock – than what has fascinated me in the past couple of years: trance, electronica and house.
The following are my favourite picks from the lot:
Foo Fighters: Echoes, Silence, Patience & Grace was the first of the rock lot that I had picked up in a long time. I will happily admit that the reason why I did pick it up was due the music video for “The Pretender,” which is one hell of an explosive track. What I did not expect, though, was for the album to turn out to be as good as it eventually turned out to be, with some 6 tracks are worth listening to over and over again.
Most of the songs in the album start rather softly before exploding right in your face when you least expect it. The production quality is quite clean and polished, with the riffs flowing out fast and clean in what is an outstanding commercial hard rock album. Then again the problem with Foo Fighters is what Nikhil said about them recently: Foo Fighters have songs, they don’t have albums. When you see Dave Grohl, you don’t see an enigmatic rock star who has a larger-than-life presence: you see something that the cat dragged in.
Thankfully, his looks and his presence has no bearing on the quality of the music and that is what makes it an awesome album.
Arctic Monkeys: Favourite Worst Nightmare is more like listening to someone who has a change of personality 90 seconds into every song. Most of the songs take unpredictable twists and turns out of the blue like they got hit by the realization that they needed to put ‘X’ into a song that was going down ‘Y’ and two minutes later it has to go down ‘Z’.
Thus we have riffs and and progressions that take off in the middle of nowhere, speed up, speed down, go quiet on you without any prior notice; all of which is laced with a very quirky sense of humour and a thumping fat bass line which makes the tracks all the more bouncy and thump-some.
Wolfmother: If the Arctic Monkeys have trouble sticking to a single personality through their songs, Wolfmother’s self-titled album is a picture of singular purpose and poise in belonging to the dirty unclean rock-your-balls-off riffs of the 1970s. There is little pretense here. You will find no venue to blame the band for over-engineering the sound and it is an absolute delight to get to listen to something that sounds like a rusty tractor running wild than the usual purring perfection of albums these days.
That said, I still think parts of “Woman” sound a lot like Deep Purple’s Black Night.
Neil Young: Chrome Dreams 2 is a classic. It takes balls to make an album like this in this time and age. It is an album that meanders through all the varieties of the lovely music Neil has gifted us with through his long career and it fittingly starts with the softish “Beautiful Bluebird”.
While “Boxcar” and “Ordinary People” take a different, rockier, turn, things get considerably riffier and harder with “Dirty Old Man.”
Everyone must have a copy of this album as there is a fairly good likelihood that you’ll dust it off someday and appreciate it the way it should be.