“Blogging Persai” is the title of the blog run by the Persai guys. If you needed an indication of how this post is going to proceed, a major hint would be that I was sorely tempted to give the title “Flogging Persai” to it. For a bunch of guys who have been extremely trigger happy during their Uncov.com days to stamp almost everything with the dreaded “FAIL,” it is rather interesting that their own product is nothing short of a half-baked proof of concept that has been cobbled together for reasons that don’t go beyond, well, the fact that it can be done.
Persai, according to the founders, is an ad-supported content recommendation system. Over time, the guys have crawled a truckload of RSS feeds(there used to be a blog entry which said as much, but is not there on the bog anymore, but Sam Ruby has the list here), indexed and classified them and this in turn powers the recommendation system. You can subscribe to “interests” (known as keywords for the rest of humanity) and get sources thrown at you which the system thinks are relevant to you. While you can’t do much else with the sources, since Persai does not have a built-in feed reader, you can reject sources. And that is all there is to see about Persai. Well, at least for now.
Use Case: Recommendation systems have not traditionally fared too well on the internet. Previous players like Greg Linden’s Findory used to do a lot more than what Persai even does today and have not done too well at all. In fact, Findory, rather sadly, shut shop recently. The only recommendation system (which works in a stealthy manner) is Google News, which works because they don’t blatantly involve you in the recommendation process.
Once you find content on Persai, there is not much to do with it. Fulfillment is a term that is at best very vague on Persai. You can, as they claim, track the topics, but those links lead out the website anyway. Individual interests have RSS feeds that you can subscribe to, but you can already do that with Google News Alerts and other products. I do doubt if anyone is going to use Persai just to have search term driven RSS feeds.
Accuracy: The approach that Persai has taken to classification involves the usage of training data. This approach works well on similar data sets, but the moment you deviate from the similarity, the entropy will be of a magnitude which will send the classifier on a wild goose chase. And as expected, this has an adverse impact on the accuracy of the results. For instance, one of my interests — “mameo” — throws back results at me which has nothing to do with Mameo in the first five results. I could, of course, reject these sources and help improve Persai, but why would anyone do that when there are other avenues that provide me with much more accurate results?
Speed: To do classification, Persai is already using Hadoop’s MapReduce. Mapreduce does an amazing job of distributively processing huge chunks of data (freshly crawled data to be indexed and classified in this case), but it may only help Persai to a certain extent. The reasons for this are simple: If they process interests as unique to each user, it just won’t scale up. There will be numerous threads doing classification for the same interests since they are unique.
And if the interests are not tracked as a unique item per user, it can play havoc with the results with different users rejecting different sources for different reasons. Of course, there are workarounds for it by using a mix of both approaches (classify as non-unique, filter on display by excluding user-specific rejection criteria), but in the end it ends up being a hack.
In any case, the approach results in tremendously outdated results. Some of the interests have really old articles on top. This could also be due to the fact that the sources are manually added into the system, which means that the quality and spread of the sources will be dependent on the bias of the person who is selecting them. Moreover, it another issue that sites without RSS feeds will not be able get into Persai.
Splogs: Possibly the group that will be over the moon about Persai would be the thugs who run splogs. With Persai it becomes ridiculously easy to set up automated blogs based on topics and, honestly, I see more people using Persai for this than anything else.Considering that Persai is still in beta, I would not give it the “FAIL” rating, but I would certainly give it the “FRAIL” rating. I hope it becomes a much better by the time it comes out of private beta.
The blog has been suffering from semi-neglect due to the usual refrains: work and life. While I will be pushing out a few drafts that have been in the deep freeze for a while, it will take a while longer to organise things so that I can start blogging again regularly.
There are events and then there are not-so-ordinary events that give us hints, even in their disassociation, about the direction that technological (or any other type, for that matter) developments will head.
In the past week we have seen three such events – Microsoft’s formal overture towards Yahoo!, Facebook’s less-than-stellar numbers and Twitter’s ongoing saga in trying to keep a web-scale messaging framework up and running – that give us tasty hints as to where we may be headed.
The simpler, shorter version of the Microsoft – Yahoo! story is that companies that do business in the old school way – a manner similar to a behemoth, clumsy and ugly in gait – are history on the internet. Lock-in of the user and his/her data to platforms or products is a strategy that is history. It is only a stellar product that will keep companies alive in the future. And neither Microsoft, nor Yahoo! have built and in-house hit web-scale product in recent times.
The feeling that keeps coming back to my mind is that Microsoft and Yahoo! will be one of those weddings that look perfect as a mental image (for the shareholders and business wonks), but in practice it ends up being an absolute nightmare. There is a staggering amount of redundancy (for every Yahoo! product you can think of, there is almost a competing one with MSN/Live.com) and the integration will also be rotten in terms of platforms and cultures.
Even if you set apart the strong stench of desperation in the move, the fact remains that these are two companies that are struggling to catch the imagination of the younger and upcoming generation. By the time the dust settles on this one, much confusion would have ensued, which would tick off the loyal users who make up a vast majority of the numbers that make the deal look exciting.
That said, it is indeed a sad development to see an internet icon like Yahoo! being in the position that it finds itself in now. And in that state of distress lies a story for everyone who makes a living off the internet – don’t take anything for granted. Earlier, a company’s lifecycle – from inception to success to the demise – used to take decades, now the same is being compressed into ten years.
It is a theme that I will never tire of telling everyone I know: being nimble is a priceless asset in doing business now – nurture it, grow it and covet it with as much care as you covet your bottom line.