Blue Screen Of Duds

Where the alter ego of codelust plays

Archive for February 2008

Decentralized Social Data Framework: A Modest Proposal

with 3 comments

Twitter being down is no longer funny, nor is it even news anymore and the same is the case with Twitter-angst, where loyal users fret and fume about how often it is down. One of the interesting suggestions that have come out as a result of this is to create a decentralized version of Twitter – much on the lines of IRC – to bring about much better uptimes for the beleaguered child of Obvious Inc.

I would take the idea a lot further and argue that all social communication products should gradually turn into aggregation points. What I am proposing is a new social data framework, let us call it HyperID (since it would borrow and use heavily ideas and concepts from OpenID), from which social media websites would subscribe, push and pull data from.

Essentially, this would involve the publication of the user’s social graph as the universal starting point for services and websites to subscribe to, rather than the current approach where everyone is struggling to aggregate disparate social graphs as the end point of all activities. Ergo, we are addressing the wrong problem at the wrong place.

The current crop of problems will only be addressed when we stop pulling data into aggregators and start pushing data into service and messaging buses. Additionally, since this data is replicated across all subscriber nodes, it should also provide us with much better redundancy.

Problem Domain 

Identity: Joe User on Twitter may not always be the same as Joe User on Facebook. This is a known problem that makes discovery of content, context and connections tricky and often downright inaccurate. Google’s Social Graph API is a brave attempt at addressing this issue using XFN and FOAF, but it won’t find much success because it is initiated at the wrong end and also because it is an educated guess at the best and you don’t make those with your personal data or connections.
 
Disparate services: Joe User may only want to blog and not use photo sharing on the same platform, unlike Jane User who uses an entire gamut of services. In an even worse scenario, if Jane User wants to use blogs on a particular service provider (say, Windows Live Spaces) and photo sharing on another (Flickr, for instance), she will have to build and nurture different trust systems, contacts and reputation levels.

Data retention: Yes, service providers are now warming up to the possibility of allowing users to pull out user data from them, but it is often provided without metadata or data that is accrued over time (comments, tags, categories etc). Switching providers often leaves you with having to do the same work all over again.

Security: Social information aggregators now collect and save information by asking you for passwords and usernames on other services. This is not a sane way to work (extremely high risk of phishing) and is downright illegal at times when it involves HTML scraping and unauthorized access.

Proposed solution

Hyperid Layout

Identity, identity, identity: Start using OpenID as the base of HyperID. Users will be uniquely addressable by means of URLs. Joe User can always be associated with his URL (http://www.joeuser.com/id/), independent of the services he has subscribed to. Connections made by Joe User will also resolve to other OpenIDs. In one swipe you no longer have to scrape or crawl or guess to figure out your connections.
 
Formalize a social (meta)data vocabulary: Existing syndication formats like RSS and ATOM, are usually used to publish text content. There are extensions of these formats like Media RSS from Yahoo!, but none of them address the social data domain. 

Of the existing candidates, the Atom Publishing Protocol seems to be the most amenable to an extension like this to cover the most common of social data requirements. Additional and site-specific extensions can be added on by means of custom namespaces that define them.

You host your own social graph: With a common vocabulary, pushing, pulling and subscribing to data across different providers and subscribers should become effortless. This would also mean that you can, if you want to, host your own social graph (http://www.janeuser.com/social) or leave it up to service providers who will do it for you. I know that SixApart already does this in part with the Action Streams plugin, but it is still a pull than a push service.

Moreover, we could extend the autodiscovery protocol for RSS and use it to point to the location of the social graph, which is a considerably better and easier solution than the one proposed Social Graph.

Extend and embrace existing tech: Extend and leverage existing technologies like OpenID and Atom to authenticate and advertise available services to users depending on their access levels.

What this could mean

For companies: They have to change the way they look at usage, data and their own business models. Throwing away locked-in logins would be a scary thing to do, but you get better quality and better-profiled usage.

In the short run you are looking at existing companies changing themselves into data buses. In the longer run, it should be business as normal.

Redundancy: Since your data is replicated across different subscribers, you can push updates across to different services and assign fallbacks (primary subscriber: twitter, secondary: pownce and so on).

Subscriber applications can cache advertised fallback options and try known options if the primary ones are unavailable. 

For users: They will need to sign up with a HyperID provider or host one on their own if they are savvy enough to do that. On the surface, though, it should all be business as usual, since a well-executed API and vocabulary should do the heavy lifting behind the scenes.
 
The Opportunity

For someone like WordPress.com, diversifying into the HyperID space would be a natural extension. They could even call it Socialpress. The hypothetical service would have a dashboard like interface to control your settings, subscriptions and trusted users and an API endpoint specific to each user.

Risks

Complexity: Since data is replicated and pushed out across to different subscribers, controls will be granular by default and across different providers this could prove to be very cumbersome.

Security: Even though attacks against OpenId has not been a matter of concern, extending it would bring with it the risk of opening up new fronts in what is essentially a simple identity verification mechanism.

Synchronization: Since there is data replication involved (bi-directional like any decent framework should do), there is the possibility that lag should be there. Improperly implemented HyperID compliant websites could in theory retain data should be deleted across all subscribed nodes.

Traction: Without widespread support from the major players the initiative just won’t go anywhere. This is even more troublesome because it involves bi-directional syncing and all the parties involved are expected to play nice. If they don’t, it just won’t work. We could probably get into certification, compliance and all that jazz, but that would make it insanely complicated.

Exceptions: We are assuming here that users would want to aggregate all of their things under a single identity. I am well aware of the fact that there are valid use cases where users may want to not do that. HyperID does not prevent from doing. In fact, you could use different hyperIDs, or even specify which services you don’t want to be published at all.

Feedback

The comment space awaits you!
 
p.s: Apologies for the crappy graphic to go with the post. I am an absolute newbie on Omnigraffle and it shows! 

Written by shyam

February 4, 2008 at 1:46 pm

Sign o’ the times

with one comment

There are events and then there are not-so-ordinary events that give us hints, even in their disassociation, about the direction that technological (or any other type, for that matter) developments will head.

In the past week we have seen three such events – Microsoft’s formal overture towards Yahoo!, Facebook’s less-than-stellar numbers and Twitter’s ongoing saga in trying to keep a web-scale messaging framework up and running – that give us tasty hints as to where we may be headed.

The simpler, shorter version of the Microsoft – Yahoo! story is that companies that do business in the old school way – a manner similar to a behemoth, clumsy and ugly in gait – are history on the internet. Lock-in of the user and his/her data to platforms or products is a strategy that is history. It is only a stellar product that will keep companies alive in the future. And neither Microsoft, nor Yahoo! have built and in-house hit web-scale product in recent times.

The feeling that keeps coming back to my mind is that Microsoft and Yahoo! will be one of those weddings that look perfect as a mental image (for the shareholders and business wonks), but in practice it ends up being an absolute nightmare. There is a staggering amount of redundancy (for every Yahoo! product you can think of, there is almost a competing one with MSN/Live.com) and the integration will also be rotten in terms of platforms and cultures.

Even if you set apart the strong stench of desperation in the move, the fact remains that these are two companies that are struggling to catch the imagination of the younger and upcoming generation. By the time the dust settles on this one, much confusion would have ensued, which would tick off the loyal users who make up a vast majority of the numbers that make the deal look exciting.

That said, it is indeed a sad development to see an internet icon like Yahoo! being in the position that it finds itself in now. And in that state of distress lies a story for everyone who makes a living off the internet – don’t take anything for granted. Earlier, a company’s lifecycle – from inception to success to the demise – used to take decades, now the same is being compressed into ten years.

It is a theme that I will never tire of telling everyone I know: being nimble is a priceless asset in doing business now – nurture it, grow it and covet it with as much care as you covet your bottom line.

Written by shyam

February 4, 2008 at 12:32 pm

Leopard Tip: Flushing DNS cache

leave a comment »

Instead of doing “lookupd -flushcache” you have to do “dscacheutil -flushcache”.

Written by shyam

February 1, 2008 at 12:20 pm

Posted in Apple, Leopard, OSX

Tagged with , ,