Archive for the ‘Wordpress’ Category
As mentioned earlier, WordPress.com has made the move from Litespeed for their frontend serving needs to Nginx, the little lightning fast server from Russia. Matt had mentioned that they were quite happy with LiteSpeed, but wanted to move to something else purely to have their entire stack run with open sourced software.
It is a huge boost for Nginx, which has in any case been growing at a rapid pace in terms of adoption in the recent years, especially as a reverse proxying solutin for the Ruby On Rails crowd. What is quite interesting is that WordPress.com is running the development version of the software (0.6.29) than the stable one (0.5.35). There is, though, no clarity if Nginx is being used purely as a reverse proxying solution for WordPress.com, or if it is actually serving PHP too though the FCGI route.
According to Netcraft, the switchover was made on 11th of April.
Automattic, the company that created the WordPress.com blogging platform and oversees the WordPress.org open source project, has rejected a $200 million acquisition offer, says multiple sources. — Techcrunch
I’d said a while ago that even the $300 million figure quoted by Rafat earlier was not the right valuation for the company. Automattic does not only create software that helps people — and now a fair number of publications — publish content, they also organise it in a consistent manner using global tags and categories. They are, in fact, a content management, content delivery and allied services company, very much unlike what the world thinks of them as: a blog software company.
If they stick it out in the long run by themselves (no reason why they should not do it, since they have not taken on board any major funding in a while now, which is a good indication that the company is doing well in terms of cash flow/reserves), they would be worth a whole lot more.
Update: More evidence to back it up that WordPress is much more than a blog software company. How long it will be before they get an actual editor to run that page and push out more organised content?
A while ago, WordPress.com took down the ‘Feed Stats’ module that used to help the users see traffic the blog’s feed was getting and the break up in terms of the various clients. Going by the responses (210 comments on that thread), it was something that was missed by quite a few and it was a page I used to heavily rely on.
Ever since Google Reader started the practice of reporting subscription numbers in its User Agent header, a lot of other web-based clients have started doing the same (I had checked it against our internal server logs and a majority of them do the same) and it was a good way of seeing actual and sustained usage in terms of the feeds. While the Feed Stats module never reported subscription numbers, it still used to give us a fair idea about how much the usage used to be, even if was not exactly uniques or absolute uniques.
Now, it is entirely possible to use something like Feedburner to do it, but the problem is that since the free WordPress.com offering does not allow for template editing, you can’t point the autodiscovery links elsewhere other than the WordPress.com feed. Moreover, even in cases where you can edit the templates, clients that are looking at the old feed URLs need to be served a 301 redirect to point to the right URL, which is not possible right now.
Even though Matt has commented that the feature may return someday, it is very important for people who track their traffic closely to have this information and making this feature available would make our lives considerably easier and WordPress.com that much better.
What we have here today is Matt Mullenweg, who runs the entire WordPress show via Automattic, making good on his promise and answering a couple of pesky questions I had for him regarding WordPress.com.
Interesting comments from the answers include content filtering (I mean it as a positive, than as a negative, considering how little adult content on WordPress.com gets on to the homepages, while some of the best adult blogs I know are hosted on it), the usage of a CDN called Netli for content caching and rough numbers regarding usage in India, which is by far the highest I’ve ever seen anywhere.
If you are keen on helping blogging in India in the local languages, they are looking for translators.
And as an interesting aside, this is the first interview-ish sort of a thing I’ve done in a long time, that too after I’d hung up my journalistic shoes about three years ago and that too on my blog. Strange to see how the tables have turned.
Meanwhile, here’s a big thanks to Matt. Read on:
1) How big is the uptake for WordPress.com in India (Since you guys are one of the oddest web ops in terms of sharing internal numbers, it would be nice if you could give the break up in terms of sign ups, regular users (average 3-4 logins a week), and probably a percentage figure and numbers of the traffic originating from India)?
We don’t have a breakdown for sign ups in that detail but Google analytics does say we get about 221,984 page views from India, mostly from Maharashtra. The best I can tell we have a few thousand blogs that self-classify as being in an Indian language like Hindi.
2) Do you have more goodies lined up for India, beyond improving on the serving infrastructure? I remember seeing a lot of blogs in the heydays of Blogger with a lot of bloggers blogging in the local languages. Are you planning to reach out to these bloggers?
I’m very open to suggestions in that regard, the obvious things after making the service faster is local support forums and a better translation of the interface. (People can donate translations at translate.wordpress.com.)
3) Is the VSNL IP that we’ve seen before just a co-located server you have in the IDC or is it a part of the infrastructure of a CDN? Or even better, is Automattic now getting into the CDN business?
It’s part of a dynamic CDN called Netli, which accelerates our dynamic pages and provides standard CDN static caching stuff as well.
4) How well is WordPress.com dealing with its current growth? After all, the code base you have for the framework is still the single user/multi-user WordPress installation. Is it stable enough for me to dump five years worth of blogging on to the framework?
The growth has surprised everyone, even myself. We now regularly break 4 million page views a day and server close to 350/reqs a second. (Non-cached.) I think our current infrastructure is in a very good place right now, thanks mostly to the work of Barry. We’ve got a robust setup in Dallas and are aggressively expanding in other DCs in the US.
5) Even when the framework handles the stress well, there are other factors you need to keep an eye out for: storage, bandwidth, billing (for premium accounts) etc. Do you guys have the leeway to scale it to handle, let us assume, a 4x growth across all those variables?
All of that is pretty easy, except for storage which can be tricky to synchronize cross-datacenter. Right now we have a method that works well, but I’d like something cleaner before we get into terabytes of files.
6) WordPress.com has stopped redirecting all requests to wp-login.php to the encrypted SSL URL. Not quite a smart thing to do, you’d agree, even if it has sped up the pages a lot. You could probably do what Yahoo! Google and Microsoft does and keep the authentication part forced to redirect to the SSL version and switch back to plain HTTP for the other admin pages.
Yep, that’s definitely something we’re looking at. I had no idea when we originally added SSL that it would be such a pain.
7) Why was the SSL option dropped anyway? Were you guys using an add on crypto processor or was it being handled by the acceleration appliance?
It had nothing to do with processing speed, our dashboard traffic is light enough that it wasn’t a problem at all, it all had to do with user speed. The increased latency and sucky client caching of SSL content made the admin interface just crawl, especially for international folks, and in our testing we found nothing helped as much as just turning off SSL.
8) Why LiteSpeed?
It’s the fastest and most robust web server we’ve tested. The only thing I dislike about it is that it’s not open source, and we’re seriously considering replacing it purely for philosophical reasons. It’s the only non-OS app in our stack.
9) Will WordPress.com ever give up PHP/MySql and move into the Java scheme of things? At this point an application server would surely look like an enticing prospect to you?
Never, that would be the biggest waste of time.
10) Is there any sort of content filtering on WordPress.com ? There are plenty of sex blogs and adult content being served from the framework, but these don’t tend to show up normally on the regular WordPress.com listing pages.
Yes we pretty aggressively filter mature blogs from public listings to try and keep the front page and admin areas PG-rated.
Right after a comprehensive write up on Myspace’s troubles in scaling up their infrastructure to meet the needs of a million teenagers competing to make some of the ugliest and heaviest pages on the internet, comes a post by Matt on Automattic’s troubles getting Sun to make good on their promises made under their Startup Essentials Program. I had written earlier about how Sun has made some surprise incursions into what’s been traditionally Microsoft and LAMP territory in the media sector, but if what Matt’s saying is true (BTW, I still have not received my free Solaris 10 DVDs they’d promised to ship from Sun.de), it is a bad move on Sun’s part. Promising and not delivering is a lot worse than not promising at all, be it love or be it business.
That apart, I continue to be amazed by WordPress.com’s growth and stability. At least from the point of view of an user, it does not feel anything like what Matt says — an operation run on a tight budget — and I am for the first time not considering anything else to satisfy my blogging needs. I am one of those bloggers who are technically competent enough to run their own server/framework etc. But after a day of dealing with, among other things, servers spread over three full racks, last thing I need is to manage one for myself. And that was the primary reason why I’ve always stuck with Blogger. Nothing beats the luxury of having your data hosted on Google’s servers and even after the recent troubles with Gmail, I am yet to experience any data loss with the company.
I do not know exactly what set up WordPress.com is using (I am assuming that all the reads are handled out of one DC and the writes out of another, with redundant high-speed links handling the Mysql replication between the two and another framework handling all the media files), but it’s been very impressive till date and it should be a lesson for the Myspace in scaling and redundancy (yeah, I know the scales we are talking about are very different, but just do a comparison of the outages/costs and you’ll know what I am talking about).
From my own experience, I’ve always leaned towards Postgresql in the database wars. From my first brush with the elephant in 2002, I’ve never experienced any data loss, even when someone pulled the plug on the server, while the case has been different with Mysql. Moreover, in the current set up, I’ve seen our Postgresql servers handling around 2500 connections (mixed read and write) without breaking a sweat (load level 2.5 average) on a single server, while the same on a single Mysql server causes it to throw its hands up and say “I give up.” Of course, Mysql is very fast when you offload the reads on to a cluster of slaves, but that brings in the cost factor into the equation. That, I’ll tackle in a different post.
I’ve been seeing individual blogs on WordPress.com being intermittently served out of a VSNL IP address (188.8.131.52). The IP address is from a range that is not normally seen on sites served out of VSNL which belong to the 203.199.xxx.xxx range, which, if my memory serves me right, is the Prabhadevi IDC. The WordPress IP has a reverse DNS of vashi-netli-idc-ill184.108.40.206.static.vsnl.net.in and as you can imagine it belongs to the Vashi IDC.
WordPress.com has been fiddling around with its content serving infrastructure for a while now, initially using mirrored multihomed servers out of Arizona and San Diego IDCs and from what I could guess switched recently to using a content accelerator like BigIP or ServerIron switches, thus drastically dropping the number of multiple IPs that resolve to the wordpress.com subdomains.
The geeky parts apart, I would really like to know what’s prompted this India-specific change for WordPress.com. One of the most obvious explanations is that they could have signed up with Akamai or Limelight Networks for their CDN services. But Akamai has been serving their Indian content from their Reliance IDC IPs for their clients and from what I vaguely remember this was not the IP block they were serving in India even on their network in VSNL. Limelight does not specialise in serving text content, they are multimedia content specialists and even their known client of Rajshri is being served from their IDC in Arizona. And from the little I had once talked to the Limelight guys, they still don’t have any POPs in India.
If it is due to an uptake in traffic and registrations from India, that would be something worth noticing on WordPress because it has not been an easy task to get a blogging framework to do well in India and having a dedicated serving infrastructure within India would be excellent news for Indian WordPress.com users.
On a related note, Google has joined the ranks of Microsoft and Yahoo! in having a dedicated cluster for serving India. Yahoo! Mail users and old Hotmail users would fondly remember the cluster-based subdomains like f18.mail.yahoo.com and baym-cs237.msgr.hotmail.com. These days, a ‘netstat -a’ gives interesting Google-related reverse DNS entries like po.in.f125.google.com and in.in.f19.google.com. These are interesting times indeed.