Back in 2004, January 11th to be exact, a website we hosted on one of our shared webservers got mentioned on Slashdot, and promptly caused the webserver in question to be completely overloaded, as 10’s of 1000’s of people tried to access it (to make matters worse, it was very image heavy as well!).
We had to switch the website off, as it was killing all the sites that were also on the server, the problem was, it was that popular, that the server was still struggling with the ‘page not found’ requests that people were now getting.
A few hours later it calmed down, but it got us thinking how we could stop (or even better control) this happening again in the future.
Initially we said, let’s build a cluster and load balance it, that way all the webservers we run can share the load, that’ll be great! Ah, but what if the total load that a customer gets is enough to make all the servers struggle, unlikely but possible (it’s pretty difficult to limit customer resources on a traditional shared hosting platform).
Then we thought, well hang on, we can just configure the load balancers (on Layer 7) to drop an individual website if it gets really busy, and the rest of the servers would be ok. That saves the rest of the customers, but penalises the customer who is probably over the moon at the traffic he’s getting, which really sucks for the customer.
We were still mulling these ideas over a few weeks later when we got an enquiry from a worldwide racing organisation, looking for a hosting platform that would cope with a fairly small requirement for 12 days out of 14, but then expand to handle 5000% more in those other days (when the race’s were on). We had a long think about it and reluctantly decided not to supply a quote, as the only way we would be able to provide the service was to sell them the capacity at the higher level on a permanent basis, which we knew wouldn’t be cost effective for them.
Thats when the lightbulb appeared above my head and put these two problems together.
We needed to build a hosting platform that allowed customers to flex their requirements up and down, as frequently as they needed, on a dedicated server scale, and only pay for the service that they actually used. This would solve both of these problems, as the Slashdot customer could have flexed up for a few hours, which I’m quite sure they would have happily paid for the amount of exposure they could have had, and the racing organisation would have also found the solution very cost effective.
So that was that, we wanted to build a flexible, scalable, automated hosting platform, brilliant we said, we’ve got a winner here, so let’s do it.
Hrm, that’s where things started getting complicated….
(The next post will explain what happened between then, Q1 2004 and now, Q3 2007).