High Scalability - How High is REALLY High?


If you check out the links on the right you'll see one for High Scalability - Building bigger, faster, more reliable websites, which I follow closely for a number of reasons. It's easily the most highly concentrated source of information on the issue of website scalability there is, but before you can really appreciate that you should understand what is meant by the word scalability. Wikipedia defines it as a system or network's ability to either handle growing amounts of work in a graceful manner, or to be readily enlarged in order to do so. While the term "readily enlarged" is unfortunate as it triggers every spam filter in existence, think instead of a rubber band that can stretch without breaking and you have the picture.

High Scalability is a treasure trove of information about what is "under the hood" at some of the most popular and well known websites in the world and what makes them capable of achieving such amazing feats of high scalability. For example, I found it extremely interesting that "PlentyOfFish", a free dating site with millions of users, ran on only a single server managed by one guy until recently. And reading the history of the eBay site gave me a completely different picture than the one I had in my head - it evolved over quite some time and took a number of strange turns along the way. Not at all what I imagined, but it works and that's what counts.

On the subject of being able to handle growing amounts of work in a graceful manner, what better proof would there be than a million users showing up in a 4 minute period of time? First of all, it would blow right past whatever built in limits are normally defined in the operating system, web server and database - one of the many useful things about High Scalability is how it uncovers those things and shows you how to fix them before they bite you in the arsenal. Windows is especially notorious for setting internal limits at arbitrarily low values and waiting for you to discover them the hard way (like when all of your visitors are treated to an ugly "500 server error" message, for example - definitely not graceful).

What High Scalability consistently reminds us is that you can have all the capacity in the world but not be scalable if you don't pay attention to all the relevant details. That's why I think of scalability testing as a cross between performance testing and functional testing. You should never deploy a new server without subjecting it to a scalability test designed to uncover whatever artificial limits may be lurking underneath and to discover what real and practical limits there are on throughput and so forth.

I've asked Todd Huff at High Scalability to help me find a suitable candidate for the first one million user load test but I'm sure he won't find anything as good as diapers.com. Nonetheless, being first means being the first of many and I hope to be doing these around the clock before long. In fact I'm at work on an icon that companies can display on their web sites indicating that they have passed a CapCal Crash Test of whatever number of users they desire. Of course, with Amazon EC2 there to allow capacity to be r e a d i l y e n l a r g e d, that number could be very, very big indeed - once we've done a million users we'll be aiming for 10 million and so on.

No comments:

Post a Comment