Performance Testing in the Dinosaur Age


Can you imagine a full Broadway production that only runs for a single night? It would have to be a really bad show! But that’s exactly what load testing used to be like, and still is if you automate the old-fashioned way. Writing test scripts, setting up computers to generate the load, running the test scripts by hand, and evaluating the results is like a really awful Broadway production. Problem is, the next time you make changes to your application you have to do the whole thing all over again. At least a bad show never comes back—this one does, but not often enough to keep the whole cast and crew hanging around!

That reminds me of a story. Back in 1981 (the dinosaur age in Internet time), I worked as a software test engineer for a company called Basic Four in Irvine, California. Basic Four manufactured a “minicomputer” that was the size of a small refrigerator and about half as fast as that old 386 machine in your closet. As many as 100 dumb terminals could be attached to a single machine and used for data entry tasks (or so the marketing guys claimed). My job was to prove that the computer really could support 100 terminals.

Proving that it couldn’t did not seem to be an option. So I dutifully wheeled in several racks of CRT terminals, exactly 100, from the manufacturing floor and painstakingly connected all of the cables and cords. My boss cheerfully suggested that we post an announcement on the cafeteria bulletin board, asking for volunteers from the factory floor to help us with the test. We could even offer free pizza to entice them!

“Forget it,” I said, “most of the people on the floor don’t speak English and have better taste in food. And it seems like a silly thing to do if we have to repeat it every time we change the software—can we really afford that much pizza? If we offered free beer we would draw more people for sure, but it might mess up the test results if they throw up on the keyboard or get tipsy. So I’ll just throw together a little program to write to the disk drive and display characters on the CRT—at least it will be similar to what happens in real life.”

So that’s what I did. One after the other, I logged in at each terminal and started up my test program. The screen began to fill up with dots and then I moved on to the next one. I began to notice that the dots on all the screens moved just a little bit slower with each one I added. By the time I got to about 30, they were barely even moving at all. “Oops!” I thought, “My test program must be too demanding on the CPU—back to the drawing board.”

So I rewrote the test program to be a bit more realistic. I started with what a fast typist might do (say, eighty words a minute) and calculated the average delay between keystrokes. Then I wrote a routine that put the program to sleep for that period of time and woke it back up long enough to send a dot to the screen. After eighty such iterations, I told it to send eighty bytes of data to the disk to simulate saving a record. Then, of course, I had to go back to all the terminals, one by one, to kill the previous version of the program that was running and start up the new one.

By then it was nighttime, and my boss showed up with pizza. (Was it the great philosopher Dilbert who said, “Pizza is the opiate of the bosses”?) “How’s it going?” he asked. I explained what I was doing, and he seemed very excited about the whole thing. “That’s great! Just think of all the money we’ll save on beer and pizza!”

“Now that you mention it, there’s that little matter of the bonus you promised me,” I laughed. “Here, make yourself useful. You start on that end and I’ll start on this end. Just log in to each one, start up RandyTest in the main directory, and then go on to the next one.”

Thrilled to be doing something “technical,” he complied. This time the speed of the dots didn’t degrade as much every time a new terminal was activated. But when we got to around 50, something weird happened—the “refrigerator’s” disk light suddenly started flashing furiously and all the dots slowed to a crawl! Apparently, I had overwhelmed the hard disk by having so many high-speed “virtual typists” save so many records so quickly. For these many years it has been my job to torture computers, but in this case, I needed a realistic load test that focused on the terminals.

I went back and adjusted my routine to simulate a range of slower typists. The adjustment allowed me to get almost 70 terminals up and running before the disk began to thrash and the screens froze up. By then my boss had finished his pizza and gone home, after congratulating me on a job well done. Well done? Depends on how you look at it, I guess. I couldn’t see any way I could possibly back up our marketing claims without assuming that the average data entry clerk can’t type faster than a nine-year-old! I had to set the typing speed to about ten words a minute before 100 terminals could run at once, and even then the dots were barely moving. But hey, that’s what exaggerated marketing claims are all about, right?

The art and science of load testing has evolved quite a bit since those days, but the basics are still the same. Can a system or network of a given capacity deliver adequate performance under real-world usage conditions? There is only one way to find out for sure, and that means putting the application “on stage” to see how it really performs. Fortunately, we’ve at least taken most of the legwork out of the process these days, with the sophisticated load testing tools that are available. And we don’t even have to haul 100 (or 10,000) terminals into the lab to do it. Only the best Broadway shows run night after night—no messy setup and teardown with each production. If you can do your load testing this way, you are way ahead of the curve.

(This article was first published by StickyMinds).

No comments:

Post a Comment