Performance-Testing

Friday, March 18, 2016

Practical mathematics

For a brief while, I used to be a mathematician. But believe me, statistics was never a strong suit in my case ! So this article has some very basic thoughts on a way to get more out of performance testing with really very little effort.

I've come up against a few real-world situations where performance test analysis is almost complete and test script development about to begin. The analysis has yielded a breakdown of the client's activities into business processes and given an indication of the average sizes of data objects those processes handle. For example "12 rows on an invoice" or "10 items in an order". You're using a realistic spread of data and you know that by measuring performance for enough transactions, you'll get valid results. Then the client points out that they care quite a lot about the invoices or orders that are much larger than the average. Perhaps they represent more important customers or suppliers. And checking your figures, you find out that when you just look at the top 5% of the objects their average is indeed much larger. They could average 50 or 100 items say.

You feel a bit sheepish at this point but you know you're dealing with a fairly skewed distribution, and you're secretly pleased this question was asked before it's too late. So what can you do ?

An approach I've often used during performance test analysis is to divide up the data into two sets - representing the top 10% and representing the remaining 90%. You may have to identify test data that matches the criteria, or perhaps load enough fresh data representing the "normal" and "large" items. Then when you setup your load test you include two copies of the script, one using only "normal" data and one using "large" data. If you're lucky the load test tool will automatically collect performance data for the two sets of test users side by side - if not you may need to use some tricks like changing the transactions names so you can identify results from the "normal" and "large" data.

When you execute load tests you will find some transactions such as opening a line item perform much the same in each script. But you will also see interesting effects - perhaps the "large" script takes 5 times as long (on average) to open 50 items as it does 10 items. You can then go back to your non-functional requirements and assess the likely impact on users. You can also have much more interesting conversations with the designers and developers of the system and see quick ways to make significant performance improvements. You can also use monitoring tools such as CA Wily Introscope to probe into the behaviour of the application during tests. So this is a technique well worth considering.

There will always be situations where the simple approach does not work. For example, you may need to produce a graph showing the variations in transaction response time caused by data size. To be honest I've only done that once in eleven years !