Friday, 27 August 2010

Two Types of Performance




Dear Junior

In architecture one of the most important tasks is to keep an eye on the non-functional (or quality) attributes of the system. Often this importance is enhanced by stakeholders holding up some non-functional requirement (NFR) saying "the system must really fulfill this". Unfortunately, these NFRs are often quite sloppily formulated and a key example is "performance". I have stopped counted the times I have heard "we must have high performance".

I think a baseline requirement for any NFR is that it should be specific. There should be no doubt about what quality we talk about. In my experience "performance" can mean at least two very different things. Instead of accepting requirements on performance I rather try to reformulate the NFR to use some other wording instead. I have found that the "performance" asked for often can be split into two different qualities: latency and throughput.

Latency or Response Time


With latency or response time I mean the time it takes for some job to pass through the system. A really simple case is the loading of a web page, where the job is to take a request and deliver a response. So we can get out our stop-watch and measure the time it takes from the moment we click "search" until the result-page shows up. This latency is probably in the range of 100 ms - 10 s. Of course, this response-time is crucial to keep the user happy.

But latency can also be an important property even without human interaction. In the context of cron-started batch jobs it might be the time from the input-file is read until the processing has committed to the database. The latency for this processing might have to be short enough so the result does not miss next batch downstream. E g it might be crucial that the salary-calculation is finished before the payment-batch is sent to the bank on salary day. 

In a less batch-oriented scenario the system might process data asynchronously, pulling it from one queue, processing it, and pushing it onto another queue. Then the latency will be the time it takes from data being pulled in until the corresponding data is pushed out at the other end.

All in all, the latency is seen from the perspective of one single request or transaction. Latency talks about how fast the system is from one traveller's point of view. Latency is about "fast" in the same way as an F1-car is fast, but will not carry a lot of load.

Throughput or Capacity


On the other hand, throughput or capacity talks about how much work the system can process. For example, a news information portal might have to handle a thousand simultaneous requests — because at nine o'clock coffee break a few thousand people might simultaneous surf to that site to check out the news.

Throughput is also important in the non-interactive scenario. Each salary-calculation might only take a few seconds, but how many will the system be able to process during those 10000 s between midnight (when it starts) and 02:45 when the bank-batch leaves. If we cannot process all 50 000 employees, some will complain. To meet the goal we need a throughput of five transactions per second.

In other words, where latency was the performance from the perspective of one client or transaction, then throughput is the performance seen from the perspective of the collective of all clients or transactions, how much load the system can carry. Here "load" in the same way as a bus will take a lot of load transporting lots of people at once, even if it is not fast.


Fastness vs Load Capacity

Both F1 cars and heavy-duty trucks are no doubt "high-performing" cars. But they are so in completely different ways. To have a F1 car showing up at the coal mine would be a misunderstanding that only could be matched by the truck at the race track.


So, I avoid talking about "performance" and risk misunderstanding. Instead I try to use "latency" and "response time" to talk about how fast things happen — while thinking about an F1 car.  And I use "throughput" and "capacity" to talk how much load the system can handle — while thinking about a bus full of people.

What is the latency for transporting yourself between Stockholm and Gothenburg using a F1 car or public-transport bus? What is the throughput of transporting a few thousand people from Stockholm to Gothenburg using an F1 car or public-transport bus?

Yours

    Dan


P s Now when we are moving to multicore, we will see increasing capacity but latency leveling out or getting worse. This in itself will be a reason to move to non-traditional architectures, where I think the event driven architectures (EDA) is a good candidate for saving the day. My presentation at the upcoming JavaZone will mainly revolve around this issue.

4 comments:

  1. Excellent post, as usual. On a side note: Some stakeholders will often mix functional and non-functional requirements like "We want a search function that works like Google". Typically this means that they want the same latency in returning the first results and not necessarily a similar algorithm.

    Also, in web application scenarios it is also interesting to talk about user perceived latency of e.g. a specific web page. It has been shown that this is not necessarily the same as the measured time for a complete page load. It may be a good idea to have a discussion on performance goals and their quantification in specific contexts as suggested here:

    http://msdn.microsoft.com/en-us/library/bb924365.aspx

    ReplyDelete
  2. event driven architecture, was reading about node.js in that respect.

    interesting article

    ReplyDelete
  3. Dear Peter

    Thanks - I humbly bow.

    Functional and non-functional requirements are hard to keep apart. Something first expressed as a functional requirement can actually be a need for a non-functional quality attribute, and vice verse.

    My favourite example is "there should be a login window", which actually is phrased as a functional requirement. However, what is probably wanted is that: "you should not be able to access any functionality at all without properly passing the login process" - a non-functional requirement.

    Thanks for pointing out the subtle and important distinction between actual and perceived performance as well as the link to the discussion on contextual quantifications of performance qualities.

    Yours /Dan

    ReplyDelete
  4. Dear Anonymous

    Glad you liked the article. Event-Driven Architectures are indeed an interesting subject, which I think I will have reason to return to at some later occasion.

    Yours /Dan

    ReplyDelete