Load Testing Metrics

Source : http://loadstorm.com

Load Testing Metrics

There are many measurements that you can use when load testing. The following metrics are key performance indicators for your web application or web site.

  • Average Response Times
  • Peak Response Times
  • Error Rates
  • Throughput
  • Requests per Second
  • Concurrent Users



Average Response Time

When you measure every request and every response to those requests, you will have data for the round trip of what is sent from a browser and how long it takes the target web application to deliver what was needed.

For example, one request will be a web page...let's say the home page of the web site. The load testing system will simulate the user's browser in sending a request for the "home.html" resource. On the target's side, the request is received by the web server, it makes further requests of the application to dynamically build the page, and when the full HTML document is compiled, the web server returns that document along with a response header.

The Average Response Time takes into consideration every round trip request/response cycle up until that point in time of the load test and calculates the mathematical mean of all response times.

The resulting metric is a reflection of the speed of the web application being tested - the BEST indicator of how the target site is performing from the users' perspective. The Average Response Time includes the delivery of HTML, images, CSS, XML, Javascript files, and any other resource being used. Thus, the average will be significantly affected by any slow components.

Response times can be measured as either:

  • Time to First Byte
  • Time to Last Byte

Some people like to know when the first byte of the response is received by the load generator (simulated browser). This shows how long the request took to get there and how long the server took to start replying. However, that is only part of the real equation. It seems to be much more valuable to know the entire cycle of response that encompasses the duration of download for the resource. Meaning, why would I want to know only part of the response time? What is most important is what the user experiences, and that includes the delivery of the full payload from the server. A user wants to see the HTML page - which requires receipt of the full document. So the Time to Last Byte would be preferred as a Key Performance Indicator (KPI) over Time to First Byte.

Peak Response Time

Similar to the previous metric, Peak Response Time is measuring the round trip of a request/response cycle. However the peak will tell us what is the LONGEST cycle at this point in the test.

For example, if we are looking at a graph that is showing 5 minutes into the load test that the Peak Response Time is 12 seconds, then we now know one of our requests took that long. The average may still be sub-second because our other resources had speedy response.

The Peak Response Time shows us that at least one of our resources are potentially problematic. It can reflect an anomaly in the application where a specific request was mishandled by the target system. Usually though, there will be an "expensive" database query involved in fulfilling a certain request such as a page that makes it take much longer, and this metric is great to expose those issues.

Typically images and stylesheets are not the slowest (although they can be when a mistake is made like using a BMP file). In a web application, the process of dynamically building the HTML document from application logic and database queries is usually the most time intensive part of the system. It is less common, yet occurs more often with open source apps, to have very slow Javascript files because of their enormous size. Large files can produce slow responses that will show up in Peak Response Time, so be careful when using big images or calling big JS libraries. Many times, you really only need less than 20% of the Javascript inside those libraries. Lazy coders won't take the trouble to clean out the other 80%, and that will hurt their system performance.

Error Rate

It is to be expected that some errors may occur when processing requests, especially under load. Most of the time you will see errors begin to be reported when the load has reached a point that exceeds the web application's ability to deliver what is necessary.

The Error Rate is the mathematical calculation that produces a percentage of problem requests to all requests. The percentage reflects how many responses are HTTP status codes indicating an error on the server, as well as any request that never gets a response.

The web server will return an HTTP Status Code in the response header. Normal codes are usually 200 (OK) or something in the 3xx range indicating a redirect on the server. A common error code is 500, which means the web server knows it has a problem with fulfilling that request. That of course doesn't tell you what caused the problem, but at least you know that the server knows there is a definitive technical defect in the functioning of the system somewhere.

It is much trickier to measure something you never receive, so an error code can be reported by the load testing tool for a condition not indicated by the server. Specifically, the tool must wait for some period of time before it quits "listening" for a response. The tool must determine when it will "give up" on a request and declare a timeout condition. Timeouts will not a code received from a web server, so the tool must choose a code such as a 408 to represent the timeout error.

Other errors can be hard to describe because they do not occur at the HTTP level. A good example is when the web server refuses a connection at the TCP network layer. There is no way to receive an HTTP Status Code for this, thus the load testing tool must choose some error code to use for reporting this condition back to you in the load testing results. A code of 417 is what LoadStorm reports.

Error Rate is a significant metric because it measure "performance failure" in the application. It tells you how many failed requests are occurring at a particular point in time of your load test. The value of this metric is most evident when you can easily see the percentage of problems increase significantly as the higher load produces more errors. In many load tests, this climb in Error Rate will be drastic. This rapid rise in errors tells you where the target system is stressed beyond its ability to deliver adequate performance.

No one can define the tolerance for Error Rate in your web application. Some testers consider less than 1% Error Rate successful if the test is delivering greater than 95% of the maximum expected traffic. However, other testers consider any errors to be a big problem and work to eliminate them. It is not uncommon to have a few errors in web applications - especially when you are dealing with thousands of concurrent users.

Throughput

Throughput is the measurement of bandwidth consumed during the test. It shows how much data is flowing back and forth from your servers.

Throughput is measured in units of Kilobytes Per Second.

Requests per Second

RPS is the measurement of how many requests are being sent to the target server. It includes requests for HTML pages, CSS stylesheets, XML documents, JavaScript libraries, images and Flash/multimedia files.

RPS will be affected by how many resources are called from the site's pages. Some sites can have 50-100 images per page, and as long as these images are small in size (e.g. <25k),>

Concurrent Users

Concurrent users is the most common way to express the load being applied during a test. This metric is measuring how many virtual users are active at any particular point in time. It does not equate to RPS because one user can generate a high number of requests, and each vuser will not constantly be generating requests.

A virtual user does what a "real" user does as specified by the scenarios and steps that you have created in the load testing tool. If there are 1,000 vusers, then there are 1,000 scenarios running at that particular time. Many of those 1,000 vusers may be spawning requests at the same time, but there are many vusers that are not because of "think time". Simply put, think time is the pause between vuser actions that simulates what happens with a real user as he or she reads the page received before clicking again.

Other Thoughts on Load Testing Metrics

On SOA Testing blog, they list the most important load testing metrics in their context as:

* Response time: It's the most important parameter to reflect the quality of a Web Service. Response time is the total time it takes after the client sends a request till it gets a response. This includes the time the message remains in transit on the network, which can't be measured exclusively by any load-testing tool. So we're restricted to testing Web Services deployed on a local machine. The result will be a graph measuring the average response time against the number of virtual users.

* Number of transactions passed/failed: This parameter simply shows the total number of transactions passed or failed.

* Throughput: It's measured in bytes and represents the amount of data that the virtual users receive from the server at any given second. We can compare this graph to the response-time graph to see how the throughput affects transaction performance.

* Load size: The number of concurrent virtual users trying to access the Web Service at any particular instance in an interval of time.

* CPU utilization: The amount of CPU time used by the Web Service while processing the request.

* Memory utilization: The amount of memory used by the Web Service while processing the request.

* Wait Time (Average Latency): The time it takes from when a request is sent until the first byte is received.

Comments

Popular Posts