Reported response times higher than observed

Why does Flood report a much higher response time than what I can see manually via my browser?

So you've started testing with Flood and you notice that the response times being reported in the dashboard are quite high. You visit your application manually on your local machine but are unable to reproduce the high response times that Flood reports. Why could this be?

The most common reason is node over-utilization.

If that common cause doesn't fit for your circumstances, there are a few other reasons why this might be happening. Keep reading for the full list.

1. Latency

If you're wondering why your grid node in Frankfurt is reporting higher response times than what you can see through your browser in North America, the answer may be latency. Latency is the time it takes for a request to be sent to your server and back, and is affected by geographical distance.

To see if this is the reason, restart your test and select a grid closer to you.

2. Unreliability of "user feel"

Sometimes it's difficult to put a number to how fast your application seems to respond when you're manually simulating a user accessing your application through a browser. It may seem as though a page loads really quickly, but in reality there could be many other resources like javascripts that continue to load in the background, even if the site looks fully-formed. To prevent this, we recommend quantifying the response time and using tools such as your browser's Developer Tools or even proxy sniffers like Fiddler. If you're not familiar with either of those, even just running your script as a single user locally and looking at the elapsed time would be a better idea than trying to estimate how long a page took to load.

This way, you are able to attach a number to the response time that you can more accurately use for comparison purposes.

Note: Make sure you take response time measurements at the same time that your flood is running; otherwise you're comparing a full-load scenario to a single-user scenario, which is apples and oranges.

3. Grid node over-utilization

The reported response times may be higher due to your grid nodes struggling to keep up with the requests, which means they're using up too much memory or CPU. When this happens, the results are unreliable because the grid nodes themselves are the bottleneck, not the application you're testing.

How can you check to see if this is happening in your case?

Check your grid nodes' resource utilization while your flood is running. Click on Grids, select the node you'd like to check, and scroll down to the Info tab to see the CPU, Memory and IO Utilization.​

What we usually see happening is the CPU utilization of the node staying at or close to 100%. If this is happening, it means your grid node is being over-utilized.

What can you do to fix this?

Consider adding think time to your script. Very often we see scripts that don't have any think time at all. Consider the timing of requests sent to your application in production. Is it realistic to assume requests will be received almost simultaneously?

In most cases, the answer is no. Running a test with no think time unduly stresses out the grid nodes - it's almost as if you were testing the grid nodes rather than your application. To prevent this, add think time to your scripts. For JMeter, we recommend using a Gaussian Random Timer to space out your requests. This will more accurately simulate users interacting with your site, such as typing their details into a form, and then clicking a button that sends the request.

If your answer was yes, and your application really does receive requests almost simultaneously, another option is to increase the number of nodes. If one node is struggling to keep up with sending the requests, try increasing the number of nodes you use for each test. You can do this easily via the UI at the start of the test. Then, during the test, check the resource utilization again to see if the CPU is still hitting 100%. If so, add more nodes until you start to see the CPU dropping off. Here are some recommended amounts of users for every tool we support, but we still suggest you baseline the performance of your script on a single node.

If you get to the point that the CPU utilization is at a more reasonable level and you're still seeing response times higher than what you observe, keep reading. You may be seeing another issue.

4. Response time spikes are skewing the average

This one has something to do with how Flood calculates data points. For ease of use and simplicity, Flood reports the average response time per transaction for the last 15 seconds. This means that even if you have 100 requests for one transaction in the last 15 seconds, Flood records only one data point. For response times, it takes the average response time of those 100 requests.

But what happens when one request out of 100 has a response time of 3 minutes and the rest of the response times are 1 second? In this case, the average response time will be calculated as follows:

(99 requests) x (1 second response time) + (1 request) x (180 seconds) = 279 279 / (100 requests) = 2.79 seconds

As you can see, it's possible for the average response time to be reported as 2.79 seconds even though 99% of the requests take 1 second. This is unavoidable due to the one measurement with very high response time.

How can you check to see if this is happening in your case?

Download the Archived Results for the flood in question. These are your raw results, exactly as JMeter outputs them. The raw data will show every single request made during your test.

Analyse the data using a spreadsheet or visualisation tool. Isolate requests belonging only to the transaction in question and compute the average response time. It may also be easier to visualise the measurements in a graph. This allows you to see at a glance whether there were any outliers in your test with regards to response time.

What can you do to fix this?

Unfortunately we can only offer one-second resolution at an additional cost, and only to enterprise customers as this stage, as the computing requirements for that are significant. We would love to increase granularity of results, but that's something we are currently still exploring.

In the meantime, if you really need that granularity, you can work around this by manually downloading your raw results and calculating your average response time and other metrics separately.