Home » performance

Response Time vs. Latency

7 April 2005 10 Comments

Some people use the terms response time and latency interchangeably when talking about software performance. It is important to distinguish the difference.

Latency is the delay incurred in communicating a message (the time the message spends “on the wire”). The word latent means inactive or dormant, so the processing of a user action is latent while it is traveling across a network.

Changes in latency are typically unavoidable through changes to your code. Latency is a resource issue, which is affected by hardware adequacy and utilization.

Example: The latency in a phone call is the amount of time it takes from when you ask a question until the time that the other party hears your question. If you’ve ever talked to somebody on a cell phone while standing in the same room, you’ve probably experienced latency first hand, because you can see their lips moving, but what you hear in the phone is delayed because of the latency.

Response time is the total time it takes from when a user makes a request until they receive a response.

Response time can be affected by changes to the processing time of your system and by changes in latency, which occur due to changes in hardware resources or utilization.

Example: The response time in phone conversation is the amount of time it takes for you to ask a question and get a response back from the person that you’re talking to.

Processing time is the amount of time a system takes to process a given request, not including the time it takes the message to get from the user to the system or the time it takes to get from the system back to the user.

Processing time can be affected by changes to your code, changes to systems that your code depends on (e.g. databases), or improvements in hardware.

Example: The processing time in a phone conversation is the amount of time the person you ask a question takes to ponder the question and speak the answer (after he hears the question of course).

In these terms:
Latency + Processing Time = Response Time

In many cases, you can assert that your latency is nominal, thus making your response time and your processing time pretty much the same. I guess it doesn’t matter what you call things as long as everybody involved in your performance analysis understands these different aspects of the system. For example, it is useful to make a graph latency vs. response time, and it is important for all the parties involved to know the difference between the two.

UPDATE: Some people have pointed me to books that support the definitions I’ve made. See:

– Concurrent Programming In Java by Doug Lea (chapter 1.3.3)
– Patterns of Enterprise Application Architecture by Martin Fowler (Introduction)

Both Lea and Fowler’s definition of latency corroborate mine. Fowlers definitions are pretty much the same as what I’ve layed out here. Fowler additionally defines responsiveness as the amount of time it takes to hear back something (anything). He points out that this may be the most important performance measurement from a user’s perspective. The only difference is that Lea’s definition of response time is the same as Fowler’s definition of responsiveness. I tend to like Fowler’s definition better.

Also, if you do a search for latency (try googling: “define:latency”) most of the definitions you get will agree with mine: latency is the amount of time a request spends “on the wire”.


  • Anonymous said:

    Thanks a lot for this piece of information. I was pondering over the differences between the latency and response time for quite some time now.

  • Maddy said:

    I have another example for this Latency and Processing Time.

    10 People given a print of a single sheet to a printer, and am the last guy to give the print.
    The Printer will take 10 seconds to print the sheet.

    So for me.

    Latency = 90 Seconds
    Processing Time = 10 Seconds
    Response Time = 90 + 10 = 100 Seconds.

    But for the first guy in the queue,

    Latency = 0 Sec(as he is the first guy)
    Processing Time = 10 Sec
    Response Time = 0+ 10 = 10 Seconds

  • Okoduwa James said:

    It was an experience for me to be able to know the little difference between both words.hoping to see more details on the topic.
    My regards to u all……………………………..

  • LJ said:

    In response to Maddy’s analogy…excellent, thank you.

  • Vinny said:

    Hi all, I have a question. How would you interpret the correlation between network latency and screen refresh rate. For example: a game character impersonated by a human player would use more frames to go from point A to point B on a computer screen if the monitor (and all the hardware involved) was able of 120hz versus 60hz?

  • Dzmitry Kashlach said:

    Thank you, good article. In addition I’d like to share a piece of advice related to how to write good load report,

  • Nil said:

    This is very useful & very simple language u use. Thank you very much.

  • Ankit said:

    In VOIP communication Latency , processing & Response time can be defined by below example:

    Person A pass massage to Person B at 12:15 Hrs
    Person B get his message on 12:16 Hrs ( in this term Latency is 1 Min)

    Person B takes 3 mins to resonse it and reply on 12:19 Hrs (so processing time is 3 mins)

    Person A hear it on 12:20 (Latency 1 min)

    Total Responce time is 1 + 3 +1 = 5 Mins. (time from first msg pass to reply).

  • Armando said:

    Thank you for breaking it down to a simple and immediately understandable description. I consider myself highly technical and have read many technical definitions of latency and response time, all of which create more confusion. Your explanation is simple, clear and gives me what I need. As for technicalities not covered here, I can fill those in.


  • Akash said:

    Really it helped me a lot. The example part is excellent.Thanks for clearing my doubt.