What is latency?
Explore IBM's latency solution Subscribe to AI Topic Updates
Illustration with collage of pictograms of gear, robotic arm, mobile phone

Published: 15 August 2023
Contributor: Michael Goodwin

What is latency?

Latency is a measurement of delay in a system. Network latency is the amount of time it takes for data to travel from one point to another across a network. A network with high latency will have slower response times, while a low-latency network will have faster response times.

Though in principle, data should traverse the internet at nearly the speed of light, in practice, data packets move across the internet at a slightly slower rate due to delays caused by distance, internet infrastructure and other variables.1 The sum of these time delays comprises a network’s latency.

Smarter artificial intelligence for IT operations (AIOps)

Learn how both APM and ARM can enable faster decisions and resource application.

Related content

Read a guide to intelligent automation

Why does network latency matter?

Maintaining a low latency network is important because latency directly affects productivity, collaboration, application performance and user experience. The higher the latency (and the slower response times) the more these areas suffer. Low latency is especially crucial as companies pursue digital transformation and become increasingly reliant on cloud-based applications and services within the Internet of Things.

Let’s start with an obvious example. If high network latency causes inadequate application performance or slow load times for your clients, they are likely to look for alternative solutions. Now more than ever, individual and enterprise users alike expect lightning-fast performance. If your organization uses enterprise applications that rely on real-time data pulled from different sources to make resourcing recommendations, high latency can create inefficiencies. These inefficiencies can negatively impact the applications' performance and value.

All businesses prefer low latency. However, in industries and use cases that depend on sensor data or high-performance computing, like automated manufacturing, video-enabled remote operations (think cameras used in surgeries), live streaming or high-frequency trading, low latency is essential to the endeavor’s success.

High latency can also cause wasteful spending. Let’s say you aim to improve application and network performance by increasing or reallocating your compute, storage and network resource spend. If you fail to address existing latency issues, you might end with a larger bill without realizing improvement in performance, productivity or customer satisfaction.

How is latency measured?

Network latency is measured in milliseconds by calculating the time interval between the initiation of a send operation from a source system and the completion of the matching receive operation by the target system.2

One simple way to measure latency is by running a “ping” command, which is a network diagnostic tool used to test the connection between two devices or servers. During these speed tests, latency is often referred to as a ping rate.

In this test, an Internet Control Message Protocol (ICMP) echo request packet is sent to a target server and returned. A ping command calculates the time it takes for the packet to travel from source to destination and back again. This total travel time is referred to as round-trip time (RTT), equal to roughly double the latency, since data must travel to the server and back again. Ping is not considered an exact measurement of latency nor an ideal test for detecting directional network latency issues. This limitation is because data can travel over different network paths and encounter varying scenarios on each leg of the trip.

 

Latency, bandwidth and throughput

Latency, bandwidth and throughput are related, and sometimes confused as synonyms, but in fact refer to distinct network features. As we’ve noted, latency is the amount of time it takes for a packet of data to travel between two points across a network connection.

Bandwidth

Bandwidth is a measure of the volume of data that can pass through a network at any given time. It is measured in data units per second, such as megabits per second (mbps) or gigabits per second (gbps). This measurement is what you’re used to hearing about from your service provider when choosing connection options for your home. This is a source of great confusion, as bandwidth is not a measure of speed but of capacity. While high bandwidth can facilitate high internet speed, that capability is reliant on factors like latency and throughput as well.

Throughput

Throughput is a measurement of the average amount of data that actually passes through a network in a specific time frame, taking into account the impact of latency. It reflects the number of data packets that arrive successfully and the amount of data packet loss. It is usually measured in bits per second, or sometimes, data per second.

Jitter

Another factor in network performance is jitter. Jitter refers to the variation in latency of packet flows across a network. A consistent latency is preferable to high jitter, which can contribute to packet loss—data packets that are dropped during transmission and never arrive at their destination.

A simplified, but helpful way to remember the relationship between latency, bandwidth and throughput is that bandwidth is the amount of data that could travel over a network, throughput is the measure of how much actually transfers per second, and latency is the time it takes to do so.

What causes network latency?

Visualizing the journey data takes from client to server and back helps to understand latency and the various factors that contribute to it. Common causes of network latency are:

Distance data must travel

Plainly put, the greater the distance between the client initiating a request and the responding server, the higher the latency. The difference between a server in Chicago versus a server in New York responding to a user request in Los Angeles may only be a handful of milliseconds. But in this game, that’s a big deal, and those milliseconds add up.

Transmission medium and network hops

Next, consider the medium across which your data is traveling. Is it a network of fiber optic cables (generally lower latency) or a wireless network (generally higher latency), or a complex web of networks with multiple mediums, as is often the case?

The medium used for data travel affects latency. As does the number of times data must pass through network devices like routers to move from one network segment to the next—network hops—before it reaches its destination. The greater the hop count, the higher the latency.

Data packet size and network congestion

The size of data packets, as well as overall data volume on a network, both affect latency. Larger packets take longer to transmit, and if data volume exceeds the compute capacity of network infrastructure, you’re likely to incur bottlenecks and increased latency.

Hardware performance

Outdated or insufficiently resourced servers, routers, hubs, switches and other network hardware can cause slower response times. For instance, if servers are receiving more data than they can handle, packets will be delayed, resulting in slower page loads, download speeds and application performance.

Web page construction

Page assets like images and videos with large file sizes, render-blocking resources and unnecessary characters in source code can all contribute to higher latency.

User-side factors

Sometimes latency is caused by factors on the user side, like insufficient bandwidth, poor internet connections or outdated equipment.

How to reduce latency?

To reduce latency on your network, you might start with this network assessment:

-       Is our data traveling along the shortest, most efficient route?

-       Do our applications have the necessary resourcing for optimal performance?

-       Is our network infrastructure up-to-date and appropriate for the job?

Distribute data globally

Let’s start with the distance issue. Where are your users located? And where are the servers that respond to their requests? By distributing your servers and databases geographically closer to your users, you can cut down on the physical distance data needs to travel and reduce inefficient routing and network hops.

One way to distribute data globally is with a content delivery network, or CDN. Using a network of distributed servers allows you to store content closer to your end users, reducing the distance data packets need to travel. But what if you want to move beyond serving cached content?

Edge computing is a useful strategy, one that enables organizations to extend their cloud environment from the core data center to physical locations closer to their users and data. Through edge computing, organizations can run applications closer to end users and reduce latency.

Subnetting

A subnet is essentially a smaller network inside your network. Subnetting groups together end points that frequently communicate with each other, which can cut down on inefficient routing and reduce latency.

Use an application performance management solution

Traditional monitoring tools are not fast or thorough enough to proactively spot and contextualize performance issues in today’s complex environments. To stay ahead of issues, you can use a solution like the Instana® Observability platform that provides real-time, end-to-end observability and dependency mapping. These capabilities allow teams to pinpoint, contextualize, address and prevent application performance issues that contribute to network latency.

Optimize resource allocation and workload placement

If workloads do not have the appropriate compute, storage and network resources, latency increases and performance suffers. Trying to solve this problem by overprovisioning is inefficient and wasteful, and attempting to manually match dynamic demand with resources in complex modern infrastructures is an impossible task.

An application resource management solution like the IBM® Turbonomic® platform that continually analyzes resource utilization and the performance of applications and infrastructure components in real time can help solve resourcing issues and reduce latency.

For example, if the platform detects an application with high latency due to resource contention on a server, it can reduce latency. The platform reduces latency by automatically allocating the necessary resources to the application or moving it to a less congested server.

Monitor network performance

Tests like ping command can provide a simple measurement of network latency but are insufficient for pinpointing issues, much less addressing them. You can use a network performance management solution like IBM SevOne® NPM that provides a unified platform to help your teams spot, address and prevent network performance issues and reduce latency.

Maintain capable, up-to-date infrastructure

Make sure you are using up-to-date hardware, software and network configurations and that your infrastructure can handle what you are asking of it. Performing regular checks and maintenance on your network will also help reduce performance issues and latency.

Optimize page assets and coding

Developers can take steps to make sure that page construction does not add to latency, like optimizing videos, images and other page assets for faster loading, and through code minification.

Related solutions
Observability IBM Instana Observability

The IBM® Instana™ Observability platform provides enhanced application performance monitoring with automated full-stack visibility, 1-second granularity and 3 seconds to notify.

Learn more about Instana Observability Request an Instana Observability demo

Hybrid cloud cost optimization IBM Turbonomic

The Turbonomic® hybrid cloud cost optimization platform allows you to continuously automate critical actions in real-time that proactively deliver the most efficient use of compute, storage and network resources to your apps at every layer of the stack. 

Learn more about Turbonomic Explore the interactive demo

Network performance management IBM® SevOne® Network Performance Management

Designed for modern networks, IBM SevOne Network Performance Management (NPM) helps you proactively spot, address and prevent network performance issues with hybrid network observability. 

Learn more about SevOne NPM
Resources Modernize your network performance monitoring

Benefit from modern NPM capabilities that are dynamic, flexible and scalable.

Address the growing complexity of network performance monitoring

Learn about advanced capabilities for network and application visibility, insight and action.

5 steps to turbocharge your network performance management

Learn about 5 steps that will help network operators and engineers quickly measure their network performance management capabilities against what is actually required in modern IT environments.

Enento Group

Learn how the leading credit information provider in the Nordics used Instana Observability to enable the fast identification of bugs, lower existing latency and provide real-time visibility into every service request (with no sampling).

Dealerware

Learn how Dealerware's DevOps team used Instana Observability to reduce delivery latency by 98% during a period of exponential growth.

Take the next step

IBM Instana provides real-time observability that everyone and anyone can use. It delivers quick time-to-value while verifying that your observability strategy can keep up with the dynamic complexity of current and future environments. From mobile to mainframe, Instana supports over 250 technologies and growing. 

Explore IBM Instana Book a live demo
Footnotes

1Internet at the Speed of Light,”(link resides outside ibm.com), Yale.edu, 3 May 2022

2Effect of the network on performance,” IBM.com, 3 March 2021