TCPIP IPv4 settings

This topic lists all the adjustments that were made to the IPV4 settings.

net.ipv4.tcp_congestion_control

Network congestion in data networking [...] is the reduced quality of service that occurs when a network node is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking of new connections. Networks use congestion control and congestion avoidance techniques to try to avoid congestion collapse.1

TCP supports a number of network congestion-avoidance algorithms, each in a separate loadable module. Most Linux distribution default to using the Reno algorithms. A list of modules available to your Linux installation can be obtained with the following command:
[root@kvmhost ~] # sysctl net.ipv4.tcp_available_congestion_control
The default algorithm for most kernels is reno2. The recommended algorithm is cubic3. The scope of testing only evaluated the algorithms available in this Linux distribution. Other post-install module could be better suited to your specific workload environment. For a more comprehensive list of algorithms which may be available for the Linux distribution being used, see:
https://en.wikipedia.org/wiki/TCP_congestion-avoidance_algorithm#Algorithms

net.ipv4.tcp_fin_timeout

This parameter determines the length of time an orphaned (unreferenced) connection will wait before it is aborted at the local end. This parameter is especially helpful for when something happens to the remote peer which prevents or excessively delays a response. Since each socket used for connections consumes approximately 1.5K bytes of memory, the kernel must pro-actively abort and purge dead or stale resources.

The default value for this parameter is typically 60 (seconds).

[root@kvmhost ~] # sysctl net.ipv4.tcp_fin_timeout net.ipv4.tcp_fin_timeout = 60

For workloads or systems that generate or support high levels of network traffic, it can be advantageous to more aggressively reclaim dead or stale resources. For these configurations, it is recommended to reduce this value to below 10 (seconds).

net.ipv4.tcp_limit_output_bytes

Using this parameter, TCP controls small queue limits on per TCP socket basis. TCP tends to increase the data in-flight until loss notifications are received. With aspects of TCP send auto-tuning, large amounts of data might get queued at the device on the local machine, which can adversely impact the latency for other streams. tcp_limit_output_bytes limits the number of bytes on a device to reduce the latency effects caused by a larger queue size.

The default value is 262,144 bytes. For workloads or environments where latency is higher priority than throughput, lowering this value can improve latency. For these tests, this value was set to 131,072 bytes.

net.ipv4.tcp_low_latency

The normal TCP stack behavior is set to favor decisions that maximize network throughput. This parameter, when set, tells TCP to instead make decisions that would prefer lower latency.

The default value is 0 (off). For workloads or environments where latency is a higher priority, the recommended value is 1 (on).

net.ipv4.tcp_max_tw_buckets

Specifies the maximum number of sockets in the “time-wait” state allowed to exist at any time. If the maximum value is exceeded, sockets in the “time-wait” state are immediately destroyed and a warning is displayed. This setting exists to thwart certain types of Denial of Service attacks. Care should be exercised before lowering this value. When changed, its value should be increased, especially when more memory has been added to the system or when the network demands are high and environment is less exposed to external threats.

The default value is 262,144. When network demands are high and the environment is less exposed to external threats the value can be increased to 450,000.

net.ipv4.tcp_rmem

Contains three values that represent the minimum, default and maximum size of the TCP socket receive buffer.

The minimum represents the smallest receive buffer size guaranteed, even under memory pressure. The minimum value defaults to 1 page or 4096 bytes.

The default value represents the initial size of a TCP sockets receive buffer. This value supersedes net.core.rmem_default used by other protocols. The default value for this setting is 87380 bytes. It also sets the tcp_adv_win_scale and initializes the TCP window size to 65535 bytes.

The maximum represents the largest receive buffer size automatically selected for TCP sockets. This value does not override net.core.rmem_max. The default value for this setting is somewhere between 87380 bytes and 6M bytes based on the amount of memory in the system.

The recommendation is to use the maximum value of 16M bytes or higher (kernel level dependent) especially for 10 Gigabit adapters.

net.ipv4.tcp_tw_reuse

Permits sockets in the time-wait state to be reused for new connections.

In high traffic environments, sockets are created and destroyed at very high rates. This parameter, when set, allows no longer needed and about to be destroyed sockets to be used for new connections. When enabled, this parameter can bypass the allocation and initialization overhead normally associated with socket creation saving CPU cycles, system load and time.

The default value is 0 (off). The recommended value is 1 (on).

Note: Consult with your technical expert to ensure this change is valid in your configuration.

net.ipv4.tcp_wmem

Similar to the net.ipv4.tcp_rmem this parameter consists of 3 values, a minimum, default, and maximum.

The minimum represents the smallest receive buffer size a newly created socket is entitled to as part of its creation. The minimum value defaults to 1 page or 4096 bytes.

The default value represents the initial size of a TCP sockets receive buffer. This value supersedes net.core.rmem_default used by other protocols. It is typically set lower than net.core.wmem_default. The default value for this setting is 16K bytes.

The maximum represents the largest receive buffer size for auto-tuned send buffers for TCP sockets. This value does not override net.core.rmem_max. The default value for this setting is somewhere between 64K bytes and 4M bytes based on the amount of memory available in the system.

The recommendation is to use the maximum value of 16M bytes or higher (kernel level dependent) especially for 10 Gigabit adapters.

1 Network Congestion: Wikipedia
2 TCP Tahoe and Reno: TCP Congestion Avoidance: Wikipedia
3 Cubic TCP: TCP Cogestion Avoidance: Wikipedia