Investigating the LTM TCP Profile: Congestion Control Algorithms

Introduction

The LTM TCP profile has over thirty settings that can be manipulated to enhance the experience between client and server. Because the TCP profile is applied to the virtual server, the flexibility exists to customize the stack (in both client & server directions) for every application delivered by the LTM. In this series, we will dive into several of the configurable options and discuss the pros and cons of their inclusion in delivering applications.

Quick aside for those unfamiliar with TCP: the transmission control protocol (layer 4) rides on top of the internet protocol (layer 3) and is responsible for establishing connections between clients and servers so data can be exchanged reliably between them.

Normal TCP communication consists of a client and a server, a 3-way handshake, reliable data exchange, and a four-way close. With the LTM as an intermediary in the client/server architecture, the session setup/teardown is duplicated, with the LTM playing the role of server to the client and client to the server. These sessions are completely independent, even though the LTM can duplicate the tcp source port over to the server side connection in most cases, and depending on your underlying network architecture, can also duplicate the source IP.

Definitions

- cwnd -- congestion window; sender-side limitation on the amount of data that can be sent
- rwnd -- receive window; receiver-side limitation on the amount of data that can be received
- ssthresh -- slow start threshold; value at which tcp toggles between slow start and congestion avoidance
- Flightsize -- sent amount of unacknowledged data
- SMSS -- sender max segment size; largest segment the sender can transmit, based on MTU - overhead, path MTU discovery, or RMSS.
- RMSS -- receiver max segment size; largest segment the receiver is willing to accept.

Congestion Control

In modern TCP implementations (Reno forward), the main congestion control mechanism consists of four algorithms: slow start, congestion avoidance, fast retransmit, and fast recover. RFC 1122 required the first two (respectively), and the latter were introduced with BSD version 4.3, code name Reno. The tcp implementation (detailed in RFC 2851) in Reno has adopted that code name. New Reno introduces a slight modification to the fast recover algorithm in Reno in the absence of selective acknowledgements and is detailed in RFC 2852. Note that if selective acknowledgements are enabled in the profile, there will be no functional difference between Reno and New Reno. That said, the differences between Reno and New Reno (as defined in RFC 2852) are highlighted in the following table. The bold/italic print in the New Reno column below indicates the departure from the Reno standard.

Note that the New Reno fast recover algorithm implemented on the LTM is the careful variant of New Reno and is defined in RFC 3782. It's a little more complex and therefore isn't show above for clarity in distinguishing the differences between Reno and New Reno. The careful variant attempts to avoid unnecessary multiple fast retransmits that can occur after a timeout. All LTM version 9.x releases prior to 9.4 implement the careful variant of New Reno. Beginning in version 9.4, you can optionally select Reno, New Reno, High Speed, or Scalable. Highspeed is based on Reno, and Scalable is a variant of High Speed.

Congestion Window

During congestion avoidance, the congestion window is set differently among the available options:

Reno/New Reno
- ACK ==> cwnd = cwnd + (1/cwnd)
- LOSS ==> cwnd = cwnd - (cwnd/2)
High Speed
- ACK ==> cwnd = cwnd + (a(cwnd)/cwnd)
- LOSS ==> cwnd = cwnd - (cwnd * b(cwnd))
Scalable
- ACK ==> cwnd = cwnd + .01
- LOSS ==> cwnd = cwnd * 0.875

With Reno (or stock, standard, normal, etc) TCP, cwnd increases by one packet every round trip. When congestion is detected, cwnd is halved. For long fat networks, the optimal cwnd size could be 10000 packets. This means recovery will take at least 5000 round trips, and on a 100 ms link, that means a recovery time of 500 seconds (yeah, you read that right!). The goals of High Speed and Scalable are similar (Sustain high speeds without requiring unrealistically low loss rates, reach high speed quickly in slow start, recover from congestion without huge delays, fair treatment of standard TCP) but the approaches are different. The High Speed implementation alters cwnd up or down as a function of the size of the window. If cwnd is small, High Speed is switched off and behaves like Reno. Cwnd grows larger and shrinks smaller than with Reno. This results in better utilization (overall and early in a connection) on long fat networks. The Scalable implementation has a multiplicative increase, unlike Reno/New Reno and High Speed. It's loss recovery mechanism is independent of the congestion window and is therefore much quicker than normal (some studies show recovery as quick as 2.7 seconds even on gigabit links). The performance improvements with High Speed and Scalable can be huge for bulk transfers, perhaps doubled or greater. Throughput results (condensed from http://www-iepm.slac.stanford.edu/monitoring/bulk/fast/) based on a transmit queue length of 100 and an MTU of 1500 are shown in the table below.

Throughput Results (condensed)
TCP Implementation	Mbps (after 80s)	Mbps (after 1000s)
Reno	56	128
Scalable	387	551
High Speed	881	913

Conclusion

Since the arrival of LTM version 9.4, you have been armed with the option to increase the performance of your TCP stack significantly, while maintaining compatibility with the standard implementations. Testing is always encouraged, as every scenario welcomes additional challenges that must be solved.

Updated Nov 30, 2023

Version 2.0

application delivery

series-the-tcp-profile

tcp

TMOS