How to setup DSR in Kubernetes with BIG-IP
Using Direct Server Return (DSR) in Kubernetes can have benefits when you have workloads that require low latency, high throughput, and/or you want to preserve the source IP address of the connection. The following will guide you through how to configure Kubernetes and BIG-IP to use DSR for traffic to a Kubernetes Pod.
Why DSR?
I’m not a huge fan of DSR. It’s a weird way of having a client send traffic to a Load Balancer (LB), the LB forwards to a backend server WITHOUT rewriting the destination address, and the backend server responds directly back to the client.
It looks WEIRD! But there are some benefits, the backend server sees the original client IP address without the need for the LB to be in the return path of traffic and the LB only has to handle one side of the connection. This is also the downside because it’s not straightforward to do any type of intelligent LB if you only see half the conversation. It also involves doing weird things on your backend servers to configure loopback devices so that it will answer for the traffic when it is received, but not create an IP conflict on the network.
DSR in Kubernetes
The following uses IP Virtual Server (IPVS) to setup DSR in Kubernetes. IPVS has been supported in Kubernetes since 1.11. When using IPVS it replaces IP Tables for the kube-proxy (internal LB). When you provision a LoadBalancer or NodePort service (method to expose traffic outside the cluster) you can add “externalTrafficPolicy: Local” to enable DSR. This is mentioned in the Kubernetes documentation for GCP and Azure environments.
DSR in BIG-IP
On the BIG-IP DSR is referred to as “nPath”. K11116 discusses the steps involved in getting it setup. The steps create a profile that will disable destination address translation and allow the BIG-IP to not maintain the state of TCP connections (since it will only see half the conversation).
Putting the Pieces Together
To enable DSR from Kubernetes the first step is to create a LoadBalancer service where you define the external LB IP address.
apiVersion: v1 kind: Service metadata: name: my-frontend spec: ports: - port: 80 protocol: TCP targetPort: 80 type: LoadBalancer loadBalancerIP: 10.1.10.10 externalTrafficPolicy: Local selector: run: my-frontend
After you create the service you need to update Service to add the following status (example in YAML format, this needs to be done via the API vs. kubectl):
status: loadBalancer: ingress: - ip: 10.1.10.10
Once this is done you run “ipvsadm -ln” to verify that you now have an IPVS rule to rewrite the destination address to the Pod IP Address.
IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn .. TCP 10.1.10.10:80 rr -> 10.233.90.25:80 Masq 1 0 0 -> 10.233.90.28:80 Masq 1 0 0 …
You can verify that DSR is working by connecting to the external IP address and observing that the MAC address that the traffic is sent to is different than the MAC address that the reply is sent from.
$ sudo tcpdump -i eth1 -nnn -e host 10.1.10.10 … 01:30:02.579765 06:ba:49:38:53:f0 > 06:1f:8a:6c:8e:d2, ethertype IPv4 (0x0800), length 143: 10.1.10.100.37664 > 10.1.10.10.80: Flags [P.], seq 1:78, ack 1, win 229, options [nop,nop,TS val 3625903493 ecr 3191715024], length 77: HTTP: GET /txt HTTP/1.1 01:30:02.582457 06:d2:0a:b1:14:20 > 06:ba:49:38:53:f0, ethertype IPv4 (0x0800), length 66: 10.1.10.10.80 > 10.1.10.100.37664: Flags [.], ack 78, win 227, options [nop,nop,TS val 3191715027 ecr 3625903493], length 0 01:30:02.584176 06:d2:0a:b1:14:20 > 06:ba:49:38:53:f0, ethertype IPv4 (0x0800), length 692: 10.1.10.10.80 > 10.1.10.100.37664: Flags [P.], seq 1:627, ack 78, win 227, options [nop,nop,TS val 3191715028 ecr 3625903493], length 626: HTTP: HTTP/1.1 200 OK ...
Automate it
Using Container Ingress Services we can automate this setup with the following AS3 declaration (note the formatting is off and this will not copy-and-paste cleanly, only provided for illustrative purposes).
kind: ConfigMap apiVersion: v1 metadata: name: f5demo-as3-configmap namespace: default labels: f5type: virtual-server as3: "true" data: template: | { "class": "AS3", "action": "deploy", "declaration": { "class": "ADC", "schemaVersion": "3.10.0", "id": "DSR Demo", "AS3": { "class": "Tenant", "MyApps": { "class": "Application", "template": "shared", "frontend_pool": { "members": [ { "servicePort": 80, "serverAddresses": [] } ], "monitors": [ "http" ], "class": "Pool" }, "l2dsr_http": { "layer4": "tcp", "pool": "frontend_pool", "persistenceMethods": [], "sourcePortAction": "preserve-strict", "translateServerAddress": false, "translateServerPort": false, "class": "Service_L4", "profileL4": { "use": "fastl4_dsr" }, "virtualAddresses": [ "10.1.10.10" ], "virtualPort": 80, "snat": "none" }, "dsrhash": { "hashAlgorithm": "carp", "class": "Persist", "timeout": "indefinite", "persistenceMethod": "source-address" }, "fastl4_dsr": { "looseClose": true, "looseInitialization": true, "resetOnTimeout": false, "class": "L4_Profile" } } } } }
You can then have the BIG-IP automatically pick-up the location of the pods by annotating the service.
apiVersion: v1 kind: Service metadata: name: my-frontend labels: run: my-frontend cis.f5.com/as3-tenant: AS3 cis.f5.com/as3-app: MyApps cis.f5.com/as3-pool: frontend_pool ...
Not so weird?
DSR is a weird way to load balance traffic, but it can have some benefits. For a more exhaustive list of the reasons not to do DSR; we can reach back to 2008 for the following gem from Lori MacVittie. What is old is new again!