Last night I was upgrading some ESX 3.5 VMs from “flexible” NICs to “VMXNET Enhanced” NICs and ran into a problem with on of the servers. As an aside, doing some rudimentary throughput testing using the iperf tool I was surprised to see the significant throughput increase and CPU usage decrease when switching from flexible (i.e. VMXNET) to VMXNET Enhanced (i.e. VMXNET2) NICs. Well worth doing – apart from the gotcha I ran into…
This one particular VM which showed a problem is running Quagga (a BGP daemon) to inject routes for a local AS112 server. Once I switched over the NICs the remote Cisco router started logging the following errors
-
%TCP-6-BADAUTH: No MD5 digest from
-
%TCP-6-BADAUTH: Invalid MD5 digest from
and would not bring up the BGP peering session. BGP MD5 authentication is enabled between the router and the Quagga daemon. The use of “debug ip tcp transactions” also shows the invalid MD5 signatures.
I initially suspected it was related to a offloaded checksumming issue which I previously observed (http://communities.vmware.com/thread/250159). Turns out it was not incorrect TCP checksums being calculated (related to the above post’s incorrect UDP checksums).
Digging a bit deeper I came across a post on the quagga-users mailing list describing a similar problem to the one I was observing. The usage of MD5 checksums “complicates” the process of offloading checksumming to NICs.
The VMXNET Enhanced NIC enables additional offloading from the VM to the NIC to enhance performance. This works well for many use cases but causes problems when using TCP MD5 checksums. In my case, turning off “TCP Segmentation Offload” has worked around the problem. Adding a command such as
ethtool -K eth0 tso off ethtool -K eth0 sg off
to a startup script has worked around the issue to some degree. In an ideal world, the VMXNET driver should allow “tx-checksumming” to be turned off using ethtool aswell.
In fairness to VMware on this one, this issue appears to not be specific to virtual machines but may in fact be observed on physical hardware with NICs providing offload functions.
Having used the above two ethtool commands, to allow the BGP session to be established, I still continue to see the following errors on the Cisco router:
TCP0: bad seg from x.x.x.x -- no MD5 string: port 179 seq
%TCP-6-BADAUTH: No MD5 digest from
%TCP-6-BADAUTH: Invalid MD5 digest from
Interestingly, using the flexible and hence original VMXNET VNIC within the VM (but oddly the same actual VMXNET driver binary!!) tx-checksumming and scatter-gather is enabled for the NIC:
$ ethtool -k eth0 Offload parameters for eth0: Cannot get device rx csum settings: Operation not supported rx-checksumming: off tx-checksumming: on scatter-gather: on tcp segmentation offload: off udp fragmentation offload: off generic segmentation offload: off $
For the record, all the above is using the latest ESX 3.5 VMware tools build 317866 from VMwareTools-3.5.0-317866.tar.gz.
Doing a tcpdump (from a physical server) of the traffic between the router and the VM reveals that packets which leave the VM with a valid MD5 signature (as per tcpdump from within the VM) arrive on the wire with an invalid MD5 signature (tcpdump -s0 -M md5password port 179). This indicates that VMware ESXi may infact be altering the packets between the VM and the wire in some way which is invalidating the MD5 signature 🙁
For now, some errors with an established session is better than lots of errors and no BGP session. Ideally there would be no errors being logged by the Cisco router.