Chelsio provides some of the best line speed network adapters with complete driver support in FreeBSD. According to Chelsio, "The T6 is a highly integrated, hyper-virtualized 1/10/25/40/50/100 GbE controller with full offload support of a complete Unified Wire solution. T6 provides no-compromise performance with both low latency (sub 1µsec through hardware) and high bandwidth, limited only by the PCI bus. Furthermore, [the T6] scales to true 100 Gigabit line rate operation from a single TCP connection to thousands of connections, and allows simultaneous low latency and high bandwidth operation thanks to multiple physical channels through the ASIC."
These are some of our notes...
The Chelsio T5 and T6 adapter supports filtering on the card itself which we can setup as a hardware firewall similar to Pf's pass and block rules. The firewall rules can drop packets at the full line speed (i.e. 100 gigabit) on the hardware interface itself before interrupts are triggered or the packets get to FreeBSD and the Pf firewall. By offloading the packet filtering to the NIC we can save CPU time for other tasks like a webserver or the ZFS file system.
The following shell script is called, "chelsio_rules_of_engagement.sh". The script will program the Chelsio card to allow certain traffic in the external interface (port 0) and in the internal LAN interface (port 1). Note that the filter rules only filter traffic coming into the interface. Traffic going out of the interface is unfiltered.
For this example we will be allowing traffic to our web server, ssh from an external machine to the server, and some traffic from inside the LAN to be NAT'd using Pf. It is important to remember that the rules are first match, meaning the packet will match the rule with the lowest "filter" number first. This is the reason the pass rules are at the top and the drop all rules follow. Let's go over some of the rules so you can understand our logic and then create your own rules.
#!/bin/sh set -euf # # Chelsio Rules of Engagement (T520-CR and T520-BT) # # drop fragments cxgbetool t5nex0 filter 0 frag 1 action drop # pass upper ports 32768 through 65535 (16bit mask) # (sysctl net.inet.ip.portrange.first=32768) cxgbetool t5nex0 filter 10 action pass dport 32768:0x8000 #xgbetool t5nex0 filter 11 action pass dport 16384:0xc000 #xgbetool t5nex0 filter 12 action pass dport 8192:0xe000 #xgbetool t5nex0 filter 13 action pass dport 4096:0xf000 # pass http TCP cxgbetool t5nex0 filter 20 iport 0 action pass dport 80 proto 6 # pass https TCP cxgbetool t5nex0 filter 30 iport 0 action pass dport 443 proto 6 # pass https UDP cxgbetool t5nex0 filter 40 iport 0 action pass dport 443 proto 17 # pass ssh from 1.2.3.4 cxgbetool t5nex0 filter 50 iport 0 action pass sip 1.2.3.4 dport 22 proto 6 # pass ping from 1.2.3.4 cxgbetool t5nex0 filter 60 iport 0 action pass sip 1.2.3.4 proto 1 # pass dhcp UDP cxgbetool t5nex0 filter 70 iport 0 action pass sport 67 dport 68 proto 17 # pass ping from 10.10.10 cxgbetool t5nex0 filter 80 iport 1 action pass sip 10.10.10.0/24 proto 1 # pass http TCP 10.10.10 cxgbetool t5nex0 filter 81 iport 1 action pass sip 10.10.10.0/24 dport 80 proto 6 # pass https TCP 10.10.10 cxgbetool t5nex0 filter 82 iport 1 action pass sip 10.10.10.0/24 dport 443 proto 6 # pass dns TCP 10.10.10 cxgbetool t5nex0 filter 83 iport 1 action pass sip 10.10.10.0/24 dport 53 proto 17 # drop all TCP (hex=0x06 , ProtoNum=6) cxgbetool t5nex0 filter 100 action drop proto 6 # drop all UDP (hex=0x11 , ProtoNum=17) cxgbetool t5nex0 filter 101 action drop proto 17 # drop all ICMP (hex=0x01 , ProtoNum=1) cxgbetool t5nex0 filter 102 action drop proto 1 # Connection Offload Policies (COP) #cxgbetool t5nex0 policy /root/chelsio_offload_policy ## EOF ##
What does the output of the filters rules look like? Use the cxgbetool to list out the on-chip rules and also show the hit counter per rule. The hit counter will tick up every time a packet matches a rule.
$ cxgbetool t5nex0 filter list Idx Hits FCoE Port vld:VLAN Prot MPS Frag DIP SIP DPORT SPORT Action 0 0 0/0 0/0 0:0000/0:0000 00/00 0/0 1/1 00000000/00000000 00000000/00000000 0000/0000 0000/0000 Drop 10 1029025 0/0 0/0 0:0000/0:0000 00/00 0/0 0/0 00000000/00000000 00000000/00000000 8000/8000 0000/0000 Pass: Q=RSS 20 0 0/0 0/7 0:0000/0:0000 06/ff 0/0 0/0 00000000/00000000 00000000/00000000 0050/ffff 0000/0000 Pass: Q=RSS 30 0 0/0 0/7 0:0000/0:0000 06/ff 0/0 0/0 00000000/00000000 00000000/00000000 01bb/ffff 0000/0000 Pass: Q=RSS 40 0 0/0 0/7 0:0000/0:0000 11/ff 0/0 0/0 00000000/00000000 00000000/00000000 01bb/ffff 0000/0000 Pass: Q=RSS 50 11887 0/0 0/7 0:0000/0:0000 06/ff 0/0 0/0 00000000/00000000 c0a80504/ffffffff 0016/ffff 0000/0000 Pass: Q=RSS 60 14 0/0 0/7 0:0000/0:0000 01/ff 0/0 0/0 00000000/00000000 c0a80504/ffffffff 0000/0000 0000/0000 Pass: Q=RSS 70 286 0/0 0/7 0:0000/0:0000 11/ff 0/0 0/0 00000000/00000000 00000000/00000000 0044/ffff 0043/ffff Pass: Q=RSS 80 0 0/0 1/7 0:0000/0:0000 01/ff 0/0 0/0 00000000/00000000 0a000a00/ffffff00 0000/0000 0000/0000 Pass: Q=RSS 81 620 0/0 1/7 0:0000/0:0000 06/ff 0/0 0/0 00000000/00000000 0a000a00/ffffff00 0050/ffff 0000/0000 Pass: Q=RSS 82 0 0/0 1/7 0:0000/0:0000 06/ff 0/0 0/0 00000000/00000000 0a000a00/ffffff00 01bb/ffff 0000/0000 Pass: Q=RSS 83 3 0/0 1/7 0:0000/0:0000 11/ff 0/0 0/0 00000000/00000000 0a000a00/ffffff00 0035/ffff 0000/0000 Pass: Q=RSS 100 82735 0/0 0/0 0:0000/0:0000 06/ff 0/0 0/0 00000000/00000000 00000000/00000000 0000/0000 0000/0000 Drop 101 8515 0/0 0/0 0:0000/0:0000 11/ff 0/0 0/0 00000000/00000000 00000000/00000000 0000/0000 0000/0000 Drop 102 6234 0/0 0/0 0:0000/0:0000 01/ff 0/0 0/0 00000000/00000000 00000000/00000000 0000/0000 0000/0000 Drop
The TCP Offload Engine (TOE) will allow the Chelsio hardware to completely offload the entire TCP connection into hardware. A connection using TOE will use less CPU time leaving more CPU resources for applications.
We ran some basic tests using a single wget download with the default FreeBSD TCP stack and then offloading the TCP connection to the Chelsio card using the TCP Offload Engine (TOE). The result was TOE saved between 1.5x and 5x the amount of CPU time to download the same file compared to the default TCP stack. Wget was rate limited to test the CPU usage of different download speeds.
TCP TIMING: We noticed that short lived connections of less then 0.6 seconds will NOT use the Chelsio TCP Offload Engine (TOE) even if TOE is allowed universally or through Chelsio Offload Policy (COP). Not sure of the reason.
CPU usage of a single wget process downloading a file with the FreeBSD TCP stack compared to Chelsio TCP Offload Engine (TOE). Lower CPU time is better. FreeBSD 12 (wget) -> TOE or TCP -> Ubuntu 16.04 Nginx HTTP File size : 487 MBytes Source NIC: 1 Gbit/sec MTU Size : 1500 bytes CPU Time TOE TCP 112 MB/s 0m0.370s 0m1.547s 50 MB/s 0m0.420s 0m1.568s 25 MB/s 0m0.363s 0m1.539s 10 MB/s 0m0.266s 0m1.346s 5 MB/s 0m0.247s 0m1.548s 1 MB/s 0m0.264s 0m0.381s 500 kb/s 0m0.297s 0m0.473s File Size : 487 MBytes Source NIC: 10 Gbit/sec MTU Size : 9000 bytes CPU Time TOE TCP 500 MB/s - 0m0.562s 400 MB/s 0m0.191s 0m0.570s 300 MB/s 0m0.174s 0m0.681s 200 MB/s 0m0.174s 0m0.771s 100 MB/s 0m0.191s 0m0.702s 10 MB/s 0m0.159s 0m0.801s 5 MB/s 0m0.200s 0m0.756s 1 MB/s 0m0.235s 0m0.431s 500 kb/s 0m0.336s 0m0.433s
The Chelsio Offload Policy (COP) manages when the TCP Offload Engine (TOE) takes affect allowing the card to only offload TCP connections which you want to offload and leave the other connection to the default FreeBSD TCP stack.
To apply the Chelsio Offload Policy (COP) use "cxgbetool t5nex0 policy /root/chelsio_offload_policy" once the chelsio_offload_policy has been configured with your offload preferences. Make sure to add hw.cxgbe.cop_managed_offloading="1" to /etc/sysctl.conf so that TOE will only be enabled for connections defined in COP.
The Chelsio Offload Policy (COP) uses the following directives to tell the card which connections to apply offload logic to.
SECURITY NOTE: The Chelsio TCP Offload Engine (TOE) will completely bypass the FreeBSD TCP stack as well as any Chelsio filter rules. This means that traffic using TOE will NOT be filtered using our Chelsio Rules of Engagement filter rules or the Pf packet filter, nor will Pf log TOE connections. Netstat will show the connections using "netstat -np tcp" though.
Here are a few examples of using Chelsio Offload Policy (COP) config file:
# TOE only outgoing TCP connections. Incoming connections # will still use FreeBSD's TCP stack including Pf. $ cat /root/chelsio_offload_policy [A] all => offload # TOE incoming connections on port 80 and 443. If you have # a Chelsio T6 card, TLS can also be offloaded to the card, # T4 and T5 do not support TLS offloading. $ cat /root/chelsio_offload_policy [L] port 80 => offload [L] port 443 => offload [P] dst port 80 => offload [P] dst port 443 => offload tls
Can Chelsio NAT replace Pf NAT ?
No. Chelsio Network Address Translation (NAT) is stateless NAT and FreeBSD's Pf is stateful NAT. Stateless NAT will require you to define every source ip address and port mapping to every destination ip address and port. Stateful NAT, like in Pf or IPFW, does all the mapping for you.
What kernel modules are needed to start the Chelsio NIC on boot ?
To boot FreeBSD with the Chelsio T5 network card use the following directives in /boot/loader.conf . Make sure to add cop_managed_offloading when defining Chelsio Offload Policy (COP) rules.
$ cat /boot/loader.conf # Chelsio T5 (cxl) kernel module # #t4fw_cfg_load="YES" t5fw_cfg_load="YES" #t6fw_cfg_load="YES" if_cxgbe_load="YES" # Chelsio Offload Policy (COP) manages TCP Offload Engine (TOE) hw.cxgbe.cop_managed_offloading="1"
How can I enable the TCP Offload Engine (TOE) on boot ?
Use /etc/rc.local to load the TOE kernel driver on boot and then enable the Direct Data Placement (DDP) and Zero Copy sysctl variables. The following rc.local will load the kernel t4_tom kernel module, enable TOE on both the cxl0 and cxl1 interfaces, enable DDP and ZCopy as well as our chelsio_rules_of_engagement rules from the top of this page.
$ cat /etc/rc.local if [ -z `/sbin/kldstat | /usr/bin/grep 't4_tom\.ko'` ]; then /sbin/kldload t4_tom && \ /sbin/ifconfig cxl0 toe && \ /sbin/ifconfig cxl1 toe && \ /sbin/sysctl -q dev.t5nex.0.toe.ddp=1 >/dev/null && \ /sbin/sysctl -q dev.t5nex.0.toe.tx_zcopy=1 >/dev/null fi /root/chelsio_rules_of_engagement
What is a "Cant't set DCB Priority" error ?
TLDR: Disable the Link Layer Discovery Protocol (LLDP) on the switch.
Link Layer Discovery Protocol (LLDP) offload can be used for Data Center Bridging (DCB). The Data Center Bridging Capabilities Exchange Protocol (DCBX) is used to convey the capabilities and configuration between neighbors to ensure consistent configuration across the network. If LLDP is not configured properly then the DCB negotiation will fail and the Chelsio card will show the following errors before taking the interface offline. The easiest solution is to disable LLDP on the switch when LLDP is not needed.
# dmesg cxgb4 0000:04:00.4: Coming up as MASTER: Initializing adapter cxgb4 0000:04:00.4: Successfully configured using Firmware Configuration File "Firmware Default", version 0x0, computed checksum 0x0 cxgb4 0000:04:00.4 eth0: Chelsio T520-CR rev 1 1000/10GBASE-R SFP+ RNIC PCIe x8 8 GT/s MSI-X cxgb4 0000:02:00.4 enp2s0f4: SR module inserted csiostor 0000:02:00.6: Port:0 - LINK DOWN cxgb4 0000:02:00.4 enp2s0f4: link up, 10Gbps, full-duplex, Tx/Rx PAUSE csiostor 0000:02:00.6: Port:0 - LINK UP IPv6: ADDRCONF(NETDEV_CHANGE): enp2s0f4: link becomes ready cxgb4 0000:02:00.4 enp2s0f4: TX Packet without VLAN Tag on DCB Link cxgb4 0000:02:00.4 enp2s0f4: TX Packet without VLAN Tag on DCB Link cxgb4 0000:02:00.4 enp2s0f4: TX Packet without VLAN Tag on DCB Link # console command 0x8 in mailbox 4 timed out Can't set DCB Priority on port 0, TX Queue 0: err=110 Can't set DCB Priority on port 0, TX Queue 1: err=110 Can't set DCB Priority on port 0, TX Queue 2: err=110 Can't set DCB Priority on port 0, TX Queue 3: err=110 Can't set DCB Priority on port 0, TX Queue 4: err=110 Can't set DCB Priority on port 0, TX Queue 5: err=110 Can't set DCB Priority on port 0, TX Queue 6: err=110 Can't set DCB Priority on port 0, TX Queue 7: err=110