Hi Aurelien, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on net-next/master] [also build test WARNING on next-20221026] [cannot apply to net/master linus/master v6.1-rc2] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Aurelien-Aptel/nvme-tcp-receive-offloads/20221025-221001 patch link: https://lore.kernel.org/r/20221025135958.6242-11-aaptel%40nvidia.com patch subject: [PATCH v7 10/23] Documentation: add ULP DDP offload documentation reproduce: # https://github.com/intel-lab-lkp/linux/commit/c0839ff3b217d1ad295c08fc8b3c07d64eefcf4f git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Aurelien-Aptel/nvme-tcp-receive-offloads/20221025-221001 git checkout c0839ff3b217d1ad295c08fc8b3c07d64eefcf4f make menuconfig # enable CONFIG_COMPILE_TEST, CONFIG_WARN_MISSING_DOCUMENTS, CONFIG_WARN_ABI_ERRORS make htmldocs If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot All warnings (new ones prefixed by >>): >> Documentation/networking/ulp-ddp-offload.rst:239: WARNING: Error in "code-block" directive: >> Documentation/networking/ulp-ddp-offload.rst:60: WARNING: undefined label: tls_offload (if the link has no caption the label must precede a section header) vim +239 Documentation/networking/ulp-ddp-offload.rst 59 > 60 Offloading does require NIC hardware to track L5P protocol framing, similarly 61 to RX TLS offload (see documentation at 62 :ref:`Documentation/networking/tls-offload.rst `). NIC hardware 63 will parse PDU headers, extract fields such as operation type, length, tag 64 identifier, etc. and only offload segments that correspond to tags registered 65 with the NIC, see the :ref:`buf_reg` section. 66 67 Device configuration 68 ==================== 69 70 During driver initialization the device sets the ``NETIF_F_HW_ULP_DDP`` feature 71 and installs its 72 :c:type:`struct ulp_ddp_ops ` 73 pointer in the :c:member:`ulp_ddp_ops` member of the 74 :c:type:`struct net_device `. 75 76 Later, after the L5P completes its handshake, the L5P queries the 77 device driver for its ULP capabilities (:c:type:`enum ulp_ddp_offload_capabilities`) 78 and runtime limitations via the :c:member:`ulp_ddp_limits` callback: 79 80 .. code-block:: c 81 82 int (*ulp_ddp_limits)(struct net_device *netdev, 83 struct ulp_ddp_limits *limits); 84 85 The current list of capabilities is: 86 87 .. code-block:: c 88 89 enum ulp_ddp_offload_capabilities { 90 ULP_DDP_C_NVME_TCP = 1, 91 ULP_DDP_C_NVME_TCP_DDGST_RX = 2, 92 }; 93 94 All L5P share a common set of limits and parameters (:c:type:`struct ulp_ddp_limits`): 95 96 .. code-block:: c 97 98 /** 99 * struct ulp_ddp_limits - Generic ulp ddp limits: tcp ddp 100 * protocol limits. 101 * Protocol implementations must use this as the first member. 102 * Add new instances of ulp_ddp_limits below (nvme-tcp, etc.). 103 * 104 * @max_ddp_sgl_len: maximum sgl size supported (zero means no limit) 105 * @io_threshold: minimum payload size required to offload 106 */ 107 struct ulp_ddp_limits { 108 enum ulp_ddp_type type; 109 u64 offload_capabilities; 110 int max_ddp_sgl_len; 111 int io_threshold; 112 unsigned char buf[]; 113 }; 114 115 But each L5P can also add protocol-specific limits e.g.: 116 117 .. code-block:: c 118 119 /** 120 * struct nvme_tcp_ddp_limits - nvme tcp driver limitations 121 * 122 * @full_ccid_range: true if the driver supports the full CID range 123 */ 124 struct nvme_tcp_ddp_limits { 125 struct ulp_ddp_limits lmt; 126 127 bool full_ccid_range; 128 }; 129 130 Once the L5P has made sure the device is supported the offload 131 operations are installed on the socket. 132 133 If offload installation fails, then the connection is handled by software as if 134 offload was not attempted. 135 136 To request offload for a socket `sk`, the L5P calls :c:member:`ulp_ddp_sk_add`: 137 138 .. code-block:: c 139 140 int (*ulp_ddp_sk_add)(struct net_device *netdev, 141 struct sock *sk, 142 struct ulp_ddp_config *config); 143 144 The function return 0 for success. In case of failure, L5P software should 145 fallback to normal non-offloaded operations. The `config` parameter indicates 146 the L5P type and any metadata relevant for that protocol. For example, in 147 NVMe-TCP the following config is used: 148 149 .. code-block:: c 150 151 /** 152 * struct nvme_tcp_ddp_config - nvme tcp ddp configuration for an IO queue 153 * 154 * @pfv: pdu version (e.g., NVME_TCP_PFV_1_0) 155 * @cpda: controller pdu data alignment (dwords, 0's based) 156 * @dgst: digest types enabled. 157 * The netdev will offload crc if L5P data digest is supported. 158 * @queue_size: number of nvme-tcp IO queue elements 159 * @queue_id: queue identifier 160 * @cpu_io: cpu core running the IO thread for this queue 161 */ 162 struct nvme_tcp_ddp_config { 163 struct ulp_ddp_config cfg; 164 165 u16 pfv; 166 u8 cpda; 167 u8 dgst; 168 int queue_size; 169 int queue_id; 170 int io_cpu; 171 }; 172 173 When offload is not needed anymore, e.g. when the socket is being released, the L5P 174 calls :c:member:`ulp_ddp_sk_del` to release device contexts: 175 176 .. code-block:: c 177 178 void (*ulp_ddp_sk_del)(struct net_device *netdev, 179 struct sock *sk); 180 181 Normal operation 182 ================ 183 184 At the very least, the device maintains the following state for each connection: 185 186 * 5-tuple 187 * expected TCP sequence number 188 * mapping between tags and corresponding buffers 189 * current offset within PDU, PDU length, current PDU tag 190 191 NICs should not assume any correlation between PDUs and TCP packets. 192 If TCP packets arrive in-order, offload will place PDU payloads 193 directly inside corresponding registered buffers. NIC offload should 194 not delay packets. If offload is not possible, than the packet is 195 passed as-is to software. To perform offload on incoming packets 196 without buffering packets in the NIC, the NIC stores some inter-packet 197 state, such as partial PDU headers. 198 199 RX data-path 200 ------------ 201 202 After the device validates TCP checksums, it can perform DDP offload. The 203 packet is steered to the DDP offload context according to the 5-tuple. 204 Thereafter, the expected TCP sequence number is checked against the packet 205 TCP sequence number. If there is a match, offload is performed: the PDU payload 206 is DMA written to the corresponding destination buffer according to the PDU header 207 tag. The data should be DMAed only once, and the NIC receive ring will only 208 store the remaining TCP and PDU headers. 209 210 We remark that a single TCP packet may have numerous PDUs embedded inside. NICs 211 can choose to offload one or more of these PDUs according to various 212 trade-offs. Possibly, offloading such small PDUs is of little value, and it is 213 better to leave it to software. 214 215 Upon receiving a DDP offloaded packet, the driver reconstructs the original SKB 216 using page frags, while pointing to the destination buffers whenever possible. 217 This method enables seamless integration with the network stack, which can 218 inspect and modify packet fields transparently to the offload. 219 220 .. _buf_reg: 221 222 Destination buffer registration 223 ------------------------------- 224 225 To register the mapping between tags and destination buffers for a socket 226 `sk`, the L5P calls :c:member:`ulp_ddp_setup` of :c:type:`struct ulp_ddp_ops 227 `: 228 229 .. code-block:: c 230 231 int (*ulp_ddp_setup)(struct net_device *netdev, 232 struct sock *sk, 233 struct ulp_ddp_io *io); 234 235 236 The `io` provides the buffer via scatter-gather list (`sg_table`) and 237 corresponding tag (`command_id`): 238 > 239 .. code-block:: c 240 /** 241 * struct ulp_ddp_io - tcp ddp configuration for an IO request. 242 * 243 * @command_id: identifier on the wire associated with these buffers 244 * @nents: number of entries in the sg_table 245 * @sg_table: describing the buffers for this IO request 246 * @first_sgl: first SGL in sg_table 247 */ 248 struct ulp_ddp_io { 249 u32 command_id; 250 int nents; 251 struct sg_table sg_table; 252 struct scatterlist first_sgl[SG_CHUNK_SIZE]; 253 }; 254 -- 0-DAY CI Kernel Test Service https://01.org/lkp