From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Kavanagh, Mark B" Subject: Re: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support Date: Wed, 4 Oct 2017 14:30:52 +0000 Message-ID: References: <1506636833-25851-1-git-send-email-mark.b.kavanagh@intel.com> <1506962749-106779-1-git-send-email-mark.b.kavanagh@intel.com> <1506962749-106779-3-git-send-email-mark.b.kavanagh@intel.com> <2601191342CEEE43887BDE71AB9772585FAA3D5A@IRSMSX103.ger.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: "Hu, Jiayu" , "Tan, Jianfeng" , "Yigit, Ferruh" , "thomas@monjalon.net" To: "Ananyev, Konstantin" , "dev@dpdk.org" Return-path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id C4ED31B654 for ; Wed, 4 Oct 2017 16:30:57 +0200 (CEST) In-Reply-To: <2601191342CEEE43887BDE71AB9772585FAA3D5A@IRSMSX103.ger.corp.intel.com> Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" >-----Original Message----- >From: Ananyev, Konstantin >Sent: Wednesday, October 4, 2017 2:32 PM >To: Kavanagh, Mark B ; dev@dpdk.org >Cc: Hu, Jiayu ; Tan, Jianfeng = ; >Yigit, Ferruh ; thomas@monjalon.net >Subject: RE: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support > >Hi Mark, > >> -----Original Message----- >> From: Kavanagh, Mark B >> Sent: Monday, October 2, 2017 5:46 PM >> To: dev@dpdk.org >> Cc: Hu, Jiayu ; Tan, Jianfeng ; >Ananyev, Konstantin ; Yigit, >> Ferruh ; thomas@monjalon.net; Kavanagh, Mark B > >> Subject: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support >> >> From: Jiayu Hu >> >> This patch adds GSO support for TCP/IPv4 packets. Supported packets >> may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input >> packets have correct checksums, and doesn't update checksums for >> output packets (the responsibility for this lies with the application). >> Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets. >> >> TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect >> MBUF, to organize an output packet. Note that we refer to these two >> chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet >> header, while the indirect mbuf simply points to a location within the >> original packet's payload. Consequently, use of the GSO library requires >> multi-segment MBUF support in the TX functions of the NIC driver. >> >> If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a >> result, when all of its GSOed segments are freed, the packet is freed >> automatically. >> >> Signed-off-by: Jiayu Hu >> Signed-off-by: Mark Kavanagh >> Tested-by: Lei Yao >> --- >> doc/guides/rel_notes/release_17_11.rst | 12 +++ >> lib/librte_eal/common/include/rte_log.h | 1 + >> lib/librte_gso/Makefile | 2 + >> lib/librte_gso/gso_common.c | 153 >++++++++++++++++++++++++++++++++ >> lib/librte_gso/gso_common.h | 141 +++++++++++++++++++++++++= ++++ >> lib/librte_gso/gso_tcp4.c | 104 ++++++++++++++++++++++ >> lib/librte_gso/gso_tcp4.h | 74 +++++++++++++++ >> lib/librte_gso/rte_gso.c | 52 ++++++++++- >> 8 files changed, 536 insertions(+), 3 deletions(-) >> create mode 100644 lib/librte_gso/gso_common.c >> create mode 100644 lib/librte_gso/gso_common.h >> create mode 100644 lib/librte_gso/gso_tcp4.c >> create mode 100644 lib/librte_gso/gso_tcp4.h >> >> diff --git a/doc/guides/rel_notes/release_17_11.rst >b/doc/guides/rel_notes/release_17_11.rst >> index 7508be7..c414f73 100644 >> --- a/doc/guides/rel_notes/release_17_11.rst >> +++ b/doc/guides/rel_notes/release_17_11.rst >> @@ -41,6 +41,18 @@ New Features >> Also, make sure to start the actual text at the margin. >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> >> +* **Added the Generic Segmentation Offload Library.** >> + >> + Added the Generic Segmentation Offload (GSO) library to enable >> + applications to split large packets (e.g. MTU is 64KB) into small >> + ones (e.g. MTU is 1500B). Supported packet types are: >> + >> + * TCP/IPv4 packets, which may include a single VLAN tag. >> + >> + The GSO library doesn't check if the input packets have correct >> + checksums, and doesn't update checksums for output packets. >> + Additionally, the GSO library doesn't process IP fragmented packets. >> + >> >> Resolved Issues >> --------------- >> diff --git a/lib/librte_eal/common/include/rte_log.h >b/lib/librte_eal/common/include/rte_log.h >> index ec8dba7..2fa1199 100644 >> --- a/lib/librte_eal/common/include/rte_log.h >> +++ b/lib/librte_eal/common/include/rte_log.h >> @@ -87,6 +87,7 @@ struct rte_logs { >> #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */ >> #define RTE_LOGTYPE_EFD 18 /**< Log related to EFD. */ >> #define RTE_LOGTYPE_EVENTDEV 19 /**< Log related to eventdev. */ >> +#define RTE_LOGTYPE_GSO 20 /**< Log related to GSO. */ >> >> /* these log types can be used in an application */ >> #define RTE_LOGTYPE_USER1 24 /**< User-defined log type 1. */ >> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile >> index aeaacbc..2be64d1 100644 >> --- a/lib/librte_gso/Makefile >> +++ b/lib/librte_gso/Makefile >> @@ -42,6 +42,8 @@ LIBABIVER :=3D 1 >> >> #source files >> SRCS-$(CONFIG_RTE_LIBRTE_GSO) +=3D rte_gso.c >> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) +=3D gso_common.c >> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) +=3D gso_tcp4.c >> >> # install this header file >> SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include +=3D rte_gso.h >> diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c >> new file mode 100644 >> index 0000000..ee75d4c >> --- /dev/null >> +++ b/lib/librte_gso/gso_common.c >> @@ -0,0 +1,153 @@ >> +/*- >> + * BSD LICENSE >> + * >> + * Copyright(c) 2017 Intel Corporation. All rights reserved. >> + * All rights reserved. >> + * >> + * Redistribution and use in source and binary forms, with or without >> + * modification, are permitted provided that the following conditions >> + * are met: >> + * >> + * * Redistributions of source code must retain the above copyright >> + * notice, this list of conditions and the following disclaimer. >> + * * Redistributions in binary form must reproduce the above copyri= ght >> + * notice, this list of conditions and the following disclaimer i= n >> + * the documentation and/or other materials provided with the >> + * distribution. >> + * * Neither the name of Intel Corporation nor the names of its >> + * contributors may be used to endorse or promote products derive= d >> + * from this software without specific prior written permission. >> + * >> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTOR= S >> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT >> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS = FOR >> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIG= HT >> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENT= AL, >> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT >> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF U= SE, >> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON = ANY >> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TOR= T >> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE = USE >> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAG= E. >> + */ >> + >> +#include >> +#include >> + >> +#include >> +#include >> + >> +#include "gso_common.h" >> + >> +static inline void >> +hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt, >> + uint16_t pkt_hdr_offset) >> +{ >> + /* Copy MBUF metadata */ >> + hdr_segment->nb_segs =3D 1; >> + hdr_segment->port =3D pkt->port; >> + hdr_segment->ol_flags =3D pkt->ol_flags; >> + hdr_segment->packet_type =3D pkt->packet_type; >> + hdr_segment->pkt_len =3D pkt_hdr_offset; >> + hdr_segment->data_len =3D pkt_hdr_offset; >> + hdr_segment->tx_offload =3D pkt->tx_offload; >> + >> + /* Copy the packet header */ >> + rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *), >> + rte_pktmbuf_mtod(pkt, char *), >> + pkt_hdr_offset); >> +} >> + >> +static inline void >> +free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts) >> +{ >> + uint16_t i; >> + >> + for (i =3D 0; i < nb_pkts; i++) >> + rte_pktmbuf_free(pkts[i]); >> +} >> + >> +int >> +gso_do_segment(struct rte_mbuf *pkt, >> + uint16_t pkt_hdr_offset, >> + uint16_t pyld_unit_size, >> + struct rte_mempool *direct_pool, >> + struct rte_mempool *indirect_pool, >> + struct rte_mbuf **pkts_out, >> + uint16_t nb_pkts_out) >> +{ >> + struct rte_mbuf *pkt_in; >> + struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment; >> + uint16_t pkt_in_data_pos, segment_bytes_remaining; >> + uint16_t pyld_len, nb_segs; >> + bool more_in_pkt, more_out_segs; >> + >> + pkt_in =3D pkt; >> + nb_segs =3D 0; >> + more_in_pkt =3D 1; >> + pkt_in_data_pos =3D pkt_hdr_offset; >> + >> + while (more_in_pkt) { >> + if (unlikely(nb_segs >=3D nb_pkts_out)) { >> + free_gso_segment(pkts_out, nb_segs); >> + return -EINVAL; >> + } >> + >> + /* Allocate a direct MBUF */ >> + hdr_segment =3D rte_pktmbuf_alloc(direct_pool); >> + if (unlikely(hdr_segment =3D=3D NULL)) { >> + free_gso_segment(pkts_out, nb_segs); >> + return -ENOMEM; >> + } >> + /* Fill the packet header */ >> + hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset); >> + >> + prev_segment =3D hdr_segment; >> + segment_bytes_remaining =3D pyld_unit_size; >> + more_out_segs =3D 1; >> + >> + while (more_out_segs && more_in_pkt) { >> + /* Allocate an indirect MBUF */ >> + pyld_segment =3D rte_pktmbuf_alloc(indirect_pool); >> + if (unlikely(pyld_segment =3D=3D NULL)) { >> + rte_pktmbuf_free(hdr_segment); >> + free_gso_segment(pkts_out, nb_segs); >> + return -ENOMEM; >> + } >> + /* Attach to current MBUF segment of pkt */ >> + rte_pktmbuf_attach(pyld_segment, pkt_in); >> + >> + prev_segment->next =3D pyld_segment; >> + prev_segment =3D pyld_segment; >> + >> + pyld_len =3D segment_bytes_remaining; >> + if (pyld_len + pkt_in_data_pos > pkt_in->data_len) >> + pyld_len =3D pkt_in->data_len - pkt_in_data_pos; >> + >> + pyld_segment->data_off =3D pkt_in_data_pos + >> + pkt_in->data_off; >> + pyld_segment->data_len =3D pyld_len; >> + >> + /* Update header segment */ >> + hdr_segment->pkt_len +=3D pyld_len; >> + hdr_segment->nb_segs++; >> + >> + pkt_in_data_pos +=3D pyld_len; >> + segment_bytes_remaining -=3D pyld_len; >> + >> + /* Finish processing a MBUF segment of pkt */ >> + if (pkt_in_data_pos =3D=3D pkt_in->data_len) { >> + pkt_in =3D pkt_in->next; >> + pkt_in_data_pos =3D 0; >> + if (pkt_in =3D=3D NULL) >> + more_in_pkt =3D 0; >> + } >> + >> + /* Finish generating a GSO segment */ >> + if (segment_bytes_remaining =3D=3D 0) >> + more_out_segs =3D 0; >> + } >> + pkts_out[nb_segs++] =3D hdr_segment; >> + } >> + return nb_segs; >> +} >> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h >> new file mode 100644 >> index 0000000..8d9b94e >> --- /dev/null >> +++ b/lib/librte_gso/gso_common.h >> @@ -0,0 +1,141 @@ >> +/*- >> + * BSD LICENSE >> + * >> + * Copyright(c) 2017 Intel Corporation. All rights reserved. >> + * All rights reserved. >> + * >> + * Redistribution and use in source and binary forms, with or without >> + * modification, are permitted provided that the following conditions >> + * are met: >> + * >> + * * Redistributions of source code must retain the above copyright >> + * notice, this list of conditions and the following disclaimer. >> + * * Redistributions in binary form must reproduce the above copyri= ght >> + * notice, this list of conditions and the following disclaimer i= n >> + * the documentation and/or other materials provided with the >> + * distribution. >> + * * Neither the name of Intel Corporation nor the names of its >> + * contributors may be used to endorse or promote products derive= d >> + * from this software without specific prior written permission. >> + * >> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTOR= S >> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT >> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS = FOR >> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIG= HT >> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENT= AL, >> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT >> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF U= SE, >> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON = ANY >> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TOR= T >> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE = USE >> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAG= E. >> + */ >> + >> +#ifndef _GSO_COMMON_H_ >> +#define _GSO_COMMON_H_ >> + >> +#include >> + >> +#include >> +#include >> +#include >> + >> +#define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != =3D 0 \ >> + || ((frag_off) & IPV4_HDR_MF_FLAG) =3D=3D IPV4_HDR_MF_FLAG) >> + >> +#define TCP_HDR_PSH_MASK ((uint8_t)0x08) >> +#define TCP_HDR_FIN_MASK ((uint8_t)0x01) >> + >> +#define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) = =3D=3D \ >> + (PKT_TX_TCP_SEG | PKT_TX_IPV4)) >> + >> +/** >> + * Internal function which updates the TCP header of a packet, followin= g >> + * segmentation. This is required to update the header's 'sent' sequenc= e >> + * number, and also to clear 'PSH' and 'FIN' flags for non-tail segment= s. >> + * >> + * @param pkt >> + * The packet containing the TCP header. >> + * @param l4_offset >> + * The offset of the TCP header from the start of the packet. >> + * @param sent_seq >> + * The sent sequence number. >> + * @param non-tail >> + * Indicates whether or not this is a tail segment. >> + */ >> +static inline void >> +update_tcp_header(struct rte_mbuf *pkt, uint16_t l4_offset, uint32_t >sent_seq, >> + uint8_t non_tail) >> +{ >> + struct tcp_hdr *tcp_hdr; >> + >> + tcp_hdr =3D (struct tcp_hdr *)(rte_pktmbuf_mtod(pkt, char *) + >> + l4_offset); >> + tcp_hdr->sent_seq =3D rte_cpu_to_be_32(sent_seq); >> + if (likely(non_tail)) >> + tcp_hdr->tcp_flags &=3D (~(TCP_HDR_PSH_MASK | >> + TCP_HDR_FIN_MASK)); >> +} >> + >> +/** >> + * Internal function which updates the IPv4 header of a packet, followi= ng >> + * segmentation. This is required to update the header's 'total_length' >field, >> + * to reflect the reduced length of the now-segmented packet. Furthermo= re, >the >> + * header's 'packet_id' field must be updated to reflect the new ID of = the >> + * now-segmented packet. >> + * >> + * @param pkt >> + * The packet containing the IPv4 header. >> + * @param l3_offset >> + * The offset of the IPv4 header from the start of the packet. >> + * @param id >> + * The new ID of the packet. >> + */ >> +static inline void >> +update_ipv4_header(struct rte_mbuf *pkt, uint16_t l3_offset, uint16_t i= d) >> +{ >> + struct ipv4_hdr *ipv4_hdr; >> + >> + ipv4_hdr =3D (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) + >> + l3_offset); >> + ipv4_hdr->total_length =3D rte_cpu_to_be_16(pkt->pkt_len - l3_offset); >> + ipv4_hdr->packet_id =3D rte_cpu_to_be_16(id); >> +} >> + >> +/** >> + * Internal function which divides the input packet into small segments= . >> + * Each of the newly-created segments is organized as a two-segment MBU= F, >> + * where the first segment is a standard mbuf, which stores a copy of >> + * packet header, and the second is an indirect mbuf which points to a >> + * section of data in the input packet. >> + * >> + * @param pkt >> + * Packet to segment. >> + * @param pkt_hdr_offset >> + * Packet header offset, measured in bytes. >> + * @param pyld_unit_size >> + * The max payload length of a GSO segment. >> + * @param direct_pool >> + * MBUF pool used for allocating direct buffers for output segments. >> + * @param indirect_pool >> + * MBUF pool used for allocating indirect buffers for output segments. >> + * @param pkts_out >> + * Pointer array used to keep the mbuf addresses of output segments. I= f >> + * the memory space in pkts_out is insufficient, gso_do_segment() fail= s >> + * and returns -EINVAL. >> + * @param nb_pkts_out >> + * The max number of items that pkts_out can keep. >> + * >> + * @return >> + * - The number of segments created in the event of success. >> + * - Return -ENOMEM if run out of memory in MBUF pools. >> + * - Return -EINVAL for invalid parameters. >> + */ >> +int gso_do_segment(struct rte_mbuf *pkt, >> + uint16_t pkt_hdr_offset, >> + uint16_t pyld_unit_size, >> + struct rte_mempool *direct_pool, >> + struct rte_mempool *indirect_pool, >> + struct rte_mbuf **pkts_out, >> + uint16_t nb_pkts_out); >> +#endif >> diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c >> new file mode 100644 >> index 0000000..d83e610 >> --- /dev/null >> +++ b/lib/librte_gso/gso_tcp4.c >> @@ -0,0 +1,104 @@ >> +/*- >> + * BSD LICENSE >> + * >> + * Copyright(c) 2017 Intel Corporation. All rights reserved. >> + * All rights reserved. >> + * >> + * Redistribution and use in source and binary forms, with or without >> + * modification, are permitted provided that the following conditions >> + * are met: >> + * >> + * * Redistributions of source code must retain the above copyright >> + * notice, this list of conditions and the following disclaimer. >> + * * Redistributions in binary form must reproduce the above copyri= ght >> + * notice, this list of conditions and the following disclaimer i= n >> + * the documentation and/or other materials provided with the >> + * distribution. >> + * * Neither the name of Intel Corporation nor the names of its >> + * contributors may be used to endorse or promote products derive= d >> + * from this software without specific prior written permission. >> + * >> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTOR= S >> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT >> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS = FOR >> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIG= HT >> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENT= AL, >> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT >> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF U= SE, >> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON = ANY >> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TOR= T >> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE = USE >> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAG= E. >> + */ >> + >> +#include "gso_common.h" >> +#include "gso_tcp4.h" >> + >> +static void >> +update_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta, >> + struct rte_mbuf **segs, uint16_t nb_segs) >> +{ >> + struct ipv4_hdr *ipv4_hdr; >> + struct tcp_hdr *tcp_hdr; >> + uint32_t sent_seq; >> + uint16_t id, tail_idx, i; >> + uint16_t l3_offset =3D pkt->l2_len; >> + uint16_t l4_offset =3D l3_offset + pkt->l3_len; >> + >> + ipv4_hdr =3D (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char*) + >> + l3_offset); >> + tcp_hdr =3D (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len); >> + id =3D rte_be_to_cpu_16(ipv4_hdr->packet_id); >> + sent_seq =3D rte_be_to_cpu_32(tcp_hdr->sent_seq); >> + tail_idx =3D nb_segs - 1; >> + >> + for (i =3D 0; i < nb_segs; i++) { >> + update_ipv4_header(segs[i], l3_offset, id); >> + update_tcp_header(segs[i], l4_offset, sent_seq, i < tail_idx); >> + id +=3D ipid_delta; >> + sent_seq +=3D (segs[i]->pkt_len - segs[i]->data_len); >> + } >> +} >> + >> +int >> +gso_tcp4_segment(struct rte_mbuf *pkt, >> + uint16_t gso_size, >> + uint8_t ipid_delta, >> + struct rte_mempool *direct_pool, >> + struct rte_mempool *indirect_pool, >> + struct rte_mbuf **pkts_out, >> + uint16_t nb_pkts_out) >> +{ >> + struct ipv4_hdr *ipv4_hdr; >> + uint16_t tcp_dl; >> + uint16_t pyld_unit_size, hdr_offset; >> + uint16_t frag_off; >> + int ret; >> + >> + /* Don't process the fragmented packet */ >> + ipv4_hdr =3D (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) + >> + pkt->l2_len); >> + frag_off =3D rte_be_to_cpu_16(ipv4_hdr->fragment_offset); >> + if (unlikely(IS_FRAGMENTED(frag_off))) { >> + pkts_out[0] =3D pkt; >> + return 1; >> + } >> + >> + /* Don't process the packet without data */ >> + tcp_dl =3D pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len; >> + if (unlikely(tcp_dl =3D=3D 0)) { >> + pkts_out[0] =3D pkt; >> + return 1; >> + } >> + >> + hdr_offset =3D pkt->l2_len + pkt->l3_len + pkt->l4_len; >> + pyld_unit_size =3D gso_size - hdr_offset; >> + >> + /* Segment the payload */ >> + ret =3D gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool, >> + indirect_pool, pkts_out, nb_pkts_out); >> + if (ret > 1) >> + update_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret); >> + >> + return ret; >> +} >> diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h >> new file mode 100644 >> index 0000000..1c57441 >> --- /dev/null >> +++ b/lib/librte_gso/gso_tcp4.h >> @@ -0,0 +1,74 @@ >> +/*- >> + * BSD LICENSE >> + * >> + * Copyright(c) 2017 Intel Corporation. All rights reserved. >> + * All rights reserved. >> + * >> + * Redistribution and use in source and binary forms, with or without >> + * modification, are permitted provided that the following conditions >> + * are met: >> + * >> + * * Redistributions of source code must retain the above copyright >> + * notice, this list of conditions and the following disclaimer. >> + * * Redistributions in binary form must reproduce the above copyri= ght >> + * notice, this list of conditions and the following disclaimer i= n >> + * the documentation and/or other materials provided with the >> + * distribution. >> + * * Neither the name of Intel Corporation nor the names of its >> + * contributors may be used to endorse or promote products derive= d >> + * from this software without specific prior written permission. >> + * >> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTOR= S >> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT >> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS = FOR >> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIG= HT >> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENT= AL, >> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT >> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF U= SE, >> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON = ANY >> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TOR= T >> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE = USE >> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAG= E. >> + */ >> + >> +#ifndef _GSO_TCP4_H_ >> +#define _GSO_TCP4_H_ >> + >> +#include >> +#include >> + >> +/** >> + * Segment an IPv4/TCP packet. This function doesn't check if the input >> + * packet has correct checksums, and doesn't update checksums for outpu= t >> + * GSO segments. Furthermore, it doesn't process IP fragment packets. >> + * >> + * @param pkt >> + * The packet mbuf to segment. >> + * @param gso_size >> + * The max length of a GSO segment, measured in bytes. >> + * @param ipid_delta >> + * The increasing unit of IP ids. >> + * @param direct_pool >> + * MBUF pool used for allocating direct buffers for output segments. >> + * @param indirect_pool >> + * MBUF pool used for allocating indirect buffers for output segments. >> + * @param pkts_out >> + * Pointer array used to store the MBUF addresses of output GSO >> + * segments, when the function succeeds. If the memory space in >> + * pkts_out is insufficient, it fails and returns -EINVAL. >> + * @param nb_pkts_out >> + * The max number of items that 'pkts_out' can keep. >> + * >> + * @return >> + * - The number of GSO segments filled in pkts_out on success. >> + * - Return -ENOMEM if run out of memory in MBUF pools. >> + * - Return -EINVAL for invalid parameters. >> + */ >> +int gso_tcp4_segment(struct rte_mbuf *pkt, >> + uint16_t gso_size, >> + uint8_t ip_delta, >> + struct rte_mempool *direct_pool, >> + struct rte_mempool *indirect_pool, >> + struct rte_mbuf **pkts_out, >> + uint16_t nb_pkts_out); >> +#endif >> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c >> index b773636..a4fce50 100644 >> --- a/lib/librte_gso/rte_gso.c >> +++ b/lib/librte_gso/rte_gso.c >> @@ -33,7 +33,12 @@ >> >> #include >> >> +#include >> +#include >> + >> #include "rte_gso.h" >> +#include "gso_common.h" >> +#include "gso_tcp4.h" >> >> int >> rte_gso_segment(struct rte_mbuf *pkt, >> @@ -41,12 +46,53 @@ >> struct rte_mbuf **pkts_out, >> uint16_t nb_pkts_out) >> { >> + struct rte_mempool *direct_pool, *indirect_pool; >> + struct rte_mbuf *pkt_seg; >> + uint64_t ol_flags; >> + uint16_t gso_size; >> + uint8_t ipid_delta; >> + int ret =3D 1; >> + >> if (pkt =3D=3D NULL || pkts_out =3D=3D NULL || gso_ctx =3D=3D NULL || >> nb_pkts_out < 1) >> return -EINVAL; >> >> - pkt->ol_flags &=3D (~PKT_TX_TCP_SEG); >> - pkts_out[0] =3D pkt; >> + if ((gso_ctx->gso_size >=3D pkt->pkt_len) || (gso_ctx->gso_types & >> + DEV_TX_OFFLOAD_TCP_TSO) !=3D >> + gso_ctx->gso_types) { >> + pkt->ol_flags &=3D (~PKT_TX_TCP_SEG); >> + pkts_out[0] =3D pkt; >> + return 1; >> + } >> + >> + direct_pool =3D gso_ctx->direct_pool; >> + indirect_pool =3D gso_ctx->indirect_pool; >> + gso_size =3D gso_ctx->gso_size; >> + ipid_delta =3D (gso_ctx->ipid_flag !=3D RTE_GSO_IPID_FIXED); >> + ol_flags =3D pkt->ol_flags; >> + >> + if (IS_IPV4_TCP(pkt->ol_flags)) { >> + pkt->ol_flags &=3D (~PKT_TX_TCP_SEG); >> + ret =3D gso_tcp4_segment(pkt, gso_size, ipid_delta, >> + direct_pool, indirect_pool, >> + pkts_out, nb_pkts_out); >> + } else { >> + pkt->ol_flags &=3D (~PKT_TX_TCP_SEG); > >Not sure why do you clean this flag if you don't support that packet type >and no action was perfomed? >Suppose you have a mix ipv4 and ipv6 packets - gso lib would do ipv4 and >someone else >(HW?) can do ipv4 segmentation. I can't say for definite, since I didn't implement this change. However, I = can only presume that the assumption here is that since segmentation is bei= ng done in S/W that the underlying H/W does not support TSO. Since the underlying HW can't segment the packet in HW, we should clear the= flag; otherwise, if an mbuf marked for TCP segmentation is passed to the d= river of a NIC that does not support/understand that feature, the behavior = is undefined. Is this a fair assumption in your opinion, or is it the case that the packe= t would simply be transmitted un-segmented in that case, and so we shouldn'= t clear the flag? Thanks again, Mark >BTW, did you notice that building of shared target fails? >Konstantin I didn't, but I'll take a look right now - thanks for the catch! > > >> + pkts_out[0] =3D pkt; >> + RTE_LOG(WARNING, GSO, "Unsupported packet type\n"); >> + return 1; >> + } >> + >> + if (ret > 1) { >> + pkt_seg =3D pkt; >> + while (pkt_seg) { >> + rte_mbuf_refcnt_update(pkt_seg, -1); >> + pkt_seg =3D pkt_seg->next; >> + } >> + } else if (ret < 0) { >> + /* Revert the ol_flags in the event of failure. */ >> + pkt->ol_flags =3D ol_flags; >> + } >> >> - return 1; >> + return ret; >> } >> -- >> 1.9.3