From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jiayu Hu Subject: Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support Date: Wed, 13 Sep 2017 10:48:01 +0800 Message-ID: <20170913024801.GB44293@dpdk15.sh.intel.com> References: <1504598270-60080-1-git-send-email-jiayu.hu@intel.com> <1505184211-36728-1-git-send-email-jiayu.hu@intel.com> <1505184211-36728-3-git-send-email-jiayu.hu@intel.com> <2601191342CEEE43887BDE71AB9772584F249E8E@irsmsx105.ger.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "dev@dpdk.org" , "Kavanagh, Mark B" , "Tan, Jianfeng" To: "Ananyev, Konstantin" Return-path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 5A5161041 for ; Wed, 13 Sep 2017 04:45:15 +0200 (CEST) Content-Disposition: inline In-Reply-To: <2601191342CEEE43887BDE71AB9772584F249E8E@irsmsx105.ger.corp.intel.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Konstantin, On Tue, Sep 12, 2017 at 07:17:49PM +0800, Ananyev, Konstantin wrote: > Hi Jayu, > > > -----Original Message----- > > From: Hu, Jiayu > > Sent: Tuesday, September 12, 2017 3:43 AM > > To: dev@dpdk.org > > Cc: Ananyev, Konstantin ; Kavanagh, Mark B ; Tan, Jianfeng > > ; Hu, Jiayu > > Subject: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support > > > > This patch adds GSO support for TCP/IPv4 packets. Supported packets > > may include a single VLAN tag. TCP/IPv4 GSO assumes that all input > > packets have correct checksums, and doesn't update checksums for output > > packets (the responsibility for this lies with the application). > > Probably it shouldn't say that checksum have to be valid, right? > As you don't update checksum(s) inside the lib - it probably doesn't matter. Yes, you are right. It's better to use: "TCP/IPv4 GSO doesn't check if checksums are correct and doesn't update checksums for output packets". > > > Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets. > > > > TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect > > MBUF, to organize an output packet. Note that we refer to these two > > chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet > > header, while the indirect mbuf simply points to a location within the > > original packet's payload. Consequently, use of the GSO library requires > > multi-segment MBUF support in the TX functions of the NIC driver. > > > > If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a > > result, when all of its GSOed segments are freed, the packet is freed > > automatically. > > > > Signed-off-by: Jiayu Hu > > Signed-off-by: Mark Kavanagh > > --- > > lib/librte_eal/common/include/rte_log.h | 1 + > > lib/librte_gso/Makefile | 2 + > > lib/librte_gso/gso_common.c | 202 ++++++++++++++++++++++++++++++++ > > lib/librte_gso/gso_common.h | 113 ++++++++++++++++++ > > lib/librte_gso/gso_tcp4.c | 83 +++++++++++++ > > lib/librte_gso/gso_tcp4.h | 76 ++++++++++++ > > lib/librte_gso/rte_gso.c | 41 ++++++- > > 7 files changed, 515 insertions(+), 3 deletions(-) > > create mode 100644 lib/librte_gso/gso_common.c > > create mode 100644 lib/librte_gso/gso_common.h > > create mode 100644 lib/librte_gso/gso_tcp4.c > > create mode 100644 lib/librte_gso/gso_tcp4.h > > > > diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h > > index ec8dba7..2fa1199 100644 > > --- a/lib/librte_eal/common/include/rte_log.h > > +++ b/lib/librte_eal/common/include/rte_log.h > > @@ -87,6 +87,7 @@ extern struct rte_logs rte_logs; > > #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */ > > #define RTE_LOGTYPE_EFD 18 /**< Log related to EFD. */ > > #define RTE_LOGTYPE_EVENTDEV 19 /**< Log related to eventdev. */ > > +#define RTE_LOGTYPE_GSO 20 /**< Log related to GSO. */ > > > > /* these log types can be used in an application */ > > #define RTE_LOGTYPE_USER1 24 /**< User-defined log type 1. */ > > diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile > > index aeaacbc..2be64d1 100644 > > --- a/lib/librte_gso/Makefile > > +++ b/lib/librte_gso/Makefile > > @@ -42,6 +42,8 @@ LIBABIVER := 1 > > > > #source files > > SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c > > +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c > > +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c > > > > # install this header file > > SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h > > diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c > > new file mode 100644 > > index 0000000..7c32e03 > > --- /dev/null > > +++ b/lib/librte_gso/gso_common.c > > @@ -0,0 +1,202 @@ > > +/*- > > + * BSD LICENSE > > + * > > + * Copyright(c) 2017 Intel Corporation. All rights reserved. > > + * All rights reserved. > > + * > > + * Redistribution and use in source and binary forms, with or without > > + * modification, are permitted provided that the following conditions > > + * are met: > > + * > > + * * Redistributions of source code must retain the above copyright > > + * notice, this list of conditions and the following disclaimer. > > + * * Redistributions in binary form must reproduce the above copyright > > + * notice, this list of conditions and the following disclaimer in > > + * the documentation and/or other materials provided with the > > + * distribution. > > + * * Neither the name of Intel Corporation nor the names of its > > + * contributors may be used to endorse or promote products derived > > + * from this software without specific prior written permission. > > + * > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > > + */ > > + > > +#include > > +#include > > + > > +#include > > +#include > > +#include > > +#include > > +#include > > + > > +#include "gso_common.h" > > + > > +static inline void > > +hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt, > > + uint16_t pkt_hdr_offset) > > +{ > > + /* Copy MBUF metadata */ > > + hdr_segment->nb_segs = 1; > > + hdr_segment->port = pkt->port; > > + hdr_segment->ol_flags = pkt->ol_flags; > > + hdr_segment->packet_type = pkt->packet_type; > > + hdr_segment->pkt_len = pkt_hdr_offset; > > + hdr_segment->data_len = pkt_hdr_offset; > > + hdr_segment->tx_offload = pkt->tx_offload; > > + > > + /* Copy the packet header */ > > + rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *), > > + rte_pktmbuf_mtod(pkt, char *), > > + pkt_hdr_offset); > > +} > > + > > +static inline void > > +free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts) > > +{ > > + uint16_t i; > > + > > + for (i = 0; i < nb_pkts; i++) > > + rte_pktmbuf_free(pkts[i]); > > +} > > + > > +int > > +gso_do_segment(struct rte_mbuf *pkt, > > + uint16_t pkt_hdr_offset, > > + uint16_t pyld_unit_size, > > + struct rte_mempool *direct_pool, > > + struct rte_mempool *indirect_pool, > > + struct rte_mbuf **pkts_out, > > + uint16_t nb_pkts_out) > > +{ > > + struct rte_mbuf *pkt_in; > > + struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment; > > + uint16_t pkt_in_data_pos, segment_bytes_remaining; > > + uint16_t pyld_len, nb_segs; > > + bool more_in_pkt, more_out_segs; > > + > > + pkt_in = pkt; > > + nb_segs = 0; > > + more_in_pkt = 1; > > + pkt_in_data_pos = pkt_hdr_offset; > > + > > + while (more_in_pkt) { > > + if (unlikely(nb_segs >= nb_pkts_out)) { > > + free_gso_segment(pkts_out, nb_segs); > > + return -EINVAL; > > + } > > + > > + /* Allocate a direct MBUF */ > > + hdr_segment = rte_pktmbuf_alloc(direct_pool); > > + if (unlikely(hdr_segment == NULL)) { > > + free_gso_segment(pkts_out, nb_segs); > > + return -ENOMEM; > > + } > > + /* Fill the packet header */ > > + hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset); > > + > > + prev_segment = hdr_segment; > > + segment_bytes_remaining = pyld_unit_size; > > + more_out_segs = 1; > > + > > + while (more_out_segs && more_in_pkt) { > > + /* Allocate an indirect MBUF */ > > + pyld_segment = rte_pktmbuf_alloc(indirect_pool); > > + if (unlikely(pyld_segment == NULL)) { > > + rte_pktmbuf_free(hdr_segment); > > + free_gso_segment(pkts_out, nb_segs); > > + return -ENOMEM; > > + } > > + /* Attach to current MBUF segment of pkt */ > > + rte_pktmbuf_attach(pyld_segment, pkt_in); > > + > > + prev_segment->next = pyld_segment; > > + prev_segment = pyld_segment; > > + > > + pyld_len = segment_bytes_remaining; > > + if (pyld_len + pkt_in_data_pos > pkt_in->data_len) > > + pyld_len = pkt_in->data_len - pkt_in_data_pos; > > + > > + pyld_segment->data_off = pkt_in_data_pos + > > + pkt_in->data_off; > > + pyld_segment->data_len = pyld_len; > > + > > + /* Update header segment */ > > + hdr_segment->pkt_len += pyld_len; > > + hdr_segment->nb_segs++; > > + > > + pkt_in_data_pos += pyld_len; > > + segment_bytes_remaining -= pyld_len; > > + > > + /* Finish processing a MBUF segment of pkt */ > > + if (pkt_in_data_pos == pkt_in->data_len) { > > + pkt_in = pkt_in->next; > > + pkt_in_data_pos = 0; > > + if (pkt_in == NULL) > > + more_in_pkt = 0; > > + } > > + > > + /* Finish generating a GSO segment */ > > + if (segment_bytes_remaining == 0) > > + more_out_segs = 0; > > + } > > + pkts_out[nb_segs++] = hdr_segment; > > + } > > + return nb_segs; > > +} > > + > > +static inline void > > +update_inner_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta, > > + struct rte_mbuf **segs, uint16_t nb_segs) > > +{ > > + struct tcp_hdr *tcp_hdr; > > + struct ipv4_hdr *ipv4_hdr; > > + struct rte_mbuf *seg; > > + uint32_t sent_seq; > > + uint16_t inner_l2_offset; > > + uint16_t id, i; > > + > > + inner_l2_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len; > > Shouldn't it be: pkt->l2_len here? > Or probably even better to pass l2_len as an input parameter. Oh, yes. Applications won't guarantee outer_l2_len and outer_l3_len are 0 for non-tunnelling packets. I will add l2_len as a parameter instead. > > > + ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) + > > + inner_l2_offset); > > + tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len); > > + id = rte_be_to_cpu_16(ipv4_hdr->packet_id); > > + sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq); > > + > > + for (i = 0; i < nb_segs; i++) { > > + seg = segs[i]; > > + /* Update the inner IPv4 header */ > > + ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(seg, char *) + > > + inner_l2_offset); > > + ipv4_hdr->total_length = rte_cpu_to_be_16(seg->pkt_len - > > + inner_l2_offset); > > + ipv4_hdr->packet_id = rte_cpu_to_be_16(id); > > + id += ipid_delta; > > + > > + /* Update the inner TCP header */ > > + tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + seg->l3_len); > > + tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq); > > + if (likely(i < nb_segs - 1)) > > + tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK | > > + TCP_HDR_FIN_MASK)); > > + sent_seq += (seg->pkt_len - seg->data_len); > > + } > > +} > > + > > +void > > +gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta, > > + struct rte_mbuf **segs, uint16_t nb_segs) > > +{ > > + if (is_ipv4_tcp(pkt->packet_type)) > > + update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs); > > +} > > diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h > > new file mode 100644 > > index 0000000..3c76520 > > --- /dev/null > > +++ b/lib/librte_gso/gso_common.h > > @@ -0,0 +1,113 @@ > > +/*- > > + * BSD LICENSE > > + * > > + * Copyright(c) 2017 Intel Corporation. All rights reserved. > > + * All rights reserved. > > + * > > + * Redistribution and use in source and binary forms, with or without > > + * modification, are permitted provided that the following conditions > > + * are met: > > + * > > + * * Redistributions of source code must retain the above copyright > > + * notice, this list of conditions and the following disclaimer. > > + * * Redistributions in binary form must reproduce the above copyright > > + * notice, this list of conditions and the following disclaimer in > > + * the documentation and/or other materials provided with the > > + * distribution. > > + * * Neither the name of Intel Corporation nor the names of its > > + * contributors may be used to endorse or promote products derived > > + * from this software without specific prior written permission. > > + * > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > > + */ > > + > > +#ifndef _GSO_COMMON_H_ > > +#define _GSO_COMMON_H_ > > + > > +#include > > +#include > > + > > +#define IPV4_HDR_DF_SHIFT 14 > > We have that already defined in librte_net/rte_ip.h Yes. I will remove it here. > > > +#define IPV4_HDR_DF_MASK (1 << IPV4_HDR_DF_SHIFT) > > + > > +#define TCP_HDR_PSH_MASK ((uint8_t)0x08) > > +#define TCP_HDR_FIN_MASK ((uint8_t)0x01) > > + > > +#define ETHER_TCP_PKT (RTE_PTYPE_L2_ETHER | RTE_PTYPE_L4_TCP) > > +#define ETHER_VLAN_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | RTE_PTYPE_L4_TCP) > > +static inline uint8_t is_ipv4_tcp(uint32_t ptype) > > +{ > > + switch (ptype & (~RTE_PTYPE_L3_MASK)) { > > + case ETHER_VLAN_TCP_PKT: > > + case ETHER_TCP_PKT: > > Why not just: > return RTE_ETH_IS_IPV4_HDR(ptype) && (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP; > ? Yes, we don't need to check if the packet is vlan encapsulated. > > > + return RTE_ETH_IS_IPV4_HDR(ptype); > > + default: > > + return 0; > > + } > > +} > > + > > +/** > > + * Internal function which updates relevant packet headers, following > > + * segmentation. This is required to update, for example, the IPv4 > > + * 'total_length' field, to reflect the reduced length of the now- > > + * segmented packet. > > + * > > + * @param pkt > > + * The original packet. > > + * @param ipid_delta > > + * The increasing uint of IP ids. > > + * @param segs > > + * Pointer array used for storing mbuf addresses for GSO segments. > > + * @param nb_segs > > + * The number of GSO segments placed in segs. > > + */ > > +void gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta, > > + struct rte_mbuf **segs, uint16_t nb_segs); > > + > > +/** > > + * Internal function which divides the input packet into small segments. > > + * Each of the newly-created segments is organized as a two-segment MBUF, > > + * where the first segment is a standard mbuf, which stores a copy of > > + * packet header, and the second is an indirect mbuf which points to a > > + * section of data in the input packet. > > + * > > + * @param pkt > > + * Packet to segment. > > + * @param pkt_hdr_offset > > + * Packet header offset, measured in bytes. > > + * @param pyld_unit_size > > + * The max payload length of a GSO segment. > > + * @param direct_pool > > + * MBUF pool used for allocating direct buffers for output segments. > > + * @param indirect_pool > > + * MBUF pool used for allocating indirect buffers for output segments. > > + * @param pkts_out > > + * Pointer array used to keep the mbuf addresses of output segments. If > > + * the memory space in pkts_out is insufficient, gso_do_segment() fails > > + * and returns -EINVAL. > > + * @param nb_pkts_out > > + * The max number of items that pkts_out can keep. > > + * > > + * @return > > + * - The number of segments created in the event of success. > > + * - Return -ENOMEM if run out of memory in MBUF pools. > > + * - Return -EINVAL for invalid parameters. > > + */ > > +int gso_do_segment(struct rte_mbuf *pkt, > > + uint16_t pkt_hdr_offset, > > + uint16_t pyld_unit_size, > > + struct rte_mempool *direct_pool, > > + struct rte_mempool *indirect_pool, > > + struct rte_mbuf **pkts_out, > > + uint16_t nb_pkts_out); > > +#endif > > diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c > > new file mode 100644 > > index 0000000..8d4bfb2 > > --- /dev/null > > +++ b/lib/librte_gso/gso_tcp4.c > > @@ -0,0 +1,83 @@ > > +/*- > > + * BSD LICENSE > > + * > > + * Copyright(c) 2017 Intel Corporation. All rights reserved. > > + * All rights reserved. > > + * > > + * Redistribution and use in source and binary forms, with or without > > + * modification, are permitted provided that the following conditions > > + * are met: > > + * > > + * * Redistributions of source code must retain the above copyright > > + * notice, this list of conditions and the following disclaimer. > > + * * Redistributions in binary form must reproduce the above copyright > > + * notice, this list of conditions and the following disclaimer in > > + * the documentation and/or other materials provided with the > > + * distribution. > > + * * Neither the name of Intel Corporation nor the names of its > > + * contributors may be used to endorse or promote products derived > > + * from this software without specific prior written permission. > > + * > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > > + */ > > + > > + > > +#include > > +#include > > + > > +#include "gso_common.h" > > +#include "gso_tcp4.h" > > + > > +int > > +gso_tcp4_segment(struct rte_mbuf *pkt, > > + uint16_t gso_size, > > + uint8_t ipid_delta, > > + struct rte_mempool *direct_pool, > > + struct rte_mempool *indirect_pool, > > + struct rte_mbuf **pkts_out, > > + uint16_t nb_pkts_out) > > +{ > > + struct ipv4_hdr *ipv4_hdr; > > + uint16_t tcp_dl; > > + uint16_t pyld_unit_size; > > + uint16_t hdr_offset; > > + int ret = 1; > > + > > + ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) + > > + pkt->l2_len); > > + /* Don't process the fragmented packet */ > > + if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16( > > + IPV4_HDR_DF_MASK)) == 0)) { > > > It is not a check for fragmented packet - it is a check that fragmentation is allowed for that packet. > Should be IPV4_HDR_DF_MASK - 1, I think. IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF bit. It's a little-endian value. But ipv4_hdr->fragment_offset is big-endian order. So the value of DF bit should be "ipv4_hdr->fragment_offset & rte_cpu_to_be_16( IPV4_HDR_DF_MASK)". If this value is 0, the input packet is fragmented. > > > + pkts_out[0] = pkt; > > + return ret; > > + } > > + > > + tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt->l3_len - > > + pkt->l4_len; > > Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len? Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len here. > > > + /* Don't process the packet without data */ > > + if (unlikely(tcp_dl == 0)) { > > + pkts_out[0] = pkt; > > + return ret; > > + } > > + > > + hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len; > > + pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN; > > Hmm, why do we need to count CRC_LEN here? Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be included in gso_size. > > > + > > + /* Segment the payload */ > > + ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool, > > + indirect_pool, pkts_out, nb_pkts_out); > > + if (ret > 1) > > + gso_update_pkt_headers(pkt, ipid_delta, pkts_out, ret); > > + > > + return ret; > > +} > > diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h > > new file mode 100644 > > index 0000000..9c07984 > > --- /dev/null > > +++ b/lib/librte_gso/gso_tcp4.h > > @@ -0,0 +1,76 @@ > > +/*- > > + * BSD LICENSE > > + * > > + * Copyright(c) 2017 Intel Corporation. All rights reserved. > > + * All rights reserved. > > + * > > + * Redistribution and use in source and binary forms, with or without > > + * modification, are permitted provided that the following conditions > > + * are met: > > + * > > + * * Redistributions of source code must retain the above copyright > > + * notice, this list of conditions and the following disclaimer. > > + * * Redistributions in binary form must reproduce the above copyright > > + * notice, this list of conditions and the following disclaimer in > > + * the documentation and/or other materials provided with the > > + * distribution. > > + * * Neither the name of Intel Corporation nor the names of its > > + * contributors may be used to endorse or promote products derived > > + * from this software without specific prior written permission. > > + * > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > > + */ > > + > > +#ifndef _GSO_TCP4_H_ > > +#define _GSO_TCP4_H_ > > + > > +#include > > +#include > > + > > +/** > > + * Segment an IPv4/TCP packet. This function assumes the input packet has > > + * correct checksums and doesn't update checksums for GSO segment. > > + * Furthermore, it doesn't process IP fragment packets. > > + * > > + * @param pkt > > + * The packet mbuf to segment. > > + * @param gso_size > > + * The max length of a GSO segment, measured in bytes. > > + * @param ipid_delta > > + * The increasing uint of IP ids. > > + * @param direct_pool > > + * MBUF pool used for allocating direct buffers for output segments. > > + * @param indirect_pool > > + * MBUF pool used for allocating indirect buffers for output segments. > > + * @param pkts_out > > + * Pointer array used to store the MBUF addresses of output GSO > > + * segments, when gso_tcp4_segment() successes. If the memory space in > > + * pkts_out is insufficient, gso_tcp4_segment() fails and returns > > + * -EINVAL. > > + * @param nb_pkts_out > > + * The max number of items that 'pkts_out' can keep. > > + * > > + * @return > > + * - The number of GSO segments filled in pkts_out on success. > > + * - Return -ENOMEM if run out of memory in MBUF pools. > > + * - Return -EINVAL for invalid parameters. > > + */ > > +int gso_tcp4_segment(struct rte_mbuf *pkt, > > + uint16_t gso_size, > > + uint8_t ip_delta, > > + struct rte_mempool *direct_pool, > > + struct rte_mempool *indirect_pool, > > + struct rte_mbuf **pkts_out, > > + uint16_t nb_pkts_out); > > + > > +#endif > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c > > index dda50ee..95f6ea6 100644 > > --- a/lib/librte_gso/rte_gso.c > > +++ b/lib/librte_gso/rte_gso.c > > @@ -33,18 +33,53 @@ > > > > #include > > > > +#include > > + > > #include "rte_gso.h" > > +#include "gso_common.h" > > +#include "gso_tcp4.h" > > > > int > > rte_gso_segment(struct rte_mbuf *pkt, > > - struct rte_gso_ctx gso_ctx __rte_unused, > > + struct rte_gso_ctx gso_ctx, > > struct rte_mbuf **pkts_out, > > uint16_t nb_pkts_out) > > { > > + struct rte_mempool *direct_pool, *indirect_pool; > > + struct rte_mbuf *pkt_seg; > > + uint16_t gso_size; > > + uint8_t ipid_delta; > > + int ret = 1; > > + > > if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1) > > return -EINVAL; > > > > - pkts_out[0] = pkt; > > + if (gso_ctx.gso_size >= pkt->pkt_len || > > + (pkt->packet_type & gso_ctx.gso_types) != > > + pkt->packet_type) { > > + pkts_out[0] = pkt; > > + return ret; > > + } > > + > > + direct_pool = gso_ctx.direct_pool; > > + indirect_pool = gso_ctx.indirect_pool; > > + gso_size = gso_ctx.gso_size; > > + ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE; > > + > > + if (is_ipv4_tcp(pkt->packet_type)) { > > Probably we need here: > If (is_ipv4_tcp(pkt->packet_type) && (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {... > > > + ret = gso_tcp4_segment(pkt, gso_size, ipid_delta, > > + direct_pool, indirect_pool, > > + pkts_out, nb_pkts_out); > > + } else > > + RTE_LOG(WARNING, GSO, "Unsupported packet type\n"); > > Shouldn't we do pkt_out[0] = pkt; here? Yes, we need to add it here. Thanks for reminder. > > > + > > + if (ret > 1) { > > + pkt_seg = pkt; > > + while (pkt_seg) { > > + rte_mbuf_refcnt_update(pkt_seg, -1); > > + pkt_seg = pkt_seg->next; > > + } > > + } > > > > - return 1; > > + return ret; > > } > > -- > > 2.7.4