All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK
@ 2017-08-24 14:15 Jiayu Hu
  2017-08-24 14:15 ` [PATCH 1/5] lib: add Generic Segmentation Offload API framework Jiayu Hu
                   ` (6 more replies)
  0 siblings, 7 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-08-24 14:15 UTC (permalink / raw)
  To: dev; +Cc: mark.b.kavanagh, konstantin.ananyev, jianfeng.tan, Jiayu Hu

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. This patch adds GSO support to DPDK for specific
packet types: specifically, TCP/IPv4, VxLAN, and GRE.

The first patch introduces the GSO API framework. The second patch
adds GSO support for TCP/IPv4 packets (containing an optional VLAN
tag). The third patch adds GSO support for VxLAN packets that contain
outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or 
outer VLAN tags). The fourth patch adds GSO support for GRE packets
that contain outer IPv4, and inner TCP/IPv4 headers (with optional 
outer VLAN tag). The last patch in the series enables TCP/IPv4, VxLAN,
and GRE GSO in testpmd's checksum forwarding engine.

The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
iperf. Setup for the test is described as follows:

a. Connect 2 x 10Gbps physical ports (P0, P1), together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum
   forwarding engine.
c. Select IP and TCP HW checksum calculation for P0; select TCP HW
   checksum calculation for vhost-user port.
d. Launch a VM with csum and tso offloading enabled.
e. Run iperf-client on virtio-net port in the VM to send TCP packets.

With GSO enabled for P0 in testpmd, observed iperf throughput is ~9Gbps.
The experimental data of VxLAN and GRE will be shown later.

Jiayu Hu (3):
  lib: add Generic Segmentation Offload API framework
  gso/lib: add TCP/IPv4 GSO support
  app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO

Mark Kavanagh (2):
  lib/gso: add VxLAN GSO support
  lib/gso: add GRE GSO support

 app/test-pmd/cmdline.c                  | 121 +++++++++
 app/test-pmd/config.c                   |  25 ++
 app/test-pmd/csumonly.c                 |  68 ++++-
 app/test-pmd/testpmd.c                  |   9 +
 app/test-pmd/testpmd.h                  |  10 +
 config/common_base                      |   5 +
 lib/Makefile                            |   2 +
 lib/librte_eal/common/include/rte_log.h |   1 +
 lib/librte_gso/Makefile                 |  52 ++++
 lib/librte_gso/gso_common.c             | 431 ++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h             | 180 +++++++++++++
 lib/librte_gso/gso_tcp.c                |  82 ++++++
 lib/librte_gso/gso_tcp.h                |  73 ++++++
 lib/librte_gso/gso_tunnel.c             |  62 +++++
 lib/librte_gso/gso_tunnel.h             |  46 ++++
 lib/librte_gso/rte_gso.c                | 100 ++++++++
 lib/librte_gso/rte_gso.h                | 122 +++++++++
 lib/librte_gso/rte_gso_version.map      |   7 +
 mk/rte.app.mk                           |   1 +
 19 files changed, 1392 insertions(+), 5 deletions(-)
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp.c
 create mode 100644 lib/librte_gso/gso_tcp.h
 create mode 100644 lib/librte_gso/gso_tunnel.c
 create mode 100644 lib/librte_gso/gso_tunnel.h
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

-- 
2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH 1/5] lib: add Generic Segmentation Offload API framework
  2017-08-24 14:15 [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
@ 2017-08-24 14:15 ` Jiayu Hu
  2017-08-30  1:38   ` Ananyev, Konstantin
  2017-08-24 14:15 ` [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support Jiayu Hu
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 157+ messages in thread
From: Jiayu Hu @ 2017-08-24 14:15 UTC (permalink / raw)
  To: dev; +Cc: mark.b.kavanagh, konstantin.ananyev, jianfeng.tan, Jiayu Hu

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. This patch introduces the GSO API framework to DPDK.

The GSO library provides a segmentation API, rte_gso_segment(), for
applications. It splits an input packet into small ones in each
invocation. The GSO library refers to these small packets generated
by rte_gso_segment() as GSO segments. When all GSO segments are freed,
the input packet is freed automatically.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 config/common_base                 |   5 ++
 lib/Makefile                       |   2 +
 lib/librte_gso/Makefile            |  49 ++++++++++++++++
 lib/librte_gso/rte_gso.c           |  47 ++++++++++++++++
 lib/librte_gso/rte_gso.h           | 111 +++++++++++++++++++++++++++++++++++++
 lib/librte_gso/rte_gso_version.map |   7 +++
 mk/rte.app.mk                      |   1 +
 7 files changed, 222 insertions(+)
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

diff --git a/config/common_base b/config/common_base
index 5e97a08..603e340 100644
--- a/config/common_base
+++ b/config/common_base
@@ -652,6 +652,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
 CONFIG_RTE_LIBRTE_GRO=y
 
 #
+# Compile GSO library
+#
+CONFIG_RTE_LIBRTE_GSO=y
+
+#
 # Compile librte_meter
 #
 CONFIG_RTE_LIBRTE_METER=y
diff --git a/lib/Makefile b/lib/Makefile
index 86caba1..3d123f4 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
 DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
+DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
+DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
new file mode 100644
index 0000000..aeaacbc
--- /dev/null
+++ b/lib/librte_gso/Makefile
@@ -0,0 +1,49 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_gso.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+
+EXPORT_MAP := rte_gso_version.map
+
+LIBABIVER := 1
+
+#source files
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
new file mode 100644
index 0000000..b81afce
--- /dev/null
+++ b/lib/librte_gso/rte_gso.c
@@ -0,0 +1,47 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "rte_gso.h"
+
+int
+rte_gso_segment(struct rte_mbuf *pkt,
+		struct rte_gso_ctx gso_ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out __rte_unused)
+{
+	if (pkt == NULL || pkts_out == NULL || gso_ctx.direct_pool ==
+			NULL || gso_ctx.indirect_pool == NULL)
+		return -EINVAL;
+
+	return 1;
+}
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
new file mode 100644
index 0000000..5a8389a
--- /dev/null
+++ b/lib/librte_gso/rte_gso.h
@@ -0,0 +1,111 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_GSO_H_
+#define _RTE_GSO_H_
+
+/**
+ * @file
+ * Interface to GSO library
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * GSO context structure.
+ */
+struct rte_gso_ctx {
+	struct rte_mempool *direct_pool;
+	/**< MBUF pool for allocating direct buffers, which are used
+	 * to store packet headers for GSO segments.
+	 */
+	struct rte_mempool *indirect_pool;
+	/**< MBUF pool for allocating indirect buffers, which are used
+	 * to locate packet payloads for GSO segments. The indirect
+	 * buffer doesn't contain any data, but simply points to an
+	 * offset within the packet to segment.
+	 */
+	uint64_t gso_types;
+	/**< GSO types to perform */
+	uint16_t gso_size;
+	/**< maximum size of a GSO segment, measured in bytes */
+};
+
+/**
+ * Segmentation function, which supports processing of both single- and
+ * multi- segment packets. rte_gso_segment() assumes the input packet
+ * has correct checksums, and it doesn't process IP fragment packets.
+ * Additionally, it assumes that 'pkts_out' is large enough to hold all GSO
+ * segments.
+ *
+ * We refer to the packets that are segmented from the input packet as 'GSO
+ * segments'. If the input packet is GSOed, its mbuf refcnt reduces by 1.
+ * Therefore, when all GSO segments are freed, the input packet is freed
+ * automatically. If the input packet doesn't match the criteria for GSO
+ * (e.g. 'pkt's length is small and doesn't need segmentation), the packet
+ * is skipped and this function returns 1. If the available memory space
+ * in MBUF pools is insufficient, the packet is skipped and return -ENOMEM.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param ctx
+ *  GSO context object.
+ * @param pkts_out
+ *  Pointer array used to stores the mbuf addresses of GSO segments.
+ *  Applications must ensure pkts_out is large enough to hold all GSO
+ *  segments. If the memory space in pkts_out is insufficient, the input
+ *  packet is skipped and return -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of GSO segments created on success.
+ *  - Return 1 if no GSO is performed.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int rte_gso_segment(struct rte_mbuf *pkt,
+		struct rte_gso_ctx ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_GSO_H_ */
diff --git a/lib/librte_gso/rte_gso_version.map b/lib/librte_gso/rte_gso_version.map
new file mode 100644
index 0000000..e1fd453
--- /dev/null
+++ b/lib/librte_gso/rte_gso_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_gso_segment;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index c25fdd9..d4c9873 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
+_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
  2017-08-24 14:15 [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
  2017-08-24 14:15 ` [PATCH 1/5] lib: add Generic Segmentation Offload API framework Jiayu Hu
@ 2017-08-24 14:15 ` Jiayu Hu
  2017-08-30  1:38   ` Ananyev, Konstantin
  2017-08-24 14:15 ` [PATCH 3/5] lib/gso: add VxLAN " Jiayu Hu
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 157+ messages in thread
From: Jiayu Hu @ 2017-08-24 14:15 UTC (permalink / raw)
  To: dev; +Cc: mark.b.kavanagh, konstantin.ananyev, jianfeng.tan, Jiayu Hu

This patch adds GSO support for TCP/IPv4 packets. Supported packets
may include a single VLAN tag. TCP/IPv4 GSO assumes that all input
packets have correct checksums, and doesn't update checksums for output
packets (the responsibility for this lies with the application).
Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.

TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
MBUF, to organize an output packet. Note that we refer to these two
chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
header, while the indirect mbuf simply points to a location within the
original packet's payload. Consequently, use of the GSO library requires
multi-segment MBUF support in the TX functions of the NIC driver.

If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
result, when all of its GSOed segments are freed, the packet is freed
automatically.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 lib/librte_eal/common/include/rte_log.h |   1 +
 lib/librte_gso/Makefile                 |   2 +
 lib/librte_gso/gso_common.c             | 270 ++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h             | 120 ++++++++++++++
 lib/librte_gso/gso_tcp.c                |  82 ++++++++++
 lib/librte_gso/gso_tcp.h                |  73 +++++++++
 lib/librte_gso/rte_gso.c                |  44 +++++-
 lib/librte_gso/rte_gso.h                |   3 +
 8 files changed, 593 insertions(+), 2 deletions(-)
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp.c
 create mode 100644 lib/librte_gso/gso_tcp.h

diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index ec8dba7..2fa1199 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -87,6 +87,7 @@ extern struct rte_logs rte_logs;
 #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
 #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
 #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
+#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index aeaacbc..0f8e38f 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -42,6 +42,8 @@ LIBABIVER := 1
 
 #source files
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
new file mode 100644
index 0000000..2b54fbd
--- /dev/null
+++ b/lib/librte_gso/gso_common.c
@@ -0,0 +1,270 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdbool.h>
+#include <string.h>
+
+#include <rte_malloc.h>
+
+#include <rte_ether.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
+#include "gso_common.h"
+
+static inline void
+hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset)
+{
+	/* copy mbuf metadata */
+	hdr_segment->nb_segs = 1;
+	hdr_segment->port = pkt->port;
+	hdr_segment->ol_flags = pkt->ol_flags;
+	hdr_segment->packet_type = pkt->packet_type;
+	hdr_segment->pkt_len = pkt_hdr_offset;
+	hdr_segment->data_len = pkt_hdr_offset;
+	hdr_segment->tx_offload = pkt->tx_offload;
+	/* copy packet header */
+	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
+			rte_pktmbuf_mtod(pkt, char *),
+			pkt_hdr_offset);
+}
+
+static inline void
+free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
+{
+	uint16_t i;
+
+	for (i = 0; i < nb_pkts; i++) {
+		rte_pktmbuf_detach(pkts[i]->next);
+		rte_pktmbuf_free(pkts[i]);
+		pkts[i] = NULL;
+	}
+}
+
+int
+gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct rte_mbuf *pkt_in;
+	struct rte_mbuf *hdr_segment, *pyld_segment;
+	uint32_t pkt_in_pyld_off;
+	uint16_t pkt_in_segment_len, pkt_out_segment_len;
+	uint16_t nb_segs;
+	bool pkt_in_segment_processed;
+
+	pkt_in_pyld_off = pkt->data_off + pkt_hdr_offset;
+	pkt_in = pkt;
+	nb_segs = 0;
+
+	while (pkt_in) {
+		pkt_in_segment_processed = false;
+		pkt_in_segment_len = pkt_in->data_off + pkt_in->data_len;
+
+		while (!pkt_in_segment_processed) {
+			if (unlikely(nb_segs >= nb_pkts_out)) {
+				free_gso_segment(pkts_out, nb_segs);
+				return -EINVAL;
+			}
+
+			/* allocate direct mbuf */
+			hdr_segment = rte_pktmbuf_alloc(direct_pool);
+			if (unlikely(hdr_segment == NULL)) {
+				free_gso_segment(pkts_out, nb_segs);
+				return -ENOMEM;
+			}
+
+			/* allocate indirect mbuf */
+			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
+			if (unlikely(pyld_segment == NULL)) {
+				rte_pktmbuf_free(hdr_segment);
+				free_gso_segment(pkts_out, nb_segs);
+				return -ENOMEM;
+			}
+
+			/* copy packet header */
+			hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
+
+			/* attach payload mbuf to current packet segment */
+			rte_pktmbuf_attach(pyld_segment, pkt_in);
+
+			hdr_segment->next = pyld_segment;
+			pkts_out[nb_segs++] = hdr_segment;
+
+			/* calculate payload length */
+			pkt_out_segment_len = pyld_unit_size;
+			if (pkt_in_pyld_off + pkt_out_segment_len >
+					pkt_in_segment_len) {
+				pkt_out_segment_len = pkt_in_segment_len -
+					pkt_in_pyld_off;
+			}
+
+			/* update payload segment */
+			pyld_segment->data_off = pkt_in_pyld_off;
+			pyld_segment->data_len = pkt_out_segment_len;
+
+			/* update header segment */
+			hdr_segment->pkt_len += pyld_segment->data_len;
+			hdr_segment->nb_segs++;
+
+			/* update pkt_in_pyld_off */
+			pkt_in_pyld_off += pkt_out_segment_len;
+			if (pkt_in_pyld_off == pkt_in_segment_len)
+				pkt_in_segment_processed = true;
+		}
+
+		/* 'pkt_in' may contain numerous segments */
+		pkt_in = pkt_in->next;
+		if (pkt_in != NULL)
+			pkt_in_pyld_off = pkt_in->data_off;
+	}
+	return nb_segs;
+}
+
+static inline void
+parse_ipv4(struct ipv4_hdr *ipv4_hdr, struct rte_mbuf *pkt)
+{
+	struct tcp_hdr *tcp_hdr;
+
+	switch (ipv4_hdr->next_proto_id) {
+	case IPPROTO_TCP:
+		pkt->packet_type |= RTE_PTYPE_L4_TCP;
+		pkt->l3_len = IPv4_HDR_LEN(ipv4_hdr);
+		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+		pkt->l4_len = TCP_HDR_LEN(tcp_hdr);
+		break;
+	}
+}
+
+static inline void
+parse_ethernet(struct ether_hdr *eth_hdr, struct rte_mbuf *pkt)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct vlan_hdr *vlan_hdr;
+	uint16_t ethertype;
+
+	ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
+	if (ethertype == ETHER_TYPE_VLAN) {
+		vlan_hdr = (struct vlan_hdr *)(eth_hdr + 1);
+		pkt->l2_len = sizeof(struct vlan_hdr);
+		pkt->packet_type |= RTE_PTYPE_L2_ETHER_VLAN;
+		ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto);
+	}
+
+	switch (ethertype) {
+	case ETHER_TYPE_IPv4:
+		if (IS_VLAN_PKT(pkt)) {
+			pkt->packet_type |= RTE_PTYPE_L3_IPV4;
+		} else {
+			pkt->packet_type |= RTE_PTYPE_L2_ETHER;
+			pkt->packet_type |= RTE_PTYPE_L3_IPV4;
+		}
+		pkt->l2_len += sizeof(struct ether_hdr);
+		ipv4_hdr = (struct ipv4_hdr *) ((char *)eth_hdr +
+				pkt->l2_len);
+		parse_ipv4(ipv4_hdr, pkt);
+		break;
+	}
+}
+
+void
+gso_parse_packet(struct rte_mbuf *pkt)
+{
+	struct ether_hdr *eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
+
+	pkt->packet_type = pkt->tx_offload = 0;
+	parse_ethernet(eth_hdr, pkt);
+}
+
+static inline void
+update_ipv4_header(char *base, uint16_t offset, uint16_t length, uint16_t id)
+{
+	struct ipv4_hdr *ipv4_hdr = (struct ipv4_hdr *)(base + offset);
+
+	ipv4_hdr->total_length = rte_cpu_to_be_16(length - offset);
+	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
+}
+
+static inline void
+update_tcp_header(char *base, uint16_t offset, uint32_t sent_seq,
+	uint8_t non_tail)
+{
+	struct tcp_hdr *tcp_hdr = (struct tcp_hdr *)(base + offset);
+
+	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
+	/* clean FIN and PSH for non-tail segments */
+	if (non_tail)
+		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK | TCP_HDR_FIN_MASK));
+}
+
+void
+gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
+		struct rte_mbuf **out_segments)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	struct rte_mbuf *seg;
+	uint32_t sent_seq;
+	uint16_t offset, i;
+	uint16_t tail_seg_idx = nb_segments - 1, id;
+
+	switch (pkt->packet_type) {
+	case ETHER_VLAN_IPv4_TCP_PKT:
+	case ETHER_IPv4_TCP_PKT:
+		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+				pkt->l2_len);
+		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+		id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+		sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+
+		for (i = 0; i < nb_segments; i++) {
+			seg = out_segments[i];
+
+			offset = seg->l2_len;
+			update_ipv4_header(rte_pktmbuf_mtod(seg, char *),
+					offset, seg->pkt_len, id);
+			id++;
+
+			offset += seg->l3_len;
+			update_tcp_header(rte_pktmbuf_mtod(seg, char *),
+					offset, sent_seq, i < tail_seg_idx);
+			sent_seq += seg->next->data_len;
+		}
+		break;
+	}
+}
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
new file mode 100644
index 0000000..d750041
--- /dev/null
+++ b/lib/librte_gso/gso_common.h
@@ -0,0 +1,120 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_COMMON_H_
+#define _GSO_COMMON_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+#define IPV4_HDR_DF_SHIFT 14
+#define IPV4_HDR_DF_MASK (1 << IPV4_HDR_DF_SHIFT)
+#define IPv4_HDR_LEN(iph) ((iph->version_ihl & 0x0f) * 4)
+
+#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
+#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
+#define TCP_HDR_LEN(tcph) ((tcph->data_off & 0xf0) >> 2)
+
+#define ETHER_IPv4_PKT (RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4)
+/* Supported packet types */
+/* TCP/IPv4 packet. */
+#define ETHER_IPv4_TCP_PKT (ETHER_IPv4_PKT | RTE_PTYPE_L4_TCP)
+
+/* TCP/IPv4 packet with VLAN tag. */
+#define ETHER_VLAN_IPv4_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | \
+		RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)
+
+#define IS_VLAN_PKT(pkt) ((pkt->packet_type & RTE_PTYPE_L2_ETHER_VLAN) == \
+		RTE_PTYPE_L2_ETHER_VLAN)
+
+/**
+ * Internal function which parses a packet, setting outer_l2/l3_len and
+ * l2/l3/l4_len and packet_type.
+ *
+ * @param pkt
+ *  Packet to parse.
+ */
+void gso_parse_packet(struct rte_mbuf *pkt);
+
+/**
+ * Internal function which updates relevant packet headers, following
+ * segmentation. This is required to update, for example, the IPv4
+ * 'total_length' field, to reflect the reduced length of the now-
+ * segmented packet.
+ *
+ * @param pkt
+ *  The original packet.
+ * @param nb_segments
+ *  The number of GSO segments into which pkt was split.
+ * @param out_segements
+ *  Pointer array used for storing mbuf addresses for GSO segments.
+ */
+void gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
+		struct rte_mbuf **out_segments);
+
+/**
+ * Internal function which divides the input packet into small segments.
+ * Each of the newly-created segments is organized as a two-segment mbuf,
+ * where the first segment is a standard mbuf, which stores a copy of
+ * packet header, and the second is an indirect mbuf which points to a
+ * section of data in the input packet.
+ *
+ * @param pkt
+ *  Packet to segment.
+ * @param pkt_hdr_offset
+ *  Packet header offset, measured in byte.
+ * @param pyld_unit_size
+ *  The max payload length of a GSO segment.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to keep the mbuf addresses of output segments.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of segments created in the event of success.
+ *  - If no GSO is performed, return 1.
+ *  - If available memory in mempools is insufficient, return -ENOMEM.
+ *  - -EINVAL for invalid parameters
+ */
+int gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/gso_tcp.c b/lib/librte_gso/gso_tcp.c
new file mode 100644
index 0000000..9d5fc30
--- /dev/null
+++ b/lib/librte_gso/gso_tcp.c
@@ -0,0 +1,82 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+
+#include <rte_ether.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
+#include "gso_common.h"
+#include "gso_tcp.h"
+
+int
+gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ether_hdr *eth_hdr;
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t tcp_dl;
+	uint16_t pyld_unit_size;
+	uint16_t hdr_offset;
+	int ret = 1;
+
+	eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
+	ipv4_hdr = (struct ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
+
+	/* don't process fragmented packet */
+	if ((ipv4_hdr->fragment_offset &
+				rte_cpu_to_be_16(IPV4_HDR_DF_MASK)) == 0)
+		return ret;
+
+	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) -
+		pkt->l3_len - pkt->l4_len;
+	/* don't process packet without data */
+	if (tcp_dl == 0)
+		return ret;
+
+	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
+
+	/* segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+
+	if (ret > 1)
+		gso_update_pkt_headers(pkt, ret, pkts_out);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tcp.h b/lib/librte_gso/gso_tcp.h
new file mode 100644
index 0000000..f291ccb
--- /dev/null
+++ b/lib/librte_gso/gso_tcp.h
@@ -0,0 +1,73 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TCP_H_
+#define _GSO_TCP_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment an IPv4/TCP packet. This function assumes the input packet has
+ * correct checksums and doesn't update checksums for GSO segment.
+ * Furthermore, it doesn't process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array, which is used to store mbuf addresses of GSO segments.
+ *  Caller should guarantee that 'pkts_out' is sufficiently large to store
+ *  all GSO segments.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments on success.
+ *   - Return 1 if no GSO is performed.
+ *   - Return -ENOMEM if available memory in mempools is insufficient.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index b81afce..fac95f2 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -31,17 +31,57 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
+#include <rte_log.h>
+
 #include "rte_gso.h"
+#include "gso_common.h"
+#include "gso_tcp.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
 		struct rte_gso_ctx gso_ctx,
 		struct rte_mbuf **pkts_out,
-		uint16_t nb_pkts_out __rte_unused)
+		uint16_t nb_pkts_out)
 {
+	struct rte_mempool *direct_pool, *indirect_pool;
+	struct rte_mbuf *pkt_seg;
+	uint16_t nb_segments, gso_size;
+
 	if (pkt == NULL || pkts_out == NULL || gso_ctx.direct_pool ==
 			NULL || gso_ctx.indirect_pool == NULL)
 		return -EINVAL;
 
-	return 1;
+	if ((gso_ctx.gso_types & RTE_GSO_TCP_IPV4) == 0 ||
+			gso_ctx.gso_size >= pkt->pkt_len ||
+			gso_ctx.gso_size == 0)
+		return 1;
+
+	pkt_seg = pkt;
+	gso_size = gso_ctx.gso_size;
+	direct_pool = gso_ctx.direct_pool;
+	indirect_pool = gso_ctx.indirect_pool;
+
+	/* Parse packet headers to determine how to segment 'pkt' */
+	gso_parse_packet(pkt);
+
+	switch (pkt->packet_type) {
+	case ETHER_VLAN_IPv4_TCP_PKT:
+	case ETHER_IPv4_TCP_PKT:
+		nb_segments = gso_tcp4_segment(pkt, gso_size,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+		break;
+	default:
+		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
+		nb_segments = 1;
+	}
+
+	if (nb_segments > 1) {
+		while (pkt_seg) {
+			rte_mbuf_refcnt_update(pkt_seg, -1);
+			pkt_seg = pkt_seg->next;
+		}
+	}
+
+	return nb_segments;
 }
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
index 5a8389a..77853fa 100644
--- a/lib/librte_gso/rte_gso.h
+++ b/lib/librte_gso/rte_gso.h
@@ -46,6 +46,9 @@ extern "C" {
 #include <stdint.h>
 #include <rte_mbuf.h>
 
+#define RTE_GSO_TCP_IPV4 (1ULL << 0)
+/**< GSO flag for TCP/IPv4 packets (containing optional VLAN tag) */
+
 /**
  * GSO context structure.
  */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH 3/5] lib/gso: add VxLAN GSO support
  2017-08-24 14:15 [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
  2017-08-24 14:15 ` [PATCH 1/5] lib: add Generic Segmentation Offload API framework Jiayu Hu
  2017-08-24 14:15 ` [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support Jiayu Hu
@ 2017-08-24 14:15 ` Jiayu Hu
  2017-08-24 14:15 ` [PATCH 4/5] lib/gso: add GRE " Jiayu Hu
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-08-24 14:15 UTC (permalink / raw)
  To: dev; +Cc: mark.b.kavanagh, konstantin.ananyev, jianfeng.tan, Jiayu Hu

From: Mark Kavanagh <mark.b.kavanagh@intel.com>

This patch adds GSO support for VxLAN-encapsulated packets. Supported
VxLAN packets must have an outer IPv4 header (prepended by an optional
VLAN tag), and contain an inner TCP/IPv4 packet (with an optional inner
VLAN tag).

VxLAN GSO assumes that all input packets have correct checksums and
doesn't update checksums for output packets. Additionally, it doesn't
process IP fragmented packets.

As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
output packet, which mandates support for multi-segment mbufs in the TX
functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
reduces its MBUF refcnt by 1. As a result, when all of its GSOed
segments are freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 lib/librte_gso/Makefile     |   1 +
 lib/librte_gso/gso_common.c | 109 ++++++++++++++++++++++++++++++++++++++++++--
 lib/librte_gso/gso_common.h |  41 ++++++++++++++++-
 lib/librte_gso/gso_tunnel.c |  62 +++++++++++++++++++++++++
 lib/librte_gso/gso_tunnel.h |  46 +++++++++++++++++++
 lib/librte_gso/rte_gso.c    |  12 ++++-
 lib/librte_gso/rte_gso.h    |   4 ++
 7 files changed, 268 insertions(+), 7 deletions(-)
 create mode 100644 lib/librte_gso/gso_tunnel.c
 create mode 100644 lib/librte_gso/gso_tunnel.h

diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index 0f8e38f..a4d1a81 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -44,6 +44,7 @@ LIBABIVER := 1
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
index 2b54fbd..65cec44 100644
--- a/lib/librte_gso/gso_common.c
+++ b/lib/librte_gso/gso_common.c
@@ -39,6 +39,7 @@
 #include <rte_ether.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
+#include <rte_udp.h>
 
 #include "gso_common.h"
 
@@ -156,18 +157,60 @@ gso_do_segment(struct rte_mbuf *pkt,
 	return nb_segs;
 }
 
+static inline void parse_ethernet(struct ether_hdr *eth_hdr,
+		struct rte_mbuf *pkt);
+
+static inline void
+parse_vxlan(struct udp_hdr *udp_hdr, struct rte_mbuf *pkt)
+{
+	struct ether_hdr *eth_hdr;
+
+	eth_hdr = (struct ether_hdr *)((char *)udp_hdr +
+			sizeof(struct udp_hdr) +
+			sizeof(struct vxlan_hdr));
+
+	pkt->packet_type |= RTE_PTYPE_TUNNEL_VXLAN;
+	pkt->outer_l2_len = pkt->l2_len;
+	parse_ethernet(eth_hdr, pkt);
+	pkt->l2_len += ETHER_VXLAN_HLEN; /* add udp + vxlan */
+}
+
+static inline void
+parse_udp(struct udp_hdr *udp_hdr, struct rte_mbuf *pkt)
+{
+	/* Outer UDP header of VxLAN packet */
+	if (udp_hdr->dst_port == rte_cpu_to_be_16(VXLAN_DEFAULT_PORT)) {
+		pkt->packet_type |= RTE_PTYPE_L4_UDP;
+		parse_vxlan(udp_hdr, pkt);
+	} else {
+		/* IPv4/UDP packet */
+		pkt->l4_len = sizeof(struct udp_hdr);
+		pkt->packet_type |= RTE_PTYPE_L4_UDP;
+	}
+}
+
 static inline void
 parse_ipv4(struct ipv4_hdr *ipv4_hdr, struct rte_mbuf *pkt)
 {
 	struct tcp_hdr *tcp_hdr;
+	struct udp_hdr *udp_hdr;
 
 	switch (ipv4_hdr->next_proto_id) {
 	case IPPROTO_TCP:
-		pkt->packet_type |= RTE_PTYPE_L4_TCP;
+		if (IS_VXLAN_PKT(pkt)) {
+			pkt->outer_l3_len = pkt->l3_len;
+			pkt->packet_type |= RTE_PTYPE_INNER_L4_TCP;
+		} else
+			pkt->packet_type |= RTE_PTYPE_L4_TCP;
 		pkt->l3_len = IPv4_HDR_LEN(ipv4_hdr);
 		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
 		pkt->l4_len = TCP_HDR_LEN(tcp_hdr);
 		break;
+	case IPPROTO_UDP:
+		pkt->l3_len = IPv4_HDR_LEN(ipv4_hdr);
+		udp_hdr = (struct udp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+		parse_udp(udp_hdr, pkt);
+		break;
 	}
 }
 
@@ -182,13 +225,21 @@ parse_ethernet(struct ether_hdr *eth_hdr, struct rte_mbuf *pkt)
 	if (ethertype == ETHER_TYPE_VLAN) {
 		vlan_hdr = (struct vlan_hdr *)(eth_hdr + 1);
 		pkt->l2_len = sizeof(struct vlan_hdr);
-		pkt->packet_type |= RTE_PTYPE_L2_ETHER_VLAN;
+		if (IS_VXLAN_PKT(pkt))
+			pkt->packet_type |= RTE_PTYPE_INNER_L2_ETHER_VLAN;
+		else
+			pkt->packet_type |= RTE_PTYPE_L2_ETHER_VLAN;
 		ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto);
-	}
+	} else
+		pkt->l2_len = 0;
 
 	switch (ethertype) {
 	case ETHER_TYPE_IPv4:
-		if (IS_VLAN_PKT(pkt)) {
+		if (IS_VXLAN_PKT(pkt)) {
+			if (!IS_INNER_VLAN_PKT(pkt))
+				pkt->packet_type |= RTE_PTYPE_INNER_L2_ETHER;
+			pkt->packet_type |= RTE_PTYPE_INNER_L3_IPV4;
+		} else if (IS_VLAN_PKT(pkt)) {
 			pkt->packet_type |= RTE_PTYPE_L3_IPV4;
 		} else {
 			pkt->packet_type |= RTE_PTYPE_L2_ETHER;
@@ -236,14 +287,62 @@ void
 gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
 		struct rte_mbuf **out_segments)
 {
-	struct ipv4_hdr *ipv4_hdr;
+	struct ipv4_hdr *ipv4_hdr, *outer_ipv4_hdr;
 	struct tcp_hdr *tcp_hdr;
+	struct udp_hdr *udp_hdr;
 	struct rte_mbuf *seg;
 	uint32_t sent_seq;
 	uint16_t offset, i;
 	uint16_t tail_seg_idx = nb_segments - 1, id;
+	uint16_t outer_id;
 
 	switch (pkt->packet_type) {
+	case ETHER_VLAN_IPv4_UDP_VXLAN_VLAN_IPv4_TCP_PKT:
+	case ETHER_VLAN_IPv4_UDP_VXLAN_IPv4_TCP_PKT:
+	case ETHER_IPv4_UDP_VXLAN_VLAN_IPv4_TCP_PKT:
+	case ETHER_IPv4_UDP_VXLAN_IPv4_TCP_PKT:
+		outer_ipv4_hdr =
+			(struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+				pkt->outer_l2_len);
+		ipv4_hdr = (struct ipv4_hdr *)((char *)(outer_ipv4_hdr +
+					pkt->outer_l3_len + pkt->l2_len));
+		tcp_hdr = (struct tcp_hdr *)(ipv4_hdr + 1);
+
+		outer_id = rte_be_to_cpu_16(outer_ipv4_hdr->packet_id);
+		id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+		sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+
+		for (i = 0; i < nb_segments; i++) {
+			seg = out_segments[i];
+
+			/* Update outer IPv4 header */
+			offset = seg->outer_l2_len;
+			update_ipv4_header(rte_pktmbuf_mtod(seg, char *),
+					offset, seg->pkt_len, outer_id);
+			outer_id++;
+
+			/* Update outer UDP header */
+			offset += seg->outer_l3_len;
+			udp_hdr = (struct udp_hdr *)(
+					rte_pktmbuf_mtod(seg, char *) +
+					offset);
+			udp_hdr->dgram_len = rte_cpu_to_be_16(seg->pkt_len -
+					offset);
+
+			/* Update inner IPv4 header */
+			offset += seg->l2_len;
+			update_ipv4_header(rte_pktmbuf_mtod(seg, char*),
+					offset, seg->pkt_len, id);
+			id++;
+
+			/* Update inner TCP header */
+			offset += seg->l3_len;
+			update_tcp_header(rte_pktmbuf_mtod(seg, char *),
+					offset, sent_seq, i < tail_seg_idx);
+
+			sent_seq += seg->next->data_len;
+		}
+		break;
 	case ETHER_VLAN_IPv4_TCP_PKT:
 	case ETHER_IPv4_TCP_PKT:
 		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index d750041..0ad95d3 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -46,6 +46,8 @@
 #define TCP_HDR_LEN(tcph) ((tcph->data_off & 0xf0) >> 2)
 
 #define ETHER_IPv4_PKT (RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4)
+#define INNER_ETHER_IPv4_TCP_PKT (RTE_PTYPE_INNER_L2_ETHER |\
+		RTE_PTYPE_INNER_L3_IPV4 | RTE_PTYPE_INNER_L4_TCP)
 /* Supported packet types */
 /* TCP/IPv4 packet. */
 #define ETHER_IPv4_TCP_PKT (ETHER_IPv4_PKT | RTE_PTYPE_L4_TCP)
@@ -54,9 +56,46 @@
 #define ETHER_VLAN_IPv4_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | \
 		RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)
 
+/* VxLAN packet */
+#define ETHER_IPv4_UDP_VXLAN_IPv4_TCP_PKT (ETHER_IPv4_PKT | \
+		RTE_PTYPE_L4_UDP | \
+		RTE_PTYPE_TUNNEL_VXLAN | \
+		INNER_ETHER_IPv4_TCP_PKT)
+
+/* VxLAN packet with outer VLAN tag. */
+#define ETHER_VLAN_IPv4_UDP_VXLAN_IPv4_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | \
+		RTE_PTYPE_L3_IPV4 | \
+		RTE_PTYPE_L4_UDP | \
+		RTE_PTYPE_TUNNEL_VXLAN | \
+		INNER_ETHER_IPv4_TCP_PKT)
+
+/* VxLAN packet with inner VLAN tag. */
+#define ETHER_IPv4_UDP_VXLAN_VLAN_IPv4_TCP_PKT (ETHER_IPv4_PKT | \
+		RTE_PTYPE_L4_UDP | \
+		RTE_PTYPE_TUNNEL_VXLAN | \
+		RTE_PTYPE_INNER_L2_ETHER_VLAN | \
+		RTE_PTYPE_INNER_L3_IPV4  | \
+		RTE_PTYPE_INNER_L4_TCP)
+
+/* VxLAN packet with both outer and inner VLAN tags. */
+#define ETHER_VLAN_IPv4_UDP_VXLAN_VLAN_IPv4_TCP_PKT (\
+		RTE_PTYPE_L2_ETHER_VLAN | \
+		RTE_PTYPE_L3_IPV4 | \
+		RTE_PTYPE_L4_UDP | \
+		RTE_PTYPE_TUNNEL_VXLAN | \
+		RTE_PTYPE_INNER_L2_ETHER_VLAN | \
+		RTE_PTYPE_INNER_L3_IPV4 | \
+		RTE_PTYPE_INNER_L4_TCP)
+
 #define IS_VLAN_PKT(pkt) ((pkt->packet_type & RTE_PTYPE_L2_ETHER_VLAN) == \
 		RTE_PTYPE_L2_ETHER_VLAN)
+#define IS_INNER_VLAN_PKT(pkt) (\
+		(pkt->packet_type & RTE_PTYPE_INNER_L2_ETHER_VLAN) == \
+		RTE_PTYPE_INNER_L2_ETHER_VLAN)
 
+#define VXLAN_DEFAULT_PORT 4789
+#define IS_VXLAN_PKT(pkt) ((pkt->packet_type & RTE_PTYPE_TUNNEL_VXLAN) == \
+		RTE_PTYPE_TUNNEL_VXLAN)
 /**
  * Internal function which parses a packet, setting outer_l2/l3_len and
  * l2/l3/l4_len and packet_type.
@@ -92,7 +131,7 @@ void gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
  * @param pkt
  *  Packet to segment.
  * @param pkt_hdr_offset
- *  Packet header offset, measured in byte.
+ *  Packet header offset, measured in bytes.
  * @param pyld_unit_size
  *  The max payload length of a GSO segment.
  * @param direct_pool
diff --git a/lib/librte_gso/gso_tunnel.c b/lib/librte_gso/gso_tunnel.c
new file mode 100644
index 0000000..6a04697
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel.c
@@ -0,0 +1,62 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_ether.h>
+
+#include "gso_common.h"
+#include "gso_tunnel.h"
+
+int
+gso_tunnel_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	uint16_t pyld_unit_size, hdr_offset;
+	int ret;
+
+	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len +
+		pkt->l3_len + pkt->l4_len;
+
+	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
+
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+
+	if (ret > 1)
+		gso_update_pkt_headers(pkt, ret, pkts_out);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tunnel.h b/lib/librte_gso/gso_tunnel.h
new file mode 100644
index 0000000..a9b2363
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel.h
@@ -0,0 +1,46 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TUNNEL_H_
+#define _GSO_TUNNEL_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+int gso_tunnel_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index fac95f2..f110f18 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -36,6 +36,7 @@
 #include "rte_gso.h"
 #include "gso_common.h"
 #include "gso_tcp.h"
+#include "gso_tunnel.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -51,7 +52,8 @@ rte_gso_segment(struct rte_mbuf *pkt,
 			NULL || gso_ctx.indirect_pool == NULL)
 		return -EINVAL;
 
-	if ((gso_ctx.gso_types & RTE_GSO_TCP_IPV4) == 0 ||
+	if ((gso_ctx.gso_types & (RTE_GSO_TCP_IPV4 |
+					RTE_GSO_IPV4_VXLAN_TCP_IPV4)) == 0 ||
 			gso_ctx.gso_size >= pkt->pkt_len ||
 			gso_ctx.gso_size == 0)
 		return 1;
@@ -71,6 +73,14 @@ rte_gso_segment(struct rte_mbuf *pkt,
 				direct_pool, indirect_pool,
 				pkts_out, nb_pkts_out);
 		break;
+	case ETHER_VLAN_IPv4_UDP_VXLAN_VLAN_IPv4_TCP_PKT:
+	case ETHER_VLAN_IPv4_UDP_VXLAN_IPv4_TCP_PKT:
+	case ETHER_IPv4_UDP_VXLAN_VLAN_IPv4_TCP_PKT:
+	case ETHER_IPv4_UDP_VXLAN_IPv4_TCP_PKT:
+		nb_segments = gso_tunnel_segment(pkt, gso_size,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+		break;
 	default:
 		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
 		nb_segments = 1;
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
index 77853fa..e1b2c23 100644
--- a/lib/librte_gso/rte_gso.h
+++ b/lib/librte_gso/rte_gso.h
@@ -48,6 +48,10 @@ extern "C" {
 
 #define RTE_GSO_TCP_IPV4 (1ULL << 0)
 /**< GSO flag for TCP/IPv4 packets (containing optional VLAN tag) */
+#define RTE_GSO_IPV4_VXLAN_TCP_IPV4 (1ULL << 1)
+/**< GSO flag for VxLAN packets that contain outer IPv4, and inner
+ * TCP/IPv4 headers (plus optional inner and/or outer VLAN tags).
+ */
 
 /**
  * GSO context structure.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH 4/5] lib/gso: add GRE GSO support
  2017-08-24 14:15 [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                   ` (2 preceding siblings ...)
  2017-08-24 14:15 ` [PATCH 3/5] lib/gso: add VxLAN " Jiayu Hu
@ 2017-08-24 14:15 ` Jiayu Hu
  2017-08-24 14:15 ` [PATCH 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-08-24 14:15 UTC (permalink / raw)
  To: dev; +Cc: mark.b.kavanagh, konstantin.ananyev, jianfeng.tan, Jiayu Hu

From: Mark Kavanagh <mark.b.kavanagh@intel.com>

This patch adds GSO support for GRE-tunneled packets. Supported GRE
packets must contain an outer IPv4 header, and inner TCP/IPv4 headers.
They may also contain a single VLAN tag. GRE GSO assumes that all input
packets have correct checksums and doesn't update checksums for output
packets. Additionally, it doesn't process IP fragmented packets.

As with VxLAN GSO, GRE GSO uses a two-segment MBUF to organize each
output packet, which requires multi-segment mbuf support in the TX
functions of the NIC driver. Also, if a packet is GSOed, GRE GSO reduces
its MBUF refcnt by 1. As a result, when all of its GSOed segments are
freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 lib/librte_gso/gso_common.c | 66 +++++++++++++++++++++++++++++++++++++++++++--
 lib/librte_gso/gso_common.h | 21 +++++++++++++++
 lib/librte_gso/rte_gso.c    |  5 +++-
 lib/librte_gso/rte_gso.h    |  4 +++
 4 files changed, 93 insertions(+), 3 deletions(-)

diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
index 65cec44..b3e7f9d 100644
--- a/lib/librte_gso/gso_common.c
+++ b/lib/librte_gso/gso_common.c
@@ -37,6 +37,7 @@
 #include <rte_malloc.h>
 
 #include <rte_ether.h>
+#include <rte_gre.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
 #include <rte_udp.h>
@@ -159,6 +160,8 @@ gso_do_segment(struct rte_mbuf *pkt,
 
 static inline void parse_ethernet(struct ether_hdr *eth_hdr,
 		struct rte_mbuf *pkt);
+static inline void parse_ipv4(struct ipv4_hdr *ipv4_hdr,
+		struct rte_mbuf *pkt);
 
 static inline void
 parse_vxlan(struct udp_hdr *udp_hdr, struct rte_mbuf *pkt)
@@ -190,15 +193,29 @@ parse_udp(struct udp_hdr *udp_hdr, struct rte_mbuf *pkt)
 }
 
 static inline void
+parse_gre(struct gre_hdr *gre_hdr, struct rte_mbuf *pkt)
+{
+	struct ipv4_hdr *ipv4_hdr;
+
+	if (gre_hdr->proto == rte_cpu_to_be_16(ETHER_TYPE_IPv4)) {
+		ipv4_hdr = (struct ipv4_hdr *)(gre_hdr + 1);
+		pkt->packet_type |= RTE_PTYPE_INNER_L3_IPV4;
+		parse_ipv4(ipv4_hdr, pkt);
+	}
+}
+
+static inline void
 parse_ipv4(struct ipv4_hdr *ipv4_hdr, struct rte_mbuf *pkt)
 {
+	struct gre_hdr *gre_hdr;
 	struct tcp_hdr *tcp_hdr;
 	struct udp_hdr *udp_hdr;
 
 	switch (ipv4_hdr->next_proto_id) {
 	case IPPROTO_TCP:
-		if (IS_VXLAN_PKT(pkt)) {
-			pkt->outer_l3_len = pkt->l3_len;
+		if (IS_TUNNEL_PKT(pkt)) {
+			if (IS_VXLAN_PKT(pkt))
+				pkt->outer_l3_len = pkt->l3_len;
 			pkt->packet_type |= RTE_PTYPE_INNER_L4_TCP;
 		} else
 			pkt->packet_type |= RTE_PTYPE_L4_TCP;
@@ -211,6 +228,14 @@ parse_ipv4(struct ipv4_hdr *ipv4_hdr, struct rte_mbuf *pkt)
 		udp_hdr = (struct udp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
 		parse_udp(udp_hdr, pkt);
 		break;
+	case IPPROTO_GRE:
+		gre_hdr = (struct gre_hdr *)(ipv4_hdr + 1);
+		pkt->outer_l2_len = pkt->l2_len;
+		pkt->outer_l3_len = IPv4_HDR_LEN(ipv4_hdr);
+		pkt->l2_len = sizeof(*gre_hdr);
+		pkt->packet_type |= RTE_PTYPE_TUNNEL_GRE;
+		parse_gre(gre_hdr, pkt);
+		break;
 	}
 }
 
@@ -343,6 +368,43 @@ gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
 			sent_seq += seg->next->data_len;
 		}
 		break;
+	case ETHER_VLAN_IPv4_GRE_IPv4_TCP_PKT:
+	case ETHER_IPv4_GRE_IPv4_TCP_PKT:
+		outer_ipv4_hdr =
+			(struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+				pkt->outer_l2_len);
+		ipv4_hdr = (struct ipv4_hdr *)((char *)outer_ipv4_hdr +
+				pkt->outer_l3_len + pkt->l2_len);
+		tcp_hdr = (struct tcp_hdr *)(ipv4_hdr + 1);
+
+		/* Retrieve values from original packet */
+		id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+		outer_id = rte_be_to_cpu_16(outer_ipv4_hdr->packet_id);
+		sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+
+		for (i = 0; i < nb_segments; i++) {
+			seg = out_segments[i];
+
+			/* Update outer IPv4 header */
+			offset = seg->outer_l2_len;
+			update_ipv4_header(rte_pktmbuf_mtod(seg, char *),
+					offset, seg->pkt_len, outer_id);
+			outer_id++;
+
+			/* Update inner IPv4 header */
+			offset += seg->outer_l3_len + seg->l2_len;
+			update_ipv4_header(rte_pktmbuf_mtod(seg, char *),
+					offset, seg->pkt_len, id);
+			id++;
+
+			/* Update inner TCP header */
+			offset += seg->l3_len;
+			update_tcp_header(rte_pktmbuf_mtod(seg, char *),
+					offset, sent_seq, i < tail_seg_idx);
+
+			sent_seq += seg->next->data_len;
+		}
+		break;
 	case ETHER_VLAN_IPv4_TCP_PKT:
 	case ETHER_IPv4_TCP_PKT:
 		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index 0ad95d3..2ed264a 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -87,6 +87,21 @@
 		RTE_PTYPE_INNER_L3_IPV4 | \
 		RTE_PTYPE_INNER_L4_TCP)
 
+/* GRE packet. */
+#define ETHER_IPv4_GRE_IPv4_TCP_PKT (\
+		ETHER_IPv4_PKT          | \
+		RTE_PTYPE_TUNNEL_GRE    | \
+		RTE_PTYPE_INNER_L3_IPV4 | \
+		RTE_PTYPE_INNER_L4_TCP)
+
+/* GRE packet with VLAN tag. */
+#define ETHER_VLAN_IPv4_GRE_IPv4_TCP_PKT (\
+		RTE_PTYPE_L2_ETHER_VLAN | \
+		RTE_PTYPE_L3_IPV4       | \
+		RTE_PTYPE_TUNNEL_GRE    | \
+		RTE_PTYPE_INNER_L3_IPV4 | \
+		RTE_PTYPE_INNER_L4_TCP)
+
 #define IS_VLAN_PKT(pkt) ((pkt->packet_type & RTE_PTYPE_L2_ETHER_VLAN) == \
 		RTE_PTYPE_L2_ETHER_VLAN)
 #define IS_INNER_VLAN_PKT(pkt) (\
@@ -96,6 +111,12 @@
 #define VXLAN_DEFAULT_PORT 4789
 #define IS_VXLAN_PKT(pkt) ((pkt->packet_type & RTE_PTYPE_TUNNEL_VXLAN) == \
 		RTE_PTYPE_TUNNEL_VXLAN)
+
+#define IS_GRE_PKT(pkt) ((pkt->packet_type & RTE_PTYPE_TUNNEL_GRE) == \
+		RTE_PTYPE_TUNNEL_GRE)
+
+#define IS_TUNNEL_PKT(pkt) ((pkt->packet_type & RTE_PTYPE_TUNNEL_VXLAN) | \
+		(pkt->packet_type & RTE_PTYPE_TUNNEL_GRE))
 /**
  * Internal function which parses a packet, setting outer_l2/l3_len and
  * l2/l3/l4_len and packet_type.
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index f110f18..244bbf6 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -53,7 +53,8 @@ rte_gso_segment(struct rte_mbuf *pkt,
 		return -EINVAL;
 
 	if ((gso_ctx.gso_types & (RTE_GSO_TCP_IPV4 |
-					RTE_GSO_IPV4_VXLAN_TCP_IPV4)) == 0 ||
+					RTE_GSO_IPV4_VXLAN_TCP_IPV4 |
+					RTE_GSO_IPV4_GRE_TCP_IPV4)) == 0 ||
 			gso_ctx.gso_size >= pkt->pkt_len ||
 			gso_ctx.gso_size == 0)
 		return 1;
@@ -77,6 +78,8 @@ rte_gso_segment(struct rte_mbuf *pkt,
 	case ETHER_VLAN_IPv4_UDP_VXLAN_IPv4_TCP_PKT:
 	case ETHER_IPv4_UDP_VXLAN_VLAN_IPv4_TCP_PKT:
 	case ETHER_IPv4_UDP_VXLAN_IPv4_TCP_PKT:
+	case ETHER_VLAN_IPv4_GRE_IPv4_TCP_PKT:
+	case ETHER_IPv4_GRE_IPv4_TCP_PKT:
 		nb_segments = gso_tunnel_segment(pkt, gso_size,
 				direct_pool, indirect_pool,
 				pkts_out, nb_pkts_out);
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
index e1b2c23..86ca790 100644
--- a/lib/librte_gso/rte_gso.h
+++ b/lib/librte_gso/rte_gso.h
@@ -52,6 +52,10 @@ extern "C" {
 /**< GSO flag for VxLAN packets that contain outer IPv4, and inner
  * TCP/IPv4 headers (plus optional inner and/or outer VLAN tags).
  */
+#define RTE_GSO_IPV4_GRE_TCP_IPV4 (1ULL << 2)
+/**< GSO flag for GRE packets that contain outer IPv4, and inner
+ * TCP/IPv4 headers (with optional outer VLAN tag).
+ */
 
 /**
  * GSO context structure.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-08-24 14:15 [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                   ` (3 preceding siblings ...)
  2017-08-24 14:15 ` [PATCH 4/5] lib/gso: add GRE " Jiayu Hu
@ 2017-08-24 14:15 ` Jiayu Hu
  2017-08-30  1:37 ` [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Ananyev, Konstantin
  2017-09-05  7:57 ` [PATCH v2 " Jiayu Hu
  6 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-08-24 14:15 UTC (permalink / raw)
  To: dev; +Cc: mark.b.kavanagh, konstantin.ananyev, jianfeng.tan, Jiayu Hu

This patch adds GSO support to the csum forwarding engine. Oversized
packets transmitted over a GSO-enabled port will undergo segmentation
(with the exception of packet-types unsupported by the GSO library).
GSO support is disabled by default.

GSO support may be toggled on a per-port basis, using the command

        "set port <port_id> gso on|off".

The maximum packet length for GSO segments may be set with the command

        "set port <port_id> gso_segsz <length>"

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 app/test-pmd/cmdline.c  | 121 ++++++++++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/config.c   |  25 ++++++++++
 app/test-pmd/csumonly.c |  68 +++++++++++++++++++++++++--
 app/test-pmd/testpmd.c  |   9 ++++
 app/test-pmd/testpmd.h  |  10 ++++
 5 files changed, 228 insertions(+), 5 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index cd8c358..754e249 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -431,6 +431,13 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"    Set max flow number and max packet number per-flow"
 			" for GRO.\n\n"
 
+			"set port (port_id) gso (on|off)"
+			"    Enable or disable Generic Segmentation Offload in"
+			" csum forwarding engine.\n\n"
+
+			"set port <port_id> gso_segsz <length>\n"
+			"    Set max packet length for GSO segment.\n\n"
+
 			"set fwd (%s)\n"
 			"    Set packet forwarding mode.\n\n"
 
@@ -3963,6 +3970,118 @@ cmdline_parse_inst_t cmd_gro_set = {
 	},
 };
 
+/* *** ENABLE/DISABLE GSO FOR PORTS *** */
+struct cmd_gso_enable_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_mode;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_enable_parsed(void *parsed_result,
+		__attribute__((unused)) struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_gso_enable_result *res;
+
+	res = parsed_result;
+	setup_gso(res->cmd_mode, res->cmd_pid);
+}
+
+cmdline_parse_token_string_t cmd_gso_enable_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_enable_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_enable_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_enable_mode =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_mode, "on#off");
+cmdline_parse_token_num_t cmd_gso_enable_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_enable = {
+	.f = cmd_gso_enable_parsed,
+	.data = NULL,
+	.help_str = "set port <port_id> gso on|off",
+	.tokens = {
+		(void *)&cmd_gso_enable_set,
+		(void *)&cmd_gso_enable_port,
+		(void *)&cmd_gso_enable_pid,
+		(void *)&cmd_gso_enable_keyword,
+		(void *)&cmd_gso_enable_mode,
+		NULL,
+	},
+};
+
+/* *** SET MAX PACKET LENGTH FOR GSO SEGMENT *** */
+struct cmd_gso_size_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	uint16_t cmd_segsz;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_size_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_size_result *res = parsed_result;
+
+	if (port_id_is_invalid(res->cmd_pid, ENABLED_WARN))
+		return;
+
+	if (!strcmp(res->cmd_keyword, "gso_segsz")) {
+		if (res->cmd_segsz == 0) {
+			gso_ports[res->cmd_pid].enable = 0;
+			gso_ports[res->cmd_pid].gso_segsz = 0;
+			printf("Input gso_segsz is 0. Disable GSO for"
+					" port %u\n", res->cmd_pid);
+		} else
+			gso_ports[res->cmd_pid].gso_segsz = res->cmd_segsz;
+
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_size_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_size_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_size_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_keyword, "gso_segsz");
+cmdline_parse_token_num_t cmd_gso_size_segsz =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_size_result,
+				cmd_segsz, UINT16);
+cmdline_parse_token_num_t cmd_gso_size_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_size_result,
+				cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_size = {
+	.f = cmd_gso_size_parsed,
+	.data = NULL,
+	.help_str = "set port <port_id> gso_segsz <length>: set max "
+		"packet length for GSO segment (0 to disable GSO)",
+	.tokens = {
+		(void *)&cmd_gso_size_set,
+		(void *)&cmd_gso_size_port,
+		(void *)&cmd_gso_size_pid,
+		(void *)&cmd_gso_size_keyword,
+		(void *)&cmd_gso_size_segsz,
+		NULL,
+	},
+};
+
 /* *** ENABLE/DISABLE FLUSH ON RX STREAMS *** */
 struct cmd_set_flush_rx {
 	cmdline_fixed_string_t set;
@@ -14251,6 +14370,8 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_tunnel_tso_show,
 	(cmdline_parse_inst_t *)&cmd_enable_gro,
 	(cmdline_parse_inst_t *)&cmd_gro_set,
+	(cmdline_parse_inst_t *)&cmd_gso_enable,
+	(cmdline_parse_inst_t *)&cmd_gso_size,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_rx,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_tx,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 3ae3e1c..1837fb1 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2454,6 +2454,31 @@ setup_gro(const char *mode, uint8_t port_id)
 	}
 }
 
+void
+setup_gso(const char *mode, uint8_t port_id)
+{
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		printf("invalid port id %u\n", port_id);
+		return;
+	}
+	if (strcmp(mode, "on") == 0) {
+		if (test_done == 0) {
+			printf("before enable GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 1;
+		gso_ports[port_id].gso_segsz = ETHER_MAX_LEN;
+	} else if (strcmp(mode, "off") == 0) {
+		if (test_done == 0) {
+			printf("before disable GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 0;
+	}
+}
+
 char*
 list_pkt_forwarding_modes(void)
 {
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 90c8119..f55bb0f 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -70,6 +70,7 @@
 #include <rte_string_fns.h>
 #include <rte_flow.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
@@ -627,6 +628,9 @@ static void
 pkt_burst_checksum_forward(struct fwd_stream *fs)
 {
 	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_mbuf *gso_segments[GSO_MAX_PKT_BURST];
+	struct rte_gso_ctx *gso_ctx;
+	struct rte_mbuf **tx_pkts_burst;
 	struct rte_port *txp;
 	struct rte_mbuf *m, *p;
 	struct ether_hdr *eth_hdr;
@@ -641,6 +645,9 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 	uint32_t rx_bad_ip_csum;
 	uint32_t rx_bad_l4_csum;
 	struct testpmd_offload_info info;
+	uint8_t no_gso[GSO_MAX_PKT_BURST] = {0};
+	uint16_t nb_segments = 0;
+	int ret;
 
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	uint64_t start_tsc;
@@ -851,13 +858,54 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 		}
 	}
 
+	if (unlikely(gso_ports[fs->tx_port].enable)) {
+		gso_ctx = &(current_fwd_lcore()->gso_ctx);
+		gso_ctx->gso_size = gso_ports[fs->tx_port].gso_segsz > 0 ?
+			gso_ports[fs->tx_port].gso_segsz : gso_ctx->gso_size;
+		for (i = 0; i < nb_rx; i++) {
+			ret = rte_gso_segment(pkts_burst[i], *gso_ctx,
+					&gso_segments[nb_segments],
+					GSO_MAX_PKT_BURST - nb_segments);
+			if (ret > 1)
+				nb_segments += ret;
+			else if (ret == 1) {
+				gso_segments[nb_segments] = pkts_burst[i];
+				no_gso[nb_segments++] = 1;
+			} else {
+				/* insufficient MBUFs, stop GSO */
+				memcpy(&gso_segments[nb_segments],
+						&pkts_burst[i],
+						sizeof(struct rte_mbuf *) *
+						(nb_rx - i));
+				nb_segments += (nb_rx - i);
+				break;
+			}
+			if (unlikely(nb_rx - i >= GSO_MAX_PKT_BURST -
+						nb_segments)) {
+				/*
+				 * insufficient space in gso_segments,
+				 * stop GSO.
+				 */
+				memcpy(&gso_segments[nb_segments],
+						&pkts_burst[i],
+						sizeof(struct rte_mbuf *) *
+						(nb_rx - i));
+				nb_segments += (nb_rx - i);
+				break;
+			}
+		}
+		tx_pkts_burst = gso_segments;
+		nb_rx = nb_segments;
+	} else
+		tx_pkts_burst = pkts_burst;
+
 	nb_prep = rte_eth_tx_prepare(fs->tx_port, fs->tx_queue,
-			pkts_burst, nb_rx);
+			tx_pkts_burst, nb_rx);
 	if (nb_prep != nb_rx)
 		printf("Preparing packet burst to transmit failed: %s\n",
 				rte_strerror(rte_errno));
 
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst,
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, tx_pkts_burst,
 			nb_prep);
 
 	/*
@@ -868,7 +916,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
 			rte_delay_us(burst_tx_delay_time);
 			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_rx - nb_tx);
+					&tx_pkts_burst[nb_tx], nb_rx - nb_tx);
 		}
 	}
 	fs->tx_packets += nb_tx;
@@ -878,12 +926,22 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 #ifdef RTE_TEST_PMD_RECORD_BURST_STATS
 	fs->tx_burst_stats.pkt_burst_spread[nb_tx]++;
 #endif
-	if (unlikely(nb_tx < nb_rx)) {
+
+	if (unlikely(nb_tx < nb_rx) &&
+			unlikely(gso_ports[fs->tx_port].enable)) {
 		fs->fwd_dropped += (nb_rx - nb_tx);
 		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
+			if (no_gso[nb_tx] == 0)
+				rte_pktmbuf_detach(tx_pkts_burst[nb_tx]->next);
+			rte_pktmbuf_free(tx_pkts_burst[nb_tx]);
+		} while (++nb_tx < nb_rx);
+	} else if (unlikely(nb_tx < nb_rx)) {
+		fs->fwd_dropped += (nb_rx - nb_tx);
+		do {
+			rte_pktmbuf_free(tx_pkts_burst[nb_tx]);
 		} while (++nb_tx < nb_rx);
 	}
+
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	end_tsc = rte_rdtsc();
 	core_cycles = (end_tsc - start_tsc);
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 7d40139..16c60f0 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -400,6 +400,8 @@ static int eth_event_callback(uint8_t port_id,
  */
 static int all_ports_started(void);
 
+struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+
 /*
  * Helper function to check if socket is already discovered.
  * If yes, return positive value. If not, return zero.
@@ -664,6 +666,13 @@ init_config(void)
 		if (mbp == NULL)
 			mbp = mbuf_pool_find(0);
 		fwd_lcores[lc_id]->mbp = mbp;
+		/* initialize GSO context */
+		fwd_lcores[lc_id]->gso_ctx.direct_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.indirect_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.gso_types = RTE_GSO_TCP_IPV4 |
+			RTE_GSO_IPV4_VXLAN_TCP_IPV4 |
+			RTE_GSO_IPV4_GRE_TCP_IPV4;
+		fwd_lcores[lc_id]->gso_ctx.gso_size = ETHER_MAX_LEN;
 	}
 
 	/* Configuration of packet forwarding streams. */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index c9d7739..3697d3f 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -36,6 +36,7 @@
 
 #include <rte_pci.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
@@ -205,6 +206,7 @@ struct rte_port {
  * CPU id. configuration table.
  */
 struct fwd_lcore {
+	struct rte_gso_ctx gso_ctx;     /**< GSO context */
 	struct rte_mempool *mbp; /**< The mbuf pool to use by this core */
 	streamid_t stream_idx;   /**< index of 1st stream in "fwd_streams" */
 	streamid_t stream_nb;    /**< number of streams in "fwd_streams" */
@@ -442,6 +444,13 @@ struct gro_status {
 };
 extern struct gro_status gro_ports[RTE_MAX_ETHPORTS];
 
+#define GSO_MAX_PKT_BURST 2048
+struct gso_status {
+	uint16_t gso_segsz;
+	uint8_t enable;
+};
+extern struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+
 static inline unsigned int
 lcore_num(void)
 {
@@ -641,6 +650,7 @@ void get_5tuple_filter(uint8_t port_id, uint16_t index);
 int rx_queue_id_is_invalid(queueid_t rxq_id);
 int tx_queue_id_is_invalid(queueid_t txq_id);
 void setup_gro(const char *mode, uint8_t port_id);
+void setup_gso(const char *mode, uint8_t port_id);
 
 /* Functions to manage the set of filtered Multicast MAC addresses */
 void mcast_addr_add(uint8_t port_id, struct ether_addr *mc_addr);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* Re: [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK
  2017-08-24 14:15 [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                   ` (4 preceding siblings ...)
  2017-08-24 14:15 ` [PATCH 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
@ 2017-08-30  1:37 ` Ananyev, Konstantin
  2017-08-30  7:36   ` Jiayu Hu
  2017-09-05  7:57 ` [PATCH v2 " Jiayu Hu
  6 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-08-30  1:37 UTC (permalink / raw)
  To: Hu, Jiayu, dev; +Cc: Kavanagh, Mark B, Tan, Jianfeng


Hi Jiayu,
Few questions/comments from me below in in next few mails.
Thanks
Konstantin

> 
> Generic Segmentation Offload (GSO) is a SW technique to split large
> packets into small ones. Akin to TSO, GSO enables applications to
> operate on large packets, thus reducing per-packet processing overhead.
> 
> To enable more flexibility to applications, DPDK GSO is implemented
> as a standalone library. Applications explicitly use the GSO library
> to segment packets. This patch adds GSO support to DPDK for specific
> packet types: specifically, TCP/IPv4, VxLAN, and GRE.
> 
> The first patch introduces the GSO API framework. The second patch
> adds GSO support for TCP/IPv4 packets (containing an optional VLAN
> tag). The third patch adds GSO support for VxLAN packets that contain
> outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or
> outer VLAN tags). The fourth patch adds GSO support for GRE packets
> that contain outer IPv4, and inner TCP/IPv4 headers (with optional
> outer VLAN tag). The last patch in the series enables TCP/IPv4, VxLAN,
> and GRE GSO in testpmd's checksum forwarding engine.
> 
> The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
> iperf. Setup for the test is described as follows:
> 
> a. Connect 2 x 10Gbps physical ports (P0, P1), together physically.
> b. Launch testpmd with P0 and a vhost-user port, and use csum
>    forwarding engine.
> c. Select IP and TCP HW checksum calculation for P0; select TCP HW
>    checksum calculation for vhost-user port.
> d. Launch a VM with csum and tso offloading enabled.
> e. Run iperf-client on virtio-net port in the VM to send TCP packets.

Not sure I understand the setup correctly:
So testpmd forwards packets between P0 and vhost-user port, right?
And who uses P1? iperf-server over linux kernel?
Also is P1 on another box or not?

> 
> With GSO enabled for P0 in testpmd, observed iperf throughput is ~9Gbps.

Ok, and if GSO is disabled what is the throughput?
Another stupid question: if P0 is physical 10G (ixgbe?) we can just enable a TSO on it, right?
If so, what would be the TSO numbers here?

In fact, could you probably explain a bit more, what supposed to be a main usage model for that library?
Is that to perform segmentation on (virtual) devices that doesn't support HW TSO or ...?
Again would it be for a termination point (packets were just formed and filled) by the caller,
or is that for box in the middle which just forwards packets between nodes?
If the later one, then we'll probably already have most of our packets segmented properly, no?
  
> The experimental data of VxLAN and GRE will be shown later.
> 
> Jiayu Hu (3):
>   lib: add Generic Segmentation Offload API framework
>   gso/lib: add TCP/IPv4 GSO support
>   app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
> 
> Mark Kavanagh (2):
>   lib/gso: add VxLAN GSO support
>   lib/gso: add GRE GSO support
> 
>  app/test-pmd/cmdline.c                  | 121 +++++++++
>  app/test-pmd/config.c                   |  25 ++
>  app/test-pmd/csumonly.c                 |  68 ++++-
>  app/test-pmd/testpmd.c                  |   9 +
>  app/test-pmd/testpmd.h                  |  10 +
>  config/common_base                      |   5 +
>  lib/Makefile                            |   2 +
>  lib/librte_eal/common/include/rte_log.h |   1 +
>  lib/librte_gso/Makefile                 |  52 ++++
>  lib/librte_gso/gso_common.c             | 431 ++++++++++++++++++++++++++++++++
>  lib/librte_gso/gso_common.h             | 180 +++++++++++++
>  lib/librte_gso/gso_tcp.c                |  82 ++++++
>  lib/librte_gso/gso_tcp.h                |  73 ++++++
>  lib/librte_gso/gso_tunnel.c             |  62 +++++
>  lib/librte_gso/gso_tunnel.h             |  46 ++++
>  lib/librte_gso/rte_gso.c                | 100 ++++++++
>  lib/librte_gso/rte_gso.h                | 122 +++++++++
>  lib/librte_gso/rte_gso_version.map      |   7 +
>  mk/rte.app.mk                           |   1 +
>  19 files changed, 1392 insertions(+), 5 deletions(-)
>  create mode 100644 lib/librte_gso/Makefile
>  create mode 100644 lib/librte_gso/gso_common.c
>  create mode 100644 lib/librte_gso/gso_common.h
>  create mode 100644 lib/librte_gso/gso_tcp.c
>  create mode 100644 lib/librte_gso/gso_tcp.h
>  create mode 100644 lib/librte_gso/gso_tunnel.c
>  create mode 100644 lib/librte_gso/gso_tunnel.h
>  create mode 100644 lib/librte_gso/rte_gso.c
>  create mode 100644 lib/librte_gso/rte_gso.h
>  create mode 100644 lib/librte_gso/rte_gso_version.map
> 
> --
> 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 1/5] lib: add Generic Segmentation Offload API framework
  2017-08-24 14:15 ` [PATCH 1/5] lib: add Generic Segmentation Offload API framework Jiayu Hu
@ 2017-08-30  1:38   ` Ananyev, Konstantin
  2017-08-30  7:57     ` Jiayu Hu
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-08-30  1:38 UTC (permalink / raw)
  To: Hu, Jiayu, dev; +Cc: Kavanagh, Mark B, Tan, Jianfeng

Hi Jiayu,

> 
> Generic Segmentation Offload (GSO) is a SW technique to split large
> packets into small ones. Akin to TSO, GSO enables applications to
> operate on large packets, thus reducing per-packet processing overhead.
> 
> To enable more flexibility to applications, DPDK GSO is implemented
> as a standalone library. Applications explicitly use the GSO library
> to segment packets. This patch introduces the GSO API framework to DPDK.
> 
> The GSO library provides a segmentation API, rte_gso_segment(), for
> applications. It splits an input packet into small ones in each
> invocation. The GSO library refers to these small packets generated
> by rte_gso_segment() as GSO segments. When all GSO segments are freed,
> the input packet is freed automatically.
> 
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> ---
>  config/common_base                 |   5 ++
>  lib/Makefile                       |   2 +
>  lib/librte_gso/Makefile            |  49 ++++++++++++++++
>  lib/librte_gso/rte_gso.c           |  47 ++++++++++++++++
>  lib/librte_gso/rte_gso.h           | 111 +++++++++++++++++++++++++++++++++++++
>  lib/librte_gso/rte_gso_version.map |   7 +++
>  mk/rte.app.mk                      |   1 +
>  7 files changed, 222 insertions(+)
>  create mode 100644 lib/librte_gso/Makefile
>  create mode 100644 lib/librte_gso/rte_gso.c
>  create mode 100644 lib/librte_gso/rte_gso.h
>  create mode 100644 lib/librte_gso/rte_gso_version.map
> 
> diff --git a/config/common_base b/config/common_base
> index 5e97a08..603e340 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -652,6 +652,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
>  CONFIG_RTE_LIBRTE_GRO=y
> 
>  #
> +# Compile GSO library
> +#
> +CONFIG_RTE_LIBRTE_GSO=y
> +
> +#
>  # Compile librte_meter
>  #
>  CONFIG_RTE_LIBRTE_METER=y
> diff --git a/lib/Makefile b/lib/Makefile
> index 86caba1..3d123f4 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
>  DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
>  DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
>  DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
> +DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
> +DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
> 
>  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>  DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> new file mode 100644
> index 0000000..aeaacbc
> --- /dev/null
> +++ b/lib/librte_gso/Makefile
> @@ -0,0 +1,49 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2017 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_gso.a
> +
> +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
> +
> +EXPORT_MAP := rte_gso_version.map
> +
> +LIBABIVER := 1
> +
> +#source files
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
> +
> +# install this header file
> +SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> new file mode 100644
> index 0000000..b81afce
> --- /dev/null
> +++ b/lib/librte_gso/rte_gso.c
> @@ -0,0 +1,47 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include "rte_gso.h"
> +
> +int
> +rte_gso_segment(struct rte_mbuf *pkt,
> +		struct rte_gso_ctx gso_ctx,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out __rte_unused)
> +{
> +	if (pkt == NULL || pkts_out == NULL || gso_ctx.direct_pool ==
> +			NULL || gso_ctx.indirect_pool == NULL)
> +		return -EINVAL;
> +
> +	return 1;
> +}
> diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
> new file mode 100644
> index 0000000..5a8389a
> --- /dev/null
> +++ b/lib/librte_gso/rte_gso.h
> @@ -0,0 +1,111 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _RTE_GSO_H_
> +#define _RTE_GSO_H_
> +
> +/**
> + * @file
> + * Interface to GSO library
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <stdint.h>
> +#include <rte_mbuf.h>
> +
> +/**
> + * GSO context structure.
> + */
> +struct rte_gso_ctx {
> +	struct rte_mempool *direct_pool;
> +	/**< MBUF pool for allocating direct buffers, which are used
> +	 * to store packet headers for GSO segments.
> +	 */
> +	struct rte_mempool *indirect_pool;
> +	/**< MBUF pool for allocating indirect buffers, which are used
> +	 * to locate packet payloads for GSO segments. The indirect
> +	 * buffer doesn't contain any data, but simply points to an
> +	 * offset within the packet to segment.
> +	 */
> +	uint64_t gso_types;
> +	/**< GSO types to perform */

Looking at the way it is used right now - there seems not much value in it...
Why not to make it a mask of ptypes for which GSO should be perfomed?
Let say for gso_ctx that supports only ip4/tcp it would be:
gso_types = (RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)
and then in rte_gso_segment() we can perfom gso only on packets of requested ptype:

if ((pkt->packet_type & gso_ctx->gso_types) == pkt->packet_type) {
   /* do segmentation */
} else {
  /* skip segmentation for that packet */
}

> +	uint16_t gso_size;
> +	/**< maximum size of a GSO segment, measured in bytes */

Is that MSS or MTU?

> +};
> +
> +/**
> + * Segmentation function, which supports processing of both single- and
> + * multi- segment packets. rte_gso_segment() assumes the input packet
> + * has correct checksums, and it doesn't process IP fragment packets.
> + * Additionally, it assumes that 'pkts_out' is large enough to hold all GSO
> + * segments.
> + *
> + * We refer to the packets that are segmented from the input packet as 'GSO
> + * segments'. If the input packet is GSOed, its mbuf refcnt reduces by 1.
> + * Therefore, when all GSO segments are freed, the input packet is freed
> + * automatically. If the input packet doesn't match the criteria for GSO
> + * (e.g. 'pkt's length is small and doesn't need segmentation), the packet
> + * is skipped and this function returns 1. If the available memory space
> + * in MBUF pools is insufficient, the packet is skipped and return -ENOMEM.
> + *
> + * @param pkt
> + *  The packet mbuf to segment.
> + * @param ctx
> + *  GSO context object.
> + * @param pkts_out
> + *  Pointer array used to stores the mbuf addresses of GSO segments.
> + *  Applications must ensure pkts_out is large enough to hold all GSO
> + *  segments. If the memory space in pkts_out is insufficient, the input
> + *  packet is skipped and return -EINVAL.
> + * @param nb_pkts_out
> + *  The max number of items that pkts_out can keep.
> + *
> + * @return
> + *  - The number of GSO segments created on success.
> + *  - Return 1 if no GSO is performed.

Wouldn't it be better to return number of elems filled in pkts_out[] on success?

> + *  - Return -ENOMEM if run out of memory in MBUF pools.
> + *  - Return -EINVAL for invalid parameters.
> + */
> +int rte_gso_segment(struct rte_mbuf *pkt,
> +		struct rte_gso_ctx ctx,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_GSO_H_ */
> diff --git a/lib/librte_gso/rte_gso_version.map b/lib/librte_gso/rte_gso_version.map
> new file mode 100644
> index 0000000..e1fd453
> --- /dev/null
> +++ b/lib/librte_gso/rte_gso_version.map
> @@ -0,0 +1,7 @@
> +DPDK_17.11 {
> +	global:
> +
> +	rte_gso_segment;
> +
> +	local: *;
> +};
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> index c25fdd9..d4c9873 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
> --
> 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
  2017-08-24 14:15 ` [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support Jiayu Hu
@ 2017-08-30  1:38   ` Ananyev, Konstantin
  2017-08-30  2:55     ` Jiayu Hu
                       ` (2 more replies)
  0 siblings, 3 replies; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-08-30  1:38 UTC (permalink / raw)
  To: Hu, Jiayu, dev; +Cc: Kavanagh, Mark B, Tan, Jianfeng



> -----Original Message-----
> From: Hu, Jiayu
> Sent: Thursday, August 24, 2017 3:16 PM
> To: dev@dpdk.org
> Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Tan, Jianfeng
> <jianfeng.tan@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
> Subject: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
> 
> This patch adds GSO support for TCP/IPv4 packets. Supported packets
> may include a single VLAN tag. TCP/IPv4 GSO assumes that all input
> packets have correct checksums, and doesn't update checksums for output
> packets (the responsibility for this lies with the application).
> Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
> 
> TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
> MBUF, to organize an output packet. Note that we refer to these two
> chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
> header, while the indirect mbuf simply points to a location within the
> original packet's payload. Consequently, use of the GSO library requires
> multi-segment MBUF support in the TX functions of the NIC driver.
> 
> If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> result, when all of its GSOed segments are freed, the packet is freed
> automatically.
> 
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> ---
>  lib/librte_eal/common/include/rte_log.h |   1 +
>  lib/librte_gso/Makefile                 |   2 +
>  lib/librte_gso/gso_common.c             | 270 ++++++++++++++++++++++++++++++++
>  lib/librte_gso/gso_common.h             | 120 ++++++++++++++
>  lib/librte_gso/gso_tcp.c                |  82 ++++++++++
>  lib/librte_gso/gso_tcp.h                |  73 +++++++++
>  lib/librte_gso/rte_gso.c                |  44 +++++-
>  lib/librte_gso/rte_gso.h                |   3 +
>  8 files changed, 593 insertions(+), 2 deletions(-)
>  create mode 100644 lib/librte_gso/gso_common.c
>  create mode 100644 lib/librte_gso/gso_common.h
>  create mode 100644 lib/librte_gso/gso_tcp.c
>  create mode 100644 lib/librte_gso/gso_tcp.h
> 
> diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
> index ec8dba7..2fa1199 100644
> --- a/lib/librte_eal/common/include/rte_log.h
> +++ b/lib/librte_eal/common/include/rte_log.h
> @@ -87,6 +87,7 @@ extern struct rte_logs rte_logs;
>  #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
>  #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
>  #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
> +#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
> 
>  /* these log types can be used in an application */
>  #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> index aeaacbc..0f8e38f 100644
> --- a/lib/librte_gso/Makefile
> +++ b/lib/librte_gso/Makefile
> @@ -42,6 +42,8 @@ LIBABIVER := 1
> 
>  #source files
>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp.c
> 
>  # install this header file
>  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
> new file mode 100644
> index 0000000..2b54fbd
> --- /dev/null
> +++ b/lib/librte_gso/gso_common.c
> @@ -0,0 +1,270 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <stdbool.h>
> +#include <string.h>
> +
> +#include <rte_malloc.h>
> +
> +#include <rte_ether.h>
> +#include <rte_ip.h>
> +#include <rte_tcp.h>
> +
> +#include "gso_common.h"
> +
> +static inline void
> +hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset)
> +{
> +	/* copy mbuf metadata */
> +	hdr_segment->nb_segs = 1;
> +	hdr_segment->port = pkt->port;
> +	hdr_segment->ol_flags = pkt->ol_flags;
> +	hdr_segment->packet_type = pkt->packet_type;
> +	hdr_segment->pkt_len = pkt_hdr_offset;
> +	hdr_segment->data_len = pkt_hdr_offset;
> +	hdr_segment->tx_offload = pkt->tx_offload;
> +	/* copy packet header */
> +	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
> +			rte_pktmbuf_mtod(pkt, char *),
> +			pkt_hdr_offset);
> +}
> +
> +static inline void
> +free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
> +{
> +	uint16_t i;
> +
> +	for (i = 0; i < nb_pkts; i++) {
> +		rte_pktmbuf_detach(pkts[i]->next);

I don't think you need to call detach() here explicitly.
Just rte_pktmbuf_free(pkts[i]) should do I think.

> +		rte_pktmbuf_free(pkts[i]);
> +		pkts[i] = NULL;
> +	}
> +}
> +
> +int
> +gso_do_segment(struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset,
> +		uint16_t pyld_unit_size,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out)
> +{
> +	struct rte_mbuf *pkt_in;
> +	struct rte_mbuf *hdr_segment, *pyld_segment;
> +	uint32_t pkt_in_pyld_off;
> +	uint16_t pkt_in_segment_len, pkt_out_segment_len;
> +	uint16_t nb_segs;
> +	bool pkt_in_segment_processed;
> +
> +	pkt_in_pyld_off = pkt->data_off + pkt_hdr_offset;
> +	pkt_in = pkt;
> +	nb_segs = 0;
> +
> +	while (pkt_in) {
> +		pkt_in_segment_processed = false;
> +		pkt_in_segment_len = pkt_in->data_off + pkt_in->data_len;
> +
> +		while (!pkt_in_segment_processed) {
> +			if (unlikely(nb_segs >= nb_pkts_out)) {
> +				free_gso_segment(pkts_out, nb_segs);
> +				return -EINVAL;
> +			}
> +
> +			/* allocate direct mbuf */
> +			hdr_segment = rte_pktmbuf_alloc(direct_pool);
> +			if (unlikely(hdr_segment == NULL)) {
> +				free_gso_segment(pkts_out, nb_segs);
> +				return -ENOMEM;
> +			}
> +
> +			/* allocate indirect mbuf */
> +			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
> +			if (unlikely(pyld_segment == NULL)) {
> +				rte_pktmbuf_free(hdr_segment);
> +				free_gso_segment(pkts_out, nb_segs);
> +				return -ENOMEM;
> +			}

So, if I understand correctly each new packet would always contain just one data segment?
Why several data segments couldn't be chained together (if sum of their data_len <= mss)?
In a same way as done here:
http://dpdk.org/browse/dpdk/tree/lib/librte_ip_frag/rte_ipv4_fragmentation.c#n93
or here:
https://gerrit.fd.io/r/gitweb?p=tldk.git;a=blob;f=lib/libtle_l4p/tcp_tx_seg.h;h=a8d2425597a7ad6f598aa4bb7fcd7f1da74305f0;hb=HEAD#l23
?

> +
> +			/* copy packet header */
> +			hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
> +
> +			/* attach payload mbuf to current packet segment */
> +			rte_pktmbuf_attach(pyld_segment, pkt_in);
> +
> +			hdr_segment->next = pyld_segment;
> +			pkts_out[nb_segs++] = hdr_segment;
> +
> +			/* calculate payload length */
> +			pkt_out_segment_len = pyld_unit_size;
> +			if (pkt_in_pyld_off + pkt_out_segment_len >
> +					pkt_in_segment_len) {
> +				pkt_out_segment_len = pkt_in_segment_len -
> +					pkt_in_pyld_off;
> +			}
> +
> +			/* update payload segment */
> +			pyld_segment->data_off = pkt_in_pyld_off;
> +			pyld_segment->data_len = pkt_out_segment_len;
> +
> +			/* update header segment */
> +			hdr_segment->pkt_len += pyld_segment->data_len;
> +			hdr_segment->nb_segs++;
> +
> +			/* update pkt_in_pyld_off */
> +			pkt_in_pyld_off += pkt_out_segment_len;
> +			if (pkt_in_pyld_off == pkt_in_segment_len)
> +				pkt_in_segment_processed = true;
> +		}
> +
> +		/* 'pkt_in' may contain numerous segments */
> +		pkt_in = pkt_in->next;
> +		if (pkt_in != NULL)
> +			pkt_in_pyld_off = pkt_in->data_off;
> +	}
> +	return nb_segs;
> +}
> +
> +static inline void
> +parse_ipv4(struct ipv4_hdr *ipv4_hdr, struct rte_mbuf *pkt)
> +{
> +	struct tcp_hdr *tcp_hdr;
> +
> +	switch (ipv4_hdr->next_proto_id) {
> +	case IPPROTO_TCP:
> +		pkt->packet_type |= RTE_PTYPE_L4_TCP;
> +		pkt->l3_len = IPv4_HDR_LEN(ipv4_hdr);
> +		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> +		pkt->l4_len = TCP_HDR_LEN(tcp_hdr);
> +		break;
> +	}
> +}
> +
> +static inline void
> +parse_ethernet(struct ether_hdr *eth_hdr, struct rte_mbuf *pkt)
> +{
> +	struct ipv4_hdr *ipv4_hdr;
> +	struct vlan_hdr *vlan_hdr;
> +	uint16_t ethertype;
> +
> +	ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
> +	if (ethertype == ETHER_TYPE_VLAN) {
> +		vlan_hdr = (struct vlan_hdr *)(eth_hdr + 1);
> +		pkt->l2_len = sizeof(struct vlan_hdr);
> +		pkt->packet_type |= RTE_PTYPE_L2_ETHER_VLAN;
> +		ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto);
> +	}
> +
> +	switch (ethertype) {
> +	case ETHER_TYPE_IPv4:
> +		if (IS_VLAN_PKT(pkt)) {
> +			pkt->packet_type |= RTE_PTYPE_L3_IPV4;
> +		} else {
> +			pkt->packet_type |= RTE_PTYPE_L2_ETHER;
> +			pkt->packet_type |= RTE_PTYPE_L3_IPV4;
> +		}
> +		pkt->l2_len += sizeof(struct ether_hdr);
> +		ipv4_hdr = (struct ipv4_hdr *) ((char *)eth_hdr +
> +				pkt->l2_len);
> +		parse_ipv4(ipv4_hdr, pkt);
> +		break;
> +	}
> +}
> +
> +void
> +gso_parse_packet(struct rte_mbuf *pkt)

There is a function rte_net_get_ptype() that supposed to provide similar functionality.
So we probably don't need to create a new SW parse function here, instead would be better
to reuse (and update if needed) an existing one.
Again user already might have l2/l3/l4.../_len and packet_type setuped.
So better to keep SW packet parsing out of scope of that library. 

> +{
> +	struct ether_hdr *eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
> +
> +	pkt->packet_type = pkt->tx_offload = 0;
> +	parse_ethernet(eth_hdr, pkt);
> +}
> +
> +static inline void
> +update_ipv4_header(char *base, uint16_t offset, uint16_t length, uint16_t id)
> +{
> +	struct ipv4_hdr *ipv4_hdr = (struct ipv4_hdr *)(base + offset);
> +
> +	ipv4_hdr->total_length = rte_cpu_to_be_16(length - offset);
> +	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
> +}
> +
> +static inline void
> +update_tcp_header(char *base, uint16_t offset, uint32_t sent_seq,
> +	uint8_t non_tail)
> +{
> +	struct tcp_hdr *tcp_hdr = (struct tcp_hdr *)(base + offset);
> +
> +	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
> +	/* clean FIN and PSH for non-tail segments */
> +	if (non_tail)
> +		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK | TCP_HDR_FIN_MASK));
> +}
> +
> +void
> +gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
> +		struct rte_mbuf **out_segments)
> +{
> +	struct ipv4_hdr *ipv4_hdr;
> +	struct tcp_hdr *tcp_hdr;
> +	struct rte_mbuf *seg;
> +	uint32_t sent_seq;
> +	uint16_t offset, i;
> +	uint16_t tail_seg_idx = nb_segments - 1, id;
> +
> +	switch (pkt->packet_type) {
> +	case ETHER_VLAN_IPv4_TCP_PKT:
> +	case ETHER_IPv4_TCP_PKT:

Might be worth to put code below in a separate function:
update_inner_tcp_hdr(..) or so.
Then you can reuse it for tunneled cases too.

> +		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) 
> +				pkt->l2_len);
> +		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> +		id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> +		sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> +
> +		for (i = 0; i < nb_segments; i++) {
> +			seg = out_segments[i];
> +
> +			offset = seg->l2_len;
> +			update_ipv4_header(rte_pktmbuf_mtod(seg, char *),
> +					offset, seg->pkt_len, id);
> +			id++;

Who would be responsible to make sure that we wouldn't have consecutive packets with the IPV4 id?
Would be the upper layer that forms the packet or gso library or ...?

> +
> +			offset += seg->l3_len;
> +			update_tcp_header(rte_pktmbuf_mtod(seg, char *),
> +					offset, sent_seq, i < tail_seg_idx);
> +			sent_seq += seg->next->data_len;
> +		}
> +		break;
> +	}
> +}
> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> new file mode 100644
> index 0000000..d750041
> --- /dev/null
> +++ b/lib/librte_gso/gso_common.h
> @@ -0,0 +1,120 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _GSO_COMMON_H_
> +#define _GSO_COMMON_H_
> +
> +#include <stdint.h>
> +#include <rte_mbuf.h>
> +
> +#define IPV4_HDR_DF_SHIFT 14
> +#define IPV4_HDR_DF_MASK (1 << IPV4_HDR_DF_SHIFT)
> +#define IPv4_HDR_LEN(iph) ((iph->version_ihl & 0x0f) * 4)
> +
> +#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
> +#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
> +#define TCP_HDR_LEN(tcph) ((tcph->data_off & 0xf0) >> 2)
> +
> +#define ETHER_IPv4_PKT (RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4)
> +/* Supported packet types */
> +/* TCP/IPv4 packet. */
> +#define ETHER_IPv4_TCP_PKT (ETHER_IPv4_PKT | RTE_PTYPE_L4_TCP)
> +
> +/* TCP/IPv4 packet with VLAN tag. */
> +#define ETHER_VLAN_IPv4_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | \
> +		RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)
> +
> +#define IS_VLAN_PKT(pkt) ((pkt->packet_type & RTE_PTYPE_L2_ETHER_VLAN) == \
> +		RTE_PTYPE_L2_ETHER_VLAN)
> +
> +/**
> + * Internal function which parses a packet, setting outer_l2/l3_len and
> + * l2/l3/l4_len and packet_type.
> + *
> + * @param pkt
> + *  Packet to parse.
> + */
> +void gso_parse_packet(struct rte_mbuf *pkt);
> +
> +/**
> + * Internal function which updates relevant packet headers, following
> + * segmentation. This is required to update, for example, the IPv4
> + * 'total_length' field, to reflect the reduced length of the now-
> + * segmented packet.
> + *
> + * @param pkt
> + *  The original packet.
> + * @param nb_segments
> + *  The number of GSO segments into which pkt was split.
> + * @param out_segements
> + *  Pointer array used for storing mbuf addresses for GSO segments.
> + */
> +void gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
> +		struct rte_mbuf **out_segments);
> +
> +/**
> + * Internal function which divides the input packet into small segments.
> + * Each of the newly-created segments is organized as a two-segment mbuf,
> + * where the first segment is a standard mbuf, which stores a copy of
> + * packet header, and the second is an indirect mbuf which points to a
> + * section of data in the input packet.
> + *
> + * @param pkt
> + *  Packet to segment.
> + * @param pkt_hdr_offset
> + *  Packet header offset, measured in byte.
> + * @param pyld_unit_size
> + *  The max payload length of a GSO segment.
> + * @param direct_pool
> + *  MBUF pool used for allocating direct buffers for output segments.
> + * @param indirect_pool
> + *  MBUF pool used for allocating indirect buffers for output segments.
> + * @param pkts_out
> + *  Pointer array used to keep the mbuf addresses of output segments.
> + * @param nb_pkts_out
> + *  The max number of items that pkts_out can keep.
> + *
> + * @return
> + *  - The number of segments created in the event of success.
> + *  - If no GSO is performed, return 1.
> + *  - If available memory in mempools is insufficient, return -ENOMEM.
> + *  - -EINVAL for invalid parameters
> + */
> +int gso_do_segment(struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset,
> +		uint16_t pyld_unit_size,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out);
> +#endif
> diff --git a/lib/librte_gso/gso_tcp.c b/lib/librte_gso/gso_tcp.c
> new file mode 100644
> index 0000000..9d5fc30
> --- /dev/null
> +++ b/lib/librte_gso/gso_tcp.c
> @@ -0,0 +1,82 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +
> +#include <rte_ether.h>
> +#include <rte_ip.h>
> +#include <rte_tcp.h>
> +
> +#include "gso_common.h"
> +#include "gso_tcp.h"
> +
> +int
> +gso_tcp4_segment(struct rte_mbuf *pkt,
> +		uint16_t gso_size,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out)
> +{
> +	struct ether_hdr *eth_hdr;
> +	struct ipv4_hdr *ipv4_hdr;
> +	uint16_t tcp_dl;
> +	uint16_t pyld_unit_size;
> +	uint16_t hdr_offset;
> +	int ret = 1;
> +
> +	eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
> +	ipv4_hdr = (struct ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
> +
> +	/* don't process fragmented packet */
> +	if ((ipv4_hdr->fragment_offset &
> +				rte_cpu_to_be_16(IPV4_HDR_DF_MASK)) == 0)
> +		return ret;
> +
> +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) -
> +		pkt->l3_len - pkt->l4_len;
> +	/* don't process packet without data */
> +	if (tcp_dl == 0)
> +		return ret;
> +
> +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
> +
> +	/* segment the payload */
> +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
> +			indirect_pool, pkts_out, nb_pkts_out);
> +
> +	if (ret > 1)
> +		gso_update_pkt_headers(pkt, ret, pkts_out);
> +
> +	return ret;
> +}
> diff --git a/lib/librte_gso/gso_tcp.h b/lib/librte_gso/gso_tcp.h
> new file mode 100644
> index 0000000..f291ccb
> --- /dev/null
> +++ b/lib/librte_gso/gso_tcp.h
> @@ -0,0 +1,73 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _GSO_TCP_H_
> +#define _GSO_TCP_H_
> +
> +#include <stdint.h>
> +#include <rte_mbuf.h>
> +
> +/**
> + * Segment an IPv4/TCP packet. This function assumes the input packet has
> + * correct checksums and doesn't update checksums for GSO segment.
> + * Furthermore, it doesn't process IP fragment packets.
> + *
> + * @param pkt
> + *  The packet mbuf to segment.
> + * @param gso_size
> + *  The max length of a GSO segment, measured in bytes.
> + * @param direct_pool
> + *  MBUF pool used for allocating direct buffers for output segments.
> + * @param indirect_pool
> + *  MBUF pool used for allocating indirect buffers for output segments.
> + * @param pkts_out
> + *  Pointer array, which is used to store mbuf addresses of GSO segments.
> + *  Caller should guarantee that 'pkts_out' is sufficiently large to store
> + *  all GSO segments.
> + * @param nb_pkts_out
> + *  The max number of items that 'pkts_out' can keep.
> + *
> + * @return
> + *   - The number of GSO segments on success.
> + *   - Return 1 if no GSO is performed.
> + *   - Return -ENOMEM if available memory in mempools is insufficient.
> + *   - Return -EINVAL for invalid parameters.
> + */
> +int gso_tcp4_segment(struct rte_mbuf *pkt,
> +		uint16_t gso_size,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out);
> +
> +#endif
> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> index b81afce..fac95f2 100644
> --- a/lib/librte_gso/rte_gso.c
> +++ b/lib/librte_gso/rte_gso.c
> @@ -31,17 +31,57 @@
>   *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>   */
> 
> +#include <rte_log.h>
> +
>  #include "rte_gso.h"
> +#include "gso_common.h"
> +#include "gso_tcp.h"
> 
>  int
>  rte_gso_segment(struct rte_mbuf *pkt,
>  		struct rte_gso_ctx gso_ctx,
>  		struct rte_mbuf **pkts_out,
> -		uint16_t nb_pkts_out __rte_unused)
> +		uint16_t nb_pkts_out)
>  {
> +	struct rte_mempool *direct_pool, *indirect_pool;
> +	struct rte_mbuf *pkt_seg;
> +	uint16_t nb_segments, gso_size;
> +
>  	if (pkt == NULL || pkts_out == NULL || gso_ctx.direct_pool ==
>  			NULL || gso_ctx.indirect_pool == NULL)
>  		return -EINVAL;

Probably we don't need to check gso_ctx values for each incoming packet.
If you feel it is necessary - create  new function rte_gso_ctx_check() that
could be performed just once per ctx.

> 
> -	return 1;
> +	if ((gso_ctx.gso_types & RTE_GSO_TCP_IPV4) == 0 ||
> +			gso_ctx.gso_size >= pkt->pkt_len ||
> +			gso_ctx.gso_size == 0)


First and third condition seems redundant.

> +		return 1;


I think you forgot here:
pkts_out[0] = pkt;


> +
> +	pkt_seg = pkt;
> +	gso_size = gso_ctx.gso_size;
> +	direct_pool = gso_ctx.direct_pool;
> +	indirect_pool = gso_ctx.indirect_pool;
> +
> +	/* Parse packet headers to determine how to segment 'pkt' */
> +	gso_parse_packet(pkt);


I don't think we need to parse packet here.
 Instead assume that user already filled packet_type and l2/l3/..._len fields correctly.

> +
> +	switch (pkt->packet_type) {
> +	case ETHER_VLAN_IPv4_TCP_PKT:
> +	case ETHER_IPv4_TCP_PKT:
> +		nb_segments = gso_tcp4_segment(pkt, gso_size,
> +				direct_pool, indirect_pool,
> +				pkts_out, nb_pkts_out);
> +		break;
> +	default:
> +		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
> +		nb_segments = 1;
> +	}
> +
> +	if (nb_segments > 1) {
> +		while (pkt_seg) {
> +			rte_mbuf_refcnt_update(pkt_seg, -1);
> +			pkt_seg = pkt_seg->next;
> +		}
> +	}
> +
> +	return nb_segments;
>  }
> diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
> index 5a8389a..77853fa 100644
> --- a/lib/librte_gso/rte_gso.h
> +++ b/lib/librte_gso/rte_gso.h
> @@ -46,6 +46,9 @@ extern "C" {
>  #include <stdint.h>
>  #include <rte_mbuf.h>
> 
> +#define RTE_GSO_TCP_IPV4 (1ULL << 0)
> +/**< GSO flag for TCP/IPv4 packets (containing optional VLAN tag) */
> +
>  /**
>   * GSO context structure.
>   */
> --
> 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
  2017-08-30  1:38   ` Ananyev, Konstantin
@ 2017-08-30  2:55     ` Jiayu Hu
  2017-08-30  9:25       ` Kavanagh, Mark B
  2017-08-30  9:03     ` Jiayu Hu
  2017-09-04  3:31     ` Jiayu Hu
  2 siblings, 1 reply; 157+ messages in thread
From: Jiayu Hu @ 2017-08-30  2:55 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Konstantin,

Thanks for your important suggestions. My feedbacks are inline.

On Wed, Aug 30, 2017 at 09:38:33AM +0800, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Hu, Jiayu
> > Sent: Thursday, August 24, 2017 3:16 PM
> > To: dev@dpdk.org
> > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Tan, Jianfeng
> > <jianfeng.tan@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
> > Subject: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
> > 
> > This patch adds GSO support for TCP/IPv4 packets. Supported packets
> > may include a single VLAN tag. TCP/IPv4 GSO assumes that all input
> > packets have correct checksums, and doesn't update checksums for output
> > packets (the responsibility for this lies with the application).
> > Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
> > 
> > TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
> > MBUF, to organize an output packet. Note that we refer to these two
> > chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
> > header, while the indirect mbuf simply points to a location within the
> > original packet's payload. Consequently, use of the GSO library requires
> > multi-segment MBUF support in the TX functions of the NIC driver.
> > 
> > If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> > result, when all of its GSOed segments are freed, the packet is freed
> > automatically.
> > 
> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> > ---
> >  lib/librte_eal/common/include/rte_log.h |   1 +
> >  lib/librte_gso/Makefile                 |   2 +
> >  lib/librte_gso/gso_common.c             | 270 ++++++++++++++++++++++++++++++++
> >  lib/librte_gso/gso_common.h             | 120 ++++++++++++++
> >  lib/librte_gso/gso_tcp.c                |  82 ++++++++++
> >  lib/librte_gso/gso_tcp.h                |  73 +++++++++
> >  lib/librte_gso/rte_gso.c                |  44 +++++-
> >  lib/librte_gso/rte_gso.h                |   3 +
> >  8 files changed, 593 insertions(+), 2 deletions(-)
> >  create mode 100644 lib/librte_gso/gso_common.c
> >  create mode 100644 lib/librte_gso/gso_common.h
> >  create mode 100644 lib/librte_gso/gso_tcp.c
> >  create mode 100644 lib/librte_gso/gso_tcp.h
> > 
> > diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
> > index ec8dba7..2fa1199 100644
> > --- a/lib/librte_eal/common/include/rte_log.h
> > +++ b/lib/librte_eal/common/include/rte_log.h
> > @@ -87,6 +87,7 @@ extern struct rte_logs rte_logs;
> >  #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
> >  #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
> >  #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
> > +#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
> > 
> >  /* these log types can be used in an application */
> >  #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
> > diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> > index aeaacbc..0f8e38f 100644
> > --- a/lib/librte_gso/Makefile
> > +++ b/lib/librte_gso/Makefile
> > @@ -42,6 +42,8 @@ LIBABIVER := 1
> > 
> >  #source files
> >  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
> > +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
> > +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp.c
> > 
> >  # install this header file
> >  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> > diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
> > new file mode 100644
> > index 0000000..2b54fbd
> > --- /dev/null
> > +++ b/lib/librte_gso/gso_common.c
> > @@ -0,0 +1,270 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#include <stdbool.h>
> > +#include <string.h>
> > +
> > +#include <rte_malloc.h>
> > +
> > +#include <rte_ether.h>
> > +#include <rte_ip.h>
> > +#include <rte_tcp.h>
> > +
> > +#include "gso_common.h"
> > +
> > +static inline void
> > +hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
> > +		uint16_t pkt_hdr_offset)
> > +{
> > +	/* copy mbuf metadata */
> > +	hdr_segment->nb_segs = 1;
> > +	hdr_segment->port = pkt->port;
> > +	hdr_segment->ol_flags = pkt->ol_flags;
> > +	hdr_segment->packet_type = pkt->packet_type;
> > +	hdr_segment->pkt_len = pkt_hdr_offset;
> > +	hdr_segment->data_len = pkt_hdr_offset;
> > +	hdr_segment->tx_offload = pkt->tx_offload;
> > +	/* copy packet header */
> > +	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
> > +			rte_pktmbuf_mtod(pkt, char *),
> > +			pkt_hdr_offset);
> > +}
> > +
> > +static inline void
> > +free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
> > +{
> > +	uint16_t i;
> > +
> > +	for (i = 0; i < nb_pkts; i++) {
> > +		rte_pktmbuf_detach(pkts[i]->next);
> 
> I don't think you need to call detach() here explicitly.
> Just rte_pktmbuf_free(pkts[i]) should do I think.

Yes, rte_pktmbuf_free() is enough. I will modify it. Thanks.

> 
> > +		rte_pktmbuf_free(pkts[i]);
> > +		pkts[i] = NULL;
> > +	}
> > +}
> > +
> > +int
> > +gso_do_segment(struct rte_mbuf *pkt,
> > +		uint16_t pkt_hdr_offset,
> > +		uint16_t pyld_unit_size,
> > +		struct rte_mempool *direct_pool,
> > +		struct rte_mempool *indirect_pool,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out)
> > +{
> > +	struct rte_mbuf *pkt_in;
> > +	struct rte_mbuf *hdr_segment, *pyld_segment;
> > +	uint32_t pkt_in_pyld_off;
> > +	uint16_t pkt_in_segment_len, pkt_out_segment_len;
> > +	uint16_t nb_segs;
> > +	bool pkt_in_segment_processed;
> > +
> > +	pkt_in_pyld_off = pkt->data_off + pkt_hdr_offset;
> > +	pkt_in = pkt;
> > +	nb_segs = 0;
> > +
> > +	while (pkt_in) {
> > +		pkt_in_segment_processed = false;
> > +		pkt_in_segment_len = pkt_in->data_off + pkt_in->data_len;
> > +
> > +		while (!pkt_in_segment_processed) {
> > +			if (unlikely(nb_segs >= nb_pkts_out)) {
> > +				free_gso_segment(pkts_out, nb_segs);
> > +				return -EINVAL;
> > +			}
> > +
> > +			/* allocate direct mbuf */
> > +			hdr_segment = rte_pktmbuf_alloc(direct_pool);
> > +			if (unlikely(hdr_segment == NULL)) {
> > +				free_gso_segment(pkts_out, nb_segs);
> > +				return -ENOMEM;
> > +			}
> > +
> > +			/* allocate indirect mbuf */
> > +			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
> > +			if (unlikely(pyld_segment == NULL)) {
> > +				rte_pktmbuf_free(hdr_segment);
> > +				free_gso_segment(pkts_out, nb_segs);
> > +				return -ENOMEM;
> > +			}
> 
> So, if I understand correctly each new packet would always contain just one data segment?
> Why several data segments couldn't be chained together (if sum of their data_len <= mss)?
> In a same way as done here:
> http://dpdk.org/browse/dpdk/tree/lib/librte_ip_frag/rte_ipv4_fragmentation.c#n93
> or here:
> https://gerrit.fd.io/r/gitweb?p=tldk.git;a=blob;f=lib/libtle_l4p/tcp_tx_seg.h;h=a8d2425597a7ad6f598aa4bb7fcd7f1da74305f0;hb=HEAD#l23
> ?

Oh, yes. I can chain these data segments when their total length is less than the GSO segsz.
I will change it in the next patch. Thanks very much.

> 
> > +
> > +			/* copy packet header */
> > +			hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
> > +
> > +			/* attach payload mbuf to current packet segment */
> > +			rte_pktmbuf_attach(pyld_segment, pkt_in);
> > +
> > +			hdr_segment->next = pyld_segment;
> > +			pkts_out[nb_segs++] = hdr_segment;
> > +
> > +			/* calculate payload length */
> > +			pkt_out_segment_len = pyld_unit_size;
> > +			if (pkt_in_pyld_off + pkt_out_segment_len >
> > +					pkt_in_segment_len) {
> > +				pkt_out_segment_len = pkt_in_segment_len -
> > +					pkt_in_pyld_off;
> > +			}
> > +
> > +			/* update payload segment */
> > +			pyld_segment->data_off = pkt_in_pyld_off;
> > +			pyld_segment->data_len = pkt_out_segment_len;
> > +
> > +			/* update header segment */
> > +			hdr_segment->pkt_len += pyld_segment->data_len;
> > +			hdr_segment->nb_segs++;
> > +
> > +			/* update pkt_in_pyld_off */
> > +			pkt_in_pyld_off += pkt_out_segment_len;
> > +			if (pkt_in_pyld_off == pkt_in_segment_len)
> > +				pkt_in_segment_processed = true;
> > +		}
> > +
> > +		/* 'pkt_in' may contain numerous segments */
> > +		pkt_in = pkt_in->next;
> > +		if (pkt_in != NULL)
> > +			pkt_in_pyld_off = pkt_in->data_off;
> > +	}
> > +	return nb_segs;
> > +}
> > +
> > +static inline void
> > +parse_ipv4(struct ipv4_hdr *ipv4_hdr, struct rte_mbuf *pkt)
> > +{
> > +	struct tcp_hdr *tcp_hdr;
> > +
> > +	switch (ipv4_hdr->next_proto_id) {
> > +	case IPPROTO_TCP:
> > +		pkt->packet_type |= RTE_PTYPE_L4_TCP;
> > +		pkt->l3_len = IPv4_HDR_LEN(ipv4_hdr);
> > +		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> > +		pkt->l4_len = TCP_HDR_LEN(tcp_hdr);
> > +		break;
> > +	}
> > +}
> > +
> > +static inline void
> > +parse_ethernet(struct ether_hdr *eth_hdr, struct rte_mbuf *pkt)
> > +{
> > +	struct ipv4_hdr *ipv4_hdr;
> > +	struct vlan_hdr *vlan_hdr;
> > +	uint16_t ethertype;
> > +
> > +	ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
> > +	if (ethertype == ETHER_TYPE_VLAN) {
> > +		vlan_hdr = (struct vlan_hdr *)(eth_hdr + 1);
> > +		pkt->l2_len = sizeof(struct vlan_hdr);
> > +		pkt->packet_type |= RTE_PTYPE_L2_ETHER_VLAN;
> > +		ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto);
> > +	}
> > +
> > +	switch (ethertype) {
> > +	case ETHER_TYPE_IPv4:
> > +		if (IS_VLAN_PKT(pkt)) {
> > +			pkt->packet_type |= RTE_PTYPE_L3_IPV4;
> > +		} else {
> > +			pkt->packet_type |= RTE_PTYPE_L2_ETHER;
> > +			pkt->packet_type |= RTE_PTYPE_L3_IPV4;
> > +		}
> > +		pkt->l2_len += sizeof(struct ether_hdr);
> > +		ipv4_hdr = (struct ipv4_hdr *) ((char *)eth_hdr +
> > +				pkt->l2_len);
> > +		parse_ipv4(ipv4_hdr, pkt);
> > +		break;
> > +	}
> > +}
> > +
> > +void
> > +gso_parse_packet(struct rte_mbuf *pkt)
> 
> There is a function rte_net_get_ptype() that supposed to provide similar functionality.
> So we probably don't need to create a new SW parse function here, instead would be better
> to reuse (and update if needed) an existing one.
> Again user already might have l2/l3/l4.../_len and packet_type setuped.
> So better to keep SW packet parsing out of scope of that library. 

Hmm, I know we have discussed this design choice in the GRO library, and I also think it's
better to reuse these values.

But from the perspective of OVS, it may add extra overhead, since OVS doesn't parse every
packet originally. Maybe @Mark can give us more inputs from the view of OVS.

> 
> > +{
> > +	struct ether_hdr *eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
> > +
> > +	pkt->packet_type = pkt->tx_offload = 0;
> > +	parse_ethernet(eth_hdr, pkt);
> > +}
> > +
> > +static inline void
> > +update_ipv4_header(char *base, uint16_t offset, uint16_t length, uint16_t id)
> > +{
> > +	struct ipv4_hdr *ipv4_hdr = (struct ipv4_hdr *)(base + offset);
> > +
> > +	ipv4_hdr->total_length = rte_cpu_to_be_16(length - offset);
> > +	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
> > +}
> > +
> > +static inline void
> > +update_tcp_header(char *base, uint16_t offset, uint32_t sent_seq,
> > +	uint8_t non_tail)
> > +{
> > +	struct tcp_hdr *tcp_hdr = (struct tcp_hdr *)(base + offset);
> > +
> > +	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
> > +	/* clean FIN and PSH for non-tail segments */
> > +	if (non_tail)
> > +		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK | TCP_HDR_FIN_MASK));
> > +}
> > +
> > +void
> > +gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
> > +		struct rte_mbuf **out_segments)
> > +{
> > +	struct ipv4_hdr *ipv4_hdr;
> > +	struct tcp_hdr *tcp_hdr;
> > +	struct rte_mbuf *seg;
> > +	uint32_t sent_seq;
> > +	uint16_t offset, i;
> > +	uint16_t tail_seg_idx = nb_segments - 1, id;
> > +
> > +	switch (pkt->packet_type) {
> > +	case ETHER_VLAN_IPv4_TCP_PKT:
> > +	case ETHER_IPv4_TCP_PKT:
> 
> Might be worth to put code below in a separate function:
> update_inner_tcp_hdr(..) or so.
> Then you can reuse it for tunneled cases too.

Yes, I will modify it in the next patch. Thanks.

> 
> > +		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) 
> > +				pkt->l2_len);
> > +		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> > +		id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> > +		sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> > +
> > +		for (i = 0; i < nb_segments; i++) {
> > +			seg = out_segments[i];
> > +
> > +			offset = seg->l2_len;
> > +			update_ipv4_header(rte_pktmbuf_mtod(seg, char *),
> > +					offset, seg->pkt_len, id);
> > +			id++;
> 
> Who would be responsible to make sure that we wouldn't have consecutive packets with the IPV4 id?
> Would be the upper layer that forms the packet or gso library or ...?

Oh yes. I ingore this important issue. I don't think applications can guarantee it.
I will check the design of linux and try to figure out a way. Thanks for reminder.

> 
> > +
> > +			offset += seg->l3_len;
> > +			update_tcp_header(rte_pktmbuf_mtod(seg, char *),
> > +					offset, sent_seq, i < tail_seg_idx);
> > +			sent_seq += seg->next->data_len;
> > +		}
> > +		break;
> > +	}
> > +}
> > diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> > new file mode 100644
> > index 0000000..d750041
> > --- /dev/null
> > +++ b/lib/librte_gso/gso_common.h
> > @@ -0,0 +1,120 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#ifndef _GSO_COMMON_H_
> > +#define _GSO_COMMON_H_
> > +
> > +#include <stdint.h>
> > +#include <rte_mbuf.h>
> > +
> > +#define IPV4_HDR_DF_SHIFT 14
> > +#define IPV4_HDR_DF_MASK (1 << IPV4_HDR_DF_SHIFT)
> > +#define IPv4_HDR_LEN(iph) ((iph->version_ihl & 0x0f) * 4)
> > +
> > +#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
> > +#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
> > +#define TCP_HDR_LEN(tcph) ((tcph->data_off & 0xf0) >> 2)
> > +
> > +#define ETHER_IPv4_PKT (RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4)
> > +/* Supported packet types */
> > +/* TCP/IPv4 packet. */
> > +#define ETHER_IPv4_TCP_PKT (ETHER_IPv4_PKT | RTE_PTYPE_L4_TCP)
> > +
> > +/* TCP/IPv4 packet with VLAN tag. */
> > +#define ETHER_VLAN_IPv4_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | \
> > +		RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)
> > +
> > +#define IS_VLAN_PKT(pkt) ((pkt->packet_type & RTE_PTYPE_L2_ETHER_VLAN) == \
> > +		RTE_PTYPE_L2_ETHER_VLAN)
> > +
> > +/**
> > + * Internal function which parses a packet, setting outer_l2/l3_len and
> > + * l2/l3/l4_len and packet_type.
> > + *
> > + * @param pkt
> > + *  Packet to parse.
> > + */
> > +void gso_parse_packet(struct rte_mbuf *pkt);
> > +
> > +/**
> > + * Internal function which updates relevant packet headers, following
> > + * segmentation. This is required to update, for example, the IPv4
> > + * 'total_length' field, to reflect the reduced length of the now-
> > + * segmented packet.
> > + *
> > + * @param pkt
> > + *  The original packet.
> > + * @param nb_segments
> > + *  The number of GSO segments into which pkt was split.
> > + * @param out_segements
> > + *  Pointer array used for storing mbuf addresses for GSO segments.
> > + */
> > +void gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
> > +		struct rte_mbuf **out_segments);
> > +
> > +/**
> > + * Internal function which divides the input packet into small segments.
> > + * Each of the newly-created segments is organized as a two-segment mbuf,
> > + * where the first segment is a standard mbuf, which stores a copy of
> > + * packet header, and the second is an indirect mbuf which points to a
> > + * section of data in the input packet.
> > + *
> > + * @param pkt
> > + *  Packet to segment.
> > + * @param pkt_hdr_offset
> > + *  Packet header offset, measured in byte.
> > + * @param pyld_unit_size
> > + *  The max payload length of a GSO segment.
> > + * @param direct_pool
> > + *  MBUF pool used for allocating direct buffers for output segments.
> > + * @param indirect_pool
> > + *  MBUF pool used for allocating indirect buffers for output segments.
> > + * @param pkts_out
> > + *  Pointer array used to keep the mbuf addresses of output segments.
> > + * @param nb_pkts_out
> > + *  The max number of items that pkts_out can keep.
> > + *
> > + * @return
> > + *  - The number of segments created in the event of success.
> > + *  - If no GSO is performed, return 1.
> > + *  - If available memory in mempools is insufficient, return -ENOMEM.
> > + *  - -EINVAL for invalid parameters
> > + */
> > +int gso_do_segment(struct rte_mbuf *pkt,
> > +		uint16_t pkt_hdr_offset,
> > +		uint16_t pyld_unit_size,
> > +		struct rte_mempool *direct_pool,
> > +		struct rte_mempool *indirect_pool,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out);
> > +#endif
> > diff --git a/lib/librte_gso/gso_tcp.c b/lib/librte_gso/gso_tcp.c
> > new file mode 100644
> > index 0000000..9d5fc30
> > --- /dev/null
> > +++ b/lib/librte_gso/gso_tcp.c
> > @@ -0,0 +1,82 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +
> > +#include <rte_ether.h>
> > +#include <rte_ip.h>
> > +#include <rte_tcp.h>
> > +
> > +#include "gso_common.h"
> > +#include "gso_tcp.h"
> > +
> > +int
> > +gso_tcp4_segment(struct rte_mbuf *pkt,
> > +		uint16_t gso_size,
> > +		struct rte_mempool *direct_pool,
> > +		struct rte_mempool *indirect_pool,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out)
> > +{
> > +	struct ether_hdr *eth_hdr;
> > +	struct ipv4_hdr *ipv4_hdr;
> > +	uint16_t tcp_dl;
> > +	uint16_t pyld_unit_size;
> > +	uint16_t hdr_offset;
> > +	int ret = 1;
> > +
> > +	eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
> > +	ipv4_hdr = (struct ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
> > +
> > +	/* don't process fragmented packet */
> > +	if ((ipv4_hdr->fragment_offset &
> > +				rte_cpu_to_be_16(IPV4_HDR_DF_MASK)) == 0)
> > +		return ret;
> > +
> > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) -
> > +		pkt->l3_len - pkt->l4_len;
> > +	/* don't process packet without data */
> > +	if (tcp_dl == 0)
> > +		return ret;
> > +
> > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
> > +
> > +	/* segment the payload */
> > +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
> > +			indirect_pool, pkts_out, nb_pkts_out);
> > +
> > +	if (ret > 1)
> > +		gso_update_pkt_headers(pkt, ret, pkts_out);
> > +
> > +	return ret;
> > +}
> > diff --git a/lib/librte_gso/gso_tcp.h b/lib/librte_gso/gso_tcp.h
> > new file mode 100644
> > index 0000000..f291ccb
> > --- /dev/null
> > +++ b/lib/librte_gso/gso_tcp.h
> > @@ -0,0 +1,73 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#ifndef _GSO_TCP_H_
> > +#define _GSO_TCP_H_
> > +
> > +#include <stdint.h>
> > +#include <rte_mbuf.h>
> > +
> > +/**
> > + * Segment an IPv4/TCP packet. This function assumes the input packet has
> > + * correct checksums and doesn't update checksums for GSO segment.
> > + * Furthermore, it doesn't process IP fragment packets.
> > + *
> > + * @param pkt
> > + *  The packet mbuf to segment.
> > + * @param gso_size
> > + *  The max length of a GSO segment, measured in bytes.
> > + * @param direct_pool
> > + *  MBUF pool used for allocating direct buffers for output segments.
> > + * @param indirect_pool
> > + *  MBUF pool used for allocating indirect buffers for output segments.
> > + * @param pkts_out
> > + *  Pointer array, which is used to store mbuf addresses of GSO segments.
> > + *  Caller should guarantee that 'pkts_out' is sufficiently large to store
> > + *  all GSO segments.
> > + * @param nb_pkts_out
> > + *  The max number of items that 'pkts_out' can keep.
> > + *
> > + * @return
> > + *   - The number of GSO segments on success.
> > + *   - Return 1 if no GSO is performed.
> > + *   - Return -ENOMEM if available memory in mempools is insufficient.
> > + *   - Return -EINVAL for invalid parameters.
> > + */
> > +int gso_tcp4_segment(struct rte_mbuf *pkt,
> > +		uint16_t gso_size,
> > +		struct rte_mempool *direct_pool,
> > +		struct rte_mempool *indirect_pool,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out);
> > +
> > +#endif
> > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > index b81afce..fac95f2 100644
> > --- a/lib/librte_gso/rte_gso.c
> > +++ b/lib/librte_gso/rte_gso.c
> > @@ -31,17 +31,57 @@
> >   *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> >   */
> > 
> > +#include <rte_log.h>
> > +
> >  #include "rte_gso.h"
> > +#include "gso_common.h"
> > +#include "gso_tcp.h"
> > 
> >  int
> >  rte_gso_segment(struct rte_mbuf *pkt,
> >  		struct rte_gso_ctx gso_ctx,
> >  		struct rte_mbuf **pkts_out,
> > -		uint16_t nb_pkts_out __rte_unused)
> > +		uint16_t nb_pkts_out)
> >  {
> > +	struct rte_mempool *direct_pool, *indirect_pool;
> > +	struct rte_mbuf *pkt_seg;
> > +	uint16_t nb_segments, gso_size;
> > +
> >  	if (pkt == NULL || pkts_out == NULL || gso_ctx.direct_pool ==
> >  			NULL || gso_ctx.indirect_pool == NULL)
> >  		return -EINVAL;
> 
> Probably we don't need to check gso_ctx values for each incoming packet.
> If you feel it is necessary - create  new function rte_gso_ctx_check() that
> could be performed just once per ctx.

Agree. I will change it. Thanks.

> 
> > 
> > -	return 1;
> > +	if ((gso_ctx.gso_types & RTE_GSO_TCP_IPV4) == 0 ||
> > +			gso_ctx.gso_size >= pkt->pkt_len ||
> > +			gso_ctx.gso_size == 0)
> 
> 
> First and third condition seems redundant.

The reason to check gso_ctx.gso_types here is that we don't perform
GSO if applications don't set RTE_GSO_TCP_IPV4 to gso_ctx.gso_types,
even the input packet is TCP/IPv4. And if gso_ctx.gso_size is 0,
we don't need to execute the following codes. So we still need to
remove these two conditions?

> 
> > +		return 1;
> 
> 
> I think you forgot here:
> pkts_out[0] = pkt;

But why should we keep the input packet in the output array? Currently, if
GSO is not performed, no packets will be kept in pkts_out[]. Applications
can know it by the return value 1 of rte_gso_segment().

> 
> 
> > +
> > +	pkt_seg = pkt;
> > +	gso_size = gso_ctx.gso_size;
> > +	direct_pool = gso_ctx.direct_pool;
> > +	indirect_pool = gso_ctx.indirect_pool;
> > +
> > +	/* Parse packet headers to determine how to segment 'pkt' */
> > +	gso_parse_packet(pkt);
> 
> 
> I don't think we need to parse packet here.
>  Instead assume that user already filled packet_type and l2/l3/..._len fields correctly.

Hmm, I see it. Thanks.

Thanks,
Jiayu

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK
  2017-08-30  1:37 ` [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Ananyev, Konstantin
@ 2017-08-30  7:36   ` Jiayu Hu
  2017-08-30 10:49     ` Ananyev, Konstantin
  0 siblings, 1 reply; 157+ messages in thread
From: Jiayu Hu @ 2017-08-30  7:36 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Konstantin,

Thanks for your suggestions. Feedbacks are inline.

Thanks,
Jiayu

On Wed, Aug 30, 2017 at 09:37:42AM +0800, Ananyev, Konstantin wrote:
> 
> Hi Jiayu,
> Few questions/comments from me below in in next few mails.
> Thanks
> Konstantin
> 
> > 
> > Generic Segmentation Offload (GSO) is a SW technique to split large
> > packets into small ones. Akin to TSO, GSO enables applications to
> > operate on large packets, thus reducing per-packet processing overhead.
> > 
> > To enable more flexibility to applications, DPDK GSO is implemented
> > as a standalone library. Applications explicitly use the GSO library
> > to segment packets. This patch adds GSO support to DPDK for specific
> > packet types: specifically, TCP/IPv4, VxLAN, and GRE.
> > 
> > The first patch introduces the GSO API framework. The second patch
> > adds GSO support for TCP/IPv4 packets (containing an optional VLAN
> > tag). The third patch adds GSO support for VxLAN packets that contain
> > outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or
> > outer VLAN tags). The fourth patch adds GSO support for GRE packets
> > that contain outer IPv4, and inner TCP/IPv4 headers (with optional
> > outer VLAN tag). The last patch in the series enables TCP/IPv4, VxLAN,
> > and GRE GSO in testpmd's checksum forwarding engine.
> > 
> > The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
> > iperf. Setup for the test is described as follows:
> > 
> > a. Connect 2 x 10Gbps physical ports (P0, P1), together physically.
> > b. Launch testpmd with P0 and a vhost-user port, and use csum
> >    forwarding engine.
> > c. Select IP and TCP HW checksum calculation for P0; select TCP HW
> >    checksum calculation for vhost-user port.
> > d. Launch a VM with csum and tso offloading enabled.
> > e. Run iperf-client on virtio-net port in the VM to send TCP packets.
> 
> Not sure I understand the setup correctly:
> So testpmd forwards packets between P0 and vhost-user port, right?

Yes.

> And who uses P1? iperf-server over linux kernel?

P1 is possessed by linux kernel.

> Also is P1 on another box or not?

P0 and P1 are in the same machine and are connected physically.

> 
> > 
> > With GSO enabled for P0 in testpmd, observed iperf throughput is ~9Gbps.
> 
> Ok, and if GSO is disabled what is the throughput?
> Another stupid question: if P0 is physical 10G (ixgbe?) we can just enable a TSO on it, right?
> If so, what would be the TSO numbers here?

Here are more detailed experiment information:

test1: only enable GSO for p0, GSO size is 1518, use two iperf-clients (i.e. "-P 2")
test2: only enable TSO for p0, TSO size is 1518, use two iperf-clients
test3: disable TSO and GSO, use two iperf-clients

test1 performance: 8.6Gpbs
test2 throughput: 9.5Gbps
test3 throughput: 3Mbps

> 
> In fact, could you probably explain a bit more, what supposed to be a main usage model for that library?

The GSO library is just a SW segmentation method, which can be used by applications, like OVS.
Currently, most of NICs supports to segment TCP and UDP packets, but not for all NICs. So current
OVS doesn't enable TSO, as a result of lacking a SW segmentation fallback. Besides, the protocol
types in HW segmentation are limited. So it's necessary to provide a SW segmentation solution.

With the GSO library, OVS and other applications are able to receive large packets from VMs and
process these large packets, instead of standard ones (i.e. 1518B). So the per-packet overhead is
reduced, since the number of packets needed processing is much fewer.

> Is that to perform segmentation on (virtual) devices that doesn't support HW TSO or ...?

When launch qemu with enabling TSO or GSO, the virtual device doesn't really do segmentation.
It directly sends large packets. Therefore, testpmd can receive large packets from the VM and
then perform GSO. The GSO/TSO behavior of virtual devices is different from physical NICs.

> Again would it be for a termination point (packets were just formed and filled) by the caller,
> or is that for box in the middle which just forwards packets between nodes?
> If the later one, then we'll probably already have most of our packets segmented properly, no?
>   
> > The experimental data of VxLAN and GRE will be shown later.
> > 
> > Jiayu Hu (3):
> >   lib: add Generic Segmentation Offload API framework
> >   gso/lib: add TCP/IPv4 GSO support
> >   app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
> > 
> > Mark Kavanagh (2):
> >   lib/gso: add VxLAN GSO support
> >   lib/gso: add GRE GSO support
> > 
> >  app/test-pmd/cmdline.c                  | 121 +++++++++
> >  app/test-pmd/config.c                   |  25 ++
> >  app/test-pmd/csumonly.c                 |  68 ++++-
> >  app/test-pmd/testpmd.c                  |   9 +
> >  app/test-pmd/testpmd.h                  |  10 +
> >  config/common_base                      |   5 +
> >  lib/Makefile                            |   2 +
> >  lib/librte_eal/common/include/rte_log.h |   1 +
> >  lib/librte_gso/Makefile                 |  52 ++++
> >  lib/librte_gso/gso_common.c             | 431 ++++++++++++++++++++++++++++++++
> >  lib/librte_gso/gso_common.h             | 180 +++++++++++++
> >  lib/librte_gso/gso_tcp.c                |  82 ++++++
> >  lib/librte_gso/gso_tcp.h                |  73 ++++++
> >  lib/librte_gso/gso_tunnel.c             |  62 +++++
> >  lib/librte_gso/gso_tunnel.h             |  46 ++++
> >  lib/librte_gso/rte_gso.c                | 100 ++++++++
> >  lib/librte_gso/rte_gso.h                | 122 +++++++++
> >  lib/librte_gso/rte_gso_version.map      |   7 +
> >  mk/rte.app.mk                           |   1 +
> >  19 files changed, 1392 insertions(+), 5 deletions(-)
> >  create mode 100644 lib/librte_gso/Makefile
> >  create mode 100644 lib/librte_gso/gso_common.c
> >  create mode 100644 lib/librte_gso/gso_common.h
> >  create mode 100644 lib/librte_gso/gso_tcp.c
> >  create mode 100644 lib/librte_gso/gso_tcp.h
> >  create mode 100644 lib/librte_gso/gso_tunnel.c
> >  create mode 100644 lib/librte_gso/gso_tunnel.h
> >  create mode 100644 lib/librte_gso/rte_gso.c
> >  create mode 100644 lib/librte_gso/rte_gso.h
> >  create mode 100644 lib/librte_gso/rte_gso_version.map
> > 
> > --
> > 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 1/5] lib: add Generic Segmentation Offload API framework
  2017-08-30  1:38   ` Ananyev, Konstantin
@ 2017-08-30  7:57     ` Jiayu Hu
  0 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-08-30  7:57 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Konstantin,

On Wed, Aug 30, 2017 at 09:38:02AM +0800, Ananyev, Konstantin wrote:
> Hi Jiayu,
> 
> > 
> > Generic Segmentation Offload (GSO) is a SW technique to split large
> > packets into small ones. Akin to TSO, GSO enables applications to
> > operate on large packets, thus reducing per-packet processing overhead.
> > 
> > To enable more flexibility to applications, DPDK GSO is implemented
> > as a standalone library. Applications explicitly use the GSO library
> > to segment packets. This patch introduces the GSO API framework to DPDK.
> > 
> > The GSO library provides a segmentation API, rte_gso_segment(), for
> > applications. It splits an input packet into small ones in each
> > invocation. The GSO library refers to these small packets generated
> > by rte_gso_segment() as GSO segments. When all GSO segments are freed,
> > the input packet is freed automatically.
> > 
> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> > ---
> >  config/common_base                 |   5 ++
> >  lib/Makefile                       |   2 +
> >  lib/librte_gso/Makefile            |  49 ++++++++++++++++
> >  lib/librte_gso/rte_gso.c           |  47 ++++++++++++++++
> >  lib/librte_gso/rte_gso.h           | 111 +++++++++++++++++++++++++++++++++++++
> >  lib/librte_gso/rte_gso_version.map |   7 +++
> >  mk/rte.app.mk                      |   1 +
> >  7 files changed, 222 insertions(+)
> >  create mode 100644 lib/librte_gso/Makefile
> >  create mode 100644 lib/librte_gso/rte_gso.c
> >  create mode 100644 lib/librte_gso/rte_gso.h
> >  create mode 100644 lib/librte_gso/rte_gso_version.map
> > 
> > diff --git a/config/common_base b/config/common_base
> > index 5e97a08..603e340 100644
> > --- a/config/common_base
> > +++ b/config/common_base
> > @@ -652,6 +652,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
> >  CONFIG_RTE_LIBRTE_GRO=y
> > 
> >  #
> > +# Compile GSO library
> > +#
> > +CONFIG_RTE_LIBRTE_GSO=y
> > +
> > +#
> >  # Compile librte_meter
> >  #
> >  CONFIG_RTE_LIBRTE_METER=y
> > diff --git a/lib/Makefile b/lib/Makefile
> > index 86caba1..3d123f4 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
> >  DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
> >  DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
> >  DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
> > +DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
> > +DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
> > 
> >  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
> >  DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
> > diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> > new file mode 100644
> > index 0000000..aeaacbc
> > --- /dev/null
> > +++ b/lib/librte_gso/Makefile
> > @@ -0,0 +1,49 @@
> > +#   BSD LICENSE
> > +#
> > +#   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > +#   All rights reserved.
> > +#
> > +#   Redistribution and use in source and binary forms, with or without
> > +#   modification, are permitted provided that the following conditions
> > +#   are met:
> > +#
> > +#     * Redistributions of source code must retain the above copyright
> > +#       notice, this list of conditions and the following disclaimer.
> > +#     * Redistributions in binary form must reproduce the above copyright
> > +#       notice, this list of conditions and the following disclaimer in
> > +#       the documentation and/or other materials provided with the
> > +#       distribution.
> > +#     * Neither the name of Intel Corporation nor the names of its
> > +#       contributors may be used to endorse or promote products derived
> > +#       from this software without specific prior written permission.
> > +#
> > +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > +
> > +include $(RTE_SDK)/mk/rte.vars.mk
> > +
> > +# library name
> > +LIB = librte_gso.a
> > +
> > +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
> > +
> > +EXPORT_MAP := rte_gso_version.map
> > +
> > +LIBABIVER := 1
> > +
> > +#source files
> > +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
> > +
> > +# install this header file
> > +SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> > +
> > +include $(RTE_SDK)/mk/rte.lib.mk
> > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > new file mode 100644
> > index 0000000..b81afce
> > --- /dev/null
> > +++ b/lib/librte_gso/rte_gso.c
> > @@ -0,0 +1,47 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#include "rte_gso.h"
> > +
> > +int
> > +rte_gso_segment(struct rte_mbuf *pkt,
> > +		struct rte_gso_ctx gso_ctx,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out __rte_unused)
> > +{
> > +	if (pkt == NULL || pkts_out == NULL || gso_ctx.direct_pool ==
> > +			NULL || gso_ctx.indirect_pool == NULL)
> > +		return -EINVAL;
> > +
> > +	return 1;
> > +}
> > diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
> > new file mode 100644
> > index 0000000..5a8389a
> > --- /dev/null
> > +++ b/lib/librte_gso/rte_gso.h
> > @@ -0,0 +1,111 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#ifndef _RTE_GSO_H_
> > +#define _RTE_GSO_H_
> > +
> > +/**
> > + * @file
> > + * Interface to GSO library
> > + */
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#include <stdint.h>
> > +#include <rte_mbuf.h>
> > +
> > +/**
> > + * GSO context structure.
> > + */
> > +struct rte_gso_ctx {
> > +	struct rte_mempool *direct_pool;
> > +	/**< MBUF pool for allocating direct buffers, which are used
> > +	 * to store packet headers for GSO segments.
> > +	 */
> > +	struct rte_mempool *indirect_pool;
> > +	/**< MBUF pool for allocating indirect buffers, which are used
> > +	 * to locate packet payloads for GSO segments. The indirect
> > +	 * buffer doesn't contain any data, but simply points to an
> > +	 * offset within the packet to segment.
> > +	 */
> > +	uint64_t gso_types;
> > +	/**< GSO types to perform */
> 
> Looking at the way it is used right now - there seems not much value in it...
> Why not to make it a mask of ptypes for which GSO should be perfomed?
> Let say for gso_ctx that supports only ip4/tcp it would be:
> gso_types = (RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)
> and then in rte_gso_segment() we can perfom gso only on packets of requested ptype:
> 
> if ((pkt->packet_type & gso_ctx->gso_types) == pkt->packet_type) {
>    /* do segmentation */
> } else {
>   /* skip segmentation for that packet */
> }

Yes, you are right. It's unnecessary to define GRO type macros. We
can reuse ptype. I will change it in the next version.

> 
> > +	uint16_t gso_size;
> > +	/**< maximum size of a GSO segment, measured in bytes */
> 
> Is that MSS or MTU?

MSS. It's the max length of a complete packet, including packet headers.

> 
> > +};
> > +
> > +/**
> > + * Segmentation function, which supports processing of both single- and
> > + * multi- segment packets. rte_gso_segment() assumes the input packet
> > + * has correct checksums, and it doesn't process IP fragment packets.
> > + * Additionally, it assumes that 'pkts_out' is large enough to hold all GSO
> > + * segments.
> > + *
> > + * We refer to the packets that are segmented from the input packet as 'GSO
> > + * segments'. If the input packet is GSOed, its mbuf refcnt reduces by 1.
> > + * Therefore, when all GSO segments are freed, the input packet is freed
> > + * automatically. If the input packet doesn't match the criteria for GSO
> > + * (e.g. 'pkt's length is small and doesn't need segmentation), the packet
> > + * is skipped and this function returns 1. If the available memory space
> > + * in MBUF pools is insufficient, the packet is skipped and return -ENOMEM.
> > + *
> > + * @param pkt
> > + *  The packet mbuf to segment.
> > + * @param ctx
> > + *  GSO context object.
> > + * @param pkts_out
> > + *  Pointer array used to stores the mbuf addresses of GSO segments.
> > + *  Applications must ensure pkts_out is large enough to hold all GSO
> > + *  segments. If the memory space in pkts_out is insufficient, the input
> > + *  packet is skipped and return -EINVAL.
> > + * @param nb_pkts_out
> > + *  The max number of items that pkts_out can keep.
> > + *
> > + * @return
> > + *  - The number of GSO segments created on success.
> > + *  - Return 1 if no GSO is performed.
> 
> Wouldn't it be better to return number of elems filled in pkts_out[] on success?

Agree. I will change it.

> 
> > + *  - Return -ENOMEM if run out of memory in MBUF pools.
> > + *  - Return -EINVAL for invalid parameters.
> > + */
> > +int rte_gso_segment(struct rte_mbuf *pkt,
> > +		struct rte_gso_ctx ctx,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out);
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_GSO_H_ */
> > diff --git a/lib/librte_gso/rte_gso_version.map b/lib/librte_gso/rte_gso_version.map
> > new file mode 100644
> > index 0000000..e1fd453
> > --- /dev/null
> > +++ b/lib/librte_gso/rte_gso_version.map
> > @@ -0,0 +1,7 @@
> > +DPDK_17.11 {
> > +	global:
> > +
> > +	rte_gso_segment;
> > +
> > +	local: *;
> > +};
> > diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> > index c25fdd9..d4c9873 100644
> > --- a/mk/rte.app.mk
> > +++ b/mk/rte.app.mk
> > @@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
> >  _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
> >  _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
> >  _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
> > +_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
> >  _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
> >  _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
> >  _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
> > --
> > 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
  2017-08-30  1:38   ` Ananyev, Konstantin
  2017-08-30  2:55     ` Jiayu Hu
@ 2017-08-30  9:03     ` Jiayu Hu
  2017-09-04  3:31     ` Jiayu Hu
  2 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-08-30  9:03 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Konstantin,

On Wed, Aug 30, 2017 at 09:38:33AM +0800, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Hu, Jiayu
> > Sent: Thursday, August 24, 2017 3:16 PM
> > To: dev@dpdk.org
> > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Tan, Jianfeng
> > <jianfeng.tan@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
> > Subject: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
> > 
> > This patch adds GSO support for TCP/IPv4 packets. Supported packets
> > may include a single VLAN tag. TCP/IPv4 GSO assumes that all input
> > packets have correct checksums, and doesn't update checksums for output
> > packets (the responsibility for this lies with the application).
> > Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
> > 
> > TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
> > MBUF, to organize an output packet. Note that we refer to these two
> > chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
> > header, while the indirect mbuf simply points to a location within the
> > original packet's payload. Consequently, use of the GSO library requires
> > multi-segment MBUF support in the TX functions of the NIC driver.
> > 
> > If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> > result, when all of its GSOed segments are freed, the packet is freed
> > automatically.
> > 
> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> > ---
> >  int
> >  rte_gso_segment(struct rte_mbuf *pkt,
> >  		struct rte_gso_ctx gso_ctx,
> >  		struct rte_mbuf **pkts_out,
> > -		uint16_t nb_pkts_out __rte_unused)
> > +		uint16_t nb_pkts_out)
> >  {
> > +	struct rte_mempool *direct_pool, *indirect_pool;
> > +	struct rte_mbuf *pkt_seg;
> > +	uint16_t nb_segments, gso_size;
> > +
> >  	if (pkt == NULL || pkts_out == NULL || gso_ctx.direct_pool ==
> >  			NULL || gso_ctx.indirect_pool == NULL)
> >  		return -EINVAL;
> 
> Probably we don't need to check gso_ctx values for each incoming packet.
> If you feel it is necessary - create  new function rte_gso_ctx_check() that
> could be performed just once per ctx.
> 
> > 
> > -	return 1;
> > +	if ((gso_ctx.gso_types & RTE_GSO_TCP_IPV4) == 0 ||
> > +			gso_ctx.gso_size >= pkt->pkt_len ||
> > +			gso_ctx.gso_size == 0)
> 
> 
> First and third condition seems redundant.

Yes, we don't need the first and the third check here. Please ingore the redundant
reply in the previous mail.

Thanks,
Jiayu

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
  2017-08-30  2:55     ` Jiayu Hu
@ 2017-08-30  9:25       ` Kavanagh, Mark B
  2017-08-30  9:39         ` Ananyev, Konstantin
  0 siblings, 1 reply; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-08-30  9:25 UTC (permalink / raw)
  To: Hu, Jiayu, Ananyev, Konstantin; +Cc: dev, Tan, Jianfeng

>From: Hu, Jiayu
>Sent: Wednesday, August 30, 2017 3:56 AM
>To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
>Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng
><jianfeng.tan@intel.com>
>Subject: Re: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
>
>Hi Konstantin,
>
>Thanks for your important suggestions. My feedbacks are inline.
>
>On Wed, Aug 30, 2017 at 09:38:33AM +0800, Ananyev, Konstantin wrote:
>>
>>
>> > -----Original Message-----
>> > From: Hu, Jiayu
>> > Sent: Thursday, August 24, 2017 3:16 PM
>> > To: dev@dpdk.org
>> > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Ananyev, Konstantin
><konstantin.ananyev@intel.com>; Tan, Jianfeng
>> > <jianfeng.tan@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
>> > Subject: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
>> >
>> > This patch adds GSO support for TCP/IPv4 packets. Supported packets
>> > may include a single VLAN tag. TCP/IPv4 GSO assumes that all input
>> > packets have correct checksums, and doesn't update checksums for output
>> > packets (the responsibility for this lies with the application).
>> > Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
>> >
>> > TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
>> > MBUF, to organize an output packet. Note that we refer to these two
>> > chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
>> > header, while the indirect mbuf simply points to a location within the
>> > original packet's payload. Consequently, use of the GSO library requires
>> > multi-segment MBUF support in the TX functions of the NIC driver.
>> >
>> > If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
>> > result, when all of its GSOed segments are freed, the packet is freed
>> > automatically.
>> >
>> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
>> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> > ---
>> >  lib/librte_eal/common/include/rte_log.h |   1 +
>> >  lib/librte_gso/Makefile                 |   2 +
>> >  lib/librte_gso/gso_common.c             | 270
>++++++++++++++++++++++++++++++++
>> >  lib/librte_gso/gso_common.h             | 120 ++++++++++++++
>> >  lib/librte_gso/gso_tcp.c                |  82 ++++++++++
>> >  lib/librte_gso/gso_tcp.h                |  73 +++++++++
>> >  lib/librte_gso/rte_gso.c                |  44 +++++-
>> >  lib/librte_gso/rte_gso.h                |   3 +
>> >  8 files changed, 593 insertions(+), 2 deletions(-)
>> >  create mode 100644 lib/librte_gso/gso_common.c
>> >  create mode 100644 lib/librte_gso/gso_common.h
>> >  create mode 100644 lib/librte_gso/gso_tcp.c
>> >  create mode 100644 lib/librte_gso/gso_tcp.h
>> >
>> > diff --git a/lib/librte_eal/common/include/rte_log.h
>b/lib/librte_eal/common/include/rte_log.h
>> > index ec8dba7..2fa1199 100644
>> > --- a/lib/librte_eal/common/include/rte_log.h
>> > +++ b/lib/librte_eal/common/include/rte_log.h
>> > @@ -87,6 +87,7 @@ extern struct rte_logs rte_logs;
>> >  #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
>> >  #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
>> >  #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
>> > +#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
>> >
>> >  /* these log types can be used in an application */
>> >  #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
>> > diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
>> > index aeaacbc..0f8e38f 100644
>> > --- a/lib/librte_gso/Makefile
>> > +++ b/lib/librte_gso/Makefile
>> > @@ -42,6 +42,8 @@ LIBABIVER := 1
>> >
>> >  #source files
>> >  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
>> > +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
>> > +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp.c
>> >
>> >  # install this header file
>> >  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
>> > diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
>> > new file mode 100644
>> > index 0000000..2b54fbd
>> > --- /dev/null
>> > +++ b/lib/librte_gso/gso_common.c
>> > @@ -0,0 +1,270 @@
>> > +/*-
>> > + *   BSD LICENSE
>> > + *
>> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> > + *   All rights reserved.
>> > + *
>> > + *   Redistribution and use in source and binary forms, with or without
>> > + *   modification, are permitted provided that the following conditions
>> > + *   are met:
>> > + *
>> > + *     * Redistributions of source code must retain the above copyright
>> > + *       notice, this list of conditions and the following disclaimer.
>> > + *     * Redistributions in binary form must reproduce the above
>copyright
>> > + *       notice, this list of conditions and the following disclaimer in
>> > + *       the documentation and/or other materials provided with the
>> > + *       distribution.
>> > + *     * Neither the name of Intel Corporation nor the names of its
>> > + *       contributors may be used to endorse or promote products derived
>> > + *       from this software without specific prior written permission.
>> > + *
>> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
>FOR
>> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
>INCIDENTAL,
>> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
>USE,
>> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
>ANY
>> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
>USE
>> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> > + */
>> > +
>> > +#include <stdbool.h>
>> > +#include <string.h>
>> > +
>> > +#include <rte_malloc.h>
>> > +
>> > +#include <rte_ether.h>
>> > +#include <rte_ip.h>
>> > +#include <rte_tcp.h>
>> > +
>> > +#include "gso_common.h"
>> > +
>> > +static inline void
>> > +hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
>> > +		uint16_t pkt_hdr_offset)
>> > +{
>> > +	/* copy mbuf metadata */
>> > +	hdr_segment->nb_segs = 1;
>> > +	hdr_segment->port = pkt->port;
>> > +	hdr_segment->ol_flags = pkt->ol_flags;
>> > +	hdr_segment->packet_type = pkt->packet_type;
>> > +	hdr_segment->pkt_len = pkt_hdr_offset;
>> > +	hdr_segment->data_len = pkt_hdr_offset;
>> > +	hdr_segment->tx_offload = pkt->tx_offload;
>> > +	/* copy packet header */
>> > +	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
>> > +			rte_pktmbuf_mtod(pkt, char *),
>> > +			pkt_hdr_offset);
>> > +}
>> > +
>> > +static inline void
>> > +free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
>> > +{
>> > +	uint16_t i;
>> > +
>> > +	for (i = 0; i < nb_pkts; i++) {
>> > +		rte_pktmbuf_detach(pkts[i]->next);
>>
>> I don't think you need to call detach() here explicitly.
>> Just rte_pktmbuf_free(pkts[i]) should do I think.
>
>Yes, rte_pktmbuf_free() is enough. I will modify it. Thanks.
>
>>
>> > +		rte_pktmbuf_free(pkts[i]);
>> > +		pkts[i] = NULL;
>> > +	}
>> > +}
>> > +
>> > +int
>> > +gso_do_segment(struct rte_mbuf *pkt,
>> > +		uint16_t pkt_hdr_offset,
>> > +		uint16_t pyld_unit_size,
>> > +		struct rte_mempool *direct_pool,
>> > +		struct rte_mempool *indirect_pool,
>> > +		struct rte_mbuf **pkts_out,
>> > +		uint16_t nb_pkts_out)
>> > +{
>> > +	struct rte_mbuf *pkt_in;
>> > +	struct rte_mbuf *hdr_segment, *pyld_segment;
>> > +	uint32_t pkt_in_pyld_off;
>> > +	uint16_t pkt_in_segment_len, pkt_out_segment_len;
>> > +	uint16_t nb_segs;
>> > +	bool pkt_in_segment_processed;
>> > +
>> > +	pkt_in_pyld_off = pkt->data_off + pkt_hdr_offset;
>> > +	pkt_in = pkt;
>> > +	nb_segs = 0;
>> > +
>> > +	while (pkt_in) {
>> > +		pkt_in_segment_processed = false;
>> > +		pkt_in_segment_len = pkt_in->data_off + pkt_in->data_len;
>> > +
>> > +		while (!pkt_in_segment_processed) {
>> > +			if (unlikely(nb_segs >= nb_pkts_out)) {
>> > +				free_gso_segment(pkts_out, nb_segs);
>> > +				return -EINVAL;
>> > +			}
>> > +
>> > +			/* allocate direct mbuf */
>> > +			hdr_segment = rte_pktmbuf_alloc(direct_pool);
>> > +			if (unlikely(hdr_segment == NULL)) {
>> > +				free_gso_segment(pkts_out, nb_segs);
>> > +				return -ENOMEM;
>> > +			}
>> > +
>> > +			/* allocate indirect mbuf */
>> > +			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
>> > +			if (unlikely(pyld_segment == NULL)) {
>> > +				rte_pktmbuf_free(hdr_segment);
>> > +				free_gso_segment(pkts_out, nb_segs);
>> > +				return -ENOMEM;
>> > +			}
>>
>> So, if I understand correctly each new packet would always contain just one
>data segment?
>> Why several data segments couldn't be chained together (if sum of their
>data_len <= mss)?
>> In a same way as done here:
>>
>http://dpdk.org/browse/dpdk/tree/lib/librte_ip_frag/rte_ipv4_fragmentation.c#n
>93
>> or here:
>>
>https://gerrit.fd.io/r/gitweb?p=tldk.git;a=blob;f=lib/libtle_l4p/tcp_tx_seg.h;
>h=a8d2425597a7ad6f598aa4bb7fcd7f1da74305f0;hb=HEAD#l23
>> ?
>
>Oh, yes. I can chain these data segments when their total length is less than
>the GSO segsz.
>I will change it in the next patch. Thanks very much.
>
>>
>> > +
>> > +			/* copy packet header */
>> > +			hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
>> > +
>> > +			/* attach payload mbuf to current packet segment */
>> > +			rte_pktmbuf_attach(pyld_segment, pkt_in);
>> > +
>> > +			hdr_segment->next = pyld_segment;
>> > +			pkts_out[nb_segs++] = hdr_segment;
>> > +
>> > +			/* calculate payload length */
>> > +			pkt_out_segment_len = pyld_unit_size;
>> > +			if (pkt_in_pyld_off + pkt_out_segment_len >
>> > +					pkt_in_segment_len) {
>> > +				pkt_out_segment_len = pkt_in_segment_len -
>> > +					pkt_in_pyld_off;
>> > +			}
>> > +
>> > +			/* update payload segment */
>> > +			pyld_segment->data_off = pkt_in_pyld_off;
>> > +			pyld_segment->data_len = pkt_out_segment_len;
>> > +
>> > +			/* update header segment */
>> > +			hdr_segment->pkt_len += pyld_segment->data_len;
>> > +			hdr_segment->nb_segs++;
>> > +
>> > +			/* update pkt_in_pyld_off */
>> > +			pkt_in_pyld_off += pkt_out_segment_len;
>> > +			if (pkt_in_pyld_off == pkt_in_segment_len)
>> > +				pkt_in_segment_processed = true;
>> > +		}
>> > +
>> > +		/* 'pkt_in' may contain numerous segments */
>> > +		pkt_in = pkt_in->next;
>> > +		if (pkt_in != NULL)
>> > +			pkt_in_pyld_off = pkt_in->data_off;
>> > +	}
>> > +	return nb_segs;
>> > +}
>> > +
>> > +static inline void
>> > +parse_ipv4(struct ipv4_hdr *ipv4_hdr, struct rte_mbuf *pkt)
>> > +{
>> > +	struct tcp_hdr *tcp_hdr;
>> > +
>> > +	switch (ipv4_hdr->next_proto_id) {
>> > +	case IPPROTO_TCP:
>> > +		pkt->packet_type |= RTE_PTYPE_L4_TCP;
>> > +		pkt->l3_len = IPv4_HDR_LEN(ipv4_hdr);
>> > +		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
>> > +		pkt->l4_len = TCP_HDR_LEN(tcp_hdr);
>> > +		break;
>> > +	}
>> > +}
>> > +
>> > +static inline void
>> > +parse_ethernet(struct ether_hdr *eth_hdr, struct rte_mbuf *pkt)
>> > +{
>> > +	struct ipv4_hdr *ipv4_hdr;
>> > +	struct vlan_hdr *vlan_hdr;
>> > +	uint16_t ethertype;
>> > +
>> > +	ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
>> > +	if (ethertype == ETHER_TYPE_VLAN) {
>> > +		vlan_hdr = (struct vlan_hdr *)(eth_hdr + 1);
>> > +		pkt->l2_len = sizeof(struct vlan_hdr);
>> > +		pkt->packet_type |= RTE_PTYPE_L2_ETHER_VLAN;
>> > +		ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto);
>> > +	}
>> > +
>> > +	switch (ethertype) {
>> > +	case ETHER_TYPE_IPv4:
>> > +		if (IS_VLAN_PKT(pkt)) {
>> > +			pkt->packet_type |= RTE_PTYPE_L3_IPV4;
>> > +		} else {
>> > +			pkt->packet_type |= RTE_PTYPE_L2_ETHER;
>> > +			pkt->packet_type |= RTE_PTYPE_L3_IPV4;
>> > +		}
>> > +		pkt->l2_len += sizeof(struct ether_hdr);
>> > +		ipv4_hdr = (struct ipv4_hdr *) ((char *)eth_hdr +
>> > +				pkt->l2_len);
>> > +		parse_ipv4(ipv4_hdr, pkt);
>> > +		break;
>> > +	}
>> > +}
>> > +
>> > +void
>> > +gso_parse_packet(struct rte_mbuf *pkt)
>>
>> There is a function rte_net_get_ptype() that supposed to provide similar
>functionality.
>> So we probably don't need to create a new SW parse function here, instead
>would be better
>> to reuse (and update if needed) an existing one.
>> Again user already might have l2/l3/l4.../_len and packet_type setuped.
>> So better to keep SW packet parsing out of scope of that library.
>
>Hmm, I know we have discussed this design choice in the GRO library, and I
>also think it's
>better to reuse these values.
>
>But from the perspective of OVS, it may add extra overhead, since OVS doesn't
>parse every
>packet originally. Maybe @Mark can give us more inputs from the view of OVS.

Hi Jiayu, Konstantin

For GSO, the application needs to know:
- the packet type (as it only currently supports TCP/IPv4, VxLAN, GRE packets)
- the l2/3/4_lens, etc. (in order to replicate the original packet's headers across outgoing segments)

For this, we can use the rte_net_get_ptype function, as per Konstantin's suggestion, as it provides both - thanks Konstantin!

WRT the extra overhead in OvS: TSO is the defacto standard, and GSO is provided purely as a fallback option. As such, and since the additional packet parsing is a necessity in order to facilitate GSO, the additional overhead is IMO acceptable.

Thanks,
Mark

>
>>
>> > +{
>> > +	struct ether_hdr *eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
>> > +
>> > +	pkt->packet_type = pkt->tx_offload = 0;
>> > +	parse_ethernet(eth_hdr, pkt);
>> > +}
>> > +
>> > +static inline void
>> > +update_ipv4_header(char *base, uint16_t offset, uint16_t length, uint16_t
>id)
>> > +{
>> > +	struct ipv4_hdr *ipv4_hdr = (struct ipv4_hdr *)(base + offset);
>> > +
>> > +	ipv4_hdr->total_length = rte_cpu_to_be_16(length - offset);
>> > +	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
>> > +}
>> > +
>> > +static inline void
>> > +update_tcp_header(char *base, uint16_t offset, uint32_t sent_seq,
>> > +	uint8_t non_tail)
>> > +{
>> > +	struct tcp_hdr *tcp_hdr = (struct tcp_hdr *)(base + offset);
>> > +
>> > +	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
>> > +	/* clean FIN and PSH for non-tail segments */
>> > +	if (non_tail)
>> > +		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK | TCP_HDR_FIN_MASK));
>> > +}
>> > +
>> > +void
>> > +gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
>> > +		struct rte_mbuf **out_segments)
>> > +{
>> > +	struct ipv4_hdr *ipv4_hdr;
>> > +	struct tcp_hdr *tcp_hdr;
>> > +	struct rte_mbuf *seg;
>> > +	uint32_t sent_seq;
>> > +	uint16_t offset, i;
>> > +	uint16_t tail_seg_idx = nb_segments - 1, id;
>> > +
>> > +	switch (pkt->packet_type) {
>> > +	case ETHER_VLAN_IPv4_TCP_PKT:
>> > +	case ETHER_IPv4_TCP_PKT:
>>
>> Might be worth to put code below in a separate function:
>> update_inner_tcp_hdr(..) or so.
>> Then you can reuse it for tunneled cases too.
>
>Yes, I will modify it in the next patch. Thanks.
>
>>
>> > +		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *)
>> > +				pkt->l2_len);
>> > +		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
>> > +		id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
>> > +		sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
>> > +
>> > +		for (i = 0; i < nb_segments; i++) {
>> > +			seg = out_segments[i];
>> > +
>> > +			offset = seg->l2_len;
>> > +			update_ipv4_header(rte_pktmbuf_mtod(seg, char *),
>> > +					offset, seg->pkt_len, id);
>> > +			id++;
>>
>> Who would be responsible to make sure that we wouldn't have consecutive
>packets with the IPV4 id?
>> Would be the upper layer that forms the packet or gso library or ...?
>
>Oh yes. I ingore this important issue. I don't think applications can
>guarantee it.
>I will check the design of linux and try to figure out a way. Thanks for
>reminder.
>
>>
>> > +
>> > +			offset += seg->l3_len;
>> > +			update_tcp_header(rte_pktmbuf_mtod(seg, char *),
>> > +					offset, sent_seq, i < tail_seg_idx);
>> > +			sent_seq += seg->next->data_len;
>> > +		}
>> > +		break;
>> > +	}
>> > +}
>> > diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
>> > new file mode 100644
>> > index 0000000..d750041
>> > --- /dev/null
>> > +++ b/lib/librte_gso/gso_common.h
>> > @@ -0,0 +1,120 @@
>> > +/*-
>> > + *   BSD LICENSE
>> > + *
>> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> > + *   All rights reserved.
>> > + *
>> > + *   Redistribution and use in source and binary forms, with or without
>> > + *   modification, are permitted provided that the following conditions
>> > + *   are met:
>> > + *
>> > + *     * Redistributions of source code must retain the above copyright
>> > + *       notice, this list of conditions and the following disclaimer.
>> > + *     * Redistributions in binary form must reproduce the above
>copyright
>> > + *       notice, this list of conditions and the following disclaimer in
>> > + *       the documentation and/or other materials provided with the
>> > + *       distribution.
>> > + *     * Neither the name of Intel Corporation nor the names of its
>> > + *       contributors may be used to endorse or promote products derived
>> > + *       from this software without specific prior written permission.
>> > + *
>> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
>FOR
>> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
>INCIDENTAL,
>> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
>USE,
>> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
>ANY
>> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
>USE
>> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> > + */
>> > +
>> > +#ifndef _GSO_COMMON_H_
>> > +#define _GSO_COMMON_H_
>> > +
>> > +#include <stdint.h>
>> > +#include <rte_mbuf.h>
>> > +
>> > +#define IPV4_HDR_DF_SHIFT 14
>> > +#define IPV4_HDR_DF_MASK (1 << IPV4_HDR_DF_SHIFT)
>> > +#define IPv4_HDR_LEN(iph) ((iph->version_ihl & 0x0f) * 4)
>> > +
>> > +#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
>> > +#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
>> > +#define TCP_HDR_LEN(tcph) ((tcph->data_off & 0xf0) >> 2)
>> > +
>> > +#define ETHER_IPv4_PKT (RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4)
>> > +/* Supported packet types */
>> > +/* TCP/IPv4 packet. */
>> > +#define ETHER_IPv4_TCP_PKT (ETHER_IPv4_PKT | RTE_PTYPE_L4_TCP)
>> > +
>> > +/* TCP/IPv4 packet with VLAN tag. */
>> > +#define ETHER_VLAN_IPv4_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | \
>> > +		RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)
>> > +
>> > +#define IS_VLAN_PKT(pkt) ((pkt->packet_type & RTE_PTYPE_L2_ETHER_VLAN) ==
>\
>> > +		RTE_PTYPE_L2_ETHER_VLAN)
>> > +
>> > +/**
>> > + * Internal function which parses a packet, setting outer_l2/l3_len and
>> > + * l2/l3/l4_len and packet_type.
>> > + *
>> > + * @param pkt
>> > + *  Packet to parse.
>> > + */
>> > +void gso_parse_packet(struct rte_mbuf *pkt);
>> > +
>> > +/**
>> > + * Internal function which updates relevant packet headers, following
>> > + * segmentation. This is required to update, for example, the IPv4
>> > + * 'total_length' field, to reflect the reduced length of the now-
>> > + * segmented packet.
>> > + *
>> > + * @param pkt
>> > + *  The original packet.
>> > + * @param nb_segments
>> > + *  The number of GSO segments into which pkt was split.
>> > + * @param out_segements
>> > + *  Pointer array used for storing mbuf addresses for GSO segments.
>> > + */
>> > +void gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
>> > +		struct rte_mbuf **out_segments);
>> > +
>> > +/**
>> > + * Internal function which divides the input packet into small segments.
>> > + * Each of the newly-created segments is organized as a two-segment mbuf,
>> > + * where the first segment is a standard mbuf, which stores a copy of
>> > + * packet header, and the second is an indirect mbuf which points to a
>> > + * section of data in the input packet.
>> > + *
>> > + * @param pkt
>> > + *  Packet to segment.
>> > + * @param pkt_hdr_offset
>> > + *  Packet header offset, measured in byte.
>> > + * @param pyld_unit_size
>> > + *  The max payload length of a GSO segment.
>> > + * @param direct_pool
>> > + *  MBUF pool used for allocating direct buffers for output segments.
>> > + * @param indirect_pool
>> > + *  MBUF pool used for allocating indirect buffers for output segments.
>> > + * @param pkts_out
>> > + *  Pointer array used to keep the mbuf addresses of output segments.
>> > + * @param nb_pkts_out
>> > + *  The max number of items that pkts_out can keep.
>> > + *
>> > + * @return
>> > + *  - The number of segments created in the event of success.
>> > + *  - If no GSO is performed, return 1.
>> > + *  - If available memory in mempools is insufficient, return -ENOMEM.
>> > + *  - -EINVAL for invalid parameters
>> > + */
>> > +int gso_do_segment(struct rte_mbuf *pkt,
>> > +		uint16_t pkt_hdr_offset,
>> > +		uint16_t pyld_unit_size,
>> > +		struct rte_mempool *direct_pool,
>> > +		struct rte_mempool *indirect_pool,
>> > +		struct rte_mbuf **pkts_out,
>> > +		uint16_t nb_pkts_out);
>> > +#endif
>> > diff --git a/lib/librte_gso/gso_tcp.c b/lib/librte_gso/gso_tcp.c
>> > new file mode 100644
>> > index 0000000..9d5fc30
>> > --- /dev/null
>> > +++ b/lib/librte_gso/gso_tcp.c
>> > @@ -0,0 +1,82 @@
>> > +/*-
>> > + *   BSD LICENSE
>> > + *
>> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> > + *   All rights reserved.
>> > + *
>> > + *   Redistribution and use in source and binary forms, with or without
>> > + *   modification, are permitted provided that the following conditions
>> > + *   are met:
>> > + *
>> > + *     * Redistributions of source code must retain the above copyright
>> > + *       notice, this list of conditions and the following disclaimer.
>> > + *     * Redistributions in binary form must reproduce the above
>copyright
>> > + *       notice, this list of conditions and the following disclaimer in
>> > + *       the documentation and/or other materials provided with the
>> > + *       distribution.
>> > + *     * Neither the name of Intel Corporation nor the names of its
>> > + *       contributors may be used to endorse or promote products derived
>> > + *       from this software without specific prior written permission.
>> > + *
>> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
>FOR
>> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
>INCIDENTAL,
>> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
>USE,
>> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
>ANY
>> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
>USE
>> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> > + */
>> > +
>> > +
>> > +#include <rte_ether.h>
>> > +#include <rte_ip.h>
>> > +#include <rte_tcp.h>
>> > +
>> > +#include "gso_common.h"
>> > +#include "gso_tcp.h"
>> > +
>> > +int
>> > +gso_tcp4_segment(struct rte_mbuf *pkt,
>> > +		uint16_t gso_size,
>> > +		struct rte_mempool *direct_pool,
>> > +		struct rte_mempool *indirect_pool,
>> > +		struct rte_mbuf **pkts_out,
>> > +		uint16_t nb_pkts_out)
>> > +{
>> > +	struct ether_hdr *eth_hdr;
>> > +	struct ipv4_hdr *ipv4_hdr;
>> > +	uint16_t tcp_dl;
>> > +	uint16_t pyld_unit_size;
>> > +	uint16_t hdr_offset;
>> > +	int ret = 1;
>> > +
>> > +	eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
>> > +	ipv4_hdr = (struct ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
>> > +
>> > +	/* don't process fragmented packet */
>> > +	if ((ipv4_hdr->fragment_offset &
>> > +				rte_cpu_to_be_16(IPV4_HDR_DF_MASK)) == 0)
>> > +		return ret;
>> > +
>> > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) -
>> > +		pkt->l3_len - pkt->l4_len;
>> > +	/* don't process packet without data */
>> > +	if (tcp_dl == 0)
>> > +		return ret;
>> > +
>> > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
>> > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
>> > +
>> > +	/* segment the payload */
>> > +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
>> > +			indirect_pool, pkts_out, nb_pkts_out);
>> > +
>> > +	if (ret > 1)
>> > +		gso_update_pkt_headers(pkt, ret, pkts_out);
>> > +
>> > +	return ret;
>> > +}
>> > diff --git a/lib/librte_gso/gso_tcp.h b/lib/librte_gso/gso_tcp.h
>> > new file mode 100644
>> > index 0000000..f291ccb
>> > --- /dev/null
>> > +++ b/lib/librte_gso/gso_tcp.h
>> > @@ -0,0 +1,73 @@
>> > +/*-
>> > + *   BSD LICENSE
>> > + *
>> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> > + *   All rights reserved.
>> > + *
>> > + *   Redistribution and use in source and binary forms, with or without
>> > + *   modification, are permitted provided that the following conditions
>> > + *   are met:
>> > + *
>> > + *     * Redistributions of source code must retain the above copyright
>> > + *       notice, this list of conditions and the following disclaimer.
>> > + *     * Redistributions in binary form must reproduce the above
>copyright
>> > + *       notice, this list of conditions and the following disclaimer in
>> > + *       the documentation and/or other materials provided with the
>> > + *       distribution.
>> > + *     * Neither the name of Intel Corporation nor the names of its
>> > + *       contributors may be used to endorse or promote products derived
>> > + *       from this software without specific prior written permission.
>> > + *
>> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
>FOR
>> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
>INCIDENTAL,
>> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
>USE,
>> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
>ANY
>> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
>USE
>> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> > + */
>> > +
>> > +#ifndef _GSO_TCP_H_
>> > +#define _GSO_TCP_H_
>> > +
>> > +#include <stdint.h>
>> > +#include <rte_mbuf.h>
>> > +
>> > +/**
>> > + * Segment an IPv4/TCP packet. This function assumes the input packet has
>> > + * correct checksums and doesn't update checksums for GSO segment.
>> > + * Furthermore, it doesn't process IP fragment packets.
>> > + *
>> > + * @param pkt
>> > + *  The packet mbuf to segment.
>> > + * @param gso_size
>> > + *  The max length of a GSO segment, measured in bytes.
>> > + * @param direct_pool
>> > + *  MBUF pool used for allocating direct buffers for output segments.
>> > + * @param indirect_pool
>> > + *  MBUF pool used for allocating indirect buffers for output segments.
>> > + * @param pkts_out
>> > + *  Pointer array, which is used to store mbuf addresses of GSO segments.
>> > + *  Caller should guarantee that 'pkts_out' is sufficiently large to
>store
>> > + *  all GSO segments.
>> > + * @param nb_pkts_out
>> > + *  The max number of items that 'pkts_out' can keep.
>> > + *
>> > + * @return
>> > + *   - The number of GSO segments on success.
>> > + *   - Return 1 if no GSO is performed.
>> > + *   - Return -ENOMEM if available memory in mempools is insufficient.
>> > + *   - Return -EINVAL for invalid parameters.
>> > + */
>> > +int gso_tcp4_segment(struct rte_mbuf *pkt,
>> > +		uint16_t gso_size,
>> > +		struct rte_mempool *direct_pool,
>> > +		struct rte_mempool *indirect_pool,
>> > +		struct rte_mbuf **pkts_out,
>> > +		uint16_t nb_pkts_out);
>> > +
>> > +#endif
>> > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
>> > index b81afce..fac95f2 100644
>> > --- a/lib/librte_gso/rte_gso.c
>> > +++ b/lib/librte_gso/rte_gso.c
>> > @@ -31,17 +31,57 @@
>> >   *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> >   */
>> >
>> > +#include <rte_log.h>
>> > +
>> >  #include "rte_gso.h"
>> > +#include "gso_common.h"
>> > +#include "gso_tcp.h"
>> >
>> >  int
>> >  rte_gso_segment(struct rte_mbuf *pkt,
>> >  		struct rte_gso_ctx gso_ctx,
>> >  		struct rte_mbuf **pkts_out,
>> > -		uint16_t nb_pkts_out __rte_unused)
>> > +		uint16_t nb_pkts_out)
>> >  {
>> > +	struct rte_mempool *direct_pool, *indirect_pool;
>> > +	struct rte_mbuf *pkt_seg;
>> > +	uint16_t nb_segments, gso_size;
>> > +
>> >  	if (pkt == NULL || pkts_out == NULL || gso_ctx.direct_pool ==
>> >  			NULL || gso_ctx.indirect_pool == NULL)
>> >  		return -EINVAL;
>>
>> Probably we don't need to check gso_ctx values for each incoming packet.
>> If you feel it is necessary - create  new function rte_gso_ctx_check() that
>> could be performed just once per ctx.
>
>Agree. I will change it. Thanks.
>
>>
>> >
>> > -	return 1;
>> > +	if ((gso_ctx.gso_types & RTE_GSO_TCP_IPV4) == 0 ||
>> > +			gso_ctx.gso_size >= pkt->pkt_len ||
>> > +			gso_ctx.gso_size == 0)
>>
>>
>> First and third condition seems redundant.
>
>The reason to check gso_ctx.gso_types here is that we don't perform
>GSO if applications don't set RTE_GSO_TCP_IPV4 to gso_ctx.gso_types,
>even the input packet is TCP/IPv4. And if gso_ctx.gso_size is 0,
>we don't need to execute the following codes. So we still need to
>remove these two conditions?
>
>>
>> > +		return 1;
>>
>>
>> I think you forgot here:
>> pkts_out[0] = pkt;
>
>But why should we keep the input packet in the output array? Currently, if
>GSO is not performed, no packets will be kept in pkts_out[]. Applications
>can know it by the return value 1 of rte_gso_segment().
>
>>
>>
>> > +
>> > +	pkt_seg = pkt;
>> > +	gso_size = gso_ctx.gso_size;
>> > +	direct_pool = gso_ctx.direct_pool;
>> > +	indirect_pool = gso_ctx.indirect_pool;
>> > +
>> > +	/* Parse packet headers to determine how to segment 'pkt' */
>> > +	gso_parse_packet(pkt);
>>
>>
>> I don't think we need to parse packet here.
>>  Instead assume that user already filled packet_type and l2/l3/..._len
>fields correctly.
>
>Hmm, I see it. Thanks.
>
>Thanks,
>Jiayu

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
  2017-08-30  9:25       ` Kavanagh, Mark B
@ 2017-08-30  9:39         ` Ananyev, Konstantin
  2017-08-30  9:59           ` Ananyev, Konstantin
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-08-30  9:39 UTC (permalink / raw)
  To: Kavanagh, Mark B, Hu, Jiayu; +Cc: dev, Tan, Jianfeng

Hi Mark,

> >> > +
> >> > +void
> >> > +gso_parse_packet(struct rte_mbuf *pkt)
> >>
> >> There is a function rte_net_get_ptype() that supposed to provide similar
> >functionality.
> >> So we probably don't need to create a new SW parse function here, instead
> >would be better
> >> to reuse (and update if needed) an existing one.
> >> Again user already might have l2/l3/l4.../_len and packet_type setuped.
> >> So better to keep SW packet parsing out of scope of that library.
> >
> >Hmm, I know we have discussed this design choice in the GRO library, and I
> >also think it's
> >better to reuse these values.
> >
> >But from the perspective of OVS, it may add extra overhead, since OVS doesn't
> >parse every
> >packet originally. Maybe @Mark can give us more inputs from the view of OVS.
> 
> Hi Jiayu, Konstantin
> 
> For GSO, the application needs to know:
> - the packet type (as it only currently supports TCP/IPv4, VxLAN, GRE packets)
> - the l2/3/4_lens, etc. (in order to replicate the original packet's headers across outgoing segments)
> 
> For this, we can use the rte_net_get_ptype function, as per Konstantin's suggestion, as it provides both - thanks Konstantin!
> 
> WRT the extra overhead in OvS: TSO is the defacto standard, and GSO is provided purely as a fallback option. As such, and since the
> additional packet parsing is a necessity in order to facilitate GSO, the additional overhead is IMO acceptable.

As I remember, for TSO in DPDK user still have to provide l2/l3/l4_len and mss information to the PMD.
So unless user knows these value straightway (user creates a packet himself) some packet processing will be unavailable anyway.
Konstantin
> 
> Thanks,
> Mark
> 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
  2017-08-30  9:39         ` Ananyev, Konstantin
@ 2017-08-30  9:59           ` Ananyev, Konstantin
  2017-08-30 13:27             ` Kavanagh, Mark B
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-08-30  9:59 UTC (permalink / raw)
  To: Ananyev, Konstantin, Kavanagh, Mark B, Hu, Jiayu; +Cc: dev, Tan, Jianfeng



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> Sent: Wednesday, August 30, 2017 10:39 AM
> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
> 
> Hi Mark,
> 
> > >> > +
> > >> > +void
> > >> > +gso_parse_packet(struct rte_mbuf *pkt)
> > >>
> > >> There is a function rte_net_get_ptype() that supposed to provide similar
> > >functionality.
> > >> So we probably don't need to create a new SW parse function here, instead
> > >would be better
> > >> to reuse (and update if needed) an existing one.
> > >> Again user already might have l2/l3/l4.../_len and packet_type setuped.
> > >> So better to keep SW packet parsing out of scope of that library.
> > >
> > >Hmm, I know we have discussed this design choice in the GRO library, and I
> > >also think it's
> > >better to reuse these values.
> > >
> > >But from the perspective of OVS, it may add extra overhead, since OVS doesn't
> > >parse every
> > >packet originally. Maybe @Mark can give us more inputs from the view of OVS.
> >
> > Hi Jiayu, Konstantin
> >
> > For GSO, the application needs to know:
> > - the packet type (as it only currently supports TCP/IPv4, VxLAN, GRE packets)
> > - the l2/3/4_lens, etc. (in order to replicate the original packet's headers across outgoing segments)
> >
> > For this, we can use the rte_net_get_ptype function, as per Konstantin's suggestion, as it provides both - thanks Konstantin!
> >
> > WRT the extra overhead in OvS: TSO is the defacto standard, and GSO is provided purely as a fallback option. As such, and since the
> > additional packet parsing is a necessity in order to facilitate GSO, the additional overhead is IMO acceptable.
> 
> As I remember, for TSO in DPDK user still have to provide l2/l3/l4_len and mss information to the PMD.
> So unless user knows these value straightway (user creates a packet himself) some packet processing will be unavailable anyway.
> Konstantin

s/unavailable/unavoidable/
sorry for bad typing.
Konstantin

> >
> > Thanks,
> > Mark
> >

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK
  2017-08-30  7:36   ` Jiayu Hu
@ 2017-08-30 10:49     ` Ananyev, Konstantin
  2017-08-30 13:32       ` Kavanagh, Mark B
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-08-30 10:49 UTC (permalink / raw)
  To: Hu, Jiayu; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng



> -----Original Message-----
> From: Hu, Jiayu
> Sent: Wednesday, August 30, 2017 8:37 AM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: Re: [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK
> 
> Hi Konstantin,
> 
> Thanks for your suggestions. Feedbacks are inline.
> 
> Thanks,
> Jiayu
> 
> On Wed, Aug 30, 2017 at 09:37:42AM +0800, Ananyev, Konstantin wrote:
> >
> > Hi Jiayu,
> > Few questions/comments from me below in in next few mails.
> > Thanks
> > Konstantin
> >
> > >
> > > Generic Segmentation Offload (GSO) is a SW technique to split large
> > > packets into small ones. Akin to TSO, GSO enables applications to
> > > operate on large packets, thus reducing per-packet processing overhead.
> > >
> > > To enable more flexibility to applications, DPDK GSO is implemented
> > > as a standalone library. Applications explicitly use the GSO library
> > > to segment packets. This patch adds GSO support to DPDK for specific
> > > packet types: specifically, TCP/IPv4, VxLAN, and GRE.
> > >
> > > The first patch introduces the GSO API framework. The second patch
> > > adds GSO support for TCP/IPv4 packets (containing an optional VLAN
> > > tag). The third patch adds GSO support for VxLAN packets that contain
> > > outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or
> > > outer VLAN tags). The fourth patch adds GSO support for GRE packets
> > > that contain outer IPv4, and inner TCP/IPv4 headers (with optional
> > > outer VLAN tag). The last patch in the series enables TCP/IPv4, VxLAN,
> > > and GRE GSO in testpmd's checksum forwarding engine.
> > >
> > > The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
> > > iperf. Setup for the test is described as follows:
> > >
> > > a. Connect 2 x 10Gbps physical ports (P0, P1), together physically.
> > > b. Launch testpmd with P0 and a vhost-user port, and use csum
> > >    forwarding engine.
> > > c. Select IP and TCP HW checksum calculation for P0; select TCP HW
> > >    checksum calculation for vhost-user port.
> > > d. Launch a VM with csum and tso offloading enabled.
> > > e. Run iperf-client on virtio-net port in the VM to send TCP packets.
> >
> > Not sure I understand the setup correctly:
> > So testpmd forwards packets between P0 and vhost-user port, right?
> 
> Yes.
> 
> > And who uses P1? iperf-server over linux kernel?
> 
> P1 is possessed by linux kernel.
> 
> > Also is P1 on another box or not?
> 
> P0 and P1 are in the same machine and are connected physically.
> 
> >
> > >
> > > With GSO enabled for P0 in testpmd, observed iperf throughput is ~9Gbps.
> >
> > Ok, and if GSO is disabled what is the throughput?
> > Another stupid question: if P0 is physical 10G (ixgbe?) we can just enable a TSO on it, right?
> > If so, what would be the TSO numbers here?
> 
> Here are more detailed experiment information:
> 
> test1: only enable GSO for p0, GSO size is 1518, use two iperf-clients (i.e. "-P 2")
> test2: only enable TSO for p0, TSO size is 1518, use two iperf-clients
> test3: disable TSO and GSO, use two iperf-clients
> 
> test1 performance: 8.6Gpbs
> test2 throughput: 9.5Gbps
> test3 throughput: 3Mbps

Ok thanks for detailed explanation.
I' d suggest you put it into next version cover letter. 

> 
> >
> > In fact, could you probably explain a bit more, what supposed to be a main usage model for that library?
> 
> The GSO library is just a SW segmentation method, which can be used by applications, like OVS.
> Currently, most of NICs supports to segment TCP and UDP packets, but not for all NICs. So current
> OVS doesn't enable TSO, as a result of lacking a SW segmentation fallback. Besides, the protocol
> types in HW segmentation are limited. So it's necessary to provide a SW segmentation solution.
> 
> With the GSO library, OVS and other applications are able to receive large packets from VMs and
> process these large packets, instead of standard ones (i.e. 1518B). So the per-packet overhead is
> reduced, since the number of packets needed processing is much fewer.

Ok, just for my curiosity what is the size of the packets coming from VM?
Konstantin


> 
> > Is that to perform segmentation on (virtual) devices that doesn't support HW TSO or ...?
> 
> When launch qemu with enabling TSO or GSO, the virtual device doesn't really do segmentation.
> It directly sends large packets. Therefore, testpmd can receive large packets from the VM and
> then perform GSO. The GSO/TSO behavior of virtual devices is different from physical NICs.
> 
> > Again would it be for a termination point (packets were just formed and filled) by the caller,
> > or is that for box in the middle which just forwards packets between nodes?
> > If the later one, then we'll probably already have most of our packets segmented properly, no?
> >
> > > The experimental data of VxLAN and GRE will be shown later.
> > >
> > > Jiayu Hu (3):
> > >   lib: add Generic Segmentation Offload API framework
> > >   gso/lib: add TCP/IPv4 GSO support
> > >   app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
> > >
> > > Mark Kavanagh (2):
> > >   lib/gso: add VxLAN GSO support
> > >   lib/gso: add GRE GSO support
> > >
> > >  app/test-pmd/cmdline.c                  | 121 +++++++++
> > >  app/test-pmd/config.c                   |  25 ++
> > >  app/test-pmd/csumonly.c                 |  68 ++++-
> > >  app/test-pmd/testpmd.c                  |   9 +
> > >  app/test-pmd/testpmd.h                  |  10 +
> > >  config/common_base                      |   5 +
> > >  lib/Makefile                            |   2 +
> > >  lib/librte_eal/common/include/rte_log.h |   1 +
> > >  lib/librte_gso/Makefile                 |  52 ++++
> > >  lib/librte_gso/gso_common.c             | 431 ++++++++++++++++++++++++++++++++
> > >  lib/librte_gso/gso_common.h             | 180 +++++++++++++
> > >  lib/librte_gso/gso_tcp.c                |  82 ++++++
> > >  lib/librte_gso/gso_tcp.h                |  73 ++++++
> > >  lib/librte_gso/gso_tunnel.c             |  62 +++++
> > >  lib/librte_gso/gso_tunnel.h             |  46 ++++
> > >  lib/librte_gso/rte_gso.c                | 100 ++++++++
> > >  lib/librte_gso/rte_gso.h                | 122 +++++++++
> > >  lib/librte_gso/rte_gso_version.map      |   7 +
> > >  mk/rte.app.mk                           |   1 +
> > >  19 files changed, 1392 insertions(+), 5 deletions(-)
> > >  create mode 100644 lib/librte_gso/Makefile
> > >  create mode 100644 lib/librte_gso/gso_common.c
> > >  create mode 100644 lib/librte_gso/gso_common.h
> > >  create mode 100644 lib/librte_gso/gso_tcp.c
> > >  create mode 100644 lib/librte_gso/gso_tcp.h
> > >  create mode 100644 lib/librte_gso/gso_tunnel.c
> > >  create mode 100644 lib/librte_gso/gso_tunnel.h
> > >  create mode 100644 lib/librte_gso/rte_gso.c
> > >  create mode 100644 lib/librte_gso/rte_gso.h
> > >  create mode 100644 lib/librte_gso/rte_gso_version.map
> > >
> > > --
> > > 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
  2017-08-30  9:59           ` Ananyev, Konstantin
@ 2017-08-30 13:27             ` Kavanagh, Mark B
  0 siblings, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-08-30 13:27 UTC (permalink / raw)
  To: Ananyev, Konstantin, Hu, Jiayu; +Cc: dev, Tan, Jianfeng

>From: Ananyev, Konstantin
>Sent: Wednesday, August 30, 2017 10:59 AM
>To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
>Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>Subject: RE: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
>
>
>
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
>> Sent: Wednesday, August 30, 2017 10:39 AM
>> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
><jiayu.hu@intel.com>
>> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>> Subject: Re: [dpdk-dev] [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
>>
>> Hi Mark,
>>
>> > >> > +
>> > >> > +void
>> > >> > +gso_parse_packet(struct rte_mbuf *pkt)
>> > >>
>> > >> There is a function rte_net_get_ptype() that supposed to provide
>similar
>> > >functionality.
>> > >> So we probably don't need to create a new SW parse function here,
>instead
>> > >would be better
>> > >> to reuse (and update if needed) an existing one.
>> > >> Again user already might have l2/l3/l4.../_len and packet_type setuped.
>> > >> So better to keep SW packet parsing out of scope of that library.
>> > >
>> > >Hmm, I know we have discussed this design choice in the GRO library, and
>I
>> > >also think it's
>> > >better to reuse these values.
>> > >
>> > >But from the perspective of OVS, it may add extra overhead, since OVS
>doesn't
>> > >parse every
>> > >packet originally. Maybe @Mark can give us more inputs from the view of
>OVS.
>> >
>> > Hi Jiayu, Konstantin
>> >
>> > For GSO, the application needs to know:
>> > - the packet type (as it only currently supports TCP/IPv4, VxLAN, GRE
>packets)
>> > - the l2/3/4_lens, etc. (in order to replicate the original packet's
>headers across outgoing segments)
>> >
>> > For this, we can use the rte_net_get_ptype function, as per Konstantin's
>suggestion, as it provides both - thanks Konstantin!
>> >
>> > WRT the extra overhead in OvS: TSO is the defacto standard, and GSO is
>provided purely as a fallback option. As such, and since the
>> > additional packet parsing is a necessity in order to facilitate GSO, the
>additional overhead is IMO acceptable.
>>
>> As I remember, for TSO in DPDK user still have to provide l2/l3/l4_len and
>mss information to the PMD.

Yes, that's correct. 

>> So unless user knows these value straightway (user creates a packet himself)
>some packet processing will be unavailable anyway.

That's correct also. Currently, packets that originate in a VM, and which have been marked for TSO, have the l2_len, etc. fields populated by the 'parse_ethernet' function, called as part of the call stack of the rte_vhost_dequeue_burst function, so that particular overhead is already implicit in the TSO case.

>> Konstantin
>
>s/unavailable/unavoidable/
>sorry for bad typing.
>Konstantin
>
>> >
>> > Thanks,
>> > Mark
>> >

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK
  2017-08-30 10:49     ` Ananyev, Konstantin
@ 2017-08-30 13:32       ` Kavanagh, Mark B
  0 siblings, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-08-30 13:32 UTC (permalink / raw)
  To: Ananyev, Konstantin, Hu, Jiayu; +Cc: dev, Tan, Jianfeng

>From: Ananyev, Konstantin
>Sent: Wednesday, August 30, 2017 11:49 AM
>To: Hu, Jiayu <jiayu.hu@intel.com>
>Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng
><jianfeng.tan@intel.com>
>Subject: RE: [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK
>
>
>
>> -----Original Message-----
>> From: Hu, Jiayu
>> Sent: Wednesday, August 30, 2017 8:37 AM
>> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
>> Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
>Jianfeng <jianfeng.tan@intel.com>
>> Subject: Re: [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK
>>
>> Hi Konstantin,
>>
>> Thanks for your suggestions. Feedbacks are inline.
>>
>> Thanks,
>> Jiayu
>>
>> On Wed, Aug 30, 2017 at 09:37:42AM +0800, Ananyev, Konstantin wrote:
>> >
>> > Hi Jiayu,
>> > Few questions/comments from me below in in next few mails.
>> > Thanks
>> > Konstantin
>> >
>> > >
>> > > Generic Segmentation Offload (GSO) is a SW technique to split large
>> > > packets into small ones. Akin to TSO, GSO enables applications to
>> > > operate on large packets, thus reducing per-packet processing overhead.
>> > >
>> > > To enable more flexibility to applications, DPDK GSO is implemented
>> > > as a standalone library. Applications explicitly use the GSO library
>> > > to segment packets. This patch adds GSO support to DPDK for specific
>> > > packet types: specifically, TCP/IPv4, VxLAN, and GRE.
>> > >
>> > > The first patch introduces the GSO API framework. The second patch
>> > > adds GSO support for TCP/IPv4 packets (containing an optional VLAN
>> > > tag). The third patch adds GSO support for VxLAN packets that contain
>> > > outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or
>> > > outer VLAN tags). The fourth patch adds GSO support for GRE packets
>> > > that contain outer IPv4, and inner TCP/IPv4 headers (with optional
>> > > outer VLAN tag). The last patch in the series enables TCP/IPv4, VxLAN,
>> > > and GRE GSO in testpmd's checksum forwarding engine.
>> > >
>> > > The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
>> > > iperf. Setup for the test is described as follows:
>> > >
>> > > a. Connect 2 x 10Gbps physical ports (P0, P1), together physically.
>> > > b. Launch testpmd with P0 and a vhost-user port, and use csum
>> > >    forwarding engine.
>> > > c. Select IP and TCP HW checksum calculation for P0; select TCP HW
>> > >    checksum calculation for vhost-user port.
>> > > d. Launch a VM with csum and tso offloading enabled.
>> > > e. Run iperf-client on virtio-net port in the VM to send TCP packets.
>> >
>> > Not sure I understand the setup correctly:
>> > So testpmd forwards packets between P0 and vhost-user port, right?
>>
>> Yes.
>>
>> > And who uses P1? iperf-server over linux kernel?
>>
>> P1 is possessed by linux kernel.
>>
>> > Also is P1 on another box or not?
>>
>> P0 and P1 are in the same machine and are connected physically.
>>
>> >
>> > >
>> > > With GSO enabled for P0 in testpmd, observed iperf throughput is ~9Gbps.
>> >
>> > Ok, and if GSO is disabled what is the throughput?
>> > Another stupid question: if P0 is physical 10G (ixgbe?) we can just enable
>a TSO on it, right?
>> > If so, what would be the TSO numbers here?
>>
>> Here are more detailed experiment information:
>>
>> test1: only enable GSO for p0, GSO size is 1518, use two iperf-clients (i.e.
>"-P 2")
>> test2: only enable TSO for p0, TSO size is 1518, use two iperf-clients
>> test3: disable TSO and GSO, use two iperf-clients
>>
>> test1 performance: 8.6Gpbs
>> test2 throughput: 9.5Gbps
>> test3 throughput: 3Mbps
>
>Ok thanks for detailed explanation.
>I' d suggest you put it into next version cover letter.

Thanks Konstantin - will do.

>
>>
>> >
>> > In fact, could you probably explain a bit more, what supposed to be a main
>usage model for that library?
>>
>> The GSO library is just a SW segmentation method, which can be used by
>applications, like OVS.
>> Currently, most of NICs supports to segment TCP and UDP packets, but not for
>all NICs. So current
>> OVS doesn't enable TSO, as a result of lacking a SW segmentation fallback.
>Besides, the protocol
>> types in HW segmentation are limited. So it's necessary to provide a SW
>segmentation solution.
>>
>> With the GSO library, OVS and other applications are able to receive large
>packets from VMs and
>> process these large packets, instead of standard ones (i.e. 1518B). So the
>per-packet overhead is
>> reduced, since the number of packets needed processing is much fewer.
>
>Ok, just for my curiosity what is the size of the packets coming from VM?
>Konstantin

In the case of TSO (and as a corollary, GSO), I guess that the packet size is bounded to ~64k. In OvS, that packet is dequeued using the rte_vhost_dequeue_burst API, and stored in an mbuf chain. The data capacity of mbufs in OvS is user-defined, up to a limit of 9728B.

Thanks,
Mark

>
>
>>
>> > Is that to perform segmentation on (virtual) devices that doesn't support
>HW TSO or ...?
>>
>> When launch qemu with enabling TSO or GSO, the virtual device doesn't really
>do segmentation.
>> It directly sends large packets. Therefore, testpmd can receive large
>packets from the VM and
>> then perform GSO. The GSO/TSO behavior of virtual devices is different from
>physical NICs.
>>
>> > Again would it be for a termination point (packets were just formed and
>filled) by the caller,
>> > or is that for box in the middle which just forwards packets between
>nodes?
>> > If the later one, then we'll probably already have most of our packets
>segmented properly, no?
>> >
>> > > The experimental data of VxLAN and GRE will be shown later.
>> > >
>> > > Jiayu Hu (3):
>> > >   lib: add Generic Segmentation Offload API framework
>> > >   gso/lib: add TCP/IPv4 GSO support
>> > >   app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
>> > >
>> > > Mark Kavanagh (2):
>> > >   lib/gso: add VxLAN GSO support
>> > >   lib/gso: add GRE GSO support
>> > >
>> > >  app/test-pmd/cmdline.c                  | 121 +++++++++
>> > >  app/test-pmd/config.c                   |  25 ++
>> > >  app/test-pmd/csumonly.c                 |  68 ++++-
>> > >  app/test-pmd/testpmd.c                  |   9 +
>> > >  app/test-pmd/testpmd.h                  |  10 +
>> > >  config/common_base                      |   5 +
>> > >  lib/Makefile                            |   2 +
>> > >  lib/librte_eal/common/include/rte_log.h |   1 +
>> > >  lib/librte_gso/Makefile                 |  52 ++++
>> > >  lib/librte_gso/gso_common.c             | 431
>++++++++++++++++++++++++++++++++
>> > >  lib/librte_gso/gso_common.h             | 180 +++++++++++++
>> > >  lib/librte_gso/gso_tcp.c                |  82 ++++++
>> > >  lib/librte_gso/gso_tcp.h                |  73 ++++++
>> > >  lib/librte_gso/gso_tunnel.c             |  62 +++++
>> > >  lib/librte_gso/gso_tunnel.h             |  46 ++++
>> > >  lib/librte_gso/rte_gso.c                | 100 ++++++++
>> > >  lib/librte_gso/rte_gso.h                | 122 +++++++++
>> > >  lib/librte_gso/rte_gso_version.map      |   7 +
>> > >  mk/rte.app.mk                           |   1 +
>> > >  19 files changed, 1392 insertions(+), 5 deletions(-)
>> > >  create mode 100644 lib/librte_gso/Makefile
>> > >  create mode 100644 lib/librte_gso/gso_common.c
>> > >  create mode 100644 lib/librte_gso/gso_common.h
>> > >  create mode 100644 lib/librte_gso/gso_tcp.c
>> > >  create mode 100644 lib/librte_gso/gso_tcp.h
>> > >  create mode 100644 lib/librte_gso/gso_tunnel.c
>> > >  create mode 100644 lib/librte_gso/gso_tunnel.h
>> > >  create mode 100644 lib/librte_gso/rte_gso.c
>> > >  create mode 100644 lib/librte_gso/rte_gso.h
>> > >  create mode 100644 lib/librte_gso/rte_gso_version.map
>> > >
>> > > --
>> > > 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
  2017-08-30  1:38   ` Ananyev, Konstantin
  2017-08-30  2:55     ` Jiayu Hu
  2017-08-30  9:03     ` Jiayu Hu
@ 2017-09-04  3:31     ` Jiayu Hu
  2017-09-04  9:54       ` Ananyev, Konstantin
  2 siblings, 1 reply; 157+ messages in thread
From: Jiayu Hu @ 2017-09-04  3:31 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Konstantin,

About the IP identifier, I check the linux codes and have some feedbacks inline.

On Wed, Aug 30, 2017 at 09:38:33AM +0800, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Hu, Jiayu
> > Sent: Thursday, August 24, 2017 3:16 PM
> > To: dev@dpdk.org
> > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Tan, Jianfeng
> > <jianfeng.tan@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
> > Subject: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
> > 
> > This patch adds GSO support for TCP/IPv4 packets. Supported packets
> > may include a single VLAN tag. TCP/IPv4 GSO assumes that all input
> > packets have correct checksums, and doesn't update checksums for output
> > packets (the responsibility for this lies with the application).
> > Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
> > 
> > TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
> > MBUF, to organize an output packet. Note that we refer to these two
> > chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
> > header, while the indirect mbuf simply points to a location within the
> > original packet's payload. Consequently, use of the GSO library requires
> > multi-segment MBUF support in the TX functions of the NIC driver.
> > 
> > If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> > result, when all of its GSOed segments are freed, the packet is freed
> > automatically.
> > 
> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> > ---
> > +void
> > +gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
> > +		struct rte_mbuf **out_segments)
> > +{
> > +	struct ipv4_hdr *ipv4_hdr;
> > +	struct tcp_hdr *tcp_hdr;
> > +	struct rte_mbuf *seg;
> > +	uint32_t sent_seq;
> > +	uint16_t offset, i;
> > +	uint16_t tail_seg_idx = nb_segments - 1, id;
> > +
> > +	switch (pkt->packet_type) {
> > +	case ETHER_VLAN_IPv4_TCP_PKT:
> > +	case ETHER_IPv4_TCP_PKT:
> 
> Might be worth to put code below in a separate function:
> update_inner_tcp_hdr(..) or so.
> Then you can reuse it for tunneled cases too.
> 
> > +		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) 
> > +				pkt->l2_len);
> > +		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> > +		id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> > +		sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> > +
> > +		for (i = 0; i < nb_segments; i++) {
> > +			seg = out_segments[i];
> > +
> > +			offset = seg->l2_len;
> > +			update_ipv4_header(rte_pktmbuf_mtod(seg, char *),
> > +					offset, seg->pkt_len, id);
> > +			id++;
> 
> Who would be responsible to make sure that we wouldn't have consecutive packets with the IPV4 id?
> Would be the upper layer that forms the packet or gso library or ...?

Linux supports two kinds of IP identifier: fixed identifier and incremental identifier, and
which one to use depends on upper protocol modules. Specifically, if the protocol module
wants fixed identifiers, it will set SKB_GSO_TCP_FIXEDID to skb->gso_type, and then
inet_gso_segment() will keep identifiers the same. Otherwise, all segments will have
incremental identifiers. The reason for this design is that some protocols may choose fixed
IP identifiers, like TCP (from RFC791). This design also shows that linux ignores the issue
of repeated IP identifiers.

>From the perspective of DPDK, we need to solve two problems. One is if ignore the issue of
repeated IP identifiers. The other is if the GSO library provides an interface to upper
applications to enable them to choose fixed or incremental identifiers, or simply uses
incremental IP identifiers.

Do you have any suggestions?

Thanks,
Jiayu

> 
> > +
> > +			offset += seg->l3_len;
> > +			update_tcp_header(rte_pktmbuf_mtod(seg, char *),
> > +					offset, sent_seq, i < tail_seg_idx);
> > +			sent_seq += seg->next->data_len;
> > +		}
> > +		break;
> > +	}
> > +}
> > --
> > 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
  2017-09-04  3:31     ` Jiayu Hu
@ 2017-09-04  9:54       ` Ananyev, Konstantin
  2017-09-05  1:09         ` Hu, Jiayu
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-04  9:54 UTC (permalink / raw)
  To: Hu, Jiayu; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Jiayu,

> -----Original Message-----
> From: Hu, Jiayu
> Sent: Monday, September 4, 2017 4:32 AM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: Re: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
> 
> Hi Konstantin,
> 
> About the IP identifier, I check the linux codes and have some feedbacks inline.
> 
> On Wed, Aug 30, 2017 at 09:38:33AM +0800, Ananyev, Konstantin wrote:
> >
> >
> > > -----Original Message-----
> > > From: Hu, Jiayu
> > > Sent: Thursday, August 24, 2017 3:16 PM
> > > To: dev@dpdk.org
> > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Tan, Jianfeng
> > > <jianfeng.tan@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
> > > Subject: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
> > >
> > > This patch adds GSO support for TCP/IPv4 packets. Supported packets
> > > may include a single VLAN tag. TCP/IPv4 GSO assumes that all input
> > > packets have correct checksums, and doesn't update checksums for output
> > > packets (the responsibility for this lies with the application).
> > > Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
> > >
> > > TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
> > > MBUF, to organize an output packet. Note that we refer to these two
> > > chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
> > > header, while the indirect mbuf simply points to a location within the
> > > original packet's payload. Consequently, use of the GSO library requires
> > > multi-segment MBUF support in the TX functions of the NIC driver.
> > >
> > > If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> > > result, when all of its GSOed segments are freed, the packet is freed
> > > automatically.
> > >
> > > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> > > ---
> > > +void
> > > +gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
> > > +		struct rte_mbuf **out_segments)
> > > +{
> > > +	struct ipv4_hdr *ipv4_hdr;
> > > +	struct tcp_hdr *tcp_hdr;
> > > +	struct rte_mbuf *seg;
> > > +	uint32_t sent_seq;
> > > +	uint16_t offset, i;
> > > +	uint16_t tail_seg_idx = nb_segments - 1, id;
> > > +
> > > +	switch (pkt->packet_type) {
> > > +	case ETHER_VLAN_IPv4_TCP_PKT:
> > > +	case ETHER_IPv4_TCP_PKT:
> >
> > Might be worth to put code below in a separate function:
> > update_inner_tcp_hdr(..) or so.
> > Then you can reuse it for tunneled cases too.
> >
> > > +		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *)
> > > +				pkt->l2_len);
> > > +		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> > > +		id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> > > +		sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> > > +
> > > +		for (i = 0; i < nb_segments; i++) {
> > > +			seg = out_segments[i];
> > > +
> > > +			offset = seg->l2_len;
> > > +			update_ipv4_header(rte_pktmbuf_mtod(seg, char *),
> > > +					offset, seg->pkt_len, id);
> > > +			id++;
> >
> > Who would be responsible to make sure that we wouldn't have consecutive packets with the IPV4 id?
> > Would be the upper layer that forms the packet or gso library or ...?
> 
> Linux supports two kinds of IP identifier: fixed identifier and incremental identifier, and
> which one to use depends on upper protocol modules. Specifically, if the protocol module
> wants fixed identifiers, it will set SKB_GSO_TCP_FIXEDID to skb->gso_type, and then
> inet_gso_segment() will keep identifiers the same. Otherwise, all segments will have
> incremental identifiers. The reason for this design is that some protocols may choose fixed
> IP identifiers, like TCP (from RFC791). This design also shows that linux ignores the issue
> of repeated IP identifiers.
> 
> From the perspective of DPDK, we need to solve two problems. One is if ignore the issue of
> repeated IP identifiers. The other is if the GSO library provides an interface to upper
> applications to enable them to choose fixed or incremental identifiers, or simply uses
> incremental IP identifiers.
> 
> Do you have any suggestions?


Do the same as Linux?
I.E. add some flag RRE_GSO_IPID_FIXED (or so) into gso_ctx?
Konstantin

> 
> Thanks,
> Jiayu
> 
> >
> > > +
> > > +			offset += seg->l3_len;
> > > +			update_tcp_header(rte_pktmbuf_mtod(seg, char *),
> > > +					offset, sent_seq, i < tail_seg_idx);
> > > +			sent_seq += seg->next->data_len;
> > > +		}
> > > +		break;
> > > +	}
> > > +}
> > > --
> > > 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
  2017-09-04  9:54       ` Ananyev, Konstantin
@ 2017-09-05  1:09         ` Hu, Jiayu
  2017-09-11 13:04           ` Ananyev, Konstantin
  0 siblings, 1 reply; 157+ messages in thread
From: Hu, Jiayu @ 2017-09-05  1:09 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Konstantin,

> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Monday, September 4, 2017 5:55 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
> 
> Hi Jiayu,
> 
> > -----Original Message-----
> > From: Hu, Jiayu
> > Sent: Monday, September 4, 2017 4:32 AM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> Jianfeng <jianfeng.tan@intel.com>
> > Subject: Re: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
> >
> > Hi Konstantin,
> >
> > About the IP identifier, I check the linux codes and have some feedbacks
> inline.
> >
> > On Wed, Aug 30, 2017 at 09:38:33AM +0800, Ananyev, Konstantin wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Hu, Jiayu
> > > > Sent: Thursday, August 24, 2017 3:16 PM
> > > > To: dev@dpdk.org
> > > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Ananyev,
> Konstantin <konstantin.ananyev@intel.com>; Tan, Jianfeng
> > > > <jianfeng.tan@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
> > > > Subject: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
> > > >
> > > > This patch adds GSO support for TCP/IPv4 packets. Supported packets
> > > > may include a single VLAN tag. TCP/IPv4 GSO assumes that all input
> > > > packets have correct checksums, and doesn't update checksums for
> output
> > > > packets (the responsibility for this lies with the application).
> > > > Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
> > > >
> > > > TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one
> indrect
> > > > MBUF, to organize an output packet. Note that we refer to these two
> > > > chained MBUFs as a two-segment MBUF. The direct MBUF stores the
> packet
> > > > header, while the indirect mbuf simply points to a location within the
> > > > original packet's payload. Consequently, use of the GSO library requires
> > > > multi-segment MBUF support in the TX functions of the NIC driver.
> > > >
> > > > If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> > > > result, when all of its GSOed segments are freed, the packet is freed
> > > > automatically.
> > > >
> > > > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > > > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> > > > ---
> > > > +void
> > > > +gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
> > > > +		struct rte_mbuf **out_segments)
> > > > +{
> > > > +	struct ipv4_hdr *ipv4_hdr;
> > > > +	struct tcp_hdr *tcp_hdr;
> > > > +	struct rte_mbuf *seg;
> > > > +	uint32_t sent_seq;
> > > > +	uint16_t offset, i;
> > > > +	uint16_t tail_seg_idx = nb_segments - 1, id;
> > > > +
> > > > +	switch (pkt->packet_type) {
> > > > +	case ETHER_VLAN_IPv4_TCP_PKT:
> > > > +	case ETHER_IPv4_TCP_PKT:
> > >
> > > Might be worth to put code below in a separate function:
> > > update_inner_tcp_hdr(..) or so.
> > > Then you can reuse it for tunneled cases too.
> > >
> > > > +		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *)
> > > > +				pkt->l2_len);
> > > > +		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> > > > +		id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> > > > +		sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> > > > +
> > > > +		for (i = 0; i < nb_segments; i++) {
> > > > +			seg = out_segments[i];
> > > > +
> > > > +			offset = seg->l2_len;
> > > > +			update_ipv4_header(rte_pktmbuf_mtod(seg, char *),
> > > > +					offset, seg->pkt_len, id);
> > > > +			id++;
> > >
> > > Who would be responsible to make sure that we wouldn't have
> consecutive packets with the IPV4 id?
> > > Would be the upper layer that forms the packet or gso library or ...?
> >
> > Linux supports two kinds of IP identifier: fixed identifier and incremental
> identifier, and
> > which one to use depends on upper protocol modules. Specifically, if the
> protocol module
> > wants fixed identifiers, it will set SKB_GSO_TCP_FIXEDID to skb->gso_type,
> and then
> > inet_gso_segment() will keep identifiers the same. Otherwise, all segments
> will have
> > incremental identifiers. The reason for this design is that some protocols
> may choose fixed
> > IP identifiers, like TCP (from RFC791). This design also shows that linux
> ignores the issue
> > of repeated IP identifiers.
> >
> > From the perspective of DPDK, we need to solve two problems. One is if
> ignore the issue of
> > repeated IP identifiers. The other is if the GSO library provides an interface
> to upper
> > applications to enable them to choose fixed or incremental identifiers, or
> simply uses
> > incremental IP identifiers.
> >
> > Do you have any suggestions?
> 
> 
> Do the same as Linux?
> I.E. add some flag RRE_GSO_IPID_FIXED (or so) into gso_ctx?

OK, I see. We can do that.

In the GRO library, we check if the IP identifiers are incremental compulsorily. If we
enable fixed IP identifier in GSO, it seems we also need to change the GRO library.
I mean ignore IP identifier when merge packets, and don't update the IP identifier
for the merged packet. What do you think of it?

Thanks,
Jiayu

> Konstantin
> 
> >
> > Thanks,
> > Jiayu
> >
> > >
> > > > +
> > > > +			offset += seg->l3_len;
> > > > +			update_tcp_header(rte_pktmbuf_mtod(seg, char *),
> > > > +					offset, sent_seq, i < tail_seg_idx);
> > > > +			sent_seq += seg->next->data_len;
> > > > +		}
> > > > +		break;
> > > > +	}
> > > > +}
> > > > --
> > > > 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v2 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK
  2017-08-24 14:15 [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                   ` (5 preceding siblings ...)
  2017-08-30  1:37 ` [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Ananyev, Konstantin
@ 2017-09-05  7:57 ` Jiayu Hu
  2017-09-05  7:57   ` [PATCH v2 1/5] gso: add Generic Segmentation Offload API framework Jiayu Hu
                     ` (5 more replies)
  6 siblings, 6 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-05  7:57 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, Jiayu Hu

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. This patch adds GSO support to DPDK for specific
packet types: specifically, TCP/IPv4, VxLAN, and GRE.

The first patch introduces the GSO API framework. The second patch
adds GSO support for TCP/IPv4 packets (containing an optional VLAN
tag). The third patch adds GSO support for VxLAN packets that contain
outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or 
outer VLAN tags). The fourth patch adds GSO support for GRE packets
that contain outer IPv4, and inner TCP/IPv4 headers (with optional 
outer VLAN tag). The last patch in the series enables TCP/IPv4, VxLAN,
and GRE GSO in testpmd's checksum forwarding engine.

The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
iperf. Setup for the test is described as follows:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum
   forwarding engine with "retry".
c. Select IP and TCP HW checksum calculation for P0; select TCP HW
   checksum calculation for vhost-user port.
d. Launch a VM with csum and tso offloading enabled.
e. Run iperf-client on virtio-net port in the VM to send TCP packets.
   With enabling csum and tso, the VM can send large TCP/IPv4 packets
   (mss is up to 64KB).
f. P1 is assigned to linux kernel and enabled kernel GRO. Run
   iperf-server on P1.

We conduct three iperf tests:

test-1: enable GSO for P0 in testpmd, and set max GSO segment length
	to 1518B. Run two iperf-client in the VM.
test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
	two iperf-client in the VM.
test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.

Throughput of the above three tests:

test-1: ~9Gbps
test-2: 9.5Gbps
test-3: 3Mbps

The experimental data of VxLAN and GRE will be shown later.

Change log
==========
v2:
- merge data segments whose data_len is less than mss into a large data
  segment in gso_do_segment().
- use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
  header in rte_gso_segment().
- provide IP id macros for applications to select fixed or incremental IP
  ids.
- change the defination of gso_types in struct rte_gso_ctx.
- replace rte_pktmbuf_detach() with rte_pktmbuf_free().
- refactor gso_update_pkt_headers().
- change the return value of rte_gso_segment().
- remove parameter checks in rte_gso_segment().
- use rte_net_get_ptype() in app/test-pmd/csumonly.c to fill
  mbuf->packet_type.
- add a new GSO command in testpmd to show GSO configuration for ports.
- misc: fix typo and optimize function description.

Jiayu Hu (3):
  gso: add Generic Segmentation Offload API framework
  gso: add TCP/IPv4 GSO support
  app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO

Mark Kavanagh (2):
  gso: add VxLAN GSO support
  gso: add GRE GSO support

 app/test-pmd/cmdline.c                  | 178 ++++++++++++++++++++
 app/test-pmd/config.c                   |  24 +++
 app/test-pmd/csumonly.c                 |  60 ++++++-
 app/test-pmd/testpmd.c                  |  15 ++
 app/test-pmd/testpmd.h                  |  10 ++
 config/common_base                      |   5 +
 lib/Makefile                            |   2 +
 lib/librte_eal/common/include/rte_log.h |   1 +
 lib/librte_gso/Makefile                 |  52 ++++++
 lib/librte_gso/gso_common.c             | 281 ++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h             | 156 ++++++++++++++++++
 lib/librte_gso/gso_tcp.c                |  83 ++++++++++
 lib/librte_gso/gso_tcp.h                |  76 +++++++++
 lib/librte_gso/gso_tunnel.c             |  61 +++++++
 lib/librte_gso/gso_tunnel.h             |  75 +++++++++
 lib/librte_gso/rte_gso.c                |  99 +++++++++++
 lib/librte_gso/rte_gso.h                | 133 +++++++++++++++
 lib/librte_gso/rte_gso_version.map      |   7 +
 mk/rte.app.mk                           |   1 +
 19 files changed, 1315 insertions(+), 4 deletions(-)
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp.c
 create mode 100644 lib/librte_gso/gso_tcp.h
 create mode 100644 lib/librte_gso/gso_tunnel.c
 create mode 100644 lib/librte_gso/gso_tunnel.h
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

-- 
2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v2 1/5] gso: add Generic Segmentation Offload API framework
  2017-09-05  7:57 ` [PATCH v2 " Jiayu Hu
@ 2017-09-05  7:57   ` Jiayu Hu
  2017-09-05  7:57   ` [PATCH v2 2/5] gso: add TCP/IPv4 GSO support Jiayu Hu
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-05  7:57 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, Jiayu Hu

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. This patch introduces the GSO API framework to DPDK.

The GSO library provides a segmentation API, rte_gso_segment(), for
applications. It splits an input packet into small ones in each
invocation. The GSO library refers to these small packets generated
by rte_gso_segment() as GSO segments. Each of the newly-created GSO
segments is organized as a two-segment MBUF, where the first segment is a
standard MBUF, which stores a copy of packet header, and the second is an
indirect MBUF which points to a section of data in the input packet.
rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
when all GSO segments are freed, the input packet is freed automatically.
Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
the driver of the interface which the GSO segments are sent to should
support to transmit multi-segment packets.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 config/common_base                 |   5 ++
 lib/Makefile                       |   2 +
 lib/librte_gso/Makefile            |  49 ++++++++++++++
 lib/librte_gso/rte_gso.c           |  48 +++++++++++++
 lib/librte_gso/rte_gso.h           | 133 +++++++++++++++++++++++++++++++++++++
 lib/librte_gso/rte_gso_version.map |   7 ++
 mk/rte.app.mk                      |   1 +
 7 files changed, 245 insertions(+)
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

diff --git a/config/common_base b/config/common_base
index 5e97a08..603e340 100644
--- a/config/common_base
+++ b/config/common_base
@@ -652,6 +652,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
 CONFIG_RTE_LIBRTE_GRO=y
 
 #
+# Compile GSO library
+#
+CONFIG_RTE_LIBRTE_GSO=y
+
+#
 # Compile librte_meter
 #
 CONFIG_RTE_LIBRTE_METER=y
diff --git a/lib/Makefile b/lib/Makefile
index 86caba1..3d123f4 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
 DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
+DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
+DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
new file mode 100644
index 0000000..aeaacbc
--- /dev/null
+++ b/lib/librte_gso/Makefile
@@ -0,0 +1,49 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_gso.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+
+EXPORT_MAP := rte_gso_version.map
+
+LIBABIVER := 1
+
+#source files
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
new file mode 100644
index 0000000..fef6725
--- /dev/null
+++ b/lib/librte_gso/rte_gso.c
@@ -0,0 +1,48 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "rte_gso.h"
+
+int
+rte_gso_segment(struct rte_mbuf *pkt,
+		struct rte_gso_ctx gso_ctx __rte_unused,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
+		return -EINVAL;
+
+	pkts_out[0] = pkt;
+
+	return 1;
+}
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
new file mode 100644
index 0000000..eb4ac4b
--- /dev/null
+++ b/lib/librte_gso/rte_gso.h
@@ -0,0 +1,133 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_GSO_H_
+#define _RTE_GSO_H_
+
+/**
+ * @file
+ * Interface to GSO library
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/* GSO IP id flags for the IPv4 header */
+#define RTE_GSO_IPID_FIXED (1 << 0)
+/**< Use fixed IP ids for output GSO segments */
+#define RTE_GSO_IPID_INCREASE (1 << 1)
+/**< Use incremental IP ids for output GSO segments */
+
+/**
+ * GSO context structure.
+ */
+struct rte_gso_ctx {
+	struct rte_mempool *direct_pool;
+	/**< MBUF pool for allocating direct buffers, which are used
+	 * to store packet headers for GSO segments.
+	 */
+	struct rte_mempool *indirect_pool;
+	/**< MBUF pool for allocating indirect buffers, which are used
+	 * to locate packet payloads for GSO segments. The indirect
+	 * buffer doesn't contain any data, but simply points to an
+	 * offset within the packet to segment.
+	 */
+	uint32_t gso_types;
+	/**< packet types to perform GSO. For example, if applications
+	 * want to segment TCP/IPv4 packets, set (RTE_PTYPE_L2_ETHER |
+	 * RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP) to gso_types.
+	 */
+	uint16_t gso_size;
+	/**< maximum size of an output GSO segment, including packet
+	 * header and payload, measured in bytes.
+	 */
+	uint8_t ipid_flag;
+	/**< flag to indicate GSO uses fixed or incremental IP ids for
+	 * IPv4 headers of output GSO segments.
+	 */
+};
+
+/**
+ * Segmentation function, which supports processing of both single- and
+ * multi- segment packets.
+ *
+ * Note that we refer to the packets that are segmented from the input
+ * packet as 'GSO segments'. rte_gso_segment() assumes the input packet
+ * has correct checksums, and it doesn't update checksums for output
+ * GSO segments. Additionally, it doesn't process IP fragment packets.
+ *
+ * Each of the newly-created GSO segments is organized as a two-segment
+ * MBUF, where the first segment is a standard MBUF, which stores a copy
+ * of packet header, and the second is an indirect MBUF which points to
+ * a section of data in the input packet. Since each GSO segment has
+ * multiple MBUFs (i.e. 2 MBUFs), the driver of the interface which the
+ * GSO segments are sent to should support to transmit multi-segment
+ * packets.
+ *
+ * If the input packet is GSOed, its mbuf refcnt reduces by 1. Therefore,
+ * when all GSO segments are freed, the input packet is freed automatically.
+ *
+ * If the memory space in pkts_out or MBUF pools is insufficient, this
+ * function fails, and it returns (-1) * errno. Otherwise, GSO successes,
+ * and this function returns the number of output GSO segments filled in
+ * pkts_out.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param ctx
+ *  GSO context object.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when rte_gso_segment() successes.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of GSO segments filled in pkts_out on success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int rte_gso_segment(struct rte_mbuf *pkt,
+		struct rte_gso_ctx ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_GSO_H_ */
diff --git a/lib/librte_gso/rte_gso_version.map b/lib/librte_gso/rte_gso_version.map
new file mode 100644
index 0000000..e1fd453
--- /dev/null
+++ b/lib/librte_gso/rte_gso_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_gso_segment;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index c25fdd9..d4c9873 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
+_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v2 2/5] gso: add TCP/IPv4 GSO support
  2017-09-05  7:57 ` [PATCH v2 " Jiayu Hu
  2017-09-05  7:57   ` [PATCH v2 1/5] gso: add Generic Segmentation Offload API framework Jiayu Hu
@ 2017-09-05  7:57   ` Jiayu Hu
  2017-09-05  7:57   ` [PATCH v2 3/5] gso: add VxLAN " Jiayu Hu
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-05  7:57 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, Jiayu Hu

This patch adds GSO support for TCP/IPv4 packets. Supported packets
may include a single VLAN tag. TCP/IPv4 GSO assumes that all input
packets have correct checksums, and doesn't update checksums for output
packets (the responsibility for this lies with the application).
Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.

TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
MBUF, to organize an output packet. Note that we refer to these two
chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
header, while the indirect mbuf simply points to a location within the
original packet's payload. Consequently, use of the GSO library requires
multi-segment MBUF support in the TX functions of the NIC driver.

If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
result, when all of its GSOed segments are freed, the packet is freed
automatically.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 lib/librte_eal/common/include/rte_log.h |   1 +
 lib/librte_gso/Makefile                 |   2 +
 lib/librte_gso/gso_common.c             | 207 ++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h             | 107 +++++++++++++++++
 lib/librte_gso/gso_tcp.c                |  83 +++++++++++++
 lib/librte_gso/gso_tcp.h                |  76 ++++++++++++
 lib/librte_gso/rte_gso.c                |  46 ++++++-
 7 files changed, 519 insertions(+), 3 deletions(-)
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp.c
 create mode 100644 lib/librte_gso/gso_tcp.h

diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index ec8dba7..2fa1199 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -87,6 +87,7 @@ extern struct rte_logs rte_logs;
 #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
 #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
 #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
+#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index aeaacbc..0f8e38f 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -42,6 +42,8 @@ LIBABIVER := 1
 
 #source files
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
new file mode 100644
index 0000000..4d4c3fd
--- /dev/null
+++ b/lib/librte_gso/gso_common.c
@@ -0,0 +1,207 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdbool.h>
+#include <string.h>
+
+#include <rte_malloc.h>
+
+#include <rte_ether.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
+#include "gso_common.h"
+
+static inline void
+hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset)
+{
+	/* copy mbuf metadata */
+	hdr_segment->nb_segs = 1;
+	hdr_segment->port = pkt->port;
+	hdr_segment->ol_flags = pkt->ol_flags;
+	hdr_segment->packet_type = pkt->packet_type;
+	hdr_segment->pkt_len = pkt_hdr_offset;
+	hdr_segment->data_len = pkt_hdr_offset;
+	hdr_segment->tx_offload = pkt->tx_offload;
+
+	/* copy packet header */
+	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
+			rte_pktmbuf_mtod(pkt, char *),
+			pkt_hdr_offset);
+}
+
+static inline void
+free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
+{
+	uint16_t i;
+
+	for (i = 0; i < nb_pkts; i++)
+		rte_pktmbuf_free(pkts[i]);
+}
+
+int
+gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct rte_mbuf *pkt_in;
+	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
+	uint16_t pkt_in_data_pos, segment_bytes_remaining;
+	uint16_t pyld_len, nb_segs;
+	bool more_in_pkt, more_out_segs;
+
+	pkt_in = pkt;
+	nb_segs = 0;
+	more_in_pkt = 1;
+	pkt_in_data_pos = pkt_hdr_offset;
+
+	while (more_in_pkt) {
+		if (unlikely(nb_segs >= nb_pkts_out)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -EINVAL;
+		}
+
+		/* allocate direct mbuf */
+		hdr_segment = rte_pktmbuf_alloc(direct_pool);
+		if (unlikely(hdr_segment == NULL)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -ENOMEM;
+		}
+		/* fill packet header */
+		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
+
+		prev_segment = hdr_segment;
+		segment_bytes_remaining = pyld_unit_size;
+		more_out_segs = 1;
+
+		while (more_out_segs && more_in_pkt) {
+			/* allocate indirect MBUF */
+			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
+			if (unlikely(pyld_segment == NULL)) {
+				rte_pktmbuf_free(hdr_segment);
+				free_gso_segment(pkts_out, nb_segs);
+				return -ENOMEM;
+			}
+			/* attach to current MBUF segment of pkt */
+			rte_pktmbuf_attach(pyld_segment, pkt_in);
+
+			prev_segment->next = pyld_segment;
+			prev_segment = pyld_segment;
+
+			pyld_len = segment_bytes_remaining;
+			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
+				pyld_len = pkt_in->data_len - pkt_in_data_pos;
+
+			pyld_segment->data_off = pkt_in_data_pos +
+				pkt_in->data_off;
+			pyld_segment->data_len = pyld_len;
+
+			/* update header segment */
+			hdr_segment->pkt_len += pyld_len;
+			hdr_segment->nb_segs++;
+
+			pkt_in_data_pos += pyld_len;
+			segment_bytes_remaining -= pyld_len;
+
+			/* finish processing a MBUF segment of pkt */
+			if (pkt_in_data_pos == pkt_in->data_len) {
+				pkt_in = pkt_in->next;
+				pkt_in_data_pos = 0;
+				if (pkt_in == NULL)
+					more_in_pkt = 0;
+			}
+
+			/* finish generating a GSO segment */
+			if (segment_bytes_remaining == 0)
+				more_out_segs = 0;
+		}
+		pkts_out[nb_segs++] = hdr_segment;
+	}
+	return nb_segs;
+}
+
+static inline void
+update_inner_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct tcp_hdr *tcp_hdr;
+	struct ipv4_hdr *ipv4_hdr;
+	struct rte_mbuf *seg;
+	uint32_t sent_seq;
+	uint16_t inner_l2_offset;
+	uint16_t id, i;
+
+	inner_l2_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			inner_l2_offset);
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+
+	for (i = 0; i < nb_segs; i++) {
+		seg = segs[i];
+		/* update inner IPv4 header */
+		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(seg, char *) +
+				inner_l2_offset);
+		ipv4_hdr->total_length = rte_cpu_to_be_16(seg->pkt_len -
+				inner_l2_offset);
+		ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
+		id += ipid_delta;
+
+		/* update inner TCP header */
+		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + seg->l3_len);
+		tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
+		if (likely(i < nb_segs - 1))
+			tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
+						TCP_HDR_FIN_MASK));
+
+		sent_seq += (seg->pkt_len - seg->data_len);
+	}
+}
+
+void
+gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	switch (pkt->packet_type) {
+	case ETHER_VLAN_IPv4_TCP_PKT:
+	case ETHER_IPv4_TCP_PKT:
+		update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
+		break;
+	}
+}
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
new file mode 100644
index 0000000..ce3b955
--- /dev/null
+++ b/lib/librte_gso/gso_common.h
@@ -0,0 +1,107 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_COMMON_H_
+#define _GSO_COMMON_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+#define IPV4_HDR_DF_SHIFT 14
+#define IPV4_HDR_DF_MASK (1 << IPV4_HDR_DF_SHIFT)
+
+#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
+#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
+
+#define ETHER_IPv4_PKT (RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4)
+/* TCP/IPv4 packet. */
+#define ETHER_IPv4_TCP_PKT (ETHER_IPv4_PKT | RTE_PTYPE_L4_TCP)
+
+/* TCP/IPv4 packet with VLAN tag. */
+#define ETHER_VLAN_IPv4_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | \
+		RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)
+
+/**
+ * Internal function which updates relevant packet headers, following
+ * segmentation. This is required to update, for example, the IPv4
+ * 'total_length' field, to reflect the reduced length of the now-
+ * segmented packet.
+ *
+ * @param pkt
+ *  The original packet.
+ * @param ipid_delta
+ *  The increasing uint of IP ids.
+ * @param segs
+ *  Pointer array used for storing mbuf addresses for GSO segments.
+ * @param nb_segs
+ *  The number of GSO segments placed in segs.
+ */
+void gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs);
+
+/**
+ * Internal function which divides the input packet into small segments.
+ * Each of the newly-created segments is organized as a two-segment mbuf,
+ * where the first segment is a standard mbuf, which stores a copy of
+ * packet header, and the second is an indirect mbuf which points to a
+ * section of data in the input packet.
+ *
+ * @param pkt
+ *  Packet to segment.
+ * @param pkt_hdr_offset
+ *  Packet header offset, measured in byte.
+ * @param pyld_unit_size
+ *  The max payload length of a GSO segment.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to keep the mbuf addresses of output segments.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of segments created in the event of success.
+ *  - If no GSO is performed, return 1.
+ *  - If available memory in mempools is insufficient, return -ENOMEM.
+ *  - -EINVAL for invalid parameters
+ */
+int gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/gso_tcp.c b/lib/librte_gso/gso_tcp.c
new file mode 100644
index 0000000..d52cf28
--- /dev/null
+++ b/lib/librte_gso/gso_tcp.c
@@ -0,0 +1,83 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+
+#include <rte_ether.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
+#include "gso_common.h"
+#include "gso_tcp.h"
+
+int
+gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ether_hdr *eth_hdr;
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t tcp_dl;
+	uint16_t pyld_unit_size;
+	uint16_t hdr_offset;
+	int ret = 1;
+
+	eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
+	ipv4_hdr = (struct ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
+
+	/* don't process fragmented packet */
+	if ((ipv4_hdr->fragment_offset &
+				rte_cpu_to_be_16(IPV4_HDR_DF_MASK)) == 0)
+		return ret;
+
+	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) -
+		pkt->l3_len - pkt->l4_len;
+	/* don't process packet without data */
+	if (tcp_dl == 0)
+		return ret;
+
+	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
+
+	/* segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+
+	if (ret > 1)
+		gso_update_pkt_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tcp.h b/lib/librte_gso/gso_tcp.h
new file mode 100644
index 0000000..a578535
--- /dev/null
+++ b/lib/librte_gso/gso_tcp.h
@@ -0,0 +1,76 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TCP_H_
+#define _GSO_TCP_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment an IPv4/TCP packet. This function assumes the input packet has
+ * correct checksums and doesn't update checksums for GSO segment.
+ * Furthermore, it doesn't process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing uint of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array, which is used to store mbuf addresses of GSO segments.
+ *  Caller should guarantee that 'pkts_out' is sufficiently large to store
+ *  all GSO segments.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments on success.
+ *   - Return 1 if no GSO is performed.
+ *   - Return -ENOMEM if available memory in mempools is insufficient.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ip_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index fef6725..ef03375 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -31,18 +31,58 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
+#include <rte_log.h>
+
 #include "rte_gso.h"
+#include "gso_common.h"
+#include "gso_tcp.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
-		struct rte_gso_ctx gso_ctx __rte_unused,
+		struct rte_gso_ctx gso_ctx,
 		struct rte_mbuf **pkts_out,
 		uint16_t nb_pkts_out)
 {
+	struct rte_mempool *direct_pool, *indirect_pool;
+	struct rte_mbuf *pkt_seg;
+	uint16_t gso_size;
+	uint8_t ipid_delta;
+	int ret = 1;
+
 	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
 		return -EINVAL;
 
-	pkts_out[0] = pkt;
+	if (gso_ctx.gso_size >= pkt->pkt_len ||
+			(pkt->packet_type & gso_ctx.gso_types) !=
+			pkt->packet_type) {
+		pkts_out[0] = pkt;
+		return ret;
+	}
+
+	direct_pool = gso_ctx.direct_pool;
+	indirect_pool = gso_ctx.indirect_pool;
+	gso_size = gso_ctx.gso_size;
+	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
+
+	switch (pkt->packet_type) {
+	case ETHER_VLAN_IPv4_TCP_PKT:
+	case ETHER_IPv4_TCP_PKT:
+		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+		break;
+	default:
+		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
+	}
+
+	if (ret > 1) {
+		pkt_seg = pkt;
+		while (pkt_seg) {
+			rte_mbuf_refcnt_update(pkt_seg, -1);
+			pkt_seg = pkt_seg->next;
+		}
+	} else if (ret == 1)
+		pkts_out[0] = pkt;
 
-	return 1;
+	return ret;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v2 3/5] gso: add VxLAN GSO support
  2017-09-05  7:57 ` [PATCH v2 " Jiayu Hu
  2017-09-05  7:57   ` [PATCH v2 1/5] gso: add Generic Segmentation Offload API framework Jiayu Hu
  2017-09-05  7:57   ` [PATCH v2 2/5] gso: add TCP/IPv4 GSO support Jiayu Hu
@ 2017-09-05  7:57   ` Jiayu Hu
  2017-09-05  7:57   ` [PATCH v2 4/5] gso: add GRE " Jiayu Hu
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-05  7:57 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, Jiayu Hu

From: Mark Kavanagh <mark.b.kavanagh@intel.com>

This patch adds GSO support for VxLAN-encapsulated packets. Supported
VxLAN packets must have an outer IPv4 header (prepended by an optional
VLAN tag), and contain an inner TCP/IPv4 packet (with an optional inner
VLAN tag).

VxLAN GSO assumes that all input packets have correct checksums and
doesn't update checksums for output packets. Additionally, it doesn't
process IP fragmented packets.

As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
output packet, which mandates support for multi-segment mbufs in the TX
functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
reduces its MBUF refcnt by 1. As a result, when all of its GSOed
segments are freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 lib/librte_gso/Makefile     |  1 +
 lib/librte_gso/gso_common.c | 50 ++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h | 36 +++++++++++++++++++++-
 lib/librte_gso/gso_tunnel.c | 61 ++++++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_tunnel.h | 75 +++++++++++++++++++++++++++++++++++++++++++++
 lib/librte_gso/rte_gso.c    |  9 ++++++
 6 files changed, 231 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_gso/gso_tunnel.c
 create mode 100644 lib/librte_gso/gso_tunnel.h

diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index 0f8e38f..a4d1a81 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -44,6 +44,7 @@ LIBABIVER := 1
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
index 4d4c3fd..1e16c9c 100644
--- a/lib/librte_gso/gso_common.c
+++ b/lib/librte_gso/gso_common.c
@@ -39,6 +39,7 @@
 #include <rte_ether.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
+#include <rte_udp.h>
 
 #include "gso_common.h"
 
@@ -194,11 +195,60 @@ update_inner_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
 	}
 }
 
+static inline void
+update_outer_ipv4_header(struct rte_mbuf *pkt, uint16_t id)
+{
+	struct ipv4_hdr *ipv4_hdr;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->outer_l2_len);
+	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len -
+			pkt->outer_l2_len);
+	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
+}
+
+static inline void
+update_outer_udp_header(struct rte_mbuf *pkt)
+{
+	struct udp_hdr *udp_hdr;
+	uint16_t length;
+
+	length = pkt->outer_l2_len + pkt->outer_l3_len;
+	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			length);
+	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - length);
+}
+
+static inline void
+update_ipv4_vxlan_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t i, id;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->outer_l2_len);
+	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	for (i = 0; i < nb_segs; i++) {
+		update_outer_ipv4_header(segs[i], id);
+		id += ipid_delta;
+		update_outer_udp_header(segs[i]);
+	}
+	/* update inner TCP/IPv4 headers */
+	update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
+}
+
 void
 gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
 		struct rte_mbuf **segs, uint16_t nb_segs)
 {
 	switch (pkt->packet_type) {
+	case ETHER_VLAN_IPv4_UDP_VXLAN_VLAN_IPv4_TCP_PKT:
+	case ETHER_VLAN_IPv4_UDP_VXLAN_IPv4_TCP_PKT:
+	case ETHER_IPv4_UDP_VXLAN_VLAN_IPv4_TCP_PKT:
+	case ETHER_IPv4_UDP_VXLAN_IPv4_TCP_PKT:
+		update_ipv4_vxlan_tcp4_header(pkt, ipid_delta, segs, nb_segs);
+		break;
 	case ETHER_VLAN_IPv4_TCP_PKT:
 	case ETHER_IPv4_TCP_PKT:
 		update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index ce3b955..3f76fd1 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -44,6 +44,13 @@
 #define TCP_HDR_FIN_MASK ((uint8_t)0x01)
 
 #define ETHER_IPv4_PKT (RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4)
+#define INNER_ETHER_IPv4_TCP_PKT (RTE_PTYPE_INNER_L2_ETHER | \
+		RTE_PTYPE_INNER_L3_IPV4 | \
+		RTE_PTYPE_INNER_L4_TCP)
+#define INNER_ETHER_VLAN_IPv4_TCP_PKT (RTE_PTYPE_INNER_L2_ETHER_VLAN | \
+		RTE_PTYPE_INNER_L3_IPV4 | \
+		RTE_PTYPE_INNER_L4_TCP)
+
 /* TCP/IPv4 packet. */
 #define ETHER_IPv4_TCP_PKT (ETHER_IPv4_PKT | RTE_PTYPE_L4_TCP)
 
@@ -51,6 +58,33 @@
 #define ETHER_VLAN_IPv4_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | \
 		RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)
 
+/* VxLAN packet */
+#define ETHER_IPv4_UDP_VXLAN_IPv4_TCP_PKT (ETHER_IPv4_PKT | \
+		RTE_PTYPE_L4_UDP | \
+		RTE_PTYPE_TUNNEL_VXLAN | \
+		INNER_ETHER_IPv4_TCP_PKT)
+
+/* VxLAN packet with outer VLAN tag. */
+#define ETHER_VLAN_IPv4_UDP_VXLAN_IPv4_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | \
+		RTE_PTYPE_L3_IPV4 | \
+		RTE_PTYPE_L4_UDP | \
+		RTE_PTYPE_TUNNEL_VXLAN | \
+		INNER_ETHER_IPv4_TCP_PKT)
+
+/* VxLAN packet with inner VLAN tag. */
+#define ETHER_IPv4_UDP_VXLAN_VLAN_IPv4_TCP_PKT (ETHER_IPv4_PKT | \
+		RTE_PTYPE_L4_UDP | \
+		RTE_PTYPE_TUNNEL_VXLAN | \
+		INNER_ETHER_VLAN_IPv4_TCP_PKT)
+
+/* VxLAN packet with both outer and inner VLAN tags. */
+#define ETHER_VLAN_IPv4_UDP_VXLAN_VLAN_IPv4_TCP_PKT (\
+		RTE_PTYPE_L2_ETHER_VLAN | \
+		RTE_PTYPE_L3_IPV4 | \
+		RTE_PTYPE_L4_UDP | \
+		RTE_PTYPE_TUNNEL_VXLAN | \
+		INNER_ETHER_VLAN_IPv4_TCP_PKT)
+
 /**
  * Internal function which updates relevant packet headers, following
  * segmentation. This is required to update, for example, the IPv4
@@ -79,7 +113,7 @@ void gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
  * @param pkt
  *  Packet to segment.
  * @param pkt_hdr_offset
- *  Packet header offset, measured in byte.
+ *  Packet header offset, measured in bytes.
  * @param pyld_unit_size
  *  The max payload length of a GSO segment.
  * @param direct_pool
diff --git a/lib/librte_gso/gso_tunnel.c b/lib/librte_gso/gso_tunnel.c
new file mode 100644
index 0000000..69aa91f
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel.c
@@ -0,0 +1,61 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_ether.h>
+
+#include "gso_common.h"
+#include "gso_tunnel.h"
+
+int
+gso_tunnel_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	uint16_t pyld_unit_size, hdr_offset;
+	int ret;
+
+	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len +
+		pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
+
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret > 1)
+		gso_update_pkt_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tunnel.h b/lib/librte_gso/gso_tunnel.h
new file mode 100644
index 0000000..80bd0c5
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel.h
@@ -0,0 +1,75 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TUNNEL_H_
+#define _GSO_TUNNEL_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment an tunneling packet. This function assumes the input packet
+ * has correct checksums and doesn't update checksums for GSO segment.
+ * Furthermore, it doesn't process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing uint of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array, which is used to store mbuf addresses of GSO segments.
+ *  Caller should guarantee that 'pkts_out' is sufficiently large to store
+ *  all GSO segments.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments on success.
+ *   - Return 1 if no GSO is performed.
+ *   - Return -ENOMEM if available memory in mempools is insufficient.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tunnel_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index ef03375..0170abc 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -36,6 +36,7 @@
 #include "rte_gso.h"
 #include "gso_common.h"
 #include "gso_tcp.h"
+#include "gso_tunnel.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -71,6 +72,14 @@ rte_gso_segment(struct rte_mbuf *pkt,
 				direct_pool, indirect_pool,
 				pkts_out, nb_pkts_out);
 		break;
+	case ETHER_VLAN_IPv4_UDP_VXLAN_VLAN_IPv4_TCP_PKT:
+	case ETHER_VLAN_IPv4_UDP_VXLAN_IPv4_TCP_PKT:
+	case ETHER_IPv4_UDP_VXLAN_VLAN_IPv4_TCP_PKT:
+	case ETHER_IPv4_UDP_VXLAN_IPv4_TCP_PKT:
+		ret = gso_tunnel_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+		break;
 	default:
 		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v2 4/5] gso: add GRE GSO support
  2017-09-05  7:57 ` [PATCH v2 " Jiayu Hu
                     ` (2 preceding siblings ...)
  2017-09-05  7:57   ` [PATCH v2 3/5] gso: add VxLAN " Jiayu Hu
@ 2017-09-05  7:57   ` Jiayu Hu
  2017-09-05  7:57   ` [PATCH v2 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
  2017-09-12  2:43   ` [PATCH v3 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
  5 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-05  7:57 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, Jiayu Hu

From: Mark Kavanagh <mark.b.kavanagh@intel.com>

This patch adds GSO support for GRE-tunneled packets. Supported GRE
packets must contain an outer IPv4 header, and inner TCP/IPv4 headers.
They may also contain a single VLAN tag. GRE GSO assumes that all input
packets have correct checksums and doesn't update checksums for output
packets. Additionally, it doesn't process IP fragmented packets.

As with VxLAN GSO, GRE GSO uses a two-segment MBUF to organize each
output packet, which requires multi-segment mbuf support in the TX
functions of the NIC driver. Also, if a packet is GSOed, GRE GSO reduces
its MBUF refcnt by 1. As a result, when all of its GSOed segments are
freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 lib/librte_gso/gso_common.c | 24 ++++++++++++++++++++++++
 lib/librte_gso/gso_common.h | 15 +++++++++++++++
 lib/librte_gso/rte_gso.c    |  2 ++
 3 files changed, 41 insertions(+)

diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
index 1e16c9c..668d2d0 100644
--- a/lib/librte_gso/gso_common.c
+++ b/lib/librte_gso/gso_common.c
@@ -37,6 +37,7 @@
 #include <rte_malloc.h>
 
 #include <rte_ether.h>
+#include <rte_gre.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
 #include <rte_udp.h>
@@ -238,6 +239,25 @@ update_ipv4_vxlan_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
 	update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
 }
 
+static inline void
+update_ipv4_gre_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t i, id;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->outer_l2_len);
+	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	for (i = 0; i < nb_segs; i++) {
+		update_outer_ipv4_header(segs[i], id);
+		id += ipid_delta;
+	}
+
+	/* update inner TCP/IPv4 headers */
+	update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
+}
+
 void
 gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
 		struct rte_mbuf **segs, uint16_t nb_segs)
@@ -249,6 +269,10 @@ gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
 	case ETHER_IPv4_UDP_VXLAN_IPv4_TCP_PKT:
 		update_ipv4_vxlan_tcp4_header(pkt, ipid_delta, segs, nb_segs);
 		break;
+	case ETHER_VLAN_IPv4_GRE_IPv4_TCP_PKT:
+	case ETHER_IPv4_GRE_IPv4_TCP_PKT:
+		update_ipv4_gre_tcp4_header(pkt, ipid_delta, segs, nb_segs);
+		break;
 	case ETHER_VLAN_IPv4_TCP_PKT:
 	case ETHER_IPv4_TCP_PKT:
 		update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index 3f76fd1..bd53bde 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -85,6 +85,21 @@
 		RTE_PTYPE_TUNNEL_VXLAN | \
 		INNER_ETHER_VLAN_IPv4_TCP_PKT)
 
+/* GRE packet. */
+#define ETHER_IPv4_GRE_IPv4_TCP_PKT (\
+		ETHER_IPv4_PKT          | \
+		RTE_PTYPE_TUNNEL_GRE    | \
+		RTE_PTYPE_INNER_L3_IPV4 | \
+		RTE_PTYPE_INNER_L4_TCP)
+
+/* GRE packet with VLAN tag. */
+#define ETHER_VLAN_IPv4_GRE_IPv4_TCP_PKT (\
+		RTE_PTYPE_L2_ETHER_VLAN | \
+		RTE_PTYPE_L3_IPV4       | \
+		RTE_PTYPE_TUNNEL_GRE    | \
+		RTE_PTYPE_INNER_L3_IPV4 | \
+		RTE_PTYPE_INNER_L4_TCP)
+
 /**
  * Internal function which updates relevant packet headers, following
  * segmentation. This is required to update, for example, the IPv4
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index 0170abc..d40fda9 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -76,6 +76,8 @@ rte_gso_segment(struct rte_mbuf *pkt,
 	case ETHER_VLAN_IPv4_UDP_VXLAN_IPv4_TCP_PKT:
 	case ETHER_IPv4_UDP_VXLAN_VLAN_IPv4_TCP_PKT:
 	case ETHER_IPv4_UDP_VXLAN_IPv4_TCP_PKT:
+	case ETHER_VLAN_IPv4_GRE_IPv4_TCP_PKT:
+	case ETHER_IPv4_GRE_IPv4_TCP_PKT:
 		ret = gso_tunnel_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
 				pkts_out, nb_pkts_out);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v2 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-09-05  7:57 ` [PATCH v2 " Jiayu Hu
                     ` (3 preceding siblings ...)
  2017-09-05  7:57   ` [PATCH v2 4/5] gso: add GRE " Jiayu Hu
@ 2017-09-05  7:57   ` Jiayu Hu
  2017-09-12  2:43   ` [PATCH v3 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
  5 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-05  7:57 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, Jiayu Hu

This patch adds GSO support to the csum forwarding engine. Oversized
packets transmitted over a GSO-enabled port will undergo segmentation
(with the exception of packet-types unsupported by the GSO library).
GSO support is disabled by default.

GSO support may be toggled on a per-port basis, using the command:

        "set port <port_id> gso on|off"

The maximum packet length (including the packet header and payload) for
GSO segments may be set with the command:

        "set gso segsz <length>"

Show GSO configuration for a given port with the command:

	"show port <port_id> gso"

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 app/test-pmd/cmdline.c  | 178 ++++++++++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/config.c   |  24 +++++++
 app/test-pmd/csumonly.c |  60 ++++++++++++++--
 app/test-pmd/testpmd.c  |  15 ++++
 app/test-pmd/testpmd.h  |  10 +++
 5 files changed, 283 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index cd8c358..03b98a3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -431,6 +431,17 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"    Set max flow number and max packet number per-flow"
 			" for GRO.\n\n"
 
+			"set port (port_id) gso (on|off)"
+			"    Enable or disable Generic Segmentation Offload in"
+			" csum forwarding engine.\n\n"
+
+			"set gso segsz (length)\n"
+			"    Set max packet length for output GSO segments,"
+			" including packet header and payload.\n\n"
+
+			"show port (port_id) gso\n"
+			"    Show GSO configuration.\n\n"
+
 			"set fwd (%s)\n"
 			"    Set packet forwarding mode.\n\n"
 
@@ -3963,6 +3974,170 @@ cmdline_parse_inst_t cmd_gro_set = {
 	},
 };
 
+/* *** ENABLE/DISABLE GSO *** */
+struct cmd_gso_enable_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_mode;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_enable_parsed(void *parsed_result,
+		__attribute__((unused)) struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_gso_enable_result *res;
+
+	res = parsed_result;
+	if (!strcmp(res->cmd_keyword, "gso"))
+		setup_gso(res->cmd_mode, res->cmd_pid);
+}
+
+cmdline_parse_token_string_t cmd_gso_enable_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_enable_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_enable_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_enable_mode =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_mode, "on#off");
+cmdline_parse_token_num_t cmd_gso_enable_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_enable = {
+	.f = cmd_gso_enable_parsed,
+	.data = NULL,
+	.help_str = "set port <port_id> gso on|off",
+	.tokens = {
+		(void *)&cmd_gso_enable_set,
+		(void *)&cmd_gso_enable_port,
+		(void *)&cmd_gso_enable_pid,
+		(void *)&cmd_gso_enable_keyword,
+		(void *)&cmd_gso_enable_mode,
+		NULL,
+	},
+};
+
+/* *** SET MAX PACKET LENGTH FOR GSO SEGMENTS *** */
+struct cmd_gso_size_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_segsz;
+	uint16_t cmd_size;
+};
+
+static void
+cmd_gso_size_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_size_result *res = parsed_result;
+
+	if (test_done == 0) {
+		printf("Before set GSO segsz, please stop fowarding first\n");
+		return;
+	}
+
+	if (!strcmp(res->cmd_keyword, "gso") &&
+			!strcmp(res->cmd_segsz, "segsz")) {
+		if (res->cmd_size == 0) {
+			printf("gso_size should be larger than 0."
+					" Please input a legal value\n");
+		} else
+			gso_max_segment_size = res->cmd_size;
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_size_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_size_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_size_segsz =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_segsz, "segsz");
+cmdline_parse_token_num_t cmd_gso_size_size =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_size_result,
+				cmd_size, UINT16);
+
+cmdline_parse_inst_t cmd_gso_size = {
+	.f = cmd_gso_size_parsed,
+	.data = NULL,
+	.help_str = "set gso segsz <length>",
+	.tokens = {
+		(void *)&cmd_gso_size_set,
+		(void *)&cmd_gso_size_keyword,
+		(void *)&cmd_gso_size_segsz,
+		(void *)&cmd_gso_size_size,
+		NULL,
+	},
+};
+
+/* *** SHOW GSO CONFIGURATION *** */
+struct cmd_gso_show_result {
+	cmdline_fixed_string_t cmd_show;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_show_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_show_result *res = parsed_result;
+
+	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
+		printf("invalid port id %u\n", res->cmd_pid);
+		return;
+	}
+
+	if (!strcmp(res->cmd_keyword, "gso")) {
+		if (gso_ports[res->cmd_pid].enable) {
+			printf("Max GSO segment size: %uB\n"
+					"Support GSO protocols: TCP/IPv4,"
+					" VxlAN and GRE\n",
+					gso_max_segment_size);
+		} else
+			printf("Port %u doesn't enable GSO\n", res->cmd_pid);
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_show_show =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_show, "show");
+cmdline_parse_token_string_t cmd_gso_show_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_show_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_num_t cmd_gso_show_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_show_result,
+				cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_show = {
+	.f = cmd_gso_show_parsed,
+	.data = NULL,
+	.help_str = "show port <port_id> gso",
+	.tokens = {
+		(void *)&cmd_gso_show_show,
+		(void *)&cmd_gso_show_port,
+		(void *)&cmd_gso_show_pid,
+		(void *)&cmd_gso_show_keyword,
+		NULL,
+	},
+};
+
 /* *** ENABLE/DISABLE FLUSH ON RX STREAMS *** */
 struct cmd_set_flush_rx {
 	cmdline_fixed_string_t set;
@@ -14251,6 +14426,9 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_tunnel_tso_show,
 	(cmdline_parse_inst_t *)&cmd_enable_gro,
 	(cmdline_parse_inst_t *)&cmd_gro_set,
+	(cmdline_parse_inst_t *)&cmd_gso_enable,
+	(cmdline_parse_inst_t *)&cmd_gso_size,
+	(cmdline_parse_inst_t *)&cmd_gso_show,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_rx,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_tx,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 3ae3e1c..3434346 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2454,6 +2454,30 @@ setup_gro(const char *mode, uint8_t port_id)
 	}
 }
 
+void
+setup_gso(const char *mode, uint8_t port_id)
+{
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		printf("invalid port id %u\n", port_id);
+		return;
+	}
+	if (strcmp(mode, "on") == 0) {
+		if (test_done == 0) {
+			printf("before enable GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 1;
+	} else if (strcmp(mode, "off") == 0) {
+		if (test_done == 0) {
+			printf("before disable GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 0;
+	}
+}
+
 char*
 list_pkt_forwarding_modes(void)
 {
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 90c8119..30ae709 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -66,10 +66,12 @@
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_sctp.h>
+#include <rte_net.h>
 #include <rte_prefetch.h>
 #include <rte_string_fns.h>
 #include <rte_flow.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
@@ -627,6 +629,9 @@ static void
 pkt_burst_checksum_forward(struct fwd_stream *fs)
 {
 	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_mbuf *gso_segments[GSO_MAX_PKT_BURST];
+	struct rte_gso_ctx *gso_ctx;
+	struct rte_mbuf **tx_pkts_burst;
 	struct rte_port *txp;
 	struct rte_mbuf *m, *p;
 	struct ether_hdr *eth_hdr;
@@ -641,6 +646,9 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 	uint32_t rx_bad_ip_csum;
 	uint32_t rx_bad_l4_csum;
 	struct testpmd_offload_info info;
+	uint16_t nb_segments = 0;
+	struct rte_net_hdr_lens hdr_lens;
+	int ret;
 
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	uint64_t start_tsc;
@@ -851,13 +859,56 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 		}
 	}
 
+	if (gso_ports[fs->tx_port].enable == 0)
+		tx_pkts_burst = pkts_burst;
+	else {
+		gso_ctx = &(current_fwd_lcore()->gso_ctx);
+		gso_ctx->gso_size = gso_max_segment_size;
+		for (i = 0; i < nb_rx; i++) {
+			/* fill packet_type for the packet to segment */
+			pkts_burst[i]->packet_type = rte_net_get_ptype(
+					pkts_burst[i], &hdr_lens,
+					RTE_PTYPE_ALL_MASK);
+
+			ret = rte_gso_segment(pkts_burst[i], *gso_ctx,
+					&gso_segments[nb_segments],
+					GSO_MAX_PKT_BURST - nb_segments);
+			if (ret >= 1)
+				nb_segments += ret;
+			else if (ret < 0) {
+				/* insufficient MBUFs, stop GSO */
+				memcpy(&gso_segments[nb_segments],
+						&pkts_burst[i],
+						sizeof(struct rte_mbuf *) *
+						(nb_rx - i));
+				nb_segments += (nb_rx - i);
+				break;
+			}
+			if (unlikely(nb_rx - i >= GSO_MAX_PKT_BURST -
+						nb_segments)) {
+				/*
+				 * insufficient space in gso_segments,
+				 * stop GSO.
+				 */
+				memcpy(&gso_segments[nb_segments],
+						&pkts_burst[i],
+						sizeof(struct rte_mbuf *) *
+						(nb_rx - i));
+				nb_segments += (nb_rx - i);
+				break;
+			}
+		}
+		tx_pkts_burst = gso_segments;
+		nb_rx = nb_segments;
+	}
+
 	nb_prep = rte_eth_tx_prepare(fs->tx_port, fs->tx_queue,
-			pkts_burst, nb_rx);
+			tx_pkts_burst, nb_rx);
 	if (nb_prep != nb_rx)
 		printf("Preparing packet burst to transmit failed: %s\n",
 				rte_strerror(rte_errno));
 
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst,
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, tx_pkts_burst,
 			nb_prep);
 
 	/*
@@ -868,7 +919,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
 			rte_delay_us(burst_tx_delay_time);
 			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_rx - nb_tx);
+					&tx_pkts_burst[nb_tx], nb_rx - nb_tx);
 		}
 	}
 	fs->tx_packets += nb_tx;
@@ -881,9 +932,10 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 	if (unlikely(nb_tx < nb_rx)) {
 		fs->fwd_dropped += (nb_rx - nb_tx);
 		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
+			rte_pktmbuf_free(tx_pkts_burst[nb_tx]);
 		} while (++nb_tx < nb_rx);
 	}
+
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	end_tsc = rte_rdtsc();
 	core_cycles = (end_tsc - start_tsc);
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 7d40139..e83fc95 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -400,6 +400,9 @@ static int eth_event_callback(uint8_t port_id,
  */
 static int all_ports_started(void);
 
+struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+uint16_t gso_max_segment_size = ETHER_MAX_LEN;
+
 /*
  * Helper function to check if socket is already discovered.
  * If yes, return positive value. If not, return zero.
@@ -570,6 +573,7 @@ init_config(void)
 	unsigned int nb_mbuf_per_pool;
 	lcoreid_t  lc_id;
 	uint8_t port_per_socket[RTE_MAX_NUMA_NODES];
+	uint32_t gso_types = 0;
 
 	memset(port_per_socket,0,RTE_MAX_NUMA_NODES);
 
@@ -654,6 +658,11 @@ init_config(void)
 
 	init_port_config();
 
+	gso_types = RTE_PTYPE_L2_ETHER_VLAN | RTE_PTYPE_L2_ETHER |
+		RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP |
+		RTE_PTYPE_L4_UDP | RTE_PTYPE_TUNNEL_VXLAN |
+		RTE_PTYPE_INNER_L3_IPV4 | RTE_PTYPE_INNER_L4_TCP |
+		RTE_PTYPE_TUNNEL_GRE;
 	/*
 	 * Records which Mbuf pool to use by each logical core, if needed.
 	 */
@@ -664,6 +673,12 @@ init_config(void)
 		if (mbp == NULL)
 			mbp = mbuf_pool_find(0);
 		fwd_lcores[lc_id]->mbp = mbp;
+		/* initialize GSO context */
+		fwd_lcores[lc_id]->gso_ctx.direct_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.indirect_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.gso_types = gso_types;
+		fwd_lcores[lc_id]->gso_ctx.gso_size = ETHER_MAX_LEN;
+		fwd_lcores[lc_id]->gso_ctx.ipid_flag = RTE_GSO_IPID_INCREASE;
 	}
 
 	/* Configuration of packet forwarding streams. */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index c9d7739..725af1a 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -36,6 +36,7 @@
 
 #include <rte_pci.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
@@ -205,6 +206,7 @@ struct rte_port {
  * CPU id. configuration table.
  */
 struct fwd_lcore {
+	struct rte_gso_ctx gso_ctx;     /**< GSO context */
 	struct rte_mempool *mbp; /**< The mbuf pool to use by this core */
 	streamid_t stream_idx;   /**< index of 1st stream in "fwd_streams" */
 	streamid_t stream_nb;    /**< number of streams in "fwd_streams" */
@@ -442,6 +444,13 @@ struct gro_status {
 };
 extern struct gro_status gro_ports[RTE_MAX_ETHPORTS];
 
+#define GSO_MAX_PKT_BURST 2048
+struct gso_status {
+	uint8_t enable;
+};
+extern struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+extern uint16_t gso_max_segment_size;
+
 static inline unsigned int
 lcore_num(void)
 {
@@ -641,6 +650,7 @@ void get_5tuple_filter(uint8_t port_id, uint16_t index);
 int rx_queue_id_is_invalid(queueid_t rxq_id);
 int tx_queue_id_is_invalid(queueid_t txq_id);
 void setup_gro(const char *mode, uint8_t port_id);
+void setup_gso(const char *mode, uint8_t port_id);
 
 /* Functions to manage the set of filtered Multicast MAC addresses */
 void mcast_addr_add(uint8_t port_id, struct ether_addr *mc_addr);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* Re: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
  2017-09-05  1:09         ` Hu, Jiayu
@ 2017-09-11 13:04           ` Ananyev, Konstantin
  0 siblings, 0 replies; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-11 13:04 UTC (permalink / raw)
  To: Hu, Jiayu; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Jiayu,

> > > Linux supports two kinds of IP identifier: fixed identifier and incremental
> > identifier, and
> > > which one to use depends on upper protocol modules. Specifically, if the
> > protocol module
> > > wants fixed identifiers, it will set SKB_GSO_TCP_FIXEDID to skb->gso_type,
> > and then
> > > inet_gso_segment() will keep identifiers the same. Otherwise, all segments
> > will have
> > > incremental identifiers. The reason for this design is that some protocols
> > may choose fixed
> > > IP identifiers, like TCP (from RFC791). This design also shows that linux
> > ignores the issue
> > > of repeated IP identifiers.
> > >
> > > From the perspective of DPDK, we need to solve two problems. One is if
> > ignore the issue of
> > > repeated IP identifiers. The other is if the GSO library provides an interface
> > to upper
> > > applications to enable them to choose fixed or incremental identifiers, or
> > simply uses
> > > incremental IP identifiers.
> > >
> > > Do you have any suggestions?
> >
> >
> > Do the same as Linux?
> > I.E. add some flag RRE_GSO_IPID_FIXED (or so) into gso_ctx?
> 
> OK, I see. We can do that.
> 
> In the GRO library, we check if the IP identifiers are incremental compulsorily. If we
> enable fixed IP identifier in GSO, it seems we also need to change the GRO library.
> I mean ignore IP identifier when merge packets, and don't update the IP identifier
> for the merged packet. What do you think of it?

I suppose we can, if there is a use-case for it.
Konstantin
> 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v3 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK
  2017-09-05  7:57 ` [PATCH v2 " Jiayu Hu
                     ` (4 preceding siblings ...)
  2017-09-05  7:57   ` [PATCH v2 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
@ 2017-09-12  2:43   ` Jiayu Hu
  2017-09-12  2:43     ` [PATCH v3 1/5] gso: add Generic Segmentation Offload API framework Jiayu Hu
                       ` (5 more replies)
  5 siblings, 6 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-12  2:43 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, Jiayu Hu

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. This patch adds GSO support to DPDK for specific
packet types: specifically, TCP/IPv4, VxLAN, and GRE.

The first patch introduces the GSO API framework. The second patch
adds GSO support for TCP/IPv4 packets (containing an optional VLAN
tag). The third patch adds GSO support for VxLAN packets that contain
outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or 
outer VLAN tags). The fourth patch adds GSO support for GRE packets
that contain outer IPv4, and inner TCP/IPv4 headers (with optional 
outer VLAN tag). The last patch in the series enables TCP/IPv4, VxLAN,
and GRE GSO in testpmd's checksum forwarding engine.

The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
iperf. Setup for the test is described as follows:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum
   forwarding engine with "retry".
c. Select IP and TCP HW checksum calculation for P0; select TCP HW
   checksum calculation for vhost-user port.
d. Launch a VM with csum and tso offloading enabled.
e. Run iperf-client on virtio-net port in the VM to send TCP packets.
   With enabling csum and tso, the VM can send large TCP/IPv4 packets
   (mss is up to 64KB).
f. P1 is assigned to linux kernel and enabled kernel GRO. Run
   iperf-server on P1.

We conduct three iperf tests:

test-1: enable GSO for P0 in testpmd, and set max GSO segment length
	to 1518B. Run two iperf-client in the VM.
test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
	two iperf-client in the VM.
test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.

Throughput of the above three tests:

test-1: ~9Gbps
test-2: 9.5Gbps
test-3: 3Mbps

The experimental data of VxLAN and GRE will be shown later.

Change log
==========
v3:
- support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
  RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
  UNKNOWN.
- fill mbuf->packet_type instead of using rte_net_get_ptype() in
  csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
- store the input packet into pkts_out inside gso_tcp4_segment() and
  gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
  is performed.
- add missing incldues.
- optimize file names, function names and function description.
- fix one bug in testpmd.
v2:
- merge data segments whose data_len is less than mss into a large data
  segment in gso_do_segment().
- use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
  header in rte_gso_segment().
- provide IP id macros for applications to select fixed or incremental IP
  ids.
- change the defination of gso_types in struct rte_gso_ctx.
- replace rte_pktmbuf_detach() with rte_pktmbuf_free().
- refactor gso_update_pkt_headers().
- change the return value of rte_gso_segment().
- remove parameter checks in rte_gso_segment().
- use rte_net_get_ptype() in app/test-pmd/csumonly.c to fill
  mbuf->packet_type.
- add a new GSO command in testpmd to show GSO configuration for ports.
- misc: fix typo and optimize function description.

Jiayu Hu (3):
  gso: add Generic Segmentation Offload API framework
  gso: add TCP/IPv4 GSO support
  app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO

Mark Kavanagh (2):
  gso: add VxLAN GSO support
  gso: add GRE GSO support

 app/test-pmd/cmdline.c                  | 178 +++++++++++++++++++++
 app/test-pmd/config.c                   |  24 +++
 app/test-pmd/csumonly.c                 | 102 +++++++++++-
 app/test-pmd/testpmd.c                  |  16 ++
 app/test-pmd/testpmd.h                  |  10 ++
 config/common_base                      |   5 +
 lib/Makefile                            |   2 +
 lib/librte_eal/common/include/rte_log.h |   1 +
 lib/librte_gso/Makefile                 |  52 ++++++
 lib/librte_gso/gso_common.c             | 270 ++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h             | 165 +++++++++++++++++++
 lib/librte_gso/gso_tcp4.c               |  83 ++++++++++
 lib/librte_gso/gso_tcp4.h               |  76 +++++++++
 lib/librte_gso/gso_tunnel_tcp4.c        |  85 ++++++++++
 lib/librte_gso/gso_tunnel_tcp4.h        |  76 +++++++++
 lib/librte_gso/rte_gso.c                |  91 +++++++++++
 lib/librte_gso/rte_gso.h                | 133 ++++++++++++++++
 lib/librte_gso/rte_gso_version.map      |   7 +
 mk/rte.app.mk                           |   1 +
 19 files changed, 1373 insertions(+), 4 deletions(-)
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

-- 
2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v3 1/5] gso: add Generic Segmentation Offload API framework
  2017-09-12  2:43   ` [PATCH v3 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
@ 2017-09-12  2:43     ` Jiayu Hu
  2017-09-12 10:36       ` Ananyev, Konstantin
  2017-09-14 18:33       ` Ferruh Yigit
  2017-09-12  2:43     ` [PATCH v3 2/5] gso: add TCP/IPv4 GSO support Jiayu Hu
                       ` (4 subsequent siblings)
  5 siblings, 2 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-12  2:43 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, Jiayu Hu

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. This patch introduces the GSO API framework to DPDK.

The GSO library provides a segmentation API, rte_gso_segment(), for
applications. It splits an input packet into small ones in each
invocation. The GSO library refers to these small packets generated
by rte_gso_segment() as GSO segments. Each of the newly-created GSO
segments is organized as a two-segment MBUF, where the first segment is a
standard MBUF, which stores a copy of packet header, and the second is an
indirect MBUF which points to a section of data in the input packet.
rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
when all GSO segments are freed, the input packet is freed automatically.
Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
the driver of the interface which the GSO segments are sent to should
support to transmit multi-segment packets.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 config/common_base                 |   5 ++
 lib/Makefile                       |   2 +
 lib/librte_gso/Makefile            |  49 ++++++++++++++
 lib/librte_gso/rte_gso.c           |  50 ++++++++++++++
 lib/librte_gso/rte_gso.h           | 133 +++++++++++++++++++++++++++++++++++++
 lib/librte_gso/rte_gso_version.map |   7 ++
 mk/rte.app.mk                      |   1 +
 7 files changed, 247 insertions(+)
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

diff --git a/config/common_base b/config/common_base
index 5e97a08..603e340 100644
--- a/config/common_base
+++ b/config/common_base
@@ -652,6 +652,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
 CONFIG_RTE_LIBRTE_GRO=y
 
 #
+# Compile GSO library
+#
+CONFIG_RTE_LIBRTE_GSO=y
+
+#
 # Compile librte_meter
 #
 CONFIG_RTE_LIBRTE_METER=y
diff --git a/lib/Makefile b/lib/Makefile
index 86caba1..3d123f4 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
 DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
+DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
+DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
new file mode 100644
index 0000000..aeaacbc
--- /dev/null
+++ b/lib/librte_gso/Makefile
@@ -0,0 +1,49 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_gso.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+
+EXPORT_MAP := rte_gso_version.map
+
+LIBABIVER := 1
+
+#source files
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
new file mode 100644
index 0000000..dda50ee
--- /dev/null
+++ b/lib/librte_gso/rte_gso.c
@@ -0,0 +1,50 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <errno.h>
+
+#include "rte_gso.h"
+
+int
+rte_gso_segment(struct rte_mbuf *pkt,
+		struct rte_gso_ctx gso_ctx __rte_unused,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
+		return -EINVAL;
+
+	pkts_out[0] = pkt;
+
+	return 1;
+}
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
new file mode 100644
index 0000000..db757d6
--- /dev/null
+++ b/lib/librte_gso/rte_gso.h
@@ -0,0 +1,133 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_GSO_H_
+#define _RTE_GSO_H_
+
+/**
+ * @file
+ * Interface to GSO library
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/* GSO IP id flags for the IPv4 header */
+#define RTE_GSO_IPID_FIXED (1 << 0)
+/**< Use fixed IP ids for output GSO segments */
+#define RTE_GSO_IPID_INCREASE (1 << 1)
+/**< Use incremental IP ids for output GSO segments */
+
+/**
+ * GSO context structure.
+ */
+struct rte_gso_ctx {
+	struct rte_mempool *direct_pool;
+	/**< MBUF pool for allocating direct buffers, which are used
+	 * to store packet headers for GSO segments.
+	 */
+	struct rte_mempool *indirect_pool;
+	/**< MBUF pool for allocating indirect buffers, which are used
+	 * to locate packet payloads for GSO segments. The indirect
+	 * buffer doesn't contain any data, but simply points to an
+	 * offset within the packet to segment.
+	 */
+	uint32_t gso_types;
+	/**< packet types to perform GSO. For example, if applications
+	 * want to segment TCP/IPv4 packets, may set (RTE_PTYPE_L2_ETHER |
+	 * RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP) to gso_types.
+	 */
+	uint16_t gso_size;
+	/**< maximum size of an output GSO segment, including packet
+	 * header and payload, measured in bytes.
+	 */
+	uint8_t ipid_flag;
+	/**< flag to indicate GSO uses fixed or incremental IP ids for
+	 * IPv4 headers of output GSO segments.
+	 */
+};
+
+/**
+ * Segmentation function, which supports processing of both single- and
+ * multi- segment packets.
+ *
+ * Note that we refer to the packets that are segmented from the input
+ * packet as 'GSO segments'. rte_gso_segment() assumes the input packet
+ * has correct checksums, and it doesn't update checksums for output
+ * GSO segments. Additionally, it doesn't process IP fragment packets.
+ *
+ * Each of the newly-created GSO segments is organized as a two-segment
+ * MBUF, where the first segment is a standard MBUF, which stores a copy
+ * of packet header, and the second is an indirect MBUF which points to
+ * a section of data in the input packet. Since each GSO segment has
+ * multiple MBUFs (i.e. 2 MBUFs), the driver of the interface which the
+ * GSO segments are sent to should support to transmit multi-segment
+ * packets.
+ *
+ * If the input packet is GSOed, its mbuf refcnt reduces by 1. Therefore,
+ * when all GSO segments are freed, the input packet is freed automatically.
+ *
+ * If the memory space in pkts_out or MBUF pools is insufficient, this
+ * function fails, and it returns (-1) * errno. Otherwise, GSO successes,
+ * and this function returns the number of output GSO segments filled in
+ * pkts_out.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param ctx
+ *  GSO context object.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when rte_gso_segment() successes.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of GSO segments filled in pkts_out on success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int rte_gso_segment(struct rte_mbuf *pkt,
+		struct rte_gso_ctx ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_GSO_H_ */
diff --git a/lib/librte_gso/rte_gso_version.map b/lib/librte_gso/rte_gso_version.map
new file mode 100644
index 0000000..e1fd453
--- /dev/null
+++ b/lib/librte_gso/rte_gso_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_gso_segment;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index c25fdd9..d4c9873 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
+_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-12  2:43   ` [PATCH v3 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
  2017-09-12  2:43     ` [PATCH v3 1/5] gso: add Generic Segmentation Offload API framework Jiayu Hu
@ 2017-09-12  2:43     ` Jiayu Hu
  2017-09-12 11:17       ` Ananyev, Konstantin
  2017-09-12 14:17       ` Ananyev, Konstantin
  2017-09-12  2:43     ` [PATCH v3 3/5] gso: add VxLAN " Jiayu Hu
                       ` (3 subsequent siblings)
  5 siblings, 2 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-12  2:43 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, Jiayu Hu

This patch adds GSO support for TCP/IPv4 packets. Supported packets
may include a single VLAN tag. TCP/IPv4 GSO assumes that all input
packets have correct checksums, and doesn't update checksums for output
packets (the responsibility for this lies with the application).
Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.

TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
MBUF, to organize an output packet. Note that we refer to these two
chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
header, while the indirect mbuf simply points to a location within the
original packet's payload. Consequently, use of the GSO library requires
multi-segment MBUF support in the TX functions of the NIC driver.

If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
result, when all of its GSOed segments are freed, the packet is freed
automatically.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 lib/librte_eal/common/include/rte_log.h |   1 +
 lib/librte_gso/Makefile                 |   2 +
 lib/librte_gso/gso_common.c             | 202 ++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h             | 113 ++++++++++++++++++
 lib/librte_gso/gso_tcp4.c               |  83 +++++++++++++
 lib/librte_gso/gso_tcp4.h               |  76 ++++++++++++
 lib/librte_gso/rte_gso.c                |  41 ++++++-
 7 files changed, 515 insertions(+), 3 deletions(-)
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h

diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index ec8dba7..2fa1199 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -87,6 +87,7 @@ extern struct rte_logs rte_logs;
 #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
 #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
 #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
+#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index aeaacbc..2be64d1 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -42,6 +42,8 @@ LIBABIVER := 1
 
 #source files
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
new file mode 100644
index 0000000..7c32e03
--- /dev/null
+++ b/lib/librte_gso/gso_common.c
@@ -0,0 +1,202 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdbool.h>
+#include <errno.h>
+
+#include <rte_memcpy.h>
+#include <rte_mempool.h>
+#include <rte_ether.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
+#include "gso_common.h"
+
+static inline void
+hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset)
+{
+	/* Copy MBUF metadata */
+	hdr_segment->nb_segs = 1;
+	hdr_segment->port = pkt->port;
+	hdr_segment->ol_flags = pkt->ol_flags;
+	hdr_segment->packet_type = pkt->packet_type;
+	hdr_segment->pkt_len = pkt_hdr_offset;
+	hdr_segment->data_len = pkt_hdr_offset;
+	hdr_segment->tx_offload = pkt->tx_offload;
+
+	/* Copy the packet header */
+	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
+			rte_pktmbuf_mtod(pkt, char *),
+			pkt_hdr_offset);
+}
+
+static inline void
+free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
+{
+	uint16_t i;
+
+	for (i = 0; i < nb_pkts; i++)
+		rte_pktmbuf_free(pkts[i]);
+}
+
+int
+gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct rte_mbuf *pkt_in;
+	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
+	uint16_t pkt_in_data_pos, segment_bytes_remaining;
+	uint16_t pyld_len, nb_segs;
+	bool more_in_pkt, more_out_segs;
+
+	pkt_in = pkt;
+	nb_segs = 0;
+	more_in_pkt = 1;
+	pkt_in_data_pos = pkt_hdr_offset;
+
+	while (more_in_pkt) {
+		if (unlikely(nb_segs >= nb_pkts_out)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -EINVAL;
+		}
+
+		/* Allocate a direct MBUF */
+		hdr_segment = rte_pktmbuf_alloc(direct_pool);
+		if (unlikely(hdr_segment == NULL)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -ENOMEM;
+		}
+		/* Fill the packet header */
+		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
+
+		prev_segment = hdr_segment;
+		segment_bytes_remaining = pyld_unit_size;
+		more_out_segs = 1;
+
+		while (more_out_segs && more_in_pkt) {
+			/* Allocate an indirect MBUF */
+			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
+			if (unlikely(pyld_segment == NULL)) {
+				rte_pktmbuf_free(hdr_segment);
+				free_gso_segment(pkts_out, nb_segs);
+				return -ENOMEM;
+			}
+			/* Attach to current MBUF segment of pkt */
+			rte_pktmbuf_attach(pyld_segment, pkt_in);
+
+			prev_segment->next = pyld_segment;
+			prev_segment = pyld_segment;
+
+			pyld_len = segment_bytes_remaining;
+			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
+				pyld_len = pkt_in->data_len - pkt_in_data_pos;
+
+			pyld_segment->data_off = pkt_in_data_pos +
+				pkt_in->data_off;
+			pyld_segment->data_len = pyld_len;
+
+			/* Update header segment */
+			hdr_segment->pkt_len += pyld_len;
+			hdr_segment->nb_segs++;
+
+			pkt_in_data_pos += pyld_len;
+			segment_bytes_remaining -= pyld_len;
+
+			/* Finish processing a MBUF segment of pkt */
+			if (pkt_in_data_pos == pkt_in->data_len) {
+				pkt_in = pkt_in->next;
+				pkt_in_data_pos = 0;
+				if (pkt_in == NULL)
+					more_in_pkt = 0;
+			}
+
+			/* Finish generating a GSO segment */
+			if (segment_bytes_remaining == 0)
+				more_out_segs = 0;
+		}
+		pkts_out[nb_segs++] = hdr_segment;
+	}
+	return nb_segs;
+}
+
+static inline void
+update_inner_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct tcp_hdr *tcp_hdr;
+	struct ipv4_hdr *ipv4_hdr;
+	struct rte_mbuf *seg;
+	uint32_t sent_seq;
+	uint16_t inner_l2_offset;
+	uint16_t id, i;
+
+	inner_l2_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			inner_l2_offset);
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+
+	for (i = 0; i < nb_segs; i++) {
+		seg = segs[i];
+		/* Update the inner IPv4 header */
+		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(seg, char *) +
+				inner_l2_offset);
+		ipv4_hdr->total_length = rte_cpu_to_be_16(seg->pkt_len -
+				inner_l2_offset);
+		ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
+		id += ipid_delta;
+
+		/* Update the inner TCP header */
+		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + seg->l3_len);
+		tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
+		if (likely(i < nb_segs - 1))
+			tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
+						TCP_HDR_FIN_MASK));
+		sent_seq += (seg->pkt_len - seg->data_len);
+	}
+}
+
+void
+gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	if (is_ipv4_tcp(pkt->packet_type))
+		update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
+}
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
new file mode 100644
index 0000000..3c76520
--- /dev/null
+++ b/lib/librte_gso/gso_common.h
@@ -0,0 +1,113 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_COMMON_H_
+#define _GSO_COMMON_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+#define IPV4_HDR_DF_SHIFT 14
+#define IPV4_HDR_DF_MASK (1 << IPV4_HDR_DF_SHIFT)
+
+#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
+#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
+
+#define ETHER_TCP_PKT (RTE_PTYPE_L2_ETHER | RTE_PTYPE_L4_TCP)
+#define ETHER_VLAN_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | RTE_PTYPE_L4_TCP)
+static inline uint8_t is_ipv4_tcp(uint32_t ptype)
+{
+	switch (ptype & (~RTE_PTYPE_L3_MASK)) {
+	case ETHER_VLAN_TCP_PKT:
+	case ETHER_TCP_PKT:
+		return RTE_ETH_IS_IPV4_HDR(ptype);
+	default:
+		return 0;
+	}
+}
+
+/**
+ * Internal function which updates relevant packet headers, following
+ * segmentation. This is required to update, for example, the IPv4
+ * 'total_length' field, to reflect the reduced length of the now-
+ * segmented packet.
+ *
+ * @param pkt
+ *  The original packet.
+ * @param ipid_delta
+ *  The increasing uint of IP ids.
+ * @param segs
+ *  Pointer array used for storing mbuf addresses for GSO segments.
+ * @param nb_segs
+ *  The number of GSO segments placed in segs.
+ */
+void gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs);
+
+/**
+ * Internal function which divides the input packet into small segments.
+ * Each of the newly-created segments is organized as a two-segment MBUF,
+ * where the first segment is a standard mbuf, which stores a copy of
+ * packet header, and the second is an indirect mbuf which points to a
+ * section of data in the input packet.
+ *
+ * @param pkt
+ *  Packet to segment.
+ * @param pkt_hdr_offset
+ *  Packet header offset, measured in bytes.
+ * @param pyld_unit_size
+ *  The max payload length of a GSO segment.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to keep the mbuf addresses of output segments. If
+ *  the memory space in pkts_out is insufficient, gso_do_segment() fails
+ *  and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of segments created in the event of success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
new file mode 100644
index 0000000..8d4bfb2
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.c
@@ -0,0 +1,83 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+
+#include <rte_ether.h>
+#include <rte_ip.h>
+
+#include "gso_common.h"
+#include "gso_tcp4.h"
+
+int
+gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t tcp_dl;
+	uint16_t pyld_unit_size;
+	uint16_t hdr_offset;
+	int ret = 1;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->l2_len);
+	/* Don't process the fragmented packet */
+	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
+						IPV4_HDR_DF_MASK)) == 0)) {
+		pkts_out[0] = pkt;
+		return ret;
+	}
+
+	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt->l3_len -
+		pkt->l4_len;
+	/* Don't process the packet without data */
+	if (unlikely(tcp_dl == 0)) {
+		pkts_out[0] = pkt;
+		return ret;
+	}
+
+	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret > 1)
+		gso_update_pkt_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
new file mode 100644
index 0000000..9c07984
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.h
@@ -0,0 +1,76 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TCP4_H_
+#define _GSO_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment an IPv4/TCP packet. This function assumes the input packet has
+ * correct checksums and doesn't update checksums for GSO segment.
+ * Furthermore, it doesn't process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing uint of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when gso_tcp4_segment() successes. If the memory space in
+ *  pkts_out is insufficient, gso_tcp4_segment() fails and returns
+ *  -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ip_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index dda50ee..95f6ea6 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -33,18 +33,53 @@
 
 #include <errno.h>
 
+#include <rte_log.h>
+
 #include "rte_gso.h"
+#include "gso_common.h"
+#include "gso_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
-		struct rte_gso_ctx gso_ctx __rte_unused,
+		struct rte_gso_ctx gso_ctx,
 		struct rte_mbuf **pkts_out,
 		uint16_t nb_pkts_out)
 {
+	struct rte_mempool *direct_pool, *indirect_pool;
+	struct rte_mbuf *pkt_seg;
+	uint16_t gso_size;
+	uint8_t ipid_delta;
+	int ret = 1;
+
 	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
 		return -EINVAL;
 
-	pkts_out[0] = pkt;
+	if (gso_ctx.gso_size >= pkt->pkt_len ||
+			(pkt->packet_type & gso_ctx.gso_types) !=
+			pkt->packet_type) {
+		pkts_out[0] = pkt;
+		return ret;
+	}
+
+	direct_pool = gso_ctx.direct_pool;
+	indirect_pool = gso_ctx.indirect_pool;
+	gso_size = gso_ctx.gso_size;
+	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
+
+	if (is_ipv4_tcp(pkt->packet_type)) {
+		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else
+		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
+
+	if (ret > 1) {
+		pkt_seg = pkt;
+		while (pkt_seg) {
+			rte_mbuf_refcnt_update(pkt_seg, -1);
+			pkt_seg = pkt_seg->next;
+		}
+	}
 
-	return 1;
+	return ret;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v3 3/5] gso: add VxLAN GSO support
  2017-09-12  2:43   ` [PATCH v3 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
  2017-09-12  2:43     ` [PATCH v3 1/5] gso: add Generic Segmentation Offload API framework Jiayu Hu
  2017-09-12  2:43     ` [PATCH v3 2/5] gso: add TCP/IPv4 GSO support Jiayu Hu
@ 2017-09-12  2:43     ` Jiayu Hu
  2017-09-12  2:43     ` [PATCH v3 4/5] gso: add GRE " Jiayu Hu
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-12  2:43 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, Jiayu Hu

From: Mark Kavanagh <mark.b.kavanagh@intel.com>

This patch adds GSO support for VxLAN-encapsulated packets. Supported
VxLAN packets must have an outer IPv4 header (prepended by an optional
VLAN tag), and contain an inner TCP/IPv4 packet (with an optional inner
VLAN tag).

VxLAN GSO assumes that all input packets have correct checksums and
doesn't update checksums for output packets. Additionally, it doesn't
process IP fragmented packets.

As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
output packet, which mandates support for multi-segment mbufs in the TX
functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
reduces its MBUF refcnt by 1. As a result, when all of its GSOed
segments are freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 lib/librte_gso/Makefile          |  1 +
 lib/librte_gso/gso_common.c      | 48 ++++++++++++++++++++++-
 lib/librte_gso/gso_common.h      | 33 ++++++++++++++++
 lib/librte_gso/gso_tunnel_tcp4.c | 85 ++++++++++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_tunnel_tcp4.h | 76 +++++++++++++++++++++++++++++++++++
 lib/librte_gso/rte_gso.c         |  7 +++-
 6 files changed, 248 insertions(+), 2 deletions(-)
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h

diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index 2be64d1..e6d41df 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -44,6 +44,7 @@ LIBABIVER := 1
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
index 7c32e03..c6779d0 100644
--- a/lib/librte_gso/gso_common.c
+++ b/lib/librte_gso/gso_common.c
@@ -39,6 +39,7 @@
 #include <rte_ether.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
+#include <rte_udp.h>
 
 #include "gso_common.h"
 
@@ -193,10 +194,55 @@ update_inner_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
 	}
 }
 
+static inline void
+update_outer_ipv4_header(struct rte_mbuf *pkt, uint16_t id)
+{
+	struct ipv4_hdr *ipv4_hdr;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->outer_l2_len);
+	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len -
+			pkt->outer_l2_len);
+	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
+}
+
+static inline void
+update_outer_udp_header(struct rte_mbuf *pkt)
+{
+	struct udp_hdr *udp_hdr;
+	uint16_t length;
+
+	length = pkt->outer_l2_len + pkt->outer_l3_len;
+	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			length);
+	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - length);
+}
+
+static inline void
+update_ipv4_vxlan_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t i, id;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->outer_l2_len);
+	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	for (i = 0; i < nb_segs; i++) {
+		update_outer_ipv4_header(segs[i], id);
+		id += ipid_delta;
+		update_outer_udp_header(segs[i]);
+	}
+	/* Update inner TCP/IPv4 headers */
+	update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
+}
+
 void
 gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
 		struct rte_mbuf **segs, uint16_t nb_segs)
 {
-	if (is_ipv4_tcp(pkt->packet_type))
+	if (is_ipv4_vxlan_ipv4_tcp(pkt->packet_type))
+		update_ipv4_vxlan_tcp4_header(pkt, ipid_delta, segs, nb_segs);
+	else if (is_ipv4_tcp(pkt->packet_type))
 		update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
 }
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index 3c76520..2377a1d 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -56,6 +56,39 @@ static inline uint8_t is_ipv4_tcp(uint32_t ptype)
 	}
 }
 
+#define IS_INNER_IPV4_HDR(ptype) (((ptype) == RTE_PTYPE_INNER_L3_IPV4) | \
+			((ptype) == RTE_PTYPE_INNER_L3_IPV4_EXT) | \
+			((ptype) == RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN))
+
+#define ETHER_UDP_VXLAN_ETHER_TCP_PKT (RTE_PTYPE_L2_ETHER | \
+		RTE_PTYPE_L4_UDP | RTE_PTYPE_TUNNEL_VXLAN | \
+		RTE_PTYPE_INNER_L2_ETHER | RTE_PTYPE_INNER_L4_TCP)
+#define ETHER_VLAN_UDP_VXLAN_ETHER_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | \
+		RTE_PTYPE_L4_UDP | RTE_PTYPE_TUNNEL_VXLAN | \
+		RTE_PTYPE_INNER_L2_ETHER | RTE_PTYPE_INNER_L4_TCP)
+#define ETHER_UDP_VXLAN_ETHER_VLAN_TCP_PKT (RTE_PTYPE_L2_ETHER | \
+		RTE_PTYPE_L4_UDP | RTE_PTYPE_TUNNEL_VXLAN | \
+		RTE_PTYPE_INNER_L2_ETHER_VLAN | RTE_PTYPE_INNER_L4_TCP)
+#define ETHER_VLAN_UDP_VXLAN_ETHER_VLAN_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | \
+		RTE_PTYPE_L4_UDP | RTE_PTYPE_TUNNEL_VXLAN | \
+		RTE_PTYPE_INNER_L2_ETHER_VLAN | RTE_PTYPE_INNER_L4_TCP)
+static inline uint8_t is_ipv4_vxlan_ipv4_tcp(uint32_t ptype)
+{
+	uint32_t type;
+
+	type = ptype & (~(RTE_PTYPE_L3_MASK | RTE_PTYPE_INNER_L3_MASK));
+	switch (type) {
+	case ETHER_UDP_VXLAN_ETHER_TCP_PKT:
+	case ETHER_VLAN_UDP_VXLAN_ETHER_TCP_PKT:
+	case ETHER_UDP_VXLAN_ETHER_VLAN_TCP_PKT:
+	case ETHER_VLAN_UDP_VXLAN_ETHER_VLAN_TCP_PKT:
+		return (RTE_ETH_IS_IPV4_HDR(ptype) > 0) ?
+			IS_INNER_IPV4_HDR(ptype & RTE_PTYPE_INNER_L3_MASK) : 0;
+	default:
+		return 0;
+	}
+}
+
 /**
  * Internal function which updates relevant packet headers, following
  * segmentation. This is required to update, for example, the IPv4
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
new file mode 100644
index 0000000..8ca52d1
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -0,0 +1,85 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_ether.h>
+#include <rte_ip.h>
+
+#include "gso_common.h"
+#include "gso_tunnel_tcp4.h"
+
+int
+gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *inner_ipv4_hdr;
+	uint16_t pyld_unit_size, hdr_offset;
+	uint16_t tcp_dl;
+	int ret = 1;
+
+	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
+	inner_ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			hdr_offset);
+	/*
+	 * Don't process the packet whose DF bit of the inner IPv4
+	 * header isn't set.
+	 */
+	if (unlikely((inner_ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
+						IPV4_HDR_DF_MASK)) == 0)) {
+		pkts_out[0] = pkt;
+		return ret;
+	}
+
+	tcp_dl = rte_be_to_cpu_16(inner_ipv4_hdr->total_length) -
+		pkt->l3_len - pkt->l4_len;
+	/* Don't process the packet without data */
+	if (unlikely(tcp_dl == 0)) {
+		pkts_out[0] = pkt;
+		return ret;
+	}
+
+	hdr_offset += pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret > 1)
+		gso_update_pkt_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tunnel_tcp4.h b/lib/librte_gso/gso_tunnel_tcp4.h
new file mode 100644
index 0000000..0280b38
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.h
@@ -0,0 +1,76 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TUNNEL_TCP4_H_
+#define _GSO_TUNNEL_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment an tunneling packet with inner TCP/IPv4 headers. This function
+ * assumes the input packet has correct checksums and doesn't update
+ * checksums for GSO segment. Furthermore, it doesn't process IP fragment
+ * packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing uint of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when gso_tunnel_tcp4_segment() successes. If the memory
+ *  space in pkts_out is insufficient, gso_tcp4_segment() fails and
+ *  returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index 95f6ea6..226c75a 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -38,6 +38,7 @@
 #include "rte_gso.h"
 #include "gso_common.h"
 #include "gso_tcp4.h"
+#include "gso_tunnel_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -66,7 +67,11 @@ rte_gso_segment(struct rte_mbuf *pkt,
 	gso_size = gso_ctx.gso_size;
 	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
 
-	if (is_ipv4_tcp(pkt->packet_type)) {
+	if (is_ipv4_vxlan_ipv4_tcp(pkt->packet_type)) {
+		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else if (is_ipv4_tcp(pkt->packet_type)) {
 		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
 				pkts_out, nb_pkts_out);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v3 4/5] gso: add GRE GSO support
  2017-09-12  2:43   ` [PATCH v3 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                       ` (2 preceding siblings ...)
  2017-09-12  2:43     ` [PATCH v3 3/5] gso: add VxLAN " Jiayu Hu
@ 2017-09-12  2:43     ` Jiayu Hu
  2017-09-12  2:43     ` [PATCH v3 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
  5 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-12  2:43 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, Jiayu Hu

From: Mark Kavanagh <mark.b.kavanagh@intel.com>

This patch adds GSO support for GRE-tunneled packets. Supported GRE
packets must contain an outer IPv4 header, and inner TCP/IPv4 headers.
They may also contain a single VLAN tag. GRE GSO assumes that all input
packets have correct checksums and doesn't update checksums for output
packets. Additionally, it doesn't process IP fragmented packets.

As with VxLAN GSO, GRE GSO uses a two-segment MBUF to organize each
output packet, which requires multi-segment mbuf support in the TX
functions of the NIC driver. Also, if a packet is GSOed, GRE GSO reduces
its MBUF refcnt by 1. As a result, when all of its GSOed segments are
freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 lib/librte_gso/gso_common.c | 22 ++++++++++++++++++++++
 lib/librte_gso/gso_common.h | 19 +++++++++++++++++++
 lib/librte_gso/rte_gso.c    |  3 ++-
 3 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
index c6779d0..bd56924 100644
--- a/lib/librte_gso/gso_common.c
+++ b/lib/librte_gso/gso_common.c
@@ -37,6 +37,7 @@
 #include <rte_memcpy.h>
 #include <rte_mempool.h>
 #include <rte_ether.h>
+#include <rte_gre.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
 #include <rte_udp.h>
@@ -237,12 +238,33 @@ update_ipv4_vxlan_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
 	update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
 }
 
+static inline void
+update_ipv4_gre_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t i, id;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->outer_l2_len);
+	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	for (i = 0; i < nb_segs; i++) {
+		update_outer_ipv4_header(segs[i], id);
+		id += ipid_delta;
+	}
+
+	/* Update inner TCP/IPv4 headers */
+	update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
+}
+
 void
 gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
 		struct rte_mbuf **segs, uint16_t nb_segs)
 {
 	if (is_ipv4_vxlan_ipv4_tcp(pkt->packet_type))
 		update_ipv4_vxlan_tcp4_header(pkt, ipid_delta, segs, nb_segs);
+	else if (is_ipv4_gre_ipv4_tcp(pkt->packet_type))
+		update_ipv4_gre_tcp4_header(pkt, ipid_delta, segs, nb_segs);
 	else if (is_ipv4_tcp(pkt->packet_type))
 		update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
 }
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index 2377a1d..f6d3238 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -89,6 +89,25 @@ static inline uint8_t is_ipv4_vxlan_ipv4_tcp(uint32_t ptype)
 	}
 }
 
+#define ETHER_GRE_TCP (RTE_PTYPE_L2_ETHER | RTE_PTYPE_TUNNEL_GRE | \
+		RTE_PTYPE_INNER_L4_TCP)
+#define ETHER_VLAN_GRE_TCP (RTE_PTYPE_L2_ETHER_VLAN | RTE_PTYPE_TUNNEL_GRE | \
+		RTE_PTYPE_INNER_L4_TCP)
+static inline uint8_t is_ipv4_gre_ipv4_tcp(uint32_t ptype)
+{
+	uint32_t type;
+
+	type = ptype & (~(RTE_PTYPE_L3_MASK | RTE_PTYPE_INNER_L3_MASK));
+	switch (type) {
+	case ETHER_GRE_TCP:
+	case ETHER_VLAN_GRE_TCP:
+		return (RTE_ETH_IS_IPV4_HDR(ptype) > 0) ?
+			IS_INNER_IPV4_HDR(ptype & RTE_PTYPE_INNER_L3_MASK) : 0;
+	default:
+		return 0;
+	}
+}
+
 /**
  * Internal function which updates relevant packet headers, following
  * segmentation. This is required to update, for example, the IPv4
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index 226c75a..e0925ae 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -67,7 +67,8 @@ rte_gso_segment(struct rte_mbuf *pkt,
 	gso_size = gso_ctx.gso_size;
 	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
 
-	if (is_ipv4_vxlan_ipv4_tcp(pkt->packet_type)) {
+	if (is_ipv4_vxlan_ipv4_tcp(pkt->packet_type) ||
+			is_ipv4_gre_ipv4_tcp(pkt->packet_type)) {
 		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
 				pkts_out, nb_pkts_out);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v3 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-09-12  2:43   ` [PATCH v3 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                       ` (3 preceding siblings ...)
  2017-09-12  2:43     ` [PATCH v3 4/5] gso: add GRE " Jiayu Hu
@ 2017-09-12  2:43     ` Jiayu Hu
  2017-09-14 18:33       ` Ferruh Yigit
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
  5 siblings, 1 reply; 157+ messages in thread
From: Jiayu Hu @ 2017-09-12  2:43 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, Jiayu Hu

This patch adds GSO support to the csum forwarding engine. Oversized
packets transmitted over a GSO-enabled port will undergo segmentation
(with the exception of packet-types unsupported by the GSO library).
GSO support is disabled by default.

GSO support may be toggled on a per-port basis, using the command:

        "set port <port_id> gso on|off"

The maximum packet length (including the packet header and payload) for
GSO segments may be set with the command:

        "set gso segsz <length>"

Show GSO configuration for a given port with the command:

	"show port <port_id> gso"

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 app/test-pmd/cmdline.c  | 178 ++++++++++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/config.c   |  24 +++++++
 app/test-pmd/csumonly.c | 102 +++++++++++++++++++++++++--
 app/test-pmd/testpmd.c  |  16 +++++
 app/test-pmd/testpmd.h  |  10 +++
 5 files changed, 326 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index cd8c358..03b98a3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -431,6 +431,17 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"    Set max flow number and max packet number per-flow"
 			" for GRO.\n\n"
 
+			"set port (port_id) gso (on|off)"
+			"    Enable or disable Generic Segmentation Offload in"
+			" csum forwarding engine.\n\n"
+
+			"set gso segsz (length)\n"
+			"    Set max packet length for output GSO segments,"
+			" including packet header and payload.\n\n"
+
+			"show port (port_id) gso\n"
+			"    Show GSO configuration.\n\n"
+
 			"set fwd (%s)\n"
 			"    Set packet forwarding mode.\n\n"
 
@@ -3963,6 +3974,170 @@ cmdline_parse_inst_t cmd_gro_set = {
 	},
 };
 
+/* *** ENABLE/DISABLE GSO *** */
+struct cmd_gso_enable_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_mode;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_enable_parsed(void *parsed_result,
+		__attribute__((unused)) struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_gso_enable_result *res;
+
+	res = parsed_result;
+	if (!strcmp(res->cmd_keyword, "gso"))
+		setup_gso(res->cmd_mode, res->cmd_pid);
+}
+
+cmdline_parse_token_string_t cmd_gso_enable_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_enable_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_enable_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_enable_mode =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_mode, "on#off");
+cmdline_parse_token_num_t cmd_gso_enable_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_enable = {
+	.f = cmd_gso_enable_parsed,
+	.data = NULL,
+	.help_str = "set port <port_id> gso on|off",
+	.tokens = {
+		(void *)&cmd_gso_enable_set,
+		(void *)&cmd_gso_enable_port,
+		(void *)&cmd_gso_enable_pid,
+		(void *)&cmd_gso_enable_keyword,
+		(void *)&cmd_gso_enable_mode,
+		NULL,
+	},
+};
+
+/* *** SET MAX PACKET LENGTH FOR GSO SEGMENTS *** */
+struct cmd_gso_size_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_segsz;
+	uint16_t cmd_size;
+};
+
+static void
+cmd_gso_size_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_size_result *res = parsed_result;
+
+	if (test_done == 0) {
+		printf("Before set GSO segsz, please stop fowarding first\n");
+		return;
+	}
+
+	if (!strcmp(res->cmd_keyword, "gso") &&
+			!strcmp(res->cmd_segsz, "segsz")) {
+		if (res->cmd_size == 0) {
+			printf("gso_size should be larger than 0."
+					" Please input a legal value\n");
+		} else
+			gso_max_segment_size = res->cmd_size;
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_size_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_size_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_size_segsz =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_segsz, "segsz");
+cmdline_parse_token_num_t cmd_gso_size_size =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_size_result,
+				cmd_size, UINT16);
+
+cmdline_parse_inst_t cmd_gso_size = {
+	.f = cmd_gso_size_parsed,
+	.data = NULL,
+	.help_str = "set gso segsz <length>",
+	.tokens = {
+		(void *)&cmd_gso_size_set,
+		(void *)&cmd_gso_size_keyword,
+		(void *)&cmd_gso_size_segsz,
+		(void *)&cmd_gso_size_size,
+		NULL,
+	},
+};
+
+/* *** SHOW GSO CONFIGURATION *** */
+struct cmd_gso_show_result {
+	cmdline_fixed_string_t cmd_show;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_show_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_show_result *res = parsed_result;
+
+	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
+		printf("invalid port id %u\n", res->cmd_pid);
+		return;
+	}
+
+	if (!strcmp(res->cmd_keyword, "gso")) {
+		if (gso_ports[res->cmd_pid].enable) {
+			printf("Max GSO segment size: %uB\n"
+					"Support GSO protocols: TCP/IPv4,"
+					" VxlAN and GRE\n",
+					gso_max_segment_size);
+		} else
+			printf("Port %u doesn't enable GSO\n", res->cmd_pid);
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_show_show =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_show, "show");
+cmdline_parse_token_string_t cmd_gso_show_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_show_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_num_t cmd_gso_show_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_show_result,
+				cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_show = {
+	.f = cmd_gso_show_parsed,
+	.data = NULL,
+	.help_str = "show port <port_id> gso",
+	.tokens = {
+		(void *)&cmd_gso_show_show,
+		(void *)&cmd_gso_show_port,
+		(void *)&cmd_gso_show_pid,
+		(void *)&cmd_gso_show_keyword,
+		NULL,
+	},
+};
+
 /* *** ENABLE/DISABLE FLUSH ON RX STREAMS *** */
 struct cmd_set_flush_rx {
 	cmdline_fixed_string_t set;
@@ -14251,6 +14426,9 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_tunnel_tso_show,
 	(cmdline_parse_inst_t *)&cmd_enable_gro,
 	(cmdline_parse_inst_t *)&cmd_gro_set,
+	(cmdline_parse_inst_t *)&cmd_gso_enable,
+	(cmdline_parse_inst_t *)&cmd_gso_size,
+	(cmdline_parse_inst_t *)&cmd_gso_show,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_rx,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_tx,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 3ae3e1c..3434346 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2454,6 +2454,30 @@ setup_gro(const char *mode, uint8_t port_id)
 	}
 }
 
+void
+setup_gso(const char *mode, uint8_t port_id)
+{
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		printf("invalid port id %u\n", port_id);
+		return;
+	}
+	if (strcmp(mode, "on") == 0) {
+		if (test_done == 0) {
+			printf("before enable GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 1;
+	} else if (strcmp(mode, "off") == 0) {
+		if (test_done == 0) {
+			printf("before disable GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 0;
+	}
+}
+
 char*
 list_pkt_forwarding_modes(void)
 {
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 90c8119..8e9a8a1 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -66,10 +66,12 @@
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_sctp.h>
+#include <rte_net.h>
 #include <rte_prefetch.h>
 #include <rte_string_fns.h>
 #include <rte_flow.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
@@ -103,6 +105,7 @@ struct testpmd_offload_info {
 	uint16_t tso_segsz;
 	uint16_t tunnel_tso_segsz;
 	uint32_t pkt_len;
+	uint32_t packet_type;
 };
 
 /* simplified GRE header */
@@ -129,10 +132,25 @@ parse_ipv4(struct ipv4_hdr *ipv4_hdr, struct testpmd_offload_info *info)
 	info->l3_len = (ipv4_hdr->version_ihl & 0x0f) * 4;
 	info->l4_proto = ipv4_hdr->next_proto_id;
 
+	if (info->is_tunnel)
+		info->packet_type |= RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN;
+	else
+		info->packet_type |= RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
+
 	/* only fill l4_len for TCP, it's useful for TSO */
 	if (info->l4_proto == IPPROTO_TCP) {
 		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + info->l3_len);
 		info->l4_len = (tcp_hdr->data_off & 0xf0) >> 2;
+		if (info->is_tunnel)
+			info->packet_type |= RTE_PTYPE_INNER_L4_TCP;
+		else
+			info->packet_type |= RTE_PTYPE_L4_TCP;
+	} else if (info->l4_proto == IPPROTO_UDP) {
+		if (info->is_tunnel)
+			info->packet_type |= RTE_PTYPE_INNER_L4_UDP;
+		else
+			info->packet_type |= RTE_PTYPE_L4_UDP;
+		info->l4_len = 0;
 	} else
 		info->l4_len = 0;
 }
@@ -146,10 +164,25 @@ parse_ipv6(struct ipv6_hdr *ipv6_hdr, struct testpmd_offload_info *info)
 	info->l3_len = sizeof(struct ipv6_hdr);
 	info->l4_proto = ipv6_hdr->proto;
 
+	if (info->is_tunnel)
+		info->packet_type |= RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN;
+	else
+		info->packet_type |= RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
+
 	/* only fill l4_len for TCP, it's useful for TSO */
 	if (info->l4_proto == IPPROTO_TCP) {
 		tcp_hdr = (struct tcp_hdr *)((char *)ipv6_hdr + info->l3_len);
 		info->l4_len = (tcp_hdr->data_off & 0xf0) >> 2;
+		if (info->is_tunnel)
+			info->packet_type |= RTE_PTYPE_INNER_L4_TCP;
+		else
+			info->packet_type |= RTE_PTYPE_L4_TCP;
+	} else if (info->l4_proto == IPPROTO_UDP) {
+		if (info->is_tunnel)
+			info->packet_type |= RTE_PTYPE_INNER_L4_UDP;
+		else
+			info->packet_type |= RTE_PTYPE_L4_UDP;
+		info->l4_len = 0;
 	} else
 		info->l4_len = 0;
 }
@@ -164,16 +197,26 @@ parse_ethernet(struct ether_hdr *eth_hdr, struct testpmd_offload_info *info)
 {
 	struct ipv4_hdr *ipv4_hdr;
 	struct ipv6_hdr *ipv6_hdr;
+	uint32_t l2_type;
 
 	info->l2_len = sizeof(struct ether_hdr);
 	info->ethertype = eth_hdr->ether_type;
+	if (info->is_tunnel)
+		l2_type = RTE_PTYPE_INNER_L2_ETHER;
+	else
+		l2_type = RTE_PTYPE_L2_ETHER;
 
 	if (info->ethertype == _htons(ETHER_TYPE_VLAN)) {
 		struct vlan_hdr *vlan_hdr = (struct vlan_hdr *)(eth_hdr + 1);
 
 		info->l2_len  += sizeof(struct vlan_hdr);
 		info->ethertype = vlan_hdr->eth_proto;
+		if (info->is_tunnel)
+			l2_type = RTE_PTYPE_INNER_L2_ETHER_VLAN;
+		else
+			l2_type = RTE_PTYPE_L2_ETHER_VLAN;
 	}
+	info->packet_type |= l2_type;
 
 	switch (info->ethertype) {
 	case _htons(ETHER_TYPE_IPv4):
@@ -212,6 +255,7 @@ parse_vxlan(struct udp_hdr *udp_hdr,
 	info->outer_l2_len = info->l2_len;
 	info->outer_l3_len = info->l3_len;
 	info->outer_l4_proto = info->l4_proto;
+	info->packet_type |= RTE_PTYPE_TUNNEL_VXLAN;
 
 	eth_hdr = (struct ether_hdr *)((char *)udp_hdr +
 		sizeof(struct udp_hdr) +
@@ -245,6 +289,7 @@ parse_gre(struct simple_gre_hdr *gre_hdr, struct testpmd_offload_info *info)
 		info->outer_l2_len = info->l2_len;
 		info->outer_l3_len = info->l3_len;
 		info->outer_l4_proto = info->l4_proto;
+		info->packet_type |= RTE_PTYPE_TUNNEL_GRE;
 
 		ipv4_hdr = (struct ipv4_hdr *)((char *)gre_hdr + gre_len);
 
@@ -258,6 +303,7 @@ parse_gre(struct simple_gre_hdr *gre_hdr, struct testpmd_offload_info *info)
 		info->outer_l2_len = info->l2_len;
 		info->outer_l3_len = info->l3_len;
 		info->outer_l4_proto = info->l4_proto;
+		info->packet_type |= RTE_PTYPE_TUNNEL_GRE;
 
 		ipv6_hdr = (struct ipv6_hdr *)((char *)gre_hdr + gre_len);
 
@@ -271,6 +317,7 @@ parse_gre(struct simple_gre_hdr *gre_hdr, struct testpmd_offload_info *info)
 		info->outer_l2_len = info->l2_len;
 		info->outer_l3_len = info->l3_len;
 		info->outer_l4_proto = info->l4_proto;
+		info->packet_type |= RTE_PTYPE_TUNNEL_GRE;
 
 		eth_hdr = (struct ether_hdr *)((char *)gre_hdr + gre_len);
 
@@ -299,6 +346,7 @@ parse_encap_ip(void *encap_ip, struct testpmd_offload_info *info)
 	info->outer_ethertype = info->ethertype;
 	info->outer_l2_len = info->l2_len;
 	info->outer_l3_len = info->l3_len;
+	info->packet_type |= RTE_PTYPE_TUNNEL_IP;
 
 	if (ip_version == 4) {
 		parse_ipv4(ipv4_hdr, info);
@@ -627,6 +675,9 @@ static void
 pkt_burst_checksum_forward(struct fwd_stream *fs)
 {
 	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_mbuf *gso_segments[GSO_MAX_PKT_BURST];
+	struct rte_gso_ctx *gso_ctx;
+	struct rte_mbuf **tx_pkts_burst;
 	struct rte_port *txp;
 	struct rte_mbuf *m, *p;
 	struct ether_hdr *eth_hdr;
@@ -641,6 +692,8 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 	uint32_t rx_bad_ip_csum;
 	uint32_t rx_bad_l4_csum;
 	struct testpmd_offload_info info;
+	uint16_t nb_segments = 0;
+	int ret;
 
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	uint64_t start_tsc;
@@ -683,6 +736,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 		m = pkts_burst[i];
 		info.is_tunnel = 0;
 		info.pkt_len = rte_pktmbuf_pkt_len(m);
+		info.packet_type = 0;
 		tx_ol_flags = 0;
 		rx_ol_flags = m->ol_flags;
 
@@ -790,6 +844,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 			m->tso_segsz = info.tso_segsz;
 		}
 		m->ol_flags = tx_ol_flags;
+		m->packet_type = info.packet_type;
 
 		/* Do split & copy for the packet. */
 		if (tx_pkt_split != TX_PKT_SPLIT_OFF) {
@@ -851,13 +906,51 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 		}
 	}
 
+	if (gso_ports[fs->tx_port].enable == 0)
+		tx_pkts_burst = pkts_burst;
+	else {
+		gso_ctx = &(current_fwd_lcore()->gso_ctx);
+		gso_ctx->gso_size = gso_max_segment_size;
+		for (i = 0; i < nb_rx; i++) {
+			ret = rte_gso_segment(pkts_burst[i], *gso_ctx,
+					&gso_segments[nb_segments],
+					GSO_MAX_PKT_BURST - nb_segments);
+			if (ret >= 1)
+				nb_segments += ret;
+			else if (ret < 0) {
+				/* insufficient MBUFs, stop GSO */
+				memcpy(&gso_segments[nb_segments],
+						&pkts_burst[i],
+						sizeof(struct rte_mbuf *) *
+						(nb_rx - i));
+				nb_segments += (nb_rx - i);
+				break;
+			}
+			if (unlikely(nb_rx - i >= GSO_MAX_PKT_BURST -
+						nb_segments)) {
+				/*
+				 * insufficient space in gso_segments,
+				 * stop GSO.
+				 */
+				memcpy(&gso_segments[nb_segments],
+						&pkts_burst[i],
+						sizeof(struct rte_mbuf *) *
+						(nb_rx - i));
+				nb_segments += (nb_rx - i);
+				break;
+			}
+		}
+		tx_pkts_burst = gso_segments;
+		nb_rx = nb_segments;
+	}
+
 	nb_prep = rte_eth_tx_prepare(fs->tx_port, fs->tx_queue,
-			pkts_burst, nb_rx);
+			tx_pkts_burst, nb_rx);
 	if (nb_prep != nb_rx)
 		printf("Preparing packet burst to transmit failed: %s\n",
 				rte_strerror(rte_errno));
 
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst,
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, tx_pkts_burst,
 			nb_prep);
 
 	/*
@@ -868,7 +961,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
 			rte_delay_us(burst_tx_delay_time);
 			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_rx - nb_tx);
+					&tx_pkts_burst[nb_tx], nb_rx - nb_tx);
 		}
 	}
 	fs->tx_packets += nb_tx;
@@ -881,9 +974,10 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 	if (unlikely(nb_tx < nb_rx)) {
 		fs->fwd_dropped += (nb_rx - nb_tx);
 		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
+			rte_pktmbuf_free(tx_pkts_burst[nb_tx]);
 		} while (++nb_tx < nb_rx);
 	}
+
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	end_tsc = rte_rdtsc();
 	core_cycles = (end_tsc - start_tsc);
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 7d40139..dd4b365 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -400,6 +400,9 @@ static int eth_event_callback(uint8_t port_id,
  */
 static int all_ports_started(void);
 
+struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+uint16_t gso_max_segment_size = ETHER_MAX_LEN;
+
 /*
  * Helper function to check if socket is already discovered.
  * If yes, return positive value. If not, return zero.
@@ -570,6 +573,7 @@ init_config(void)
 	unsigned int nb_mbuf_per_pool;
 	lcoreid_t  lc_id;
 	uint8_t port_per_socket[RTE_MAX_NUMA_NODES];
+	uint32_t gso_types = 0;
 
 	memset(port_per_socket,0,RTE_MAX_NUMA_NODES);
 
@@ -654,6 +658,12 @@ init_config(void)
 
 	init_port_config();
 
+	gso_types = RTE_PTYPE_L2_ETHER_VLAN | RTE_PTYPE_L2_ETHER |
+		RTE_PTYPE_INNER_L2_ETHER | RTE_PTYPE_INNER_L2_ETHER_VLAN |
+		RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | RTE_PTYPE_L4_TCP |
+		RTE_PTYPE_L4_UDP | RTE_PTYPE_TUNNEL_VXLAN |
+		RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN | RTE_PTYPE_INNER_L4_TCP |
+		RTE_PTYPE_TUNNEL_GRE;
 	/*
 	 * Records which Mbuf pool to use by each logical core, if needed.
 	 */
@@ -664,6 +674,12 @@ init_config(void)
 		if (mbp == NULL)
 			mbp = mbuf_pool_find(0);
 		fwd_lcores[lc_id]->mbp = mbp;
+		/* initialize GSO context */
+		fwd_lcores[lc_id]->gso_ctx.direct_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.indirect_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.gso_types = gso_types;
+		fwd_lcores[lc_id]->gso_ctx.gso_size = ETHER_MAX_LEN;
+		fwd_lcores[lc_id]->gso_ctx.ipid_flag = RTE_GSO_IPID_INCREASE;
 	}
 
 	/* Configuration of packet forwarding streams. */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index c9d7739..725af1a 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -36,6 +36,7 @@
 
 #include <rte_pci.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
@@ -205,6 +206,7 @@ struct rte_port {
  * CPU id. configuration table.
  */
 struct fwd_lcore {
+	struct rte_gso_ctx gso_ctx;     /**< GSO context */
 	struct rte_mempool *mbp; /**< The mbuf pool to use by this core */
 	streamid_t stream_idx;   /**< index of 1st stream in "fwd_streams" */
 	streamid_t stream_nb;    /**< number of streams in "fwd_streams" */
@@ -442,6 +444,13 @@ struct gro_status {
 };
 extern struct gro_status gro_ports[RTE_MAX_ETHPORTS];
 
+#define GSO_MAX_PKT_BURST 2048
+struct gso_status {
+	uint8_t enable;
+};
+extern struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+extern uint16_t gso_max_segment_size;
+
 static inline unsigned int
 lcore_num(void)
 {
@@ -641,6 +650,7 @@ void get_5tuple_filter(uint8_t port_id, uint16_t index);
 int rx_queue_id_is_invalid(queueid_t rxq_id);
 int tx_queue_id_is_invalid(queueid_t txq_id);
 void setup_gro(const char *mode, uint8_t port_id);
+void setup_gso(const char *mode, uint8_t port_id);
 
 /* Functions to manage the set of filtered Multicast MAC addresses */
 void mcast_addr_add(uint8_t port_id, struct ether_addr *mc_addr);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 1/5] gso: add Generic Segmentation Offload API framework
  2017-09-12  2:43     ` [PATCH v3 1/5] gso: add Generic Segmentation Offload API framework Jiayu Hu
@ 2017-09-12 10:36       ` Ananyev, Konstantin
  2017-09-13  2:11         ` Jiayu Hu
  2017-09-14 18:33       ` Ferruh Yigit
  1 sibling, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-12 10:36 UTC (permalink / raw)
  To: Hu, Jiayu, dev; +Cc: Kavanagh, Mark B, Tan, Jianfeng

Hi Jiayu,
Few comments from be inline.
Konstantin

> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> new file mode 100644
> index 0000000..dda50ee
> --- /dev/null
> +++ b/lib/librte_gso/rte_gso.c
> @@ -0,0 +1,50 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <errno.h>
> +
> +#include "rte_gso.h"
> +
> +int
> +rte_gso_segment(struct rte_mbuf *pkt,
> +		struct rte_gso_ctx gso_ctx __rte_unused,

No need to pass parameter by value here.
struct rte_gso_ctx *gso_ctx would do.
Even better - const struct rte_gso_ctx *, in case it doesn't need to need
to be updated inside that function.  

> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out)
> +{
> +	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
> +		return -EINVAL;
> +
> +	pkts_out[0] = pkt;
> +
> +	return 1;
> +}
> diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
> new file mode 100644
> index 0000000..db757d6
> --- /dev/null
> +++ b/lib/librte_gso/rte_gso.h
> @@ -0,0 +1,133 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _RTE_GSO_H_
> +#define _RTE_GSO_H_
> +
> +/**
> + * @file
> + * Interface to GSO library
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <stdint.h>
> +#include <rte_mbuf.h>
> +
> +/* GSO IP id flags for the IPv4 header */
> +#define RTE_GSO_IPID_FIXED (1 << 0)
> +/**< Use fixed IP ids for output GSO segments */
> +#define RTE_GSO_IPID_INCREASE (1 << 1)
> +/**< Use incremental IP ids for output GSO segments */

As values above are mutually exclusive, I think you don't need both flags.
Just one seems enough.


> +
> +/**
> + * GSO context structure.
> + */
> +struct rte_gso_ctx {
> +	struct rte_mempool *direct_pool;
> +	/**< MBUF pool for allocating direct buffers, which are used
> +	 * to store packet headers for GSO segments.
> +	 */
> +	struct rte_mempool *indirect_pool;
> +	/**< MBUF pool for allocating indirect buffers, which are used
> +	 * to locate packet payloads for GSO segments. The indirect
> +	 * buffer doesn't contain any data, but simply points to an
> +	 * offset within the packet to segment.
> +	 */
> +	uint32_t gso_types;
> +	/**< packet types to perform GSO. For example, if applications
> +	 * want to segment TCP/IPv4 packets, may set (RTE_PTYPE_L2_ETHER |
> +	 * RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP) to gso_types.


Actually after another thought - it probably should be no ptype mask, but mask
of rte_ethdev DEV_TX_OFFLOAD_*_TSO flags that are used to advertise real HW TSO offloads.
Let say for GSO that supports TSO over IPv4 it would be:
PKT_TX_TCP_SEG | PKT_TX_IPV4.
That would allow user to use GSO and TSO in a transparent way,
plus ptype is not actually a proper bitmask, but a set of enums,
so it not always possible to distinguish what ptype is supported just by bitmask.
Sorry for causing confusion here.

> +	 */
> +	uint16_t gso_size;
> +	/**< maximum size of an output GSO segment, including packet
> +	 * header and payload, measured in bytes.
> +	 */
> +	uint8_t ipid_flag;

I'd suggest uint32_t flags (or even uint64_t).
Who knows what extra flags we'll need in future here.

> +	/**< flag to indicate GSO uses fixed or incremental IP ids for
> +	 * IPv4 headers of output GSO segments.
> +	 */
> +};
> +
> +/**
> + * Segmentation function, which supports processing of both single- and
> + * multi- segment packets.
> + *
> + * Note that we refer to the packets that are segmented from the input
> + * packet as 'GSO segments'. rte_gso_segment() assumes the input packet
> + * has correct checksums, and it doesn't update checksums for output
> + * GSO segments. Additionally, it doesn't process IP fragment packets.
> + *
> + * Each of the newly-created GSO segments is organized as a two-segment
> + * MBUF, where the first segment is a standard MBUF, which stores a copy
> + * of packet header, and the second is an indirect MBUF which points to
> + * a section of data in the input packet. Since each GSO segment has
> + * multiple MBUFs (i.e. 2 MBUFs), the driver of the interface which the
> + * GSO segments are sent to should support to transmit multi-segment
> + * packets.
> + *
> + * If the input packet is GSOed, its mbuf refcnt reduces by 1. Therefore,
> + * when all GSO segments are freed, the input packet is freed automatically.
> + *
> + * If the memory space in pkts_out or MBUF pools is insufficient, this
> + * function fails, and it returns (-1) * errno. Otherwise, GSO successes,
> + * and this function returns the number of output GSO segments filled in
> + * pkts_out.
> + *
> + * @param pkt
> + *  The packet mbuf to segment.
> + * @param ctx
> + *  GSO context object.
> + * @param pkts_out
> + *  Pointer array used to store the MBUF addresses of output GSO
> + *  segments, when rte_gso_segment() successes.
> + * @param nb_pkts_out
> + *  The max number of items that pkts_out can keep.
> + *
> + * @return
> + *  - The number of GSO segments filled in pkts_out on success.
> + *  - Return -ENOMEM if run out of memory in MBUF pools.
> + *  - Return -EINVAL for invalid parameters.
> + */
> +int rte_gso_segment(struct rte_mbuf *pkt,
> +		struct rte_gso_ctx ctx,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_GSO_H_ */

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-12  2:43     ` [PATCH v3 2/5] gso: add TCP/IPv4 GSO support Jiayu Hu
@ 2017-09-12 11:17       ` Ananyev, Konstantin
  2017-09-13  2:48         ` Jiayu Hu
  2017-09-12 14:17       ` Ananyev, Konstantin
  1 sibling, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-12 11:17 UTC (permalink / raw)
  To: Hu, Jiayu, dev; +Cc: Kavanagh, Mark B, Tan, Jianfeng

Hi Jayu,

> -----Original Message-----
> From: Hu, Jiayu
> Sent: Tuesday, September 12, 2017 3:43 AM
> To: dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng
> <jianfeng.tan@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
> Subject: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> This patch adds GSO support for TCP/IPv4 packets. Supported packets
> may include a single VLAN tag. TCP/IPv4 GSO assumes that all input
> packets have correct checksums, and doesn't update checksums for output
> packets (the responsibility for this lies with the application).

Probably it shouldn't say that checksum have to be valid, right?
As you don't update checksum(s) inside the lib - it probably doesn't matter.

> Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
> 
> TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
> MBUF, to organize an output packet. Note that we refer to these two
> chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
> header, while the indirect mbuf simply points to a location within the
> original packet's payload. Consequently, use of the GSO library requires
> multi-segment MBUF support in the TX functions of the NIC driver.
> 
> If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> result, when all of its GSOed segments are freed, the packet is freed
> automatically.
> 
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> ---
>  lib/librte_eal/common/include/rte_log.h |   1 +
>  lib/librte_gso/Makefile                 |   2 +
>  lib/librte_gso/gso_common.c             | 202 ++++++++++++++++++++++++++++++++
>  lib/librte_gso/gso_common.h             | 113 ++++++++++++++++++
>  lib/librte_gso/gso_tcp4.c               |  83 +++++++++++++
>  lib/librte_gso/gso_tcp4.h               |  76 ++++++++++++
>  lib/librte_gso/rte_gso.c                |  41 ++++++-
>  7 files changed, 515 insertions(+), 3 deletions(-)
>  create mode 100644 lib/librte_gso/gso_common.c
>  create mode 100644 lib/librte_gso/gso_common.h
>  create mode 100644 lib/librte_gso/gso_tcp4.c
>  create mode 100644 lib/librte_gso/gso_tcp4.h
> 
> diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
> index ec8dba7..2fa1199 100644
> --- a/lib/librte_eal/common/include/rte_log.h
> +++ b/lib/librte_eal/common/include/rte_log.h
> @@ -87,6 +87,7 @@ extern struct rte_logs rte_logs;
>  #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
>  #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
>  #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
> +#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
> 
>  /* these log types can be used in an application */
>  #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> index aeaacbc..2be64d1 100644
> --- a/lib/librte_gso/Makefile
> +++ b/lib/librte_gso/Makefile
> @@ -42,6 +42,8 @@ LIBABIVER := 1
> 
>  #source files
>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
> 
>  # install this header file
>  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
> new file mode 100644
> index 0000000..7c32e03
> --- /dev/null
> +++ b/lib/librte_gso/gso_common.c
> @@ -0,0 +1,202 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <stdbool.h>
> +#include <errno.h>
> +
> +#include <rte_memcpy.h>
> +#include <rte_mempool.h>
> +#include <rte_ether.h>
> +#include <rte_ip.h>
> +#include <rte_tcp.h>
> +
> +#include "gso_common.h"
> +
> +static inline void
> +hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset)
> +{
> +	/* Copy MBUF metadata */
> +	hdr_segment->nb_segs = 1;
> +	hdr_segment->port = pkt->port;
> +	hdr_segment->ol_flags = pkt->ol_flags;
> +	hdr_segment->packet_type = pkt->packet_type;
> +	hdr_segment->pkt_len = pkt_hdr_offset;
> +	hdr_segment->data_len = pkt_hdr_offset;
> +	hdr_segment->tx_offload = pkt->tx_offload;
> +
> +	/* Copy the packet header */
> +	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
> +			rte_pktmbuf_mtod(pkt, char *),
> +			pkt_hdr_offset);
> +}
> +
> +static inline void
> +free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
> +{
> +	uint16_t i;
> +
> +	for (i = 0; i < nb_pkts; i++)
> +		rte_pktmbuf_free(pkts[i]);
> +}
> +
> +int
> +gso_do_segment(struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset,
> +		uint16_t pyld_unit_size,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out)
> +{
> +	struct rte_mbuf *pkt_in;
> +	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
> +	uint16_t pkt_in_data_pos, segment_bytes_remaining;
> +	uint16_t pyld_len, nb_segs;
> +	bool more_in_pkt, more_out_segs;
> +
> +	pkt_in = pkt;
> +	nb_segs = 0;
> +	more_in_pkt = 1;
> +	pkt_in_data_pos = pkt_hdr_offset;
> +
> +	while (more_in_pkt) {
> +		if (unlikely(nb_segs >= nb_pkts_out)) {
> +			free_gso_segment(pkts_out, nb_segs);
> +			return -EINVAL;
> +		}
> +
> +		/* Allocate a direct MBUF */
> +		hdr_segment = rte_pktmbuf_alloc(direct_pool);
> +		if (unlikely(hdr_segment == NULL)) {
> +			free_gso_segment(pkts_out, nb_segs);
> +			return -ENOMEM;
> +		}
> +		/* Fill the packet header */
> +		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
> +
> +		prev_segment = hdr_segment;
> +		segment_bytes_remaining = pyld_unit_size;
> +		more_out_segs = 1;
> +
> +		while (more_out_segs && more_in_pkt) {
> +			/* Allocate an indirect MBUF */
> +			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
> +			if (unlikely(pyld_segment == NULL)) {
> +				rte_pktmbuf_free(hdr_segment);
> +				free_gso_segment(pkts_out, nb_segs);
> +				return -ENOMEM;
> +			}
> +			/* Attach to current MBUF segment of pkt */
> +			rte_pktmbuf_attach(pyld_segment, pkt_in);
> +
> +			prev_segment->next = pyld_segment;
> +			prev_segment = pyld_segment;
> +
> +			pyld_len = segment_bytes_remaining;
> +			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
> +				pyld_len = pkt_in->data_len - pkt_in_data_pos;
> +
> +			pyld_segment->data_off = pkt_in_data_pos +
> +				pkt_in->data_off;
> +			pyld_segment->data_len = pyld_len;
> +
> +			/* Update header segment */
> +			hdr_segment->pkt_len += pyld_len;
> +			hdr_segment->nb_segs++;
> +
> +			pkt_in_data_pos += pyld_len;
> +			segment_bytes_remaining -= pyld_len;
> +
> +			/* Finish processing a MBUF segment of pkt */
> +			if (pkt_in_data_pos == pkt_in->data_len) {
> +				pkt_in = pkt_in->next;
> +				pkt_in_data_pos = 0;
> +				if (pkt_in == NULL)
> +					more_in_pkt = 0;
> +			}
> +
> +			/* Finish generating a GSO segment */
> +			if (segment_bytes_remaining == 0)
> +				more_out_segs = 0;
> +		}
> +		pkts_out[nb_segs++] = hdr_segment;
> +	}
> +	return nb_segs;
> +}
> +
> +static inline void
> +update_inner_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
> +		struct rte_mbuf **segs, uint16_t nb_segs)
> +{
> +	struct tcp_hdr *tcp_hdr;
> +	struct ipv4_hdr *ipv4_hdr;
> +	struct rte_mbuf *seg;
> +	uint32_t sent_seq;
> +	uint16_t inner_l2_offset;
> +	uint16_t id, i;
> +
> +	inner_l2_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;

Shouldn't it be: pkt->l2_len here?
Or probably even better to pass l2_len as an input parameter.

> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			inner_l2_offset);
> +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> +	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> +
> +	for (i = 0; i < nb_segs; i++) {
> +		seg = segs[i];
> +		/* Update the inner IPv4 header */
> +		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(seg, char *) +
> +				inner_l2_offset);
> +		ipv4_hdr->total_length = rte_cpu_to_be_16(seg->pkt_len -
> +				inner_l2_offset);
> +		ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
> +		id += ipid_delta;
> +
> +		/* Update the inner TCP header */
> +		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + seg->l3_len);
> +		tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
> +		if (likely(i < nb_segs - 1))
> +			tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
> +						TCP_HDR_FIN_MASK));
> +		sent_seq += (seg->pkt_len - seg->data_len);
> +	}
> +}
> +
> +void
> +gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
> +		struct rte_mbuf **segs, uint16_t nb_segs)
> +{
> +	if (is_ipv4_tcp(pkt->packet_type))
> +		update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
> +}
> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> new file mode 100644
> index 0000000..3c76520
> --- /dev/null
> +++ b/lib/librte_gso/gso_common.h
> @@ -0,0 +1,113 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _GSO_COMMON_H_
> +#define _GSO_COMMON_H_
> +
> +#include <stdint.h>
> +#include <rte_mbuf.h>
> +
> +#define IPV4_HDR_DF_SHIFT 14

We have that already defined in librte_net/rte_ip.h

> +#define IPV4_HDR_DF_MASK (1 << IPV4_HDR_DF_SHIFT)
> +
> +#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
> +#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
> +
> +#define ETHER_TCP_PKT (RTE_PTYPE_L2_ETHER | RTE_PTYPE_L4_TCP)
> +#define ETHER_VLAN_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | RTE_PTYPE_L4_TCP)
> +static inline uint8_t is_ipv4_tcp(uint32_t ptype)
> +{
> +	switch (ptype & (~RTE_PTYPE_L3_MASK)) {
> +	case ETHER_VLAN_TCP_PKT:
> +	case ETHER_TCP_PKT:

Why not just:
return RTE_ETH_IS_IPV4_HDR(ptype) && (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP;
?

> +		return RTE_ETH_IS_IPV4_HDR(ptype);
> +	default:
> +		return 0;
> +	}
> +}
> +
> +/**
> + * Internal function which updates relevant packet headers, following
> + * segmentation. This is required to update, for example, the IPv4
> + * 'total_length' field, to reflect the reduced length of the now-
> + * segmented packet.
> + *
> + * @param pkt
> + *  The original packet.
> + * @param ipid_delta
> + *  The increasing uint of IP ids.
> + * @param segs
> + *  Pointer array used for storing mbuf addresses for GSO segments.
> + * @param nb_segs
> + *  The number of GSO segments placed in segs.
> + */
> +void gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
> +		struct rte_mbuf **segs, uint16_t nb_segs);
> +
> +/**
> + * Internal function which divides the input packet into small segments.
> + * Each of the newly-created segments is organized as a two-segment MBUF,
> + * where the first segment is a standard mbuf, which stores a copy of
> + * packet header, and the second is an indirect mbuf which points to a
> + * section of data in the input packet.
> + *
> + * @param pkt
> + *  Packet to segment.
> + * @param pkt_hdr_offset
> + *  Packet header offset, measured in bytes.
> + * @param pyld_unit_size
> + *  The max payload length of a GSO segment.
> + * @param direct_pool
> + *  MBUF pool used for allocating direct buffers for output segments.
> + * @param indirect_pool
> + *  MBUF pool used for allocating indirect buffers for output segments.
> + * @param pkts_out
> + *  Pointer array used to keep the mbuf addresses of output segments. If
> + *  the memory space in pkts_out is insufficient, gso_do_segment() fails
> + *  and returns -EINVAL.
> + * @param nb_pkts_out
> + *  The max number of items that pkts_out can keep.
> + *
> + * @return
> + *  - The number of segments created in the event of success.
> + *  - Return -ENOMEM if run out of memory in MBUF pools.
> + *  - Return -EINVAL for invalid parameters.
> + */
> +int gso_do_segment(struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset,
> +		uint16_t pyld_unit_size,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out);
> +#endif
> diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
> new file mode 100644
> index 0000000..8d4bfb2
> --- /dev/null
> +++ b/lib/librte_gso/gso_tcp4.c
> @@ -0,0 +1,83 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +
> +#include <rte_ether.h>
> +#include <rte_ip.h>
> +
> +#include "gso_common.h"
> +#include "gso_tcp4.h"
> +
> +int
> +gso_tcp4_segment(struct rte_mbuf *pkt,
> +		uint16_t gso_size,
> +		uint8_t ipid_delta,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out)
> +{
> +	struct ipv4_hdr *ipv4_hdr;
> +	uint16_t tcp_dl;
> +	uint16_t pyld_unit_size;
> +	uint16_t hdr_offset;
> +	int ret = 1;
> +
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			pkt->l2_len);
> +	/* Don't process the fragmented packet */
> +	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
> +						IPV4_HDR_DF_MASK)) == 0)) {


It is not a check for fragmented packet - it is a check that fragmentation is allowed for that packet.
Should be IPV4_HDR_DF_MASK - 1,  I think.

> +		pkts_out[0] = pkt;
> +		return ret;
> +	}
> +
> +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt->l3_len -
> +		pkt->l4_len;

Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len?

> +	/* Don't process the packet without data */
> +	if (unlikely(tcp_dl == 0)) {
> +		pkts_out[0] = pkt;
> +		return ret;
> +	}
> +
> +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;

Hmm, why do we need to count CRC_LEN here?

> +
> +	/* Segment the payload */
> +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
> +			indirect_pool, pkts_out, nb_pkts_out);
> +	if (ret > 1)
> +		gso_update_pkt_headers(pkt, ipid_delta, pkts_out, ret);
> +
> +	return ret;
> +}
> diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
> new file mode 100644
> index 0000000..9c07984
> --- /dev/null
> +++ b/lib/librte_gso/gso_tcp4.h
> @@ -0,0 +1,76 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _GSO_TCP4_H_
> +#define _GSO_TCP4_H_
> +
> +#include <stdint.h>
> +#include <rte_mbuf.h>
> +
> +/**
> + * Segment an IPv4/TCP packet. This function assumes the input packet has
> + * correct checksums and doesn't update checksums for GSO segment.
> + * Furthermore, it doesn't process IP fragment packets.
> + *
> + * @param pkt
> + *  The packet mbuf to segment.
> + * @param gso_size
> + *  The max length of a GSO segment, measured in bytes.
> + * @param ipid_delta
> + *  The increasing uint of IP ids.
> + * @param direct_pool
> + *  MBUF pool used for allocating direct buffers for output segments.
> + * @param indirect_pool
> + *  MBUF pool used for allocating indirect buffers for output segments.
> + * @param pkts_out
> + *  Pointer array used to store the MBUF addresses of output GSO
> + *  segments, when gso_tcp4_segment() successes. If the memory space in
> + *  pkts_out is insufficient, gso_tcp4_segment() fails and returns
> + *  -EINVAL.
> + * @param nb_pkts_out
> + *  The max number of items that 'pkts_out' can keep.
> + *
> + * @return
> + *   - The number of GSO segments filled in pkts_out on success.
> + *   - Return -ENOMEM if run out of memory in MBUF pools.
> + *   - Return -EINVAL for invalid parameters.
> + */
> +int gso_tcp4_segment(struct rte_mbuf *pkt,
> +		uint16_t gso_size,
> +		uint8_t ip_delta,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out);
> +
> +#endif
> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> index dda50ee..95f6ea6 100644
> --- a/lib/librte_gso/rte_gso.c
> +++ b/lib/librte_gso/rte_gso.c
> @@ -33,18 +33,53 @@
> 
>  #include <errno.h>
> 
> +#include <rte_log.h>
> +
>  #include "rte_gso.h"
> +#include "gso_common.h"
> +#include "gso_tcp4.h"
> 
>  int
>  rte_gso_segment(struct rte_mbuf *pkt,
> -		struct rte_gso_ctx gso_ctx __rte_unused,
> +		struct rte_gso_ctx gso_ctx,
>  		struct rte_mbuf **pkts_out,
>  		uint16_t nb_pkts_out)
>  {
> +	struct rte_mempool *direct_pool, *indirect_pool;
> +	struct rte_mbuf *pkt_seg;
> +	uint16_t gso_size;
> +	uint8_t ipid_delta;
> +	int ret = 1;
> +
>  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
>  		return -EINVAL;
> 
> -	pkts_out[0] = pkt;
> +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> +			(pkt->packet_type & gso_ctx.gso_types) !=
> +			pkt->packet_type) {
> +		pkts_out[0] = pkt;
> +		return ret;
> +	}
> +
> +	direct_pool = gso_ctx.direct_pool;
> +	indirect_pool = gso_ctx.indirect_pool;
> +	gso_size = gso_ctx.gso_size;
> +	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
> +
> +	if (is_ipv4_tcp(pkt->packet_type)) {

Probably we need here:
If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...

> +		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> +				direct_pool, indirect_pool,
> +				pkts_out, nb_pkts_out);
> +	} else
> +		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");

Shouldn't we do pkt_out[0] = pkt; here?

> +
> +	if (ret > 1) {
> +		pkt_seg = pkt;
> +		while (pkt_seg) {
> +			rte_mbuf_refcnt_update(pkt_seg, -1);
> +			pkt_seg = pkt_seg->next;
> +		}
> +	}
> 
> -	return 1;
> +	return ret;
>  }
> --
> 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-12  2:43     ` [PATCH v3 2/5] gso: add TCP/IPv4 GSO support Jiayu Hu
  2017-09-12 11:17       ` Ananyev, Konstantin
@ 2017-09-12 14:17       ` Ananyev, Konstantin
  2017-09-13 10:44         ` Jiayu Hu
  1 sibling, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-12 14:17 UTC (permalink / raw)
  To: Hu, Jiayu, dev; +Cc: Kavanagh, Mark B, Tan, Jianfeng



> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Tuesday, September 12, 2017 12:18 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> Hi Jayu,
> 
> > -----Original Message-----
> > From: Hu, Jiayu
> > Sent: Tuesday, September 12, 2017 3:43 AM
> > To: dev@dpdk.org
> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng
> > <jianfeng.tan@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
> > Subject: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> > This patch adds GSO support for TCP/IPv4 packets. Supported packets
> > may include a single VLAN tag. TCP/IPv4 GSO assumes that all input
> > packets have correct checksums, and doesn't update checksums for output
> > packets (the responsibility for this lies with the application).
> 
> Probably it shouldn't say that checksum have to be valid, right?
> As you don't update checksum(s) inside the lib - it probably doesn't matter.
> 
> > Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
> >
> > TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
> > MBUF, to organize an output packet. Note that we refer to these two
> > chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
> > header, while the indirect mbuf simply points to a location within the
> > original packet's payload. Consequently, use of the GSO library requires
> > multi-segment MBUF support in the TX functions of the NIC driver.
> >
> > If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> > result, when all of its GSOed segments are freed, the packet is freed
> > automatically.
> >
> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> > ---
> >  lib/librte_eal/common/include/rte_log.h |   1 +
> >  lib/librte_gso/Makefile                 |   2 +
> >  lib/librte_gso/gso_common.c             | 202 ++++++++++++++++++++++++++++++++
> >  lib/librte_gso/gso_common.h             | 113 ++++++++++++++++++
> >  lib/librte_gso/gso_tcp4.c               |  83 +++++++++++++
> >  lib/librte_gso/gso_tcp4.h               |  76 ++++++++++++
> >  lib/librte_gso/rte_gso.c                |  41 ++++++-
> >  7 files changed, 515 insertions(+), 3 deletions(-)
> >  create mode 100644 lib/librte_gso/gso_common.c
> >  create mode 100644 lib/librte_gso/gso_common.h
> >  create mode 100644 lib/librte_gso/gso_tcp4.c
> >  create mode 100644 lib/librte_gso/gso_tcp4.h
> >
> > diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
> > index ec8dba7..2fa1199 100644
> > --- a/lib/librte_eal/common/include/rte_log.h
> > +++ b/lib/librte_eal/common/include/rte_log.h
> > @@ -87,6 +87,7 @@ extern struct rte_logs rte_logs;
> >  #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
> >  #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
> >  #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
> > +#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
> >
> >  /* these log types can be used in an application */
> >  #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
> > diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> > index aeaacbc..2be64d1 100644
> > --- a/lib/librte_gso/Makefile
> > +++ b/lib/librte_gso/Makefile
> > @@ -42,6 +42,8 @@ LIBABIVER := 1
> >
> >  #source files
> >  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
> > +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
> > +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
> >
> >  # install this header file
> >  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> > diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
> > new file mode 100644
> > index 0000000..7c32e03
> > --- /dev/null
> > +++ b/lib/librte_gso/gso_common.c
> > @@ -0,0 +1,202 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#include <stdbool.h>
> > +#include <errno.h>
> > +
> > +#include <rte_memcpy.h>
> > +#include <rte_mempool.h>
> > +#include <rte_ether.h>
> > +#include <rte_ip.h>
> > +#include <rte_tcp.h>
> > +
> > +#include "gso_common.h"
> > +
> > +static inline void
> > +hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
> > +		uint16_t pkt_hdr_offset)
> > +{
> > +	/* Copy MBUF metadata */
> > +	hdr_segment->nb_segs = 1;
> > +	hdr_segment->port = pkt->port;
> > +	hdr_segment->ol_flags = pkt->ol_flags;
> > +	hdr_segment->packet_type = pkt->packet_type;
> > +	hdr_segment->pkt_len = pkt_hdr_offset;
> > +	hdr_segment->data_len = pkt_hdr_offset;
> > +	hdr_segment->tx_offload = pkt->tx_offload;
> > +
> > +	/* Copy the packet header */
> > +	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
> > +			rte_pktmbuf_mtod(pkt, char *),
> > +			pkt_hdr_offset);
> > +}
> > +
> > +static inline void
> > +free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
> > +{
> > +	uint16_t i;
> > +
> > +	for (i = 0; i < nb_pkts; i++)
> > +		rte_pktmbuf_free(pkts[i]);
> > +}
> > +
> > +int
> > +gso_do_segment(struct rte_mbuf *pkt,
> > +		uint16_t pkt_hdr_offset,
> > +		uint16_t pyld_unit_size,
> > +		struct rte_mempool *direct_pool,
> > +		struct rte_mempool *indirect_pool,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out)
> > +{
> > +	struct rte_mbuf *pkt_in;
> > +	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
> > +	uint16_t pkt_in_data_pos, segment_bytes_remaining;
> > +	uint16_t pyld_len, nb_segs;
> > +	bool more_in_pkt, more_out_segs;
> > +
> > +	pkt_in = pkt;
> > +	nb_segs = 0;
> > +	more_in_pkt = 1;
> > +	pkt_in_data_pos = pkt_hdr_offset;
> > +
> > +	while (more_in_pkt) {
> > +		if (unlikely(nb_segs >= nb_pkts_out)) {
> > +			free_gso_segment(pkts_out, nb_segs);
> > +			return -EINVAL;
> > +		}
> > +
> > +		/* Allocate a direct MBUF */
> > +		hdr_segment = rte_pktmbuf_alloc(direct_pool);
> > +		if (unlikely(hdr_segment == NULL)) {
> > +			free_gso_segment(pkts_out, nb_segs);
> > +			return -ENOMEM;
> > +		}
> > +		/* Fill the packet header */
> > +		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
> > +
> > +		prev_segment = hdr_segment;
> > +		segment_bytes_remaining = pyld_unit_size;
> > +		more_out_segs = 1;
> > +
> > +		while (more_out_segs && more_in_pkt) {
> > +			/* Allocate an indirect MBUF */
> > +			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
> > +			if (unlikely(pyld_segment == NULL)) {
> > +				rte_pktmbuf_free(hdr_segment);
> > +				free_gso_segment(pkts_out, nb_segs);
> > +				return -ENOMEM;
> > +			}
> > +			/* Attach to current MBUF segment of pkt */
> > +			rte_pktmbuf_attach(pyld_segment, pkt_in);
> > +
> > +			prev_segment->next = pyld_segment;
> > +			prev_segment = pyld_segment;
> > +
> > +			pyld_len = segment_bytes_remaining;
> > +			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
> > +				pyld_len = pkt_in->data_len - pkt_in_data_pos;
> > +
> > +			pyld_segment->data_off = pkt_in_data_pos +
> > +				pkt_in->data_off;
> > +			pyld_segment->data_len = pyld_len;
> > +
> > +			/* Update header segment */
> > +			hdr_segment->pkt_len += pyld_len;
> > +			hdr_segment->nb_segs++;
> > +
> > +			pkt_in_data_pos += pyld_len;
> > +			segment_bytes_remaining -= pyld_len;
> > +
> > +			/* Finish processing a MBUF segment of pkt */
> > +			if (pkt_in_data_pos == pkt_in->data_len) {
> > +				pkt_in = pkt_in->next;
> > +				pkt_in_data_pos = 0;
> > +				if (pkt_in == NULL)
> > +					more_in_pkt = 0;
> > +			}
> > +
> > +			/* Finish generating a GSO segment */
> > +			if (segment_bytes_remaining == 0)
> > +				more_out_segs = 0;
> > +		}
> > +		pkts_out[nb_segs++] = hdr_segment;
> > +	}
> > +	return nb_segs;
> > +}
> > +
> > +static inline void
> > +update_inner_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
> > +		struct rte_mbuf **segs, uint16_t nb_segs)
> > +{
> > +	struct tcp_hdr *tcp_hdr;
> > +	struct ipv4_hdr *ipv4_hdr;
> > +	struct rte_mbuf *seg;
> > +	uint32_t sent_seq;
> > +	uint16_t inner_l2_offset;
> > +	uint16_t id, i;
> > +
> > +	inner_l2_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
> 
> Shouldn't it be: pkt->l2_len here?
> Or probably even better to pass l2_len as an input parameter.
> 
> > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> > +			inner_l2_offset);
> > +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> > +	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> > +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> > +
> > +	for (i = 0; i < nb_segs; i++) {
> > +		seg = segs[i];
> > +		/* Update the inner IPv4 header */
> > +		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(seg, char *) +
> > +				inner_l2_offset);
> > +		ipv4_hdr->total_length = rte_cpu_to_be_16(seg->pkt_len -
> > +				inner_l2_offset);
> > +		ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
> > +		id += ipid_delta;
> > +
> > +		/* Update the inner TCP header */
> > +		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + seg->l3_len);
> > +		tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
> > +		if (likely(i < nb_segs - 1))
> > +			tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
> > +						TCP_HDR_FIN_MASK));
> > +		sent_seq += (seg->pkt_len - seg->data_len);
> > +	}
> > +}
> > +
> > +void
> > +gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
> > +		struct rte_mbuf **segs, uint16_t nb_segs)
> > +{
> > +	if (is_ipv4_tcp(pkt->packet_type))
> > +		update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
> > +}
> > diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> > new file mode 100644
> > index 0000000..3c76520
> > --- /dev/null
> > +++ b/lib/librte_gso/gso_common.h
> > @@ -0,0 +1,113 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#ifndef _GSO_COMMON_H_
> > +#define _GSO_COMMON_H_
> > +
> > +#include <stdint.h>
> > +#include <rte_mbuf.h>
> > +
> > +#define IPV4_HDR_DF_SHIFT 14
> 
> We have that already defined in librte_net/rte_ip.h
> 
> > +#define IPV4_HDR_DF_MASK (1 << IPV4_HDR_DF_SHIFT)
> > +
> > +#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
> > +#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
> > +
> > +#define ETHER_TCP_PKT (RTE_PTYPE_L2_ETHER | RTE_PTYPE_L4_TCP)
> > +#define ETHER_VLAN_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | RTE_PTYPE_L4_TCP)
> > +static inline uint8_t is_ipv4_tcp(uint32_t ptype)
> > +{
> > +	switch (ptype & (~RTE_PTYPE_L3_MASK)) {
> > +	case ETHER_VLAN_TCP_PKT:
> > +	case ETHER_TCP_PKT:
> 
> Why not just:
> return RTE_ETH_IS_IPV4_HDR(ptype) && (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP;
> ?
> 
> > +		return RTE_ETH_IS_IPV4_HDR(ptype);
> > +	default:
> > +		return 0;
> > +	}
> > +}
> > +
> > +/**
> > + * Internal function which updates relevant packet headers, following
> > + * segmentation. This is required to update, for example, the IPv4
> > + * 'total_length' field, to reflect the reduced length of the now-
> > + * segmented packet.
> > + *
> > + * @param pkt
> > + *  The original packet.
> > + * @param ipid_delta
> > + *  The increasing uint of IP ids.
> > + * @param segs
> > + *  Pointer array used for storing mbuf addresses for GSO segments.
> > + * @param nb_segs
> > + *  The number of GSO segments placed in segs.
> > + */
> > +void gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
> > +		struct rte_mbuf **segs, uint16_t nb_segs);
> > +
> > +/**
> > + * Internal function which divides the input packet into small segments.
> > + * Each of the newly-created segments is organized as a two-segment MBUF,
> > + * where the first segment is a standard mbuf, which stores a copy of
> > + * packet header, and the second is an indirect mbuf which points to a
> > + * section of data in the input packet.
> > + *
> > + * @param pkt
> > + *  Packet to segment.
> > + * @param pkt_hdr_offset
> > + *  Packet header offset, measured in bytes.
> > + * @param pyld_unit_size
> > + *  The max payload length of a GSO segment.
> > + * @param direct_pool
> > + *  MBUF pool used for allocating direct buffers for output segments.
> > + * @param indirect_pool
> > + *  MBUF pool used for allocating indirect buffers for output segments.
> > + * @param pkts_out
> > + *  Pointer array used to keep the mbuf addresses of output segments. If
> > + *  the memory space in pkts_out is insufficient, gso_do_segment() fails
> > + *  and returns -EINVAL.
> > + * @param nb_pkts_out
> > + *  The max number of items that pkts_out can keep.
> > + *
> > + * @return
> > + *  - The number of segments created in the event of success.
> > + *  - Return -ENOMEM if run out of memory in MBUF pools.
> > + *  - Return -EINVAL for invalid parameters.
> > + */
> > +int gso_do_segment(struct rte_mbuf *pkt,
> > +		uint16_t pkt_hdr_offset,
> > +		uint16_t pyld_unit_size,
> > +		struct rte_mempool *direct_pool,
> > +		struct rte_mempool *indirect_pool,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out);
> > +#endif
> > diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
> > new file mode 100644
> > index 0000000..8d4bfb2
> > --- /dev/null
> > +++ b/lib/librte_gso/gso_tcp4.c
> > @@ -0,0 +1,83 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +
> > +#include <rte_ether.h>
> > +#include <rte_ip.h>
> > +
> > +#include "gso_common.h"
> > +#include "gso_tcp4.h"
> > +
> > +int
> > +gso_tcp4_segment(struct rte_mbuf *pkt,
> > +		uint16_t gso_size,
> > +		uint8_t ipid_delta,
> > +		struct rte_mempool *direct_pool,
> > +		struct rte_mempool *indirect_pool,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out)
> > +{
> > +	struct ipv4_hdr *ipv4_hdr;
> > +	uint16_t tcp_dl;
> > +	uint16_t pyld_unit_size;
> > +	uint16_t hdr_offset;
> > +	int ret = 1;
> > +
> > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> > +			pkt->l2_len);
> > +	/* Don't process the fragmented packet */
> > +	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
> > +						IPV4_HDR_DF_MASK)) == 0)) {
> 
> 
> It is not a check for fragmented packet - it is a check that fragmentation is allowed for that packet.
> Should be IPV4_HDR_DF_MASK - 1,  I think.
> 
> > +		pkts_out[0] = pkt;
> > +		return ret;
> > +	}
> > +
> > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt->l3_len -
> > +		pkt->l4_len;
> 
> Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len?
> 
> > +	/* Don't process the packet without data */
> > +	if (unlikely(tcp_dl == 0)) {
> > +		pkts_out[0] = pkt;
> > +		return ret;
> > +	}
> > +
> > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
> 
> Hmm, why do we need to count CRC_LEN here?
> 
> > +
> > +	/* Segment the payload */
> > +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
> > +			indirect_pool, pkts_out, nb_pkts_out);
> > +	if (ret > 1)
> > +		gso_update_pkt_headers(pkt, ipid_delta, pkts_out, ret);
> > +
> > +	return ret;
> > +}
> > diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
> > new file mode 100644
> > index 0000000..9c07984
> > --- /dev/null
> > +++ b/lib/librte_gso/gso_tcp4.h
> > @@ -0,0 +1,76 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#ifndef _GSO_TCP4_H_
> > +#define _GSO_TCP4_H_
> > +
> > +#include <stdint.h>
> > +#include <rte_mbuf.h>
> > +
> > +/**
> > + * Segment an IPv4/TCP packet. This function assumes the input packet has
> > + * correct checksums and doesn't update checksums for GSO segment.
> > + * Furthermore, it doesn't process IP fragment packets.
> > + *
> > + * @param pkt
> > + *  The packet mbuf to segment.
> > + * @param gso_size
> > + *  The max length of a GSO segment, measured in bytes.
> > + * @param ipid_delta
> > + *  The increasing uint of IP ids.
> > + * @param direct_pool
> > + *  MBUF pool used for allocating direct buffers for output segments.
> > + * @param indirect_pool
> > + *  MBUF pool used for allocating indirect buffers for output segments.
> > + * @param pkts_out
> > + *  Pointer array used to store the MBUF addresses of output GSO
> > + *  segments, when gso_tcp4_segment() successes. If the memory space in
> > + *  pkts_out is insufficient, gso_tcp4_segment() fails and returns
> > + *  -EINVAL.
> > + * @param nb_pkts_out
> > + *  The max number of items that 'pkts_out' can keep.
> > + *
> > + * @return
> > + *   - The number of GSO segments filled in pkts_out on success.
> > + *   - Return -ENOMEM if run out of memory in MBUF pools.
> > + *   - Return -EINVAL for invalid parameters.
> > + */
> > +int gso_tcp4_segment(struct rte_mbuf *pkt,
> > +		uint16_t gso_size,
> > +		uint8_t ip_delta,
> > +		struct rte_mempool *direct_pool,
> > +		struct rte_mempool *indirect_pool,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out);
> > +
> > +#endif
> > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > index dda50ee..95f6ea6 100644
> > --- a/lib/librte_gso/rte_gso.c
> > +++ b/lib/librte_gso/rte_gso.c
> > @@ -33,18 +33,53 @@
> >
> >  #include <errno.h>
> >
> > +#include <rte_log.h>
> > +
> >  #include "rte_gso.h"
> > +#include "gso_common.h"
> > +#include "gso_tcp4.h"
> >
> >  int
> >  rte_gso_segment(struct rte_mbuf *pkt,
> > -		struct rte_gso_ctx gso_ctx __rte_unused,
> > +		struct rte_gso_ctx gso_ctx,
> >  		struct rte_mbuf **pkts_out,
> >  		uint16_t nb_pkts_out)
> >  {
> > +	struct rte_mempool *direct_pool, *indirect_pool;
> > +	struct rte_mbuf *pkt_seg;
> > +	uint16_t gso_size;
> > +	uint8_t ipid_delta;
> > +	int ret = 1;
> > +
> >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
> >  		return -EINVAL;
> >
> > -	pkts_out[0] = pkt;
> > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> > +			(pkt->packet_type & gso_ctx.gso_types) !=
> > +			pkt->packet_type) {
> > +		pkts_out[0] = pkt;
> > +		return ret;
> > +	}
> > +
> > +	direct_pool = gso_ctx.direct_pool;
> > +	indirect_pool = gso_ctx.indirect_pool;
> > +	gso_size = gso_ctx.gso_size;
> > +	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
> > +
> > +	if (is_ipv4_tcp(pkt->packet_type)) {
> 
> Probably we need here:
> If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...

Sorry, actually it probably should be:
If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) == PKT_TX_IPV4 &&
      (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...

Konstantin

> 
> > +		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> > +				direct_pool, indirect_pool,
> > +				pkts_out, nb_pkts_out);
> > +	} else
> > +		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
> 
> Shouldn't we do pkt_out[0] = pkt; here?
> 
> > +
> > +	if (ret > 1) {
> > +		pkt_seg = pkt;
> > +		while (pkt_seg) {
> > +			rte_mbuf_refcnt_update(pkt_seg, -1);
> > +			pkt_seg = pkt_seg->next;
> > +		}
> > +	}
> >
> > -	return 1;
> > +	return ret;
> >  }
> > --
> > 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 1/5] gso: add Generic Segmentation Offload API framework
  2017-09-12 10:36       ` Ananyev, Konstantin
@ 2017-09-13  2:11         ` Jiayu Hu
  0 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-13  2:11 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Konstantin,

Thanks for your quick feedbacks. Replies are inline.

Thanks,
Jiayu

On Tue, Sep 12, 2017 at 06:36:41PM +0800, Ananyev, Konstantin wrote:
> Hi Jiayu,
> Few comments from be inline.
> Konstantin
> 
> > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > new file mode 100644
> > index 0000000..dda50ee
> > --- /dev/null
> > +++ b/lib/librte_gso/rte_gso.c
> > @@ -0,0 +1,50 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#include <errno.h>
> > +
> > +#include "rte_gso.h"
> > +
> > +int
> > +rte_gso_segment(struct rte_mbuf *pkt,
> > +		struct rte_gso_ctx gso_ctx __rte_unused,
> 
> No need to pass parameter by value here.
> struct rte_gso_ctx *gso_ctx would do.
> Even better - const struct rte_gso_ctx *, in case it doesn't need to need
> to be updated inside that function.  

Yes, agree. I will use rte_gso_ctx *gso_ctx instead.

> 
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out)
> > +{
> > +	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
> > +		return -EINVAL;
> > +
> > +	pkts_out[0] = pkt;
> > +
> > +	return 1;
> > +}
> > diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
> > new file mode 100644
> > index 0000000..db757d6
> > --- /dev/null
> > +++ b/lib/librte_gso/rte_gso.h
> > @@ -0,0 +1,133 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#ifndef _RTE_GSO_H_
> > +#define _RTE_GSO_H_
> > +
> > +/**
> > + * @file
> > + * Interface to GSO library
> > + */
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#include <stdint.h>
> > +#include <rte_mbuf.h>
> > +
> > +/* GSO IP id flags for the IPv4 header */
> > +#define RTE_GSO_IPID_FIXED (1 << 0)
> > +/**< Use fixed IP ids for output GSO segments */
> > +#define RTE_GSO_IPID_INCREASE (1 << 1)
> > +/**< Use incremental IP ids for output GSO segments */
> 
> As values above are mutually exclusive, I think you don't need both flags.
> Just one seems enough.

Agree, I will remove RTE_GSO_IPID_INCREASE.

> 
> 
> > +
> > +/**
> > + * GSO context structure.
> > + */
> > +struct rte_gso_ctx {
> > +	struct rte_mempool *direct_pool;
> > +	/**< MBUF pool for allocating direct buffers, which are used
> > +	 * to store packet headers for GSO segments.
> > +	 */
> > +	struct rte_mempool *indirect_pool;
> > +	/**< MBUF pool for allocating indirect buffers, which are used
> > +	 * to locate packet payloads for GSO segments. The indirect
> > +	 * buffer doesn't contain any data, but simply points to an
> > +	 * offset within the packet to segment.
> > +	 */
> > +	uint32_t gso_types;
> > +	/**< packet types to perform GSO. For example, if applications
> > +	 * want to segment TCP/IPv4 packets, may set (RTE_PTYPE_L2_ETHER |
> > +	 * RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP) to gso_types.
> 
> 
> Actually after another thought - it probably should be no ptype mask, but mask
> of rte_ethdev DEV_TX_OFFLOAD_*_TSO flags that are used to advertise real HW TSO offloads.
> Let say for GSO that supports TSO over IPv4 it would be:
> PKT_TX_TCP_SEG | PKT_TX_IPV4.
> That would allow user to use GSO and TSO in a transparent way,
> plus ptype is not actually a proper bitmask, but a set of enums,
> so it not always possible to distinguish what ptype is supported just by bitmask.
> Sorry for causing confusion here.

Yes, agree. Reusing packet_type indeed introduces lots of macros to applications.
DEV_TX_OFFLOAD_*_TSO is a better choice, and it can also make HW offload and SW
offload consistent.

> 
> > +	 */
> > +	uint16_t gso_size;
> > +	/**< maximum size of an output GSO segment, including packet
> > +	 * header and payload, measured in bytes.
> > +	 */
> > +	uint8_t ipid_flag;
> 
> I'd suggest uint32_t flags (or even uint64_t).
> Who knows what extra flags we'll need in future here.

Make sense. I will use uint64_t.

> 
> > +	/**< flag to indicate GSO uses fixed or incremental IP ids for
> > +	 * IPv4 headers of output GSO segments.
> > +	 */
> > +};
> > +
> > +/**
> > + * Segmentation function, which supports processing of both single- and
> > + * multi- segment packets.
> > + *
> > + * Note that we refer to the packets that are segmented from the input
> > + * packet as 'GSO segments'. rte_gso_segment() assumes the input packet
> > + * has correct checksums, and it doesn't update checksums for output
> > + * GSO segments. Additionally, it doesn't process IP fragment packets.
> > + *
> > + * Each of the newly-created GSO segments is organized as a two-segment
> > + * MBUF, where the first segment is a standard MBUF, which stores a copy
> > + * of packet header, and the second is an indirect MBUF which points to
> > + * a section of data in the input packet. Since each GSO segment has
> > + * multiple MBUFs (i.e. 2 MBUFs), the driver of the interface which the
> > + * GSO segments are sent to should support to transmit multi-segment
> > + * packets.
> > + *
> > + * If the input packet is GSOed, its mbuf refcnt reduces by 1. Therefore,
> > + * when all GSO segments are freed, the input packet is freed automatically.
> > + *
> > + * If the memory space in pkts_out or MBUF pools is insufficient, this
> > + * function fails, and it returns (-1) * errno. Otherwise, GSO successes,
> > + * and this function returns the number of output GSO segments filled in
> > + * pkts_out.
> > + *
> > + * @param pkt
> > + *  The packet mbuf to segment.
> > + * @param ctx
> > + *  GSO context object.
> > + * @param pkts_out
> > + *  Pointer array used to store the MBUF addresses of output GSO
> > + *  segments, when rte_gso_segment() successes.
> > + * @param nb_pkts_out
> > + *  The max number of items that pkts_out can keep.
> > + *
> > + * @return
> > + *  - The number of GSO segments filled in pkts_out on success.
> > + *  - Return -ENOMEM if run out of memory in MBUF pools.
> > + *  - Return -EINVAL for invalid parameters.
> > + */
> > +int rte_gso_segment(struct rte_mbuf *pkt,
> > +		struct rte_gso_ctx ctx,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out);
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_GSO_H_ */

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-12 11:17       ` Ananyev, Konstantin
@ 2017-09-13  2:48         ` Jiayu Hu
  2017-09-13  9:38           ` Ananyev, Konstantin
  0 siblings, 1 reply; 157+ messages in thread
From: Jiayu Hu @ 2017-09-13  2:48 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Konstantin,

On Tue, Sep 12, 2017 at 07:17:49PM +0800, Ananyev, Konstantin wrote:
> Hi Jayu,
> 
> > -----Original Message-----
> > From: Hu, Jiayu
> > Sent: Tuesday, September 12, 2017 3:43 AM
> > To: dev@dpdk.org
> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng
> > <jianfeng.tan@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
> > Subject: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > 
> > This patch adds GSO support for TCP/IPv4 packets. Supported packets
> > may include a single VLAN tag. TCP/IPv4 GSO assumes that all input
> > packets have correct checksums, and doesn't update checksums for output
> > packets (the responsibility for this lies with the application).
> 
> Probably it shouldn't say that checksum have to be valid, right?
> As you don't update checksum(s) inside the lib - it probably doesn't matter.

Yes, you are right. It's better to use:
"TCP/IPv4 GSO doesn't check if checksums are correct and doesn't update
checksums for output packets".

> 
> > Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
> > 
> > TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
> > MBUF, to organize an output packet. Note that we refer to these two
> > chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
> > header, while the indirect mbuf simply points to a location within the
> > original packet's payload. Consequently, use of the GSO library requires
> > multi-segment MBUF support in the TX functions of the NIC driver.
> > 
> > If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> > result, when all of its GSOed segments are freed, the packet is freed
> > automatically.
> > 
> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> > ---
> >  lib/librte_eal/common/include/rte_log.h |   1 +
> >  lib/librte_gso/Makefile                 |   2 +
> >  lib/librte_gso/gso_common.c             | 202 ++++++++++++++++++++++++++++++++
> >  lib/librte_gso/gso_common.h             | 113 ++++++++++++++++++
> >  lib/librte_gso/gso_tcp4.c               |  83 +++++++++++++
> >  lib/librte_gso/gso_tcp4.h               |  76 ++++++++++++
> >  lib/librte_gso/rte_gso.c                |  41 ++++++-
> >  7 files changed, 515 insertions(+), 3 deletions(-)
> >  create mode 100644 lib/librte_gso/gso_common.c
> >  create mode 100644 lib/librte_gso/gso_common.h
> >  create mode 100644 lib/librte_gso/gso_tcp4.c
> >  create mode 100644 lib/librte_gso/gso_tcp4.h
> > 
> > diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
> > index ec8dba7..2fa1199 100644
> > --- a/lib/librte_eal/common/include/rte_log.h
> > +++ b/lib/librte_eal/common/include/rte_log.h
> > @@ -87,6 +87,7 @@ extern struct rte_logs rte_logs;
> >  #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
> >  #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
> >  #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
> > +#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
> > 
> >  /* these log types can be used in an application */
> >  #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
> > diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> > index aeaacbc..2be64d1 100644
> > --- a/lib/librte_gso/Makefile
> > +++ b/lib/librte_gso/Makefile
> > @@ -42,6 +42,8 @@ LIBABIVER := 1
> > 
> >  #source files
> >  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
> > +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
> > +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
> > 
> >  # install this header file
> >  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> > diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
> > new file mode 100644
> > index 0000000..7c32e03
> > --- /dev/null
> > +++ b/lib/librte_gso/gso_common.c
> > @@ -0,0 +1,202 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#include <stdbool.h>
> > +#include <errno.h>
> > +
> > +#include <rte_memcpy.h>
> > +#include <rte_mempool.h>
> > +#include <rte_ether.h>
> > +#include <rte_ip.h>
> > +#include <rte_tcp.h>
> > +
> > +#include "gso_common.h"
> > +
> > +static inline void
> > +hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
> > +		uint16_t pkt_hdr_offset)
> > +{
> > +	/* Copy MBUF metadata */
> > +	hdr_segment->nb_segs = 1;
> > +	hdr_segment->port = pkt->port;
> > +	hdr_segment->ol_flags = pkt->ol_flags;
> > +	hdr_segment->packet_type = pkt->packet_type;
> > +	hdr_segment->pkt_len = pkt_hdr_offset;
> > +	hdr_segment->data_len = pkt_hdr_offset;
> > +	hdr_segment->tx_offload = pkt->tx_offload;
> > +
> > +	/* Copy the packet header */
> > +	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
> > +			rte_pktmbuf_mtod(pkt, char *),
> > +			pkt_hdr_offset);
> > +}
> > +
> > +static inline void
> > +free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
> > +{
> > +	uint16_t i;
> > +
> > +	for (i = 0; i < nb_pkts; i++)
> > +		rte_pktmbuf_free(pkts[i]);
> > +}
> > +
> > +int
> > +gso_do_segment(struct rte_mbuf *pkt,
> > +		uint16_t pkt_hdr_offset,
> > +		uint16_t pyld_unit_size,
> > +		struct rte_mempool *direct_pool,
> > +		struct rte_mempool *indirect_pool,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out)
> > +{
> > +	struct rte_mbuf *pkt_in;
> > +	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
> > +	uint16_t pkt_in_data_pos, segment_bytes_remaining;
> > +	uint16_t pyld_len, nb_segs;
> > +	bool more_in_pkt, more_out_segs;
> > +
> > +	pkt_in = pkt;
> > +	nb_segs = 0;
> > +	more_in_pkt = 1;
> > +	pkt_in_data_pos = pkt_hdr_offset;
> > +
> > +	while (more_in_pkt) {
> > +		if (unlikely(nb_segs >= nb_pkts_out)) {
> > +			free_gso_segment(pkts_out, nb_segs);
> > +			return -EINVAL;
> > +		}
> > +
> > +		/* Allocate a direct MBUF */
> > +		hdr_segment = rte_pktmbuf_alloc(direct_pool);
> > +		if (unlikely(hdr_segment == NULL)) {
> > +			free_gso_segment(pkts_out, nb_segs);
> > +			return -ENOMEM;
> > +		}
> > +		/* Fill the packet header */
> > +		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
> > +
> > +		prev_segment = hdr_segment;
> > +		segment_bytes_remaining = pyld_unit_size;
> > +		more_out_segs = 1;
> > +
> > +		while (more_out_segs && more_in_pkt) {
> > +			/* Allocate an indirect MBUF */
> > +			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
> > +			if (unlikely(pyld_segment == NULL)) {
> > +				rte_pktmbuf_free(hdr_segment);
> > +				free_gso_segment(pkts_out, nb_segs);
> > +				return -ENOMEM;
> > +			}
> > +			/* Attach to current MBUF segment of pkt */
> > +			rte_pktmbuf_attach(pyld_segment, pkt_in);
> > +
> > +			prev_segment->next = pyld_segment;
> > +			prev_segment = pyld_segment;
> > +
> > +			pyld_len = segment_bytes_remaining;
> > +			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
> > +				pyld_len = pkt_in->data_len - pkt_in_data_pos;
> > +
> > +			pyld_segment->data_off = pkt_in_data_pos +
> > +				pkt_in->data_off;
> > +			pyld_segment->data_len = pyld_len;
> > +
> > +			/* Update header segment */
> > +			hdr_segment->pkt_len += pyld_len;
> > +			hdr_segment->nb_segs++;
> > +
> > +			pkt_in_data_pos += pyld_len;
> > +			segment_bytes_remaining -= pyld_len;
> > +
> > +			/* Finish processing a MBUF segment of pkt */
> > +			if (pkt_in_data_pos == pkt_in->data_len) {
> > +				pkt_in = pkt_in->next;
> > +				pkt_in_data_pos = 0;
> > +				if (pkt_in == NULL)
> > +					more_in_pkt = 0;
> > +			}
> > +
> > +			/* Finish generating a GSO segment */
> > +			if (segment_bytes_remaining == 0)
> > +				more_out_segs = 0;
> > +		}
> > +		pkts_out[nb_segs++] = hdr_segment;
> > +	}
> > +	return nb_segs;
> > +}
> > +
> > +static inline void
> > +update_inner_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
> > +		struct rte_mbuf **segs, uint16_t nb_segs)
> > +{
> > +	struct tcp_hdr *tcp_hdr;
> > +	struct ipv4_hdr *ipv4_hdr;
> > +	struct rte_mbuf *seg;
> > +	uint32_t sent_seq;
> > +	uint16_t inner_l2_offset;
> > +	uint16_t id, i;
> > +
> > +	inner_l2_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
> 
> Shouldn't it be: pkt->l2_len here?
> Or probably even better to pass l2_len as an input parameter.

Oh, yes. Applications won't guarantee outer_l2_len and outer_l3_len are 0
for non-tunnelling packets. I will add l2_len as a parameter instead.

> 
> > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> > +			inner_l2_offset);
> > +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> > +	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> > +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> > +
> > +	for (i = 0; i < nb_segs; i++) {
> > +		seg = segs[i];
> > +		/* Update the inner IPv4 header */
> > +		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(seg, char *) +
> > +				inner_l2_offset);
> > +		ipv4_hdr->total_length = rte_cpu_to_be_16(seg->pkt_len -
> > +				inner_l2_offset);
> > +		ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
> > +		id += ipid_delta;
> > +
> > +		/* Update the inner TCP header */
> > +		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + seg->l3_len);
> > +		tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
> > +		if (likely(i < nb_segs - 1))
> > +			tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
> > +						TCP_HDR_FIN_MASK));
> > +		sent_seq += (seg->pkt_len - seg->data_len);
> > +	}
> > +}
> > +
> > +void
> > +gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
> > +		struct rte_mbuf **segs, uint16_t nb_segs)
> > +{
> > +	if (is_ipv4_tcp(pkt->packet_type))
> > +		update_inner_tcp4_header(pkt, ipid_delta, segs, nb_segs);
> > +}
> > diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> > new file mode 100644
> > index 0000000..3c76520
> > --- /dev/null
> > +++ b/lib/librte_gso/gso_common.h
> > @@ -0,0 +1,113 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#ifndef _GSO_COMMON_H_
> > +#define _GSO_COMMON_H_
> > +
> > +#include <stdint.h>
> > +#include <rte_mbuf.h>
> > +
> > +#define IPV4_HDR_DF_SHIFT 14
> 
> We have that already defined in librte_net/rte_ip.h

Yes. I will remove it here.

> 
> > +#define IPV4_HDR_DF_MASK (1 << IPV4_HDR_DF_SHIFT)
> > +
> > +#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
> > +#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
> > +
> > +#define ETHER_TCP_PKT (RTE_PTYPE_L2_ETHER | RTE_PTYPE_L4_TCP)
> > +#define ETHER_VLAN_TCP_PKT (RTE_PTYPE_L2_ETHER_VLAN | RTE_PTYPE_L4_TCP)
> > +static inline uint8_t is_ipv4_tcp(uint32_t ptype)
> > +{
> > +	switch (ptype & (~RTE_PTYPE_L3_MASK)) {
> > +	case ETHER_VLAN_TCP_PKT:
> > +	case ETHER_TCP_PKT:
> 
> Why not just:
> return RTE_ETH_IS_IPV4_HDR(ptype) && (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP;
> ?

Yes, we don't need to check if the packet is vlan encapsulated.

> 
> > +		return RTE_ETH_IS_IPV4_HDR(ptype);
> > +	default:
> > +		return 0;
> > +	}
> > +}
> > +
> > +/**
> > + * Internal function which updates relevant packet headers, following
> > + * segmentation. This is required to update, for example, the IPv4
> > + * 'total_length' field, to reflect the reduced length of the now-
> > + * segmented packet.
> > + *
> > + * @param pkt
> > + *  The original packet.
> > + * @param ipid_delta
> > + *  The increasing uint of IP ids.
> > + * @param segs
> > + *  Pointer array used for storing mbuf addresses for GSO segments.
> > + * @param nb_segs
> > + *  The number of GSO segments placed in segs.
> > + */
> > +void gso_update_pkt_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
> > +		struct rte_mbuf **segs, uint16_t nb_segs);
> > +
> > +/**
> > + * Internal function which divides the input packet into small segments.
> > + * Each of the newly-created segments is organized as a two-segment MBUF,
> > + * where the first segment is a standard mbuf, which stores a copy of
> > + * packet header, and the second is an indirect mbuf which points to a
> > + * section of data in the input packet.
> > + *
> > + * @param pkt
> > + *  Packet to segment.
> > + * @param pkt_hdr_offset
> > + *  Packet header offset, measured in bytes.
> > + * @param pyld_unit_size
> > + *  The max payload length of a GSO segment.
> > + * @param direct_pool
> > + *  MBUF pool used for allocating direct buffers for output segments.
> > + * @param indirect_pool
> > + *  MBUF pool used for allocating indirect buffers for output segments.
> > + * @param pkts_out
> > + *  Pointer array used to keep the mbuf addresses of output segments. If
> > + *  the memory space in pkts_out is insufficient, gso_do_segment() fails
> > + *  and returns -EINVAL.
> > + * @param nb_pkts_out
> > + *  The max number of items that pkts_out can keep.
> > + *
> > + * @return
> > + *  - The number of segments created in the event of success.
> > + *  - Return -ENOMEM if run out of memory in MBUF pools.
> > + *  - Return -EINVAL for invalid parameters.
> > + */
> > +int gso_do_segment(struct rte_mbuf *pkt,
> > +		uint16_t pkt_hdr_offset,
> > +		uint16_t pyld_unit_size,
> > +		struct rte_mempool *direct_pool,
> > +		struct rte_mempool *indirect_pool,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out);
> > +#endif
> > diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
> > new file mode 100644
> > index 0000000..8d4bfb2
> > --- /dev/null
> > +++ b/lib/librte_gso/gso_tcp4.c
> > @@ -0,0 +1,83 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +
> > +#include <rte_ether.h>
> > +#include <rte_ip.h>
> > +
> > +#include "gso_common.h"
> > +#include "gso_tcp4.h"
> > +
> > +int
> > +gso_tcp4_segment(struct rte_mbuf *pkt,
> > +		uint16_t gso_size,
> > +		uint8_t ipid_delta,
> > +		struct rte_mempool *direct_pool,
> > +		struct rte_mempool *indirect_pool,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out)
> > +{
> > +	struct ipv4_hdr *ipv4_hdr;
> > +	uint16_t tcp_dl;
> > +	uint16_t pyld_unit_size;
> > +	uint16_t hdr_offset;
> > +	int ret = 1;
> > +
> > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> > +			pkt->l2_len);
> > +	/* Don't process the fragmented packet */
> > +	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
> > +						IPV4_HDR_DF_MASK)) == 0)) {
> 
> 
> It is not a check for fragmented packet - it is a check that fragmentation is allowed for that packet.
> Should be IPV4_HDR_DF_MASK - 1,  I think.

IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF bit. It's a
little-endian value. But ipv4_hdr->fragment_offset is big-endian order.
So the value of DF bit should be "ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
IPV4_HDR_DF_MASK)". If this value is 0, the input packet is fragmented.

> 
> > +		pkts_out[0] = pkt;
> > +		return ret;
> > +	}
> > +
> > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt->l3_len -
> > +		pkt->l4_len;
> 
> Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len?

Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len here.

> 
> > +	/* Don't process the packet without data */
> > +	if (unlikely(tcp_dl == 0)) {
> > +		pkts_out[0] = pkt;
> > +		return ret;
> > +	}
> > +
> > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
> 
> Hmm, why do we need to count CRC_LEN here?

Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be
included in gso_size.

> 
> > +
> > +	/* Segment the payload */
> > +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
> > +			indirect_pool, pkts_out, nb_pkts_out);
> > +	if (ret > 1)
> > +		gso_update_pkt_headers(pkt, ipid_delta, pkts_out, ret);
> > +
> > +	return ret;
> > +}
> > diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
> > new file mode 100644
> > index 0000000..9c07984
> > --- /dev/null
> > +++ b/lib/librte_gso/gso_tcp4.h
> > @@ -0,0 +1,76 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#ifndef _GSO_TCP4_H_
> > +#define _GSO_TCP4_H_
> > +
> > +#include <stdint.h>
> > +#include <rte_mbuf.h>
> > +
> > +/**
> > + * Segment an IPv4/TCP packet. This function assumes the input packet has
> > + * correct checksums and doesn't update checksums for GSO segment.
> > + * Furthermore, it doesn't process IP fragment packets.
> > + *
> > + * @param pkt
> > + *  The packet mbuf to segment.
> > + * @param gso_size
> > + *  The max length of a GSO segment, measured in bytes.
> > + * @param ipid_delta
> > + *  The increasing uint of IP ids.
> > + * @param direct_pool
> > + *  MBUF pool used for allocating direct buffers for output segments.
> > + * @param indirect_pool
> > + *  MBUF pool used for allocating indirect buffers for output segments.
> > + * @param pkts_out
> > + *  Pointer array used to store the MBUF addresses of output GSO
> > + *  segments, when gso_tcp4_segment() successes. If the memory space in
> > + *  pkts_out is insufficient, gso_tcp4_segment() fails and returns
> > + *  -EINVAL.
> > + * @param nb_pkts_out
> > + *  The max number of items that 'pkts_out' can keep.
> > + *
> > + * @return
> > + *   - The number of GSO segments filled in pkts_out on success.
> > + *   - Return -ENOMEM if run out of memory in MBUF pools.
> > + *   - Return -EINVAL for invalid parameters.
> > + */
> > +int gso_tcp4_segment(struct rte_mbuf *pkt,
> > +		uint16_t gso_size,
> > +		uint8_t ip_delta,
> > +		struct rte_mempool *direct_pool,
> > +		struct rte_mempool *indirect_pool,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out);
> > +
> > +#endif
> > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > index dda50ee..95f6ea6 100644
> > --- a/lib/librte_gso/rte_gso.c
> > +++ b/lib/librte_gso/rte_gso.c
> > @@ -33,18 +33,53 @@
> > 
> >  #include <errno.h>
> > 
> > +#include <rte_log.h>
> > +
> >  #include "rte_gso.h"
> > +#include "gso_common.h"
> > +#include "gso_tcp4.h"
> > 
> >  int
> >  rte_gso_segment(struct rte_mbuf *pkt,
> > -		struct rte_gso_ctx gso_ctx __rte_unused,
> > +		struct rte_gso_ctx gso_ctx,
> >  		struct rte_mbuf **pkts_out,
> >  		uint16_t nb_pkts_out)
> >  {
> > +	struct rte_mempool *direct_pool, *indirect_pool;
> > +	struct rte_mbuf *pkt_seg;
> > +	uint16_t gso_size;
> > +	uint8_t ipid_delta;
> > +	int ret = 1;
> > +
> >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
> >  		return -EINVAL;
> > 
> > -	pkts_out[0] = pkt;
> > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> > +			(pkt->packet_type & gso_ctx.gso_types) !=
> > +			pkt->packet_type) {
> > +		pkts_out[0] = pkt;
> > +		return ret;
> > +	}
> > +
> > +	direct_pool = gso_ctx.direct_pool;
> > +	indirect_pool = gso_ctx.indirect_pool;
> > +	gso_size = gso_ctx.gso_size;
> > +	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
> > +
> > +	if (is_ipv4_tcp(pkt->packet_type)) {
> 
> Probably we need here:
> If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> 
> > +		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> > +				direct_pool, indirect_pool,
> > +				pkts_out, nb_pkts_out);
> > +	} else
> > +		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
> 
> Shouldn't we do pkt_out[0] = pkt; here?

Yes, we need to add it here. Thanks for reminder.

> 
> > +
> > +	if (ret > 1) {
> > +		pkt_seg = pkt;
> > +		while (pkt_seg) {
> > +			rte_mbuf_refcnt_update(pkt_seg, -1);
> > +			pkt_seg = pkt_seg->next;
> > +		}
> > +	}
> > 
> > -	return 1;
> > +	return ret;
> >  }
> > --
> > 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-13  2:48         ` Jiayu Hu
@ 2017-09-13  9:38           ` Ananyev, Konstantin
  2017-09-13 10:23             ` Hu, Jiayu
  2017-09-13 14:52             ` Kavanagh, Mark B
  0 siblings, 2 replies; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-13  9:38 UTC (permalink / raw)
  To: Hu, Jiayu; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng



> > > +
> > > +int
> > > +gso_tcp4_segment(struct rte_mbuf *pkt,
> > > +		uint16_t gso_size,
> > > +		uint8_t ipid_delta,
> > > +		struct rte_mempool *direct_pool,
> > > +		struct rte_mempool *indirect_pool,
> > > +		struct rte_mbuf **pkts_out,
> > > +		uint16_t nb_pkts_out)
> > > +{
> > > +	struct ipv4_hdr *ipv4_hdr;
> > > +	uint16_t tcp_dl;
> > > +	uint16_t pyld_unit_size;
> > > +	uint16_t hdr_offset;
> > > +	int ret = 1;
> > > +
> > > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> > > +			pkt->l2_len);
> > > +	/* Don't process the fragmented packet */
> > > +	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
> > > +						IPV4_HDR_DF_MASK)) == 0)) {
> >
> >
> > It is not a check for fragmented packet - it is a check that fragmentation is allowed for that packet.
> > Should be IPV4_HDR_DF_MASK - 1,  I think.

DF bit doesn't indicate is packet fragmented or not.
It forbids to fragment packet any further.
To check is packet already fragmented or not, you have to check MF bit and frag_offset.
Both have to be zero for un-fragmented packets.

> 
> IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF bit. It's a
> little-endian value. But ipv4_hdr->fragment_offset is big-endian order.
> So the value of DF bit should be "ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
> IPV4_HDR_DF_MASK)". If this value is 0, the input packet is fragmented.
> 
> >
> > > +		pkts_out[0] = pkt;
> > > +		return ret;
> > > +	}
> > > +
> > > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt->l3_len -
> > > +		pkt->l4_len;
> >
> > Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len?
> 
> Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len here.
> 
> >
> > > +	/* Don't process the packet without data */
> > > +	if (unlikely(tcp_dl == 0)) {
> > > +		pkts_out[0] = pkt;
> > > +		return ret;
> > > +	}
> > > +
> > > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> > > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
> >
> > Hmm, why do we need to count CRC_LEN here?
> 
> Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be
> included in gso_size.

Why?
What is the point to account crc len into this computation?
Why not just assume that gso_size is already a max_frame_size - crc_len
As I remember, when we RX packet crc bytes will be already stripped,
when user populates the packet, he doesn't care about crc bytes too. 

Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-13  9:38           ` Ananyev, Konstantin
@ 2017-09-13 10:23             ` Hu, Jiayu
  2017-09-13 14:52             ` Kavanagh, Mark B
  1 sibling, 0 replies; 157+ messages in thread
From: Hu, Jiayu @ 2017-09-13 10:23 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Konstantin,

> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Wednesday, September 13, 2017 5:38 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> 
> 
> > > > +
> > > > +int
> > > > +gso_tcp4_segment(struct rte_mbuf *pkt,
> > > > +		uint16_t gso_size,
> > > > +		uint8_t ipid_delta,
> > > > +		struct rte_mempool *direct_pool,
> > > > +		struct rte_mempool *indirect_pool,
> > > > +		struct rte_mbuf **pkts_out,
> > > > +		uint16_t nb_pkts_out)
> > > > +{
> > > > +	struct ipv4_hdr *ipv4_hdr;
> > > > +	uint16_t tcp_dl;
> > > > +	uint16_t pyld_unit_size;
> > > > +	uint16_t hdr_offset;
> > > > +	int ret = 1;
> > > > +
> > > > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> > > > +			pkt->l2_len);
> > > > +	/* Don't process the fragmented packet */
> > > > +	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
> > > > +						IPV4_HDR_DF_MASK)) == 0))
> {
> > >
> > >
> > > It is not a check for fragmented packet - it is a check that fragmentation
> is allowed for that packet.
> > > Should be IPV4_HDR_DF_MASK - 1,  I think.
> 
> DF bit doesn't indicate is packet fragmented or not.
> It forbids to fragment packet any further.
> To check is packet already fragmented or not, you have to check MF bit and
> frag_offset.
> Both have to be zero for un-fragmented packets.

Yes, you are right. I checked the RFC and I misunderstood the meaning of DF bit.
When DF bit is set to 1, the packet isn't IP fragmented. When DF bit is 0, the packet
may or may not be fragmented. So it can't indicate if the packet is an IP fragment.
Only both MF and offset are 0, the packet is not fragmented.

> 
> >
> > IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF bit. It's
> a
> > little-endian value. But ipv4_hdr->fragment_offset is big-endian order.
> > So the value of DF bit should be "ipv4_hdr->fragment_offset &
> rte_cpu_to_be_16(
> > IPV4_HDR_DF_MASK)". If this value is 0, the input packet is fragmented.
> >
> > >
> > > > +		pkts_out[0] = pkt;
> > > > +		return ret;
> > > > +	}
> > > > +
> > > > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt->l3_len -
> > > > +		pkt->l4_len;
> > >
> > > Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len?
> >
> > Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len here.
> >
> > >
> > > > +	/* Don't process the packet without data */
> > > > +	if (unlikely(tcp_dl == 0)) {
> > > > +		pkts_out[0] = pkt;
> > > > +		return ret;
> > > > +	}
> > > > +
> > > > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> > > > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
> > >
> > > Hmm, why do we need to count CRC_LEN here?
> >
> > Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be
> > included in gso_size.
> 
> Why?
> What is the point to account crc len into this computation?
> Why not just assume that gso_size is already a max_frame_size - crc_len
> As I remember, when we RX packet crc bytes will be already stripped,
> when user populates the packet, he doesn't care about crc bytes too.

Sorry, maybe I didn't make it clear. I don't mean that applications must count
CRC when set gso_segsz. It's related specific scenarios to decide if count CRC
in gso_segsz or not, IMO. The GSO library shouldn't be aware of CRC, and just
uses gso_segsz to split packets.

Thanks,
Jiayu
> 
> Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-12 14:17       ` Ananyev, Konstantin
@ 2017-09-13 10:44         ` Jiayu Hu
  2017-09-13 22:10           ` Ananyev, Konstantin
  0 siblings, 1 reply; 157+ messages in thread
From: Jiayu Hu @ 2017-09-13 10:44 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Konstantin,

On Tue, Sep 12, 2017 at 10:17:27PM +0800, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Tuesday, September 12, 2017 12:18 PM
> > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>
> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > 
> > > result, when all of its GSOed segments are freed, the packet is freed
> > > automatically.
> > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > > index dda50ee..95f6ea6 100644
> > > --- a/lib/librte_gso/rte_gso.c
> > > +++ b/lib/librte_gso/rte_gso.c
> > > @@ -33,18 +33,53 @@
> > >
> > >  #include <errno.h>
> > >
> > > +#include <rte_log.h>
> > > +
> > >  #include "rte_gso.h"
> > > +#include "gso_common.h"
> > > +#include "gso_tcp4.h"
> > >
> > >  int
> > >  rte_gso_segment(struct rte_mbuf *pkt,
> > > -		struct rte_gso_ctx gso_ctx __rte_unused,
> > > +		struct rte_gso_ctx gso_ctx,
> > >  		struct rte_mbuf **pkts_out,
> > >  		uint16_t nb_pkts_out)
> > >  {
> > > +	struct rte_mempool *direct_pool, *indirect_pool;
> > > +	struct rte_mbuf *pkt_seg;
> > > +	uint16_t gso_size;
> > > +	uint8_t ipid_delta;
> > > +	int ret = 1;
> > > +
> > >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
> > >  		return -EINVAL;
> > >
> > > -	pkts_out[0] = pkt;
> > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> > > +			(pkt->packet_type & gso_ctx.gso_types) !=
> > > +			pkt->packet_type) {
> > > +		pkts_out[0] = pkt;
> > > +		return ret;
> > > +	}
> > > +
> > > +	direct_pool = gso_ctx.direct_pool;
> > > +	indirect_pool = gso_ctx.indirect_pool;
> > > +	gso_size = gso_ctx.gso_size;
> > > +	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
> > > +
> > > +	if (is_ipv4_tcp(pkt->packet_type)) {
> > 
> > Probably we need here:
> > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> 
> Sorry, actually it probably should be:
> If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) == PKT_TX_IPV4 &&
>       (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...

I don't quite understand why the GSO library should be aware if the TSO
flag is set or not. Applications can query device TSO capability before
they call the GSO library. Do I misundertsand anything?

Additionally, we don't need to check if the packet is a TCP/IPv4 packet here?

Thanks,
Jiayu
> 
> Konstantin
> 
> > 
> > > +		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> > > +				direct_pool, indirect_pool,
> > > +				pkts_out, nb_pkts_out);
> > > +	} else
> > > +		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
> > 
> > Shouldn't we do pkt_out[0] = pkt; here?
> > 
> > > +
> > > +	if (ret > 1) {
> > > +		pkt_seg = pkt;
> > > +		while (pkt_seg) {
> > > +			rte_mbuf_refcnt_update(pkt_seg, -1);
> > > +			pkt_seg = pkt_seg->next;
> > > +		}
> > > +	}
> > >
> > > -	return 1;
> > > +	return ret;
> > >  }
> > > --
> > > 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-13  9:38           ` Ananyev, Konstantin
  2017-09-13 10:23             ` Hu, Jiayu
@ 2017-09-13 14:52             ` Kavanagh, Mark B
  2017-09-13 15:13               ` Ananyev, Konstantin
  1 sibling, 1 reply; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-09-13 14:52 UTC (permalink / raw)
  To: Ananyev, Konstantin, Hu, Jiayu; +Cc: dev, Tan, Jianfeng

>From: Ananyev, Konstantin
>Sent: Wednesday, September 13, 2017 10:38 AM
>To: Hu, Jiayu <jiayu.hu@intel.com>
>Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng
><jianfeng.tan@intel.com>
>Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>
>
>
>> > > +
>> > > +int
>> > > +gso_tcp4_segment(struct rte_mbuf *pkt,
>> > > +		uint16_t gso_size,
>> > > +		uint8_t ipid_delta,
>> > > +		struct rte_mempool *direct_pool,
>> > > +		struct rte_mempool *indirect_pool,
>> > > +		struct rte_mbuf **pkts_out,
>> > > +		uint16_t nb_pkts_out)
>> > > +{
>> > > +	struct ipv4_hdr *ipv4_hdr;
>> > > +	uint16_t tcp_dl;
>> > > +	uint16_t pyld_unit_size;
>> > > +	uint16_t hdr_offset;
>> > > +	int ret = 1;
>> > > +
>> > > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> > > +			pkt->l2_len);
>> > > +	/* Don't process the fragmented packet */
>> > > +	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
>> > > +						IPV4_HDR_DF_MASK)) == 0)) {
>> >
>> >
>> > It is not a check for fragmented packet - it is a check that fragmentation
>is allowed for that packet.
>> > Should be IPV4_HDR_DF_MASK - 1,  I think.
>
>DF bit doesn't indicate is packet fragmented or not.
>It forbids to fragment packet any further.
>To check is packet already fragmented or not, you have to check MF bit and
>frag_offset.
>Both have to be zero for un-fragmented packets.
>
>>
>> IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF bit. It's a
>> little-endian value. But ipv4_hdr->fragment_offset is big-endian order.
>> So the value of DF bit should be "ipv4_hdr->fragment_offset &
>rte_cpu_to_be_16(
>> IPV4_HDR_DF_MASK)". If this value is 0, the input packet is fragmented.
>>
>> >
>> > > +		pkts_out[0] = pkt;
>> > > +		return ret;
>> > > +	}
>> > > +
>> > > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt->l3_len -
>> > > +		pkt->l4_len;
>> >
>> > Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len?
>>
>> Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len here.
>>
>> >
>> > > +	/* Don't process the packet without data */
>> > > +	if (unlikely(tcp_dl == 0)) {
>> > > +		pkts_out[0] = pkt;
>> > > +		return ret;
>> > > +	}
>> > > +
>> > > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
>> > > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
>> >
>> > Hmm, why do we need to count CRC_LEN here?
>>
>> Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be
>> included in gso_size.
>
>Why?
>What is the point to account crc len into this computation?
>Why not just assume that gso_size is already a max_frame_size - crc_len
>As I remember, when we RX packet crc bytes will be already stripped,
>when user populates the packet, he doesn't care about crc bytes too.

Hi Konstantin,

When packet is tx'd, the 4B for CRC are added back into the packet; if the payload is already at max capacity, then the actual segment size will be 4B larger than expected (e.g. 1522B, as opposed to 1518B).
To prevent that from happening, we account for the CRC len in this calculation.

If I've missed anything, please do let me know!

Thanks,
Mark 

>
>Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-13 14:52             ` Kavanagh, Mark B
@ 2017-09-13 15:13               ` Ananyev, Konstantin
  2017-09-14  0:59                 ` Hu, Jiayu
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-13 15:13 UTC (permalink / raw)
  To: Kavanagh, Mark B, Hu, Jiayu; +Cc: dev, Tan, Jianfeng

Hi Mark,

> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Wednesday, September 13, 2017 3:52 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> >From: Ananyev, Konstantin
> >Sent: Wednesday, September 13, 2017 10:38 AM
> >To: Hu, Jiayu <jiayu.hu@intel.com>
> >Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng
> ><jianfeng.tan@intel.com>
> >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> >
> >
> >> > > +
> >> > > +int
> >> > > +gso_tcp4_segment(struct rte_mbuf *pkt,
> >> > > +		uint16_t gso_size,
> >> > > +		uint8_t ipid_delta,
> >> > > +		struct rte_mempool *direct_pool,
> >> > > +		struct rte_mempool *indirect_pool,
> >> > > +		struct rte_mbuf **pkts_out,
> >> > > +		uint16_t nb_pkts_out)
> >> > > +{
> >> > > +	struct ipv4_hdr *ipv4_hdr;
> >> > > +	uint16_t tcp_dl;
> >> > > +	uint16_t pyld_unit_size;
> >> > > +	uint16_t hdr_offset;
> >> > > +	int ret = 1;
> >> > > +
> >> > > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> >> > > +			pkt->l2_len);
> >> > > +	/* Don't process the fragmented packet */
> >> > > +	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
> >> > > +						IPV4_HDR_DF_MASK)) == 0)) {
> >> >
> >> >
> >> > It is not a check for fragmented packet - it is a check that fragmentation
> >is allowed for that packet.
> >> > Should be IPV4_HDR_DF_MASK - 1,  I think.
> >
> >DF bit doesn't indicate is packet fragmented or not.
> >It forbids to fragment packet any further.
> >To check is packet already fragmented or not, you have to check MF bit and
> >frag_offset.
> >Both have to be zero for un-fragmented packets.
> >
> >>
> >> IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF bit. It's a
> >> little-endian value. But ipv4_hdr->fragment_offset is big-endian order.
> >> So the value of DF bit should be "ipv4_hdr->fragment_offset &
> >rte_cpu_to_be_16(
> >> IPV4_HDR_DF_MASK)". If this value is 0, the input packet is fragmented.
> >>
> >> >
> >> > > +		pkts_out[0] = pkt;
> >> > > +		return ret;
> >> > > +	}
> >> > > +
> >> > > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt->l3_len -
> >> > > +		pkt->l4_len;
> >> >
> >> > Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len?
> >>
> >> Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len here.
> >>
> >> >
> >> > > +	/* Don't process the packet without data */
> >> > > +	if (unlikely(tcp_dl == 0)) {
> >> > > +		pkts_out[0] = pkt;
> >> > > +		return ret;
> >> > > +	}
> >> > > +
> >> > > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> >> > > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
> >> >
> >> > Hmm, why do we need to count CRC_LEN here?
> >>
> >> Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be
> >> included in gso_size.
> >
> >Why?
> >What is the point to account crc len into this computation?
> >Why not just assume that gso_size is already a max_frame_size - crc_len
> >As I remember, when we RX packet crc bytes will be already stripped,
> >when user populates the packet, he doesn't care about crc bytes too.
> 
> Hi Konstantin,
> 
> When packet is tx'd, the 4B for CRC are added back into the packet; if the payload is already at max capacity, then the actual segment size
> will be 4B larger than expected (e.g. 1522B, as opposed to 1518B).
> To prevent that from happening, we account for the CRC len in this calculation.


Ok, and what prevents you to set gso_ctx.gso_size = 1514;  /*ether frame size without crc bytes */
?
Konstantin 

> 
> If I've missed anything, please do let me know!
> 
> Thanks,
> Mark
> 
> >
> >Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-13 10:44         ` Jiayu Hu
@ 2017-09-13 22:10           ` Ananyev, Konstantin
  2017-09-14  6:07             ` Jiayu Hu
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-13 22:10 UTC (permalink / raw)
  To: Hu, Jiayu; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng


Hi Jiayu,

> >
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin
> > > Sent: Tuesday, September 12, 2017 12:18 PM
> > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>
> > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > >
> > > > result, when all of its GSOed segments are freed, the packet is freed
> > > > automatically.
> > > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > > > index dda50ee..95f6ea6 100644
> > > > --- a/lib/librte_gso/rte_gso.c
> > > > +++ b/lib/librte_gso/rte_gso.c
> > > > @@ -33,18 +33,53 @@
> > > >
> > > >  #include <errno.h>
> > > >
> > > > +#include <rte_log.h>
> > > > +
> > > >  #include "rte_gso.h"
> > > > +#include "gso_common.h"
> > > > +#include "gso_tcp4.h"
> > > >
> > > >  int
> > > >  rte_gso_segment(struct rte_mbuf *pkt,
> > > > -		struct rte_gso_ctx gso_ctx __rte_unused,
> > > > +		struct rte_gso_ctx gso_ctx,
> > > >  		struct rte_mbuf **pkts_out,
> > > >  		uint16_t nb_pkts_out)
> > > >  {
> > > > +	struct rte_mempool *direct_pool, *indirect_pool;
> > > > +	struct rte_mbuf *pkt_seg;
> > > > +	uint16_t gso_size;
> > > > +	uint8_t ipid_delta;
> > > > +	int ret = 1;
> > > > +
> > > >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
> > > >  		return -EINVAL;
> > > >
> > > > -	pkts_out[0] = pkt;
> > > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> > > > +			(pkt->packet_type & gso_ctx.gso_types) !=
> > > > +			pkt->packet_type) {
> > > > +		pkts_out[0] = pkt;
> > > > +		return ret;
> > > > +	}
> > > > +
> > > > +	direct_pool = gso_ctx.direct_pool;
> > > > +	indirect_pool = gso_ctx.indirect_pool;
> > > > +	gso_size = gso_ctx.gso_size;
> > > > +	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
> > > > +
> > > > +	if (is_ipv4_tcp(pkt->packet_type)) {
> > >
> > > Probably we need here:
> > > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> >
> > Sorry, actually it probably should be:
> > If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) == PKT_TX_IPV4 &&
> >       (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> 
> I don't quite understand why the GSO library should be aware if the TSO
> flag is set or not. Applications can query device TSO capability before
> they call the GSO library. Do I misundertsand anything?
> 
> Additionally, we don't need to check if the packet is a TCP/IPv4 packet here?

Well, right now  PMD we doesn't rely on ptype to figure out what type of packet and
what TX offload have to be performed.
Instead it looks at TX part of ol_flags, and 
My thought was that as what we doing is actually TSO in SW, it would be good
to use the same API here too.
Also with that approach, by setting ol_flags properly user can use the same gso_ctx and still
specify what segmentation to perform on a per-packet basis.

Alternative way is to rely on ptype to distinguish should segmentation be performed on that package or not.
The only advantage I see here is that if someone would like to add GSO for some new protocol,
he wouldn't need to introduce new TX flag value for mbuf.ol_flags.
Though he still would need to update TX_OFFLOAD_* capabilities and probably packet_type definitions.
    
So from my perspective first variant (use HW TSO API) is more plausible.
Wonder what is your and Mark opinions here?
Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-13 15:13               ` Ananyev, Konstantin
@ 2017-09-14  0:59                 ` Hu, Jiayu
  2017-09-14  8:35                   ` Kavanagh, Mark B
  0 siblings, 1 reply; 157+ messages in thread
From: Hu, Jiayu @ 2017-09-14  0:59 UTC (permalink / raw)
  To: Ananyev, Konstantin, Kavanagh, Mark B; +Cc: dev, Tan, Jianfeng

Hi Konstantin,

> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Wednesday, September 13, 2017 11:13 PM
> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
> <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> Hi Mark,
> 
> > -----Original Message-----
> > From: Kavanagh, Mark B
> > Sent: Wednesday, September 13, 2017 3:52 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Hu, Jiayu
> <jiayu.hu@intel.com>
> > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> > >From: Ananyev, Konstantin
> > >Sent: Wednesday, September 13, 2017 10:38 AM
> > >To: Hu, Jiayu <jiayu.hu@intel.com>
> > >Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> Tan, Jianfeng
> > ><jianfeng.tan@intel.com>
> > >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > >
> > >
> > >
> > >> > > +
> > >> > > +int
> > >> > > +gso_tcp4_segment(struct rte_mbuf *pkt,
> > >> > > +		uint16_t gso_size,
> > >> > > +		uint8_t ipid_delta,
> > >> > > +		struct rte_mempool *direct_pool,
> > >> > > +		struct rte_mempool *indirect_pool,
> > >> > > +		struct rte_mbuf **pkts_out,
> > >> > > +		uint16_t nb_pkts_out)
> > >> > > +{
> > >> > > +	struct ipv4_hdr *ipv4_hdr;
> > >> > > +	uint16_t tcp_dl;
> > >> > > +	uint16_t pyld_unit_size;
> > >> > > +	uint16_t hdr_offset;
> > >> > > +	int ret = 1;
> > >> > > +
> > >> > > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *)
> +
> > >> > > +			pkt->l2_len);
> > >> > > +	/* Don't process the fragmented packet */
> > >> > > +	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
> > >> > > +
> 	IPV4_HDR_DF_MASK)) == 0)) {
> > >> >
> > >> >
> > >> > It is not a check for fragmented packet - it is a check that
> fragmentation
> > >is allowed for that packet.
> > >> > Should be IPV4_HDR_DF_MASK - 1,  I think.
> > >
> > >DF bit doesn't indicate is packet fragmented or not.
> > >It forbids to fragment packet any further.
> > >To check is packet already fragmented or not, you have to check MF bit
> and
> > >frag_offset.
> > >Both have to be zero for un-fragmented packets.
> > >
> > >>
> > >> IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF bit.
> It's a
> > >> little-endian value. But ipv4_hdr->fragment_offset is big-endian order.
> > >> So the value of DF bit should be "ipv4_hdr->fragment_offset &
> > >rte_cpu_to_be_16(
> > >> IPV4_HDR_DF_MASK)". If this value is 0, the input packet is fragmented.
> > >>
> > >> >
> > >> > > +		pkts_out[0] = pkt;
> > >> > > +		return ret;
> > >> > > +	}
> > >> > > +
> > >> > > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt-
> >l3_len -
> > >> > > +		pkt->l4_len;
> > >> >
> > >> > Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len?
> > >>
> > >> Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len here.
> > >>
> > >> >
> > >> > > +	/* Don't process the packet without data */
> > >> > > +	if (unlikely(tcp_dl == 0)) {
> > >> > > +		pkts_out[0] = pkt;
> > >> > > +		return ret;
> > >> > > +	}
> > >> > > +
> > >> > > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> > >> > > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
> > >> >
> > >> > Hmm, why do we need to count CRC_LEN here?
> > >>
> > >> Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be
> > >> included in gso_size.
> > >
> > >Why?
> > >What is the point to account crc len into this computation?
> > >Why not just assume that gso_size is already a max_frame_size - crc_len
> > >As I remember, when we RX packet crc bytes will be already stripped,
> > >when user populates the packet, he doesn't care about crc bytes too.
> >
> > Hi Konstantin,
> >
> > When packet is tx'd, the 4B for CRC are added back into the packet; if the
> payload is already at max capacity, then the actual segment size
> > will be 4B larger than expected (e.g. 1522B, as opposed to 1518B).
> > To prevent that from happening, we account for the CRC len in this
> calculation.
> 
> 
> Ok, and what prevents you to set gso_ctx.gso_size = 1514;  /*ether frame
> size without crc bytes */
> ?

Exactly, applications can set 1514 to gso_segsz instead of 1518, if the lower layer
will add CRC to the packet.

Jiayu

> Konstantin
> 
> >
> > If I've missed anything, please do let me know!
> >
> > Thanks,
> > Mark
> >
> > >
> > >Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-13 22:10           ` Ananyev, Konstantin
@ 2017-09-14  6:07             ` Jiayu Hu
  2017-09-14  8:47               ` Ananyev, Konstantin
  2017-09-14  8:51               ` Kavanagh, Mark B
  0 siblings, 2 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-14  6:07 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Konstantin,

On Thu, Sep 14, 2017 at 06:10:37AM +0800, Ananyev, Konstantin wrote:
> 
> Hi Jiayu,
> 
> > >
> > >
> > > > -----Original Message-----
> > > > From: Ananyev, Konstantin
> > > > Sent: Tuesday, September 12, 2017 12:18 PM
> > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> > > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>
> > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > >
> > > > > result, when all of its GSOed segments are freed, the packet is freed
> > > > > automatically.
> > > > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > > > > index dda50ee..95f6ea6 100644
> > > > > --- a/lib/librte_gso/rte_gso.c
> > > > > +++ b/lib/librte_gso/rte_gso.c
> > > > > @@ -33,18 +33,53 @@
> > > > >
> > > > >  #include <errno.h>
> > > > >
> > > > > +#include <rte_log.h>
> > > > > +
> > > > >  #include "rte_gso.h"
> > > > > +#include "gso_common.h"
> > > > > +#include "gso_tcp4.h"
> > > > >
> > > > >  int
> > > > >  rte_gso_segment(struct rte_mbuf *pkt,
> > > > > -		struct rte_gso_ctx gso_ctx __rte_unused,
> > > > > +		struct rte_gso_ctx gso_ctx,
> > > > >  		struct rte_mbuf **pkts_out,
> > > > >  		uint16_t nb_pkts_out)
> > > > >  {
> > > > > +	struct rte_mempool *direct_pool, *indirect_pool;
> > > > > +	struct rte_mbuf *pkt_seg;
> > > > > +	uint16_t gso_size;
> > > > > +	uint8_t ipid_delta;
> > > > > +	int ret = 1;
> > > > > +
> > > > >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
> > > > >  		return -EINVAL;
> > > > >
> > > > > -	pkts_out[0] = pkt;
> > > > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> > > > > +			(pkt->packet_type & gso_ctx.gso_types) !=
> > > > > +			pkt->packet_type) {
> > > > > +		pkts_out[0] = pkt;
> > > > > +		return ret;
> > > > > +	}
> > > > > +
> > > > > +	direct_pool = gso_ctx.direct_pool;
> > > > > +	indirect_pool = gso_ctx.indirect_pool;
> > > > > +	gso_size = gso_ctx.gso_size;
> > > > > +	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
> > > > > +
> > > > > +	if (is_ipv4_tcp(pkt->packet_type)) {
> > > >
> > > > Probably we need here:
> > > > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> > >
> > > Sorry, actually it probably should be:
> > > If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) == PKT_TX_IPV4 &&
> > >       (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> > 
> > I don't quite understand why the GSO library should be aware if the TSO
> > flag is set or not. Applications can query device TSO capability before
> > they call the GSO library. Do I misundertsand anything?
> > 
> > Additionally, we don't need to check if the packet is a TCP/IPv4 packet here?
> 
> Well, right now  PMD we doesn't rely on ptype to figure out what type of packet and
> what TX offload have to be performed.
> Instead it looks at TX part of ol_flags, and 
> My thought was that as what we doing is actually TSO in SW, it would be good
> to use the same API here too.
> Also with that approach, by setting ol_flags properly user can use the same gso_ctx and still
> specify what segmentation to perform on a per-packet basis.
> 
> Alternative way is to rely on ptype to distinguish should segmentation be performed on that package or not.
> The only advantage I see here is that if someone would like to add GSO for some new protocol,
> he wouldn't need to introduce new TX flag value for mbuf.ol_flags.
> Though he still would need to update TX_OFFLOAD_* capabilities and probably packet_type definitions.
>     
> So from my perspective first variant (use HW TSO API) is more plausible.
> Wonder what is your and Mark opinions here?

In the first choice, you mean:
the GSO library uses gso_ctx->gso_types and mbuf->ol_flags to call a specific GSO
segmentation function (e.g. gso_tcp4_segment(), gso_tunnel_xxx()) for each input packet.
Applications should parse the packet type, and set an exactly correct DEV_TX_OFFLOAD_*_TSO
flag to gso_types and ol_flags according to the packet type. That is, the value of gso_types
is on a per-packet basis. Using gso_ctx->gso_types and mbuf->ol_flags at the same time
is because that DEV_TX_OFFLOAD_*_TSO only tells tunnelling type and the inner L4 type, and
we need to know L3 type by ol_flags. With this design, HW segmentation and SW segmentation
are indeed consistent.

If I understand it correctly, applications need to set 'ol_flags = PKT_TX_IPV4' and
'gso_types = DEV_TX_OFFLOAD_VXLAN_TNL_TSO' for a "ether+ipv4+udp+vxlan+ether+ipv4+
tcp+payload" packet. But PKT_TX_IPV4 just present the inner L3 type for tunneled packet.
How about the outer L3 type? Always assume the inner and the outer L3 type are the same?

Jiayu
> Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-14  0:59                 ` Hu, Jiayu
@ 2017-09-14  8:35                   ` Kavanagh, Mark B
  2017-09-14  8:39                     ` Ananyev, Konstantin
  0 siblings, 1 reply; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-09-14  8:35 UTC (permalink / raw)
  To: Hu, Jiayu, Ananyev, Konstantin; +Cc: dev, Tan, Jianfeng

>From: Hu, Jiayu
>Sent: Thursday, September 14, 2017 2:00 AM
>To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>
>Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>
>Hi Konstantin,
>
>> -----Original Message-----
>> From: Ananyev, Konstantin
>> Sent: Wednesday, September 13, 2017 11:13 PM
>> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
>> <jiayu.hu@intel.com>
>> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>>
>> Hi Mark,
>>
>> > -----Original Message-----
>> > From: Kavanagh, Mark B
>> > Sent: Wednesday, September 13, 2017 3:52 PM
>> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Hu, Jiayu
>> <jiayu.hu@intel.com>
>> > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >
>> > >From: Ananyev, Konstantin
>> > >Sent: Wednesday, September 13, 2017 10:38 AM
>> > >To: Hu, Jiayu <jiayu.hu@intel.com>
>> > >Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
>> Tan, Jianfeng
>> > ><jianfeng.tan@intel.com>
>> > >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> > >
>> > >
>> > >
>> > >> > > +
>> > >> > > +int
>> > >> > > +gso_tcp4_segment(struct rte_mbuf *pkt,
>> > >> > > +		uint16_t gso_size,
>> > >> > > +		uint8_t ipid_delta,
>> > >> > > +		struct rte_mempool *direct_pool,
>> > >> > > +		struct rte_mempool *indirect_pool,
>> > >> > > +		struct rte_mbuf **pkts_out,
>> > >> > > +		uint16_t nb_pkts_out)
>> > >> > > +{
>> > >> > > +	struct ipv4_hdr *ipv4_hdr;
>> > >> > > +	uint16_t tcp_dl;
>> > >> > > +	uint16_t pyld_unit_size;
>> > >> > > +	uint16_t hdr_offset;
>> > >> > > +	int ret = 1;
>> > >> > > +
>> > >> > > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *)
>> +
>> > >> > > +			pkt->l2_len);
>> > >> > > +	/* Don't process the fragmented packet */
>> > >> > > +	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
>> > >> > > +
>> 	IPV4_HDR_DF_MASK)) == 0)) {
>> > >> >
>> > >> >
>> > >> > It is not a check for fragmented packet - it is a check that
>> fragmentation
>> > >is allowed for that packet.
>> > >> > Should be IPV4_HDR_DF_MASK - 1,  I think.
>> > >
>> > >DF bit doesn't indicate is packet fragmented or not.
>> > >It forbids to fragment packet any further.
>> > >To check is packet already fragmented or not, you have to check MF bit
>> and
>> > >frag_offset.
>> > >Both have to be zero for un-fragmented packets.
>> > >
>> > >>
>> > >> IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF bit.
>> It's a
>> > >> little-endian value. But ipv4_hdr->fragment_offset is big-endian order.
>> > >> So the value of DF bit should be "ipv4_hdr->fragment_offset &
>> > >rte_cpu_to_be_16(
>> > >> IPV4_HDR_DF_MASK)". If this value is 0, the input packet is fragmented.
>> > >>
>> > >> >
>> > >> > > +		pkts_out[0] = pkt;
>> > >> > > +		return ret;
>> > >> > > +	}
>> > >> > > +
>> > >> > > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt-
>> >l3_len -
>> > >> > > +		pkt->l4_len;
>> > >> >
>> > >> > Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len?
>> > >>
>> > >> Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len
>here.
>> > >>
>> > >> >
>> > >> > > +	/* Don't process the packet without data */
>> > >> > > +	if (unlikely(tcp_dl == 0)) {
>> > >> > > +		pkts_out[0] = pkt;
>> > >> > > +		return ret;
>> > >> > > +	}
>> > >> > > +
>> > >> > > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
>> > >> > > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
>> > >> >
>> > >> > Hmm, why do we need to count CRC_LEN here?
>> > >>
>> > >> Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be
>> > >> included in gso_size.
>> > >
>> > >Why?
>> > >What is the point to account crc len into this computation?
>> > >Why not just assume that gso_size is already a max_frame_size - crc_len
>> > >As I remember, when we RX packet crc bytes will be already stripped,
>> > >when user populates the packet, he doesn't care about crc bytes too.
>> >
>> > Hi Konstantin,
>> >
>> > When packet is tx'd, the 4B for CRC are added back into the packet; if the
>> payload is already at max capacity, then the actual segment size
>> > will be 4B larger than expected (e.g. 1522B, as opposed to 1518B).
>> > To prevent that from happening, we account for the CRC len in this
>> calculation.
>>
>>
>> Ok, and what prevents you to set gso_ctx.gso_size = 1514;  /*ether frame
>> size without crc bytes */
>> ?

Hey Konstantin,

If the user sets the gso_size to 1514, the resultant output segments' size should be 1514, and not 1518. Consequently, the payload capacity of each segment would be reduced accordingly.
The user only cares about the output segment size (i.e. gso_ctx.gso_size); we need to ensure that the size of the segments that are produced is consistent with that. As a result, we need to ensure that any packet overhead is accounted for in the segment size, before we can determine how much space remains for data.

Hope this makes sense.

Thanks,
Mark
 
>
>Exactly, applications can set 1514 to gso_segsz instead of 1518, if the lower
>layer
>will add CRC to the packet.
>
>Jiayu
>
>> Konstantin
>>
>> >
>> > If I've missed anything, please do let me know!
>> >
>> > Thanks,
>> > Mark
>> >
>> > >
>> > >Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-14  8:35                   ` Kavanagh, Mark B
@ 2017-09-14  8:39                     ` Ananyev, Konstantin
  2017-09-14  9:00                       ` Kavanagh, Mark B
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-14  8:39 UTC (permalink / raw)
  To: Kavanagh, Mark B, Hu, Jiayu; +Cc: dev, Tan, Jianfeng



> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Thursday, September 14, 2017 9:35 AM
> To: Hu, Jiayu <jiayu.hu@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> >From: Hu, Jiayu
> >Sent: Thursday, September 14, 2017 2:00 AM
> >To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark B
> ><mark.b.kavanagh@intel.com>
> >Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> >Hi Konstantin,
> >
> >> -----Original Message-----
> >> From: Ananyev, Konstantin
> >> Sent: Wednesday, September 13, 2017 11:13 PM
> >> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
> >> <jiayu.hu@intel.com>
> >> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >>
> >> Hi Mark,
> >>
> >> > -----Original Message-----
> >> > From: Kavanagh, Mark B
> >> > Sent: Wednesday, September 13, 2017 3:52 PM
> >> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Hu, Jiayu
> >> <jiayu.hu@intel.com>
> >> > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> >> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> >
> >> > >From: Ananyev, Konstantin
> >> > >Sent: Wednesday, September 13, 2017 10:38 AM
> >> > >To: Hu, Jiayu <jiayu.hu@intel.com>
> >> > >Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> >> Tan, Jianfeng
> >> > ><jianfeng.tan@intel.com>
> >> > >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> > >
> >> > >
> >> > >
> >> > >> > > +
> >> > >> > > +int
> >> > >> > > +gso_tcp4_segment(struct rte_mbuf *pkt,
> >> > >> > > +		uint16_t gso_size,
> >> > >> > > +		uint8_t ipid_delta,
> >> > >> > > +		struct rte_mempool *direct_pool,
> >> > >> > > +		struct rte_mempool *indirect_pool,
> >> > >> > > +		struct rte_mbuf **pkts_out,
> >> > >> > > +		uint16_t nb_pkts_out)
> >> > >> > > +{
> >> > >> > > +	struct ipv4_hdr *ipv4_hdr;
> >> > >> > > +	uint16_t tcp_dl;
> >> > >> > > +	uint16_t pyld_unit_size;
> >> > >> > > +	uint16_t hdr_offset;
> >> > >> > > +	int ret = 1;
> >> > >> > > +
> >> > >> > > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *)
> >> +
> >> > >> > > +			pkt->l2_len);
> >> > >> > > +	/* Don't process the fragmented packet */
> >> > >> > > +	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
> >> > >> > > +
> >> 	IPV4_HDR_DF_MASK)) == 0)) {
> >> > >> >
> >> > >> >
> >> > >> > It is not a check for fragmented packet - it is a check that
> >> fragmentation
> >> > >is allowed for that packet.
> >> > >> > Should be IPV4_HDR_DF_MASK - 1,  I think.
> >> > >
> >> > >DF bit doesn't indicate is packet fragmented or not.
> >> > >It forbids to fragment packet any further.
> >> > >To check is packet already fragmented or not, you have to check MF bit
> >> and
> >> > >frag_offset.
> >> > >Both have to be zero for un-fragmented packets.
> >> > >
> >> > >>
> >> > >> IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF bit.
> >> It's a
> >> > >> little-endian value. But ipv4_hdr->fragment_offset is big-endian order.
> >> > >> So the value of DF bit should be "ipv4_hdr->fragment_offset &
> >> > >rte_cpu_to_be_16(
> >> > >> IPV4_HDR_DF_MASK)". If this value is 0, the input packet is fragmented.
> >> > >>
> >> > >> >
> >> > >> > > +		pkts_out[0] = pkt;
> >> > >> > > +		return ret;
> >> > >> > > +	}
> >> > >> > > +
> >> > >> > > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt-
> >> >l3_len -
> >> > >> > > +		pkt->l4_len;
> >> > >> >
> >> > >> > Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len?
> >> > >>
> >> > >> Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len
> >here.
> >> > >>
> >> > >> >
> >> > >> > > +	/* Don't process the packet without data */
> >> > >> > > +	if (unlikely(tcp_dl == 0)) {
> >> > >> > > +		pkts_out[0] = pkt;
> >> > >> > > +		return ret;
> >> > >> > > +	}
> >> > >> > > +
> >> > >> > > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> >> > >> > > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
> >> > >> >
> >> > >> > Hmm, why do we need to count CRC_LEN here?
> >> > >>
> >> > >> Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be
> >> > >> included in gso_size.
> >> > >
> >> > >Why?
> >> > >What is the point to account crc len into this computation?
> >> > >Why not just assume that gso_size is already a max_frame_size - crc_len
> >> > >As I remember, when we RX packet crc bytes will be already stripped,
> >> > >when user populates the packet, he doesn't care about crc bytes too.
> >> >
> >> > Hi Konstantin,
> >> >
> >> > When packet is tx'd, the 4B for CRC are added back into the packet; if the
> >> payload is already at max capacity, then the actual segment size
> >> > will be 4B larger than expected (e.g. 1522B, as opposed to 1518B).
> >> > To prevent that from happening, we account for the CRC len in this
> >> calculation.
> >>
> >>
> >> Ok, and what prevents you to set gso_ctx.gso_size = 1514;  /*ether frame
> >> size without crc bytes */
> >> ?
> 
> Hey Konstantin,
> 
> If the user sets the gso_size to 1514, the resultant output segments' size should be 1514, and not 1518.

Yes and then NIC HW will add CRC bytes for you.
You are not filling CRC bytes in HW, and when providing to the HW size to send  - it is a payload size
(CRC bytes are not accounted).
Konstantin

 Consequently, the payload capacity
> of each segment would be reduced accordingly.
> The user only cares about the output segment size (i.e. gso_ctx.gso_size); we need to ensure that the size of the segments that are
> produced is consistent with that. As a result, we need to ensure that any packet overhead is accounted for in the segment size, before we
> can determine how much space remains for data.
> 
> Hope this makes sense.
> 
> Thanks,
> Mark
> 
> >
> >Exactly, applications can set 1514 to gso_segsz instead of 1518, if the lower
> >layer
> >will add CRC to the packet.
> >
> >Jiayu
> >
> >> Konstantin
> >>
> >> >
> >> > If I've missed anything, please do let me know!
> >> >
> >> > Thanks,
> >> > Mark
> >> >
> >> > >
> >> > >Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-14  6:07             ` Jiayu Hu
@ 2017-09-14  8:47               ` Ananyev, Konstantin
  2017-09-14  9:29                 ` Hu, Jiayu
  2017-09-14  8:51               ` Kavanagh, Mark B
  1 sibling, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-14  8:47 UTC (permalink / raw)
  To: Hu, Jiayu; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Jiayu,

> -----Original Message-----
> From: Hu, Jiayu
> Sent: Thursday, September 14, 2017 7:07 AM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> Hi Konstantin,
> 
> On Thu, Sep 14, 2017 at 06:10:37AM +0800, Ananyev, Konstantin wrote:
> >
> > Hi Jiayu,
> >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Ananyev, Konstantin
> > > > > Sent: Tuesday, September 12, 2017 12:18 PM
> > > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> > > > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>
> > > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > >
> > > > > > result, when all of its GSOed segments are freed, the packet is freed
> > > > > > automatically.
> > > > > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > > > > > index dda50ee..95f6ea6 100644
> > > > > > --- a/lib/librte_gso/rte_gso.c
> > > > > > +++ b/lib/librte_gso/rte_gso.c
> > > > > > @@ -33,18 +33,53 @@
> > > > > >
> > > > > >  #include <errno.h>
> > > > > >
> > > > > > +#include <rte_log.h>
> > > > > > +
> > > > > >  #include "rte_gso.h"
> > > > > > +#include "gso_common.h"
> > > > > > +#include "gso_tcp4.h"
> > > > > >
> > > > > >  int
> > > > > >  rte_gso_segment(struct rte_mbuf *pkt,
> > > > > > -		struct rte_gso_ctx gso_ctx __rte_unused,
> > > > > > +		struct rte_gso_ctx gso_ctx,
> > > > > >  		struct rte_mbuf **pkts_out,
> > > > > >  		uint16_t nb_pkts_out)
> > > > > >  {
> > > > > > +	struct rte_mempool *direct_pool, *indirect_pool;
> > > > > > +	struct rte_mbuf *pkt_seg;
> > > > > > +	uint16_t gso_size;
> > > > > > +	uint8_t ipid_delta;
> > > > > > +	int ret = 1;
> > > > > > +
> > > > > >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
> > > > > >  		return -EINVAL;
> > > > > >
> > > > > > -	pkts_out[0] = pkt;
> > > > > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> > > > > > +			(pkt->packet_type & gso_ctx.gso_types) !=
> > > > > > +			pkt->packet_type) {
> > > > > > +		pkts_out[0] = pkt;
> > > > > > +		return ret;
> > > > > > +	}
> > > > > > +
> > > > > > +	direct_pool = gso_ctx.direct_pool;
> > > > > > +	indirect_pool = gso_ctx.indirect_pool;
> > > > > > +	gso_size = gso_ctx.gso_size;
> > > > > > +	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
> > > > > > +
> > > > > > +	if (is_ipv4_tcp(pkt->packet_type)) {
> > > > >
> > > > > Probably we need here:
> > > > > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> > > >
> > > > Sorry, actually it probably should be:
> > > > If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) == PKT_TX_IPV4 &&
> > > >       (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> > >
> > > I don't quite understand why the GSO library should be aware if the TSO
> > > flag is set or not. Applications can query device TSO capability before
> > > they call the GSO library. Do I misundertsand anything?
> > >
> > > Additionally, we don't need to check if the packet is a TCP/IPv4 packet here?
> >
> > Well, right now  PMD we doesn't rely on ptype to figure out what type of packet and
> > what TX offload have to be performed.
> > Instead it looks at TX part of ol_flags, and
> > My thought was that as what we doing is actually TSO in SW, it would be good
> > to use the same API here too.
> > Also with that approach, by setting ol_flags properly user can use the same gso_ctx and still
> > specify what segmentation to perform on a per-packet basis.
> >
> > Alternative way is to rely on ptype to distinguish should segmentation be performed on that package or not.
> > The only advantage I see here is that if someone would like to add GSO for some new protocol,
> > he wouldn't need to introduce new TX flag value for mbuf.ol_flags.
> > Though he still would need to update TX_OFFLOAD_* capabilities and probably packet_type definitions.
> >
> > So from my perspective first variant (use HW TSO API) is more plausible.
> > Wonder what is your and Mark opinions here?
> 
> In the first choice, you mean:
> the GSO library uses gso_ctx->gso_types and mbuf->ol_flags to call a specific GSO
> segmentation function (e.g. gso_tcp4_segment(), gso_tunnel_xxx()) for each input packet.
> Applications should parse the packet type, and set an exactly correct DEV_TX_OFFLOAD_*_TSO
> flag to gso_types and ol_flags according to the packet type. That is, the value of gso_types
> is on a per-packet basis. Using gso_ctx->gso_types and mbuf->ol_flags at the same time
> is because that DEV_TX_OFFLOAD_*_TSO only tells tunnelling type and the inner L4 type, and
> we need to know L3 type by ol_flags. With this design, HW segmentation and SW segmentation
> are indeed consistent.
> 
> If I understand it correctly, applications need to set 'ol_flags = PKT_TX_IPV4' and
> 'gso_types = DEV_TX_OFFLOAD_VXLAN_TNL_TSO' for a "ether+ipv4+udp+vxlan+ether+ipv4+
> tcp+payload" packet. But PKT_TX_IPV4 just present the inner L3 type for tunneled packet.
> How about the outer L3 type? Always assume the inner and the outer L3 type are the same?

It think that for that case you'll have to set in ol_flags:

PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN | PKT_TX_TCP_SEG

Konstantin

> 
> Jiayu
> > Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-14  6:07             ` Jiayu Hu
  2017-09-14  8:47               ` Ananyev, Konstantin
@ 2017-09-14  8:51               ` Kavanagh, Mark B
  2017-09-14  9:45                 ` Hu, Jiayu
  1 sibling, 1 reply; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-09-14  8:51 UTC (permalink / raw)
  To: Hu, Jiayu, Ananyev, Konstantin; +Cc: dev, Tan, Jianfeng

>From: Hu, Jiayu
>Sent: Thursday, September 14, 2017 7:07 AM
>To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
>Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng
><jianfeng.tan@intel.com>
>Subject: Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>
>Hi Konstantin,
>
>On Thu, Sep 14, 2017 at 06:10:37AM +0800, Ananyev, Konstantin wrote:
>>
>> Hi Jiayu,
>>
>> > >
>> > >
>> > > > -----Original Message-----
>> > > > From: Ananyev, Konstantin
>> > > > Sent: Tuesday, September 12, 2017 12:18 PM
>> > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
>> > > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng
><jianfeng.tan@intel.com>
>> > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> > > >
>> > > > > result, when all of its GSOed segments are freed, the packet is
>freed
>> > > > > automatically.
>> > > > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
>> > > > > index dda50ee..95f6ea6 100644
>> > > > > --- a/lib/librte_gso/rte_gso.c
>> > > > > +++ b/lib/librte_gso/rte_gso.c
>> > > > > @@ -33,18 +33,53 @@
>> > > > >
>> > > > >  #include <errno.h>
>> > > > >
>> > > > > +#include <rte_log.h>
>> > > > > +
>> > > > >  #include "rte_gso.h"
>> > > > > +#include "gso_common.h"
>> > > > > +#include "gso_tcp4.h"
>> > > > >
>> > > > >  int
>> > > > >  rte_gso_segment(struct rte_mbuf *pkt,
>> > > > > -		struct rte_gso_ctx gso_ctx __rte_unused,
>> > > > > +		struct rte_gso_ctx gso_ctx,
>> > > > >  		struct rte_mbuf **pkts_out,
>> > > > >  		uint16_t nb_pkts_out)
>> > > > >  {
>> > > > > +	struct rte_mempool *direct_pool, *indirect_pool;
>> > > > > +	struct rte_mbuf *pkt_seg;
>> > > > > +	uint16_t gso_size;
>> > > > > +	uint8_t ipid_delta;
>> > > > > +	int ret = 1;
>> > > > > +
>> > > > >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
>> > > > >  		return -EINVAL;
>> > > > >
>> > > > > -	pkts_out[0] = pkt;
>> > > > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
>> > > > > +			(pkt->packet_type & gso_ctx.gso_types) !=
>> > > > > +			pkt->packet_type) {
>> > > > > +		pkts_out[0] = pkt;
>> > > > > +		return ret;
>> > > > > +	}
>> > > > > +
>> > > > > +	direct_pool = gso_ctx.direct_pool;
>> > > > > +	indirect_pool = gso_ctx.indirect_pool;
>> > > > > +	gso_size = gso_ctx.gso_size;
>> > > > > +	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
>> > > > > +
>> > > > > +	if (is_ipv4_tcp(pkt->packet_type)) {
>> > > >
>> > > > Probably we need here:
>> > > > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types &
>DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
>> > >
>> > > Sorry, actually it probably should be:
>> > > If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) == PKT_TX_IPV4 &&
>> > >       (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
>> >
>> > I don't quite understand why the GSO library should be aware if the TSO
>> > flag is set or not. Applications can query device TSO capability before
>> > they call the GSO library. Do I misundertsand anything?
>> >
>> > Additionally, we don't need to check if the packet is a TCP/IPv4 packet
>here?
>>
>> Well, right now  PMD we doesn't rely on ptype to figure out what type of
>packet and
>> what TX offload have to be performed.
>> Instead it looks at TX part of ol_flags, and
>> My thought was that as what we doing is actually TSO in SW, it would be good
>> to use the same API here too.
>> Also with that approach, by setting ol_flags properly user can use the same
>gso_ctx and still
>> specify what segmentation to perform on a per-packet basis.
>>
>> Alternative way is to rely on ptype to distinguish should segmentation be
>performed on that package or not.
>> The only advantage I see here is that if someone would like to add GSO for
>some new protocol,
>> he wouldn't need to introduce new TX flag value for mbuf.ol_flags.
>> Though he still would need to update TX_OFFLOAD_* capabilities and probably
>packet_type definitions.
>>
>> So from my perspective first variant (use HW TSO API) is more plausible.
>> Wonder what is your and Mark opinions here?
>
>In the first choice, you mean:
>the GSO library uses gso_ctx->gso_types and mbuf->ol_flags to call a specific
>GSO
>segmentation function (e.g. gso_tcp4_segment(), gso_tunnel_xxx()) for each
>input packet.
>Applications should parse the packet type, and set an exactly correct
>DEV_TX_OFFLOAD_*_TSO
>flag to gso_types and ol_flags according to the packet type. That is, the
>value of gso_types
>is on a per-packet basis. Using gso_ctx->gso_types and mbuf->ol_flags at the
>same time
>is because that DEV_TX_OFFLOAD_*_TSO only tells tunnelling type and the inner
>L4 type, and
>we need to know L3 type by ol_flags. With this design, HW segmentation and SW
>segmentation
>are indeed consistent.
>
>If I understand it correctly, applications need to set 'ol_flags =
>PKT_TX_IPV4' and
>'gso_types = DEV_TX_OFFLOAD_VXLAN_TNL_TSO' for a
>"ether+ipv4+udp+vxlan+ether+ipv4+
>tcp+payload" packet. But PKT_TX_IPV4 just present the inner L3 type for
>tunneled packet.
>How about the outer L3 type? Always assume the inner and the outer L3 type are
>the same?

Hi Jiayu, 

If I'm not mistaken, I think what Konstantin is suggesting is as follows: 

- The DEV_TX_OFFLOAD_*_TSO flags are currently used to describe a NIC's TSO capabilities; the GSO capabilities may also be described using the same macros, to provide a consistent view of segmentation capabilities across the HW and SW implementations.

- As part of segmentation, it's still a case of checking the packet type, but then setting the appropriate ol_flags in the mbuf, which the GSO library can use to segment the packet.

Thanks,
Mark

>
>Jiayu
>> Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-14  8:39                     ` Ananyev, Konstantin
@ 2017-09-14  9:00                       ` Kavanagh, Mark B
  2017-09-14  9:10                         ` Ananyev, Konstantin
  0 siblings, 1 reply; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-09-14  9:00 UTC (permalink / raw)
  To: Ananyev, Konstantin, Hu, Jiayu; +Cc: dev, Tan, Jianfeng

>From: Ananyev, Konstantin
>Sent: Thursday, September 14, 2017 9:40 AM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
><jiayu.hu@intel.com>
>Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>
>
>
>> -----Original Message-----
>> From: Kavanagh, Mark B
>> Sent: Thursday, September 14, 2017 9:35 AM
>> To: Hu, Jiayu <jiayu.hu@intel.com>; Ananyev, Konstantin
><konstantin.ananyev@intel.com>
>> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>>
>> >From: Hu, Jiayu
>> >Sent: Thursday, September 14, 2017 2:00 AM
>> >To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark B
>> ><mark.b.kavanagh@intel.com>
>> >Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>> >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >
>> >Hi Konstantin,
>> >
>> >> -----Original Message-----
>> >> From: Ananyev, Konstantin
>> >> Sent: Wednesday, September 13, 2017 11:13 PM
>> >> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
>> >> <jiayu.hu@intel.com>
>> >> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>> >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >>
>> >> Hi Mark,
>> >>
>> >> > -----Original Message-----
>> >> > From: Kavanagh, Mark B
>> >> > Sent: Wednesday, September 13, 2017 3:52 PM
>> >> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Hu, Jiayu
>> >> <jiayu.hu@intel.com>
>> >> > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>> >> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >> >
>> >> > >From: Ananyev, Konstantin
>> >> > >Sent: Wednesday, September 13, 2017 10:38 AM
>> >> > >To: Hu, Jiayu <jiayu.hu@intel.com>
>> >> > >Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
>> >> Tan, Jianfeng
>> >> > ><jianfeng.tan@intel.com>
>> >> > >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >> > >
>> >> > >
>> >> > >
>> >> > >> > > +
>> >> > >> > > +int
>> >> > >> > > +gso_tcp4_segment(struct rte_mbuf *pkt,
>> >> > >> > > +		uint16_t gso_size,
>> >> > >> > > +		uint8_t ipid_delta,
>> >> > >> > > +		struct rte_mempool *direct_pool,
>> >> > >> > > +		struct rte_mempool *indirect_pool,
>> >> > >> > > +		struct rte_mbuf **pkts_out,
>> >> > >> > > +		uint16_t nb_pkts_out)
>> >> > >> > > +{
>> >> > >> > > +	struct ipv4_hdr *ipv4_hdr;
>> >> > >> > > +	uint16_t tcp_dl;
>> >> > >> > > +	uint16_t pyld_unit_size;
>> >> > >> > > +	uint16_t hdr_offset;
>> >> > >> > > +	int ret = 1;
>> >> > >> > > +
>> >> > >> > > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *)
>> >> +
>> >> > >> > > +			pkt->l2_len);
>> >> > >> > > +	/* Don't process the fragmented packet */
>> >> > >> > > +	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
>> >> > >> > > +
>> >> 	IPV4_HDR_DF_MASK)) == 0)) {
>> >> > >> >
>> >> > >> >
>> >> > >> > It is not a check for fragmented packet - it is a check that
>> >> fragmentation
>> >> > >is allowed for that packet.
>> >> > >> > Should be IPV4_HDR_DF_MASK - 1,  I think.
>> >> > >
>> >> > >DF bit doesn't indicate is packet fragmented or not.
>> >> > >It forbids to fragment packet any further.
>> >> > >To check is packet already fragmented or not, you have to check MF bit
>> >> and
>> >> > >frag_offset.
>> >> > >Both have to be zero for un-fragmented packets.
>> >> > >
>> >> > >>
>> >> > >> IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF
>bit.
>> >> It's a
>> >> > >> little-endian value. But ipv4_hdr->fragment_offset is big-endian
>order.
>> >> > >> So the value of DF bit should be "ipv4_hdr->fragment_offset &
>> >> > >rte_cpu_to_be_16(
>> >> > >> IPV4_HDR_DF_MASK)". If this value is 0, the input packet is
>fragmented.
>> >> > >>
>> >> > >> >
>> >> > >> > > +		pkts_out[0] = pkt;
>> >> > >> > > +		return ret;
>> >> > >> > > +	}
>> >> > >> > > +
>> >> > >> > > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt-
>> >> >l3_len -
>> >> > >> > > +		pkt->l4_len;
>> >> > >> >
>> >> > >> > Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len?
>> >> > >>
>> >> > >> Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len
>> >here.
>> >> > >>
>> >> > >> >
>> >> > >> > > +	/* Don't process the packet without data */
>> >> > >> > > +	if (unlikely(tcp_dl == 0)) {
>> >> > >> > > +		pkts_out[0] = pkt;
>> >> > >> > > +		return ret;
>> >> > >> > > +	}
>> >> > >> > > +
>> >> > >> > > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
>> >> > >> > > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
>> >> > >> >
>> >> > >> > Hmm, why do we need to count CRC_LEN here?
>> >> > >>
>> >> > >> Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be
>> >> > >> included in gso_size.
>> >> > >
>> >> > >Why?
>> >> > >What is the point to account crc len into this computation?
>> >> > >Why not just assume that gso_size is already a max_frame_size -
>crc_len
>> >> > >As I remember, when we RX packet crc bytes will be already stripped,
>> >> > >when user populates the packet, he doesn't care about crc bytes too.
>> >> >
>> >> > Hi Konstantin,
>> >> >
>> >> > When packet is tx'd, the 4B for CRC are added back into the packet; if
>the
>> >> payload is already at max capacity, then the actual segment size
>> >> > will be 4B larger than expected (e.g. 1522B, as opposed to 1518B).
>> >> > To prevent that from happening, we account for the CRC len in this
>> >> calculation.
>> >>
>> >>
>> >> Ok, and what prevents you to set gso_ctx.gso_size = 1514;  /*ether frame
>> >> size without crc bytes */
>> >> ?
>>
>> Hey Konstantin,
>>
>> If the user sets the gso_size to 1514, the resultant output segments' size
>should be 1514, and not 1518.

Just to clarify - I meant here that the final output segment, including CRC len, should be 1514. I think this is where we're crossing wires ;)

>
>Yes and then NIC HW will add CRC bytes for you.
>You are not filling CRC bytes in HW, and when providing to the HW size to send
>- it is a payload size
>(CRC bytes are not accounted).
>Konstantin

Yes, exactly - in that case though, the gso_size specified by the user is not the actual final output segment size, but (segment size - 4B), right?

We can set that expectation in documentation, but from an application's/user's perspective, do you think that this might be confusing/misleading?

Thanks again,
Mark  

>
> Consequently, the payload capacity
>> of each segment would be reduced accordingly.
>> The user only cares about the output segment size (i.e. gso_ctx.gso_size);
>we need to ensure that the size of the segments that are
>> produced is consistent with that. As a result, we need to ensure that any
>packet overhead is accounted for in the segment size, before we
>> can determine how much space remains for data.
>>
>> Hope this makes sense.
>>
>> Thanks,
>> Mark
>>
>> >
>> >Exactly, applications can set 1514 to gso_segsz instead of 1518, if the
>lower
>> >layer
>> >will add CRC to the packet.
>> >
>> >Jiayu
>> >
>> >> Konstantin
>> >>
>> >> >
>> >> > If I've missed anything, please do let me know!
>> >> >
>> >> > Thanks,
>> >> > Mark
>> >> >
>> >> > >
>> >> > >Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-14  9:00                       ` Kavanagh, Mark B
@ 2017-09-14  9:10                         ` Ananyev, Konstantin
  2017-09-14  9:35                           ` Kavanagh, Mark B
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-14  9:10 UTC (permalink / raw)
  To: Kavanagh, Mark B, Hu, Jiayu; +Cc: dev, Tan, Jianfeng



> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Thursday, September 14, 2017 10:01 AM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> >From: Ananyev, Konstantin
> >Sent: Thursday, September 14, 2017 9:40 AM
> >To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
> ><jiayu.hu@intel.com>
> >Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> >
> >
> >> -----Original Message-----
> >> From: Kavanagh, Mark B
> >> Sent: Thursday, September 14, 2017 9:35 AM
> >> To: Hu, Jiayu <jiayu.hu@intel.com>; Ananyev, Konstantin
> ><konstantin.ananyev@intel.com>
> >> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >>
> >> >From: Hu, Jiayu
> >> >Sent: Thursday, September 14, 2017 2:00 AM
> >> >To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark B
> >> ><mark.b.kavanagh@intel.com>
> >> >Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> >> >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> >
> >> >Hi Konstantin,
> >> >
> >> >> -----Original Message-----
> >> >> From: Ananyev, Konstantin
> >> >> Sent: Wednesday, September 13, 2017 11:13 PM
> >> >> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
> >> >> <jiayu.hu@intel.com>
> >> >> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> >> >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> >>
> >> >> Hi Mark,
> >> >>
> >> >> > -----Original Message-----
> >> >> > From: Kavanagh, Mark B
> >> >> > Sent: Wednesday, September 13, 2017 3:52 PM
> >> >> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Hu, Jiayu
> >> >> <jiayu.hu@intel.com>
> >> >> > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> >> >> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> >> >
> >> >> > >From: Ananyev, Konstantin
> >> >> > >Sent: Wednesday, September 13, 2017 10:38 AM
> >> >> > >To: Hu, Jiayu <jiayu.hu@intel.com>
> >> >> > >Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> >> >> Tan, Jianfeng
> >> >> > ><jianfeng.tan@intel.com>
> >> >> > >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > >> > > +
> >> >> > >> > > +int
> >> >> > >> > > +gso_tcp4_segment(struct rte_mbuf *pkt,
> >> >> > >> > > +		uint16_t gso_size,
> >> >> > >> > > +		uint8_t ipid_delta,
> >> >> > >> > > +		struct rte_mempool *direct_pool,
> >> >> > >> > > +		struct rte_mempool *indirect_pool,
> >> >> > >> > > +		struct rte_mbuf **pkts_out,
> >> >> > >> > > +		uint16_t nb_pkts_out)
> >> >> > >> > > +{
> >> >> > >> > > +	struct ipv4_hdr *ipv4_hdr;
> >> >> > >> > > +	uint16_t tcp_dl;
> >> >> > >> > > +	uint16_t pyld_unit_size;
> >> >> > >> > > +	uint16_t hdr_offset;
> >> >> > >> > > +	int ret = 1;
> >> >> > >> > > +
> >> >> > >> > > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *)
> >> >> +
> >> >> > >> > > +			pkt->l2_len);
> >> >> > >> > > +	/* Don't process the fragmented packet */
> >> >> > >> > > +	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
> >> >> > >> > > +
> >> >> 	IPV4_HDR_DF_MASK)) == 0)) {
> >> >> > >> >
> >> >> > >> >
> >> >> > >> > It is not a check for fragmented packet - it is a check that
> >> >> fragmentation
> >> >> > >is allowed for that packet.
> >> >> > >> > Should be IPV4_HDR_DF_MASK - 1,  I think.
> >> >> > >
> >> >> > >DF bit doesn't indicate is packet fragmented or not.
> >> >> > >It forbids to fragment packet any further.
> >> >> > >To check is packet already fragmented or not, you have to check MF bit
> >> >> and
> >> >> > >frag_offset.
> >> >> > >Both have to be zero for un-fragmented packets.
> >> >> > >
> >> >> > >>
> >> >> > >> IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF
> >bit.
> >> >> It's a
> >> >> > >> little-endian value. But ipv4_hdr->fragment_offset is big-endian
> >order.
> >> >> > >> So the value of DF bit should be "ipv4_hdr->fragment_offset &
> >> >> > >rte_cpu_to_be_16(
> >> >> > >> IPV4_HDR_DF_MASK)". If this value is 0, the input packet is
> >fragmented.
> >> >> > >>
> >> >> > >> >
> >> >> > >> > > +		pkts_out[0] = pkt;
> >> >> > >> > > +		return ret;
> >> >> > >> > > +	}
> >> >> > >> > > +
> >> >> > >> > > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt-
> >> >> >l3_len -
> >> >> > >> > > +		pkt->l4_len;
> >> >> > >> >
> >> >> > >> > Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len?
> >> >> > >>
> >> >> > >> Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len
> >> >here.
> >> >> > >>
> >> >> > >> >
> >> >> > >> > > +	/* Don't process the packet without data */
> >> >> > >> > > +	if (unlikely(tcp_dl == 0)) {
> >> >> > >> > > +		pkts_out[0] = pkt;
> >> >> > >> > > +		return ret;
> >> >> > >> > > +	}
> >> >> > >> > > +
> >> >> > >> > > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> >> >> > >> > > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
> >> >> > >> >
> >> >> > >> > Hmm, why do we need to count CRC_LEN here?
> >> >> > >>
> >> >> > >> Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be
> >> >> > >> included in gso_size.
> >> >> > >
> >> >> > >Why?
> >> >> > >What is the point to account crc len into this computation?
> >> >> > >Why not just assume that gso_size is already a max_frame_size -
> >crc_len
> >> >> > >As I remember, when we RX packet crc bytes will be already stripped,
> >> >> > >when user populates the packet, he doesn't care about crc bytes too.
> >> >> >
> >> >> > Hi Konstantin,
> >> >> >
> >> >> > When packet is tx'd, the 4B for CRC are added back into the packet; if
> >the
> >> >> payload is already at max capacity, then the actual segment size
> >> >> > will be 4B larger than expected (e.g. 1522B, as opposed to 1518B).
> >> >> > To prevent that from happening, we account for the CRC len in this
> >> >> calculation.
> >> >>
> >> >>
> >> >> Ok, and what prevents you to set gso_ctx.gso_size = 1514;  /*ether frame
> >> >> size without crc bytes */
> >> >> ?
> >>
> >> Hey Konstantin,
> >>
> >> If the user sets the gso_size to 1514, the resultant output segments' size
> >should be 1514, and not 1518.
> 
> Just to clarify - I meant here that the final output segment, including CRC len, should be 1514. I think this is where we're crossing wires ;)
> 
> >
> >Yes and then NIC HW will add CRC bytes for you.
> >You are not filling CRC bytes in HW, and when providing to the HW size to send
> >- it is a payload size
> >(CRC bytes are not accounted).
> >Konstantin
> 
> Yes, exactly - in that case though, the gso_size specified by the user is not the actual final output segment size, but (segment size - 4B),
> right?

CRC bytes will be add by HW, it is totally transparent for user.

> 
> We can set that expectation in documentation, but from an application's/user's perspective, do you think that this might be
> confusing/misleading?

I think it would be much more confusing to make user account for CRC bytes.
Let say when in DPDK you form a packet and send it out via rte_eth_tx_burst()
you specify only your payload size, not payload size plus crc bytes that HW will add for you.
Konstantin

> 
> Thanks again,
> Mark
> 
> >
> > Consequently, the payload capacity
> >> of each segment would be reduced accordingly.
> >> The user only cares about the output segment size (i.e. gso_ctx.gso_size);
> >we need to ensure that the size of the segments that are
> >> produced is consistent with that. As a result, we need to ensure that any
> >packet overhead is accounted for in the segment size, before we
> >> can determine how much space remains for data.
> >>
> >> Hope this makes sense.
> >>
> >> Thanks,
> >> Mark
> >>
> >> >
> >> >Exactly, applications can set 1514 to gso_segsz instead of 1518, if the
> >lower
> >> >layer
> >> >will add CRC to the packet.
> >> >
> >> >Jiayu
> >> >
> >> >> Konstantin
> >> >>
> >> >> >
> >> >> > If I've missed anything, please do let me know!
> >> >> >
> >> >> > Thanks,
> >> >> > Mark
> >> >> >
> >> >> > >
> >> >> > >Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-14  8:47               ` Ananyev, Konstantin
@ 2017-09-14  9:29                 ` Hu, Jiayu
  2017-09-14  9:35                   ` Ananyev, Konstantin
  0 siblings, 1 reply; 157+ messages in thread
From: Hu, Jiayu @ 2017-09-14  9:29 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng

Hi Konstantin,

> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Thursday, September 14, 2017 4:47 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> Hi Jiayu,
> 
> > -----Original Message-----
> > From: Hu, Jiayu
> > Sent: Thursday, September 14, 2017 7:07 AM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> Jianfeng <jianfeng.tan@intel.com>
> > Subject: Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> > Hi Konstantin,
> >
> > On Thu, Sep 14, 2017 at 06:10:37AM +0800, Ananyev, Konstantin wrote:
> > >
> > > Hi Jiayu,
> > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Ananyev, Konstantin
> > > > > > Sent: Tuesday, September 12, 2017 12:18 PM
> > > > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> > > > > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng
> <jianfeng.tan@intel.com>
> > > > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > > >
> > > > > > > result, when all of its GSOed segments are freed, the packet is
> freed
> > > > > > > automatically.
> > > > > > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > > > > > > index dda50ee..95f6ea6 100644
> > > > > > > --- a/lib/librte_gso/rte_gso.c
> > > > > > > +++ b/lib/librte_gso/rte_gso.c
> > > > > > > @@ -33,18 +33,53 @@
> > > > > > >
> > > > > > >  #include <errno.h>
> > > > > > >
> > > > > > > +#include <rte_log.h>
> > > > > > > +
> > > > > > >  #include "rte_gso.h"
> > > > > > > +#include "gso_common.h"
> > > > > > > +#include "gso_tcp4.h"
> > > > > > >
> > > > > > >  int
> > > > > > >  rte_gso_segment(struct rte_mbuf *pkt,
> > > > > > > -		struct rte_gso_ctx gso_ctx __rte_unused,
> > > > > > > +		struct rte_gso_ctx gso_ctx,
> > > > > > >  		struct rte_mbuf **pkts_out,
> > > > > > >  		uint16_t nb_pkts_out)
> > > > > > >  {
> > > > > > > +	struct rte_mempool *direct_pool, *indirect_pool;
> > > > > > > +	struct rte_mbuf *pkt_seg;
> > > > > > > +	uint16_t gso_size;
> > > > > > > +	uint8_t ipid_delta;
> > > > > > > +	int ret = 1;
> > > > > > > +
> > > > > > >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
> > > > > > >  		return -EINVAL;
> > > > > > >
> > > > > > > -	pkts_out[0] = pkt;
> > > > > > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> > > > > > > +			(pkt->packet_type & gso_ctx.gso_types) !=
> > > > > > > +			pkt->packet_type) {
> > > > > > > +		pkts_out[0] = pkt;
> > > > > > > +		return ret;
> > > > > > > +	}
> > > > > > > +
> > > > > > > +	direct_pool = gso_ctx.direct_pool;
> > > > > > > +	indirect_pool = gso_ctx.indirect_pool;
> > > > > > > +	gso_size = gso_ctx.gso_size;
> > > > > > > +	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
> > > > > > > +
> > > > > > > +	if (is_ipv4_tcp(pkt->packet_type)) {
> > > > > >
> > > > > > Probably we need here:
> > > > > > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types &
> DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> > > > >
> > > > > Sorry, actually it probably should be:
> > > > > If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) == PKT_TX_IPV4
> &&
> > > > >       (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> > > >
> > > > I don't quite understand why the GSO library should be aware if the TSO
> > > > flag is set or not. Applications can query device TSO capability before
> > > > they call the GSO library. Do I misundertsand anything?
> > > >
> > > > Additionally, we don't need to check if the packet is a TCP/IPv4 packet
> here?
> > >
> > > Well, right now  PMD we doesn't rely on ptype to figure out what type of
> packet and
> > > what TX offload have to be performed.
> > > Instead it looks at TX part of ol_flags, and
> > > My thought was that as what we doing is actually TSO in SW, it would be
> good
> > > to use the same API here too.
> > > Also with that approach, by setting ol_flags properly user can use the
> same gso_ctx and still
> > > specify what segmentation to perform on a per-packet basis.
> > >
> > > Alternative way is to rely on ptype to distinguish should segmentation be
> performed on that package or not.
> > > The only advantage I see here is that if someone would like to add GSO
> for some new protocol,
> > > he wouldn't need to introduce new TX flag value for mbuf.ol_flags.
> > > Though he still would need to update TX_OFFLOAD_* capabilities and
> probably packet_type definitions.
> > >
> > > So from my perspective first variant (use HW TSO API) is more plausible.
> > > Wonder what is your and Mark opinions here?
> >
> > In the first choice, you mean:
> > the GSO library uses gso_ctx->gso_types and mbuf->ol_flags to call a
> specific GSO
> > segmentation function (e.g. gso_tcp4_segment(), gso_tunnel_xxx()) for
> each input packet.
> > Applications should parse the packet type, and set an exactly correct
> DEV_TX_OFFLOAD_*_TSO
> > flag to gso_types and ol_flags according to the packet type. That is, the
> value of gso_types
> > is on a per-packet basis. Using gso_ctx->gso_types and mbuf->ol_flags at
> the same time
> > is because that DEV_TX_OFFLOAD_*_TSO only tells tunnelling type and the
> inner L4 type, and
> > we need to know L3 type by ol_flags. With this design, HW segmentation
> and SW segmentation
> > are indeed consistent.
> >
> > If I understand it correctly, applications need to set 'ol_flags =
> PKT_TX_IPV4' and
> > 'gso_types = DEV_TX_OFFLOAD_VXLAN_TNL_TSO' for a
> "ether+ipv4+udp+vxlan+ether+ipv4+
> > tcp+payload" packet. But PKT_TX_IPV4 just present the inner L3 type for
> tunneled packet.
> > How about the outer L3 type? Always assume the inner and the outer L3
> type are the same?
> 
> It think that for that case you'll have to set in ol_flags:
> 
> PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN |
> PKT_TX_TCP_SEG

OK, so it means PKT_TX_TCP_SEG is also used for tunneled TSO. The
GSO library doesn't need gso_types anymore.

The first choice makes HW and SW segmentation are totally the same.
Applications just need to parse the packet and set proper ol_flags, and
the GSO library uses ol_flags to decide which segmentation function to use.
I think it's better than the second choice which depending on ptype to
choose segmentation function.

Jiayu
> 
> Konstantin
> 
> >
> > Jiayu
> > > Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-14  9:29                 ` Hu, Jiayu
@ 2017-09-14  9:35                   ` Ananyev, Konstantin
  2017-09-14 10:01                     ` Hu, Jiayu
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-14  9:35 UTC (permalink / raw)
  To: Hu, Jiayu; +Cc: dev, Kavanagh, Mark B, Tan, Jianfeng



> -----Original Message-----
> From: Hu, Jiayu
> Sent: Thursday, September 14, 2017 10:29 AM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> Hi Konstantin,
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Thursday, September 14, 2017 4:47 PM
> > To: Hu, Jiayu <jiayu.hu@intel.com>
> > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> > Jianfeng <jianfeng.tan@intel.com>
> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> > Hi Jiayu,
> >
> > > -----Original Message-----
> > > From: Hu, Jiayu
> > > Sent: Thursday, September 14, 2017 7:07 AM
> > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> > Jianfeng <jianfeng.tan@intel.com>
> > > Subject: Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > >
> > > Hi Konstantin,
> > >
> > > On Thu, Sep 14, 2017 at 06:10:37AM +0800, Ananyev, Konstantin wrote:
> > > >
> > > > Hi Jiayu,
> > > >
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Ananyev, Konstantin
> > > > > > > Sent: Tuesday, September 12, 2017 12:18 PM
> > > > > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> > > > > > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng
> > <jianfeng.tan@intel.com>
> > > > > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > > > >
> > > > > > > > result, when all of its GSOed segments are freed, the packet is
> > freed
> > > > > > > > automatically.
> > > > > > > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > > > > > > > index dda50ee..95f6ea6 100644
> > > > > > > > --- a/lib/librte_gso/rte_gso.c
> > > > > > > > +++ b/lib/librte_gso/rte_gso.c
> > > > > > > > @@ -33,18 +33,53 @@
> > > > > > > >
> > > > > > > >  #include <errno.h>
> > > > > > > >
> > > > > > > > +#include <rte_log.h>
> > > > > > > > +
> > > > > > > >  #include "rte_gso.h"
> > > > > > > > +#include "gso_common.h"
> > > > > > > > +#include "gso_tcp4.h"
> > > > > > > >
> > > > > > > >  int
> > > > > > > >  rte_gso_segment(struct rte_mbuf *pkt,
> > > > > > > > -		struct rte_gso_ctx gso_ctx __rte_unused,
> > > > > > > > +		struct rte_gso_ctx gso_ctx,
> > > > > > > >  		struct rte_mbuf **pkts_out,
> > > > > > > >  		uint16_t nb_pkts_out)
> > > > > > > >  {
> > > > > > > > +	struct rte_mempool *direct_pool, *indirect_pool;
> > > > > > > > +	struct rte_mbuf *pkt_seg;
> > > > > > > > +	uint16_t gso_size;
> > > > > > > > +	uint8_t ipid_delta;
> > > > > > > > +	int ret = 1;
> > > > > > > > +
> > > > > > > >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
> > > > > > > >  		return -EINVAL;
> > > > > > > >
> > > > > > > > -	pkts_out[0] = pkt;
> > > > > > > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> > > > > > > > +			(pkt->packet_type & gso_ctx.gso_types) !=
> > > > > > > > +			pkt->packet_type) {
> > > > > > > > +		pkts_out[0] = pkt;
> > > > > > > > +		return ret;
> > > > > > > > +	}
> > > > > > > > +
> > > > > > > > +	direct_pool = gso_ctx.direct_pool;
> > > > > > > > +	indirect_pool = gso_ctx.indirect_pool;
> > > > > > > > +	gso_size = gso_ctx.gso_size;
> > > > > > > > +	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
> > > > > > > > +
> > > > > > > > +	if (is_ipv4_tcp(pkt->packet_type)) {
> > > > > > >
> > > > > > > Probably we need here:
> > > > > > > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types &
> > DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> > > > > >
> > > > > > Sorry, actually it probably should be:
> > > > > > If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) == PKT_TX_IPV4
> > &&
> > > > > >       (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> > > > >
> > > > > I don't quite understand why the GSO library should be aware if the TSO
> > > > > flag is set or not. Applications can query device TSO capability before
> > > > > they call the GSO library. Do I misundertsand anything?
> > > > >
> > > > > Additionally, we don't need to check if the packet is a TCP/IPv4 packet
> > here?
> > > >
> > > > Well, right now  PMD we doesn't rely on ptype to figure out what type of
> > packet and
> > > > what TX offload have to be performed.
> > > > Instead it looks at TX part of ol_flags, and
> > > > My thought was that as what we doing is actually TSO in SW, it would be
> > good
> > > > to use the same API here too.
> > > > Also with that approach, by setting ol_flags properly user can use the
> > same gso_ctx and still
> > > > specify what segmentation to perform on a per-packet basis.
> > > >
> > > > Alternative way is to rely on ptype to distinguish should segmentation be
> > performed on that package or not.
> > > > The only advantage I see here is that if someone would like to add GSO
> > for some new protocol,
> > > > he wouldn't need to introduce new TX flag value for mbuf.ol_flags.
> > > > Though he still would need to update TX_OFFLOAD_* capabilities and
> > probably packet_type definitions.
> > > >
> > > > So from my perspective first variant (use HW TSO API) is more plausible.
> > > > Wonder what is your and Mark opinions here?
> > >
> > > In the first choice, you mean:
> > > the GSO library uses gso_ctx->gso_types and mbuf->ol_flags to call a
> > specific GSO
> > > segmentation function (e.g. gso_tcp4_segment(), gso_tunnel_xxx()) for
> > each input packet.
> > > Applications should parse the packet type, and set an exactly correct
> > DEV_TX_OFFLOAD_*_TSO
> > > flag to gso_types and ol_flags according to the packet type. That is, the
> > value of gso_types
> > > is on a per-packet basis. Using gso_ctx->gso_types and mbuf->ol_flags at
> > the same time
> > > is because that DEV_TX_OFFLOAD_*_TSO only tells tunnelling type and the
> > inner L4 type, and
> > > we need to know L3 type by ol_flags. With this design, HW segmentation
> > and SW segmentation
> > > are indeed consistent.
> > >
> > > If I understand it correctly, applications need to set 'ol_flags =
> > PKT_TX_IPV4' and
> > > 'gso_types = DEV_TX_OFFLOAD_VXLAN_TNL_TSO' for a
> > "ether+ipv4+udp+vxlan+ether+ipv4+
> > > tcp+payload" packet. But PKT_TX_IPV4 just present the inner L3 type for
> > tunneled packet.
> > > How about the outer L3 type? Always assume the inner and the outer L3
> > type are the same?
> >
> > It think that for that case you'll have to set in ol_flags:
> >
> > PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN |
> > PKT_TX_TCP_SEG
> 
> OK, so it means PKT_TX_TCP_SEG is also used for tunneled TSO. The
> GSO library doesn't need gso_types anymore.

You still might need gso_ctx.gso_types to let user limit what types of segmentation
that particular gso_ctx supports.
An alternative would be to assume that each gso_ctx supports all
currently implemented segmentations.
This is possible too, but probably not very convenient to the user.
Konstantin

> 
> The first choice makes HW and SW segmentation are totally the same.
> Applications just need to parse the packet and set proper ol_flags, and
> the GSO library uses ol_flags to decide which segmentation function to use.
> I think it's better than the second choice which depending on ptype to
> choose segmentation function.
> 
> Jiayu
> >
> > Konstantin
> >
> > >
> > > Jiayu
> > > > Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-14  9:10                         ` Ananyev, Konstantin
@ 2017-09-14  9:35                           ` Kavanagh, Mark B
  0 siblings, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-09-14  9:35 UTC (permalink / raw)
  To: Ananyev, Konstantin, Hu, Jiayu; +Cc: dev, Tan, Jianfeng

>From: Ananyev, Konstantin
>Sent: Thursday, September 14, 2017 10:11 AM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
><jiayu.hu@intel.com>
>Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>
>
>
>> -----Original Message-----
>> From: Kavanagh, Mark B
>> Sent: Thursday, September 14, 2017 10:01 AM
>> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Hu, Jiayu
><jiayu.hu@intel.com>
>> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>>
>> >From: Ananyev, Konstantin
>> >Sent: Thursday, September 14, 2017 9:40 AM
>> >To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
>> ><jiayu.hu@intel.com>
>> >Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>> >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: Kavanagh, Mark B
>> >> Sent: Thursday, September 14, 2017 9:35 AM
>> >> To: Hu, Jiayu <jiayu.hu@intel.com>; Ananyev, Konstantin
>> ><konstantin.ananyev@intel.com>
>> >> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>> >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >>
>> >> >From: Hu, Jiayu
>> >> >Sent: Thursday, September 14, 2017 2:00 AM
>> >> >To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark B
>> >> ><mark.b.kavanagh@intel.com>
>> >> >Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>> >> >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >> >
>> >> >Hi Konstantin,
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Ananyev, Konstantin
>> >> >> Sent: Wednesday, September 13, 2017 11:13 PM
>> >> >> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
>> >> >> <jiayu.hu@intel.com>
>> >> >> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>> >> >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >> >>
>> >> >> Hi Mark,
>> >> >>
>> >> >> > -----Original Message-----
>> >> >> > From: Kavanagh, Mark B
>> >> >> > Sent: Wednesday, September 13, 2017 3:52 PM
>> >> >> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Hu, Jiayu
>> >> >> <jiayu.hu@intel.com>
>> >> >> > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>> >> >> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >> >> >
>> >> >> > >From: Ananyev, Konstantin
>> >> >> > >Sent: Wednesday, September 13, 2017 10:38 AM
>> >> >> > >To: Hu, Jiayu <jiayu.hu@intel.com>
>> >> >> > >Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
>> >> >> Tan, Jianfeng
>> >> >> > ><jianfeng.tan@intel.com>
>> >> >> > >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >> >> > >
>> >> >> > >
>> >> >> > >
>> >> >> > >> > > +
>> >> >> > >> > > +int
>> >> >> > >> > > +gso_tcp4_segment(struct rte_mbuf *pkt,
>> >> >> > >> > > +		uint16_t gso_size,
>> >> >> > >> > > +		uint8_t ipid_delta,
>> >> >> > >> > > +		struct rte_mempool *direct_pool,
>> >> >> > >> > > +		struct rte_mempool *indirect_pool,
>> >> >> > >> > > +		struct rte_mbuf **pkts_out,
>> >> >> > >> > > +		uint16_t nb_pkts_out)
>> >> >> > >> > > +{
>> >> >> > >> > > +	struct ipv4_hdr *ipv4_hdr;
>> >> >> > >> > > +	uint16_t tcp_dl;
>> >> >> > >> > > +	uint16_t pyld_unit_size;
>> >> >> > >> > > +	uint16_t hdr_offset;
>> >> >> > >> > > +	int ret = 1;
>> >> >> > >> > > +
>> >> >> > >> > > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt,
>char *)
>> >> >> +
>> >> >> > >> > > +			pkt->l2_len);
>> >> >> > >> > > +	/* Don't process the fragmented packet */
>> >> >> > >> > > +	if (unlikely((ipv4_hdr->fragment_offset &
>rte_cpu_to_be_16(
>> >> >> > >> > > +
>> >> >> 	IPV4_HDR_DF_MASK)) == 0)) {
>> >> >> > >> >
>> >> >> > >> >
>> >> >> > >> > It is not a check for fragmented packet - it is a check that
>> >> >> fragmentation
>> >> >> > >is allowed for that packet.
>> >> >> > >> > Should be IPV4_HDR_DF_MASK - 1,  I think.
>> >> >> > >
>> >> >> > >DF bit doesn't indicate is packet fragmented or not.
>> >> >> > >It forbids to fragment packet any further.
>> >> >> > >To check is packet already fragmented or not, you have to check MF
>bit
>> >> >> and
>> >> >> > >frag_offset.
>> >> >> > >Both have to be zero for un-fragmented packets.
>> >> >> > >
>> >> >> > >>
>> >> >> > >> IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF
>> >bit.
>> >> >> It's a
>> >> >> > >> little-endian value. But ipv4_hdr->fragment_offset is big-endian
>> >order.
>> >> >> > >> So the value of DF bit should be "ipv4_hdr->fragment_offset &
>> >> >> > >rte_cpu_to_be_16(
>> >> >> > >> IPV4_HDR_DF_MASK)". If this value is 0, the input packet is
>> >fragmented.
>> >> >> > >>
>> >> >> > >> >
>> >> >> > >> > > +		pkts_out[0] = pkt;
>> >> >> > >> > > +		return ret;
>> >> >> > >> > > +	}
>> >> >> > >> > > +
>> >> >> > >> > > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt-
>> >> >> >l3_len -
>> >> >> > >> > > +		pkt->l4_len;
>> >> >> > >> >
>> >> >> > >> > Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len -
>pkt_l4_len?
>> >> >> > >>
>> >> >> > >> Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len -
>pkt_l4_len
>> >> >here.
>> >> >> > >>
>> >> >> > >> >
>> >> >> > >> > > +	/* Don't process the packet without data */
>> >> >> > >> > > +	if (unlikely(tcp_dl == 0)) {
>> >> >> > >> > > +		pkts_out[0] = pkt;
>> >> >> > >> > > +		return ret;
>> >> >> > >> > > +	}
>> >> >> > >> > > +
>> >> >> > >> > > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
>> >> >> > >> > > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
>> >> >> > >> >
>> >> >> > >> > Hmm, why do we need to count CRC_LEN here?
>> >> >> > >>
>> >> >> > >> Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be
>> >> >> > >> included in gso_size.
>> >> >> > >
>> >> >> > >Why?
>> >> >> > >What is the point to account crc len into this computation?
>> >> >> > >Why not just assume that gso_size is already a max_frame_size -
>> >crc_len
>> >> >> > >As I remember, when we RX packet crc bytes will be already
>stripped,
>> >> >> > >when user populates the packet, he doesn't care about crc bytes
>too.
>> >> >> >
>> >> >> > Hi Konstantin,
>> >> >> >
>> >> >> > When packet is tx'd, the 4B for CRC are added back into the packet;
>if
>> >the
>> >> >> payload is already at max capacity, then the actual segment size
>> >> >> > will be 4B larger than expected (e.g. 1522B, as opposed to 1518B).
>> >> >> > To prevent that from happening, we account for the CRC len in this
>> >> >> calculation.
>> >> >>
>> >> >>
>> >> >> Ok, and what prevents you to set gso_ctx.gso_size = 1514;  /*ether
>frame
>> >> >> size without crc bytes */
>> >> >> ?
>> >>
>> >> Hey Konstantin,
>> >>
>> >> If the user sets the gso_size to 1514, the resultant output segments'
>size
>> >should be 1514, and not 1518.
>>
>> Just to clarify - I meant here that the final output segment, including CRC
>len, should be 1514. I think this is where we're crossing wires ;)
>>
>> >
>> >Yes and then NIC HW will add CRC bytes for you.
>> >You are not filling CRC bytes in HW, and when providing to the HW size to
>send
>> >- it is a payload size
>> >(CRC bytes are not accounted).
>> >Konstantin
>>
>> Yes, exactly - in that case though, the gso_size specified by the user is
>not the actual final output segment size, but (segment size - 4B),
>> right?
>
>CRC bytes will be add by HW, it is totally transparent for user.

Yes - I completely agree/understand.

>
>>
>> We can set that expectation in documentation, but from an
>application's/user's perspective, do you think that this might be
>> confusing/misleading?
>
>I think it would be much more confusing to make user account for CRC bytes.
>Let say when in DPDK you form a packet and send it out via rte_eth_tx_burst()
>you specify only your payload size, not payload size plus crc bytes that HW
>will add for you.
>Konstantin

I guess I've just been looking at it from a different perspective (i.e. the user wants to decide the final total packet size); using the example of rte_eth_tx_burst above, I see where you're coming from though.
Thanks for clarifying,
Mark

>
>>
>> Thanks again,
>> Mark
>>
>> >
>> > Consequently, the payload capacity
>> >> of each segment would be reduced accordingly.
>> >> The user only cares about the output segment size (i.e.
>gso_ctx.gso_size);
>> >we need to ensure that the size of the segments that are
>> >> produced is consistent with that. As a result, we need to ensure that any
>> >packet overhead is accounted for in the segment size, before we
>> >> can determine how much space remains for data.
>> >>
>> >> Hope this makes sense.
>> >>
>> >> Thanks,
>> >> Mark
>> >>
>> >> >
>> >> >Exactly, applications can set 1514 to gso_segsz instead of 1518, if the
>> >lower
>> >> >layer
>> >> >will add CRC to the packet.
>> >> >
>> >> >Jiayu
>> >> >
>> >> >> Konstantin
>> >> >>
>> >> >> >
>> >> >> > If I've missed anything, please do let me know!
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Mark
>> >> >> >
>> >> >> > >
>> >> >> > >Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-14  8:51               ` Kavanagh, Mark B
@ 2017-09-14  9:45                 ` Hu, Jiayu
  0 siblings, 0 replies; 157+ messages in thread
From: Hu, Jiayu @ 2017-09-14  9:45 UTC (permalink / raw)
  To: Kavanagh, Mark B, Ananyev, Konstantin; +Cc: dev, Tan, Jianfeng

Hi Mark,

> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Thursday, September 14, 2017 4:52 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>
> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> >From: Hu, Jiayu
> >Sent: Thursday, September 14, 2017 7:07 AM
> >To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> >Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> Jianfeng
> ><jianfeng.tan@intel.com>
> >Subject: Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> >Hi Konstantin,
> >
> >On Thu, Sep 14, 2017 at 06:10:37AM +0800, Ananyev, Konstantin wrote:
> >>
> >> Hi Jiayu,
> >>
> >> > >
> >> > >
> >> > > > -----Original Message-----
> >> > > > From: Ananyev, Konstantin
> >> > > > Sent: Tuesday, September 12, 2017 12:18 PM
> >> > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> >> > > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan, Jianfeng
> ><jianfeng.tan@intel.com>
> >> > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> > > >
> >> > > > > result, when all of its GSOed segments are freed, the packet is
> >freed
> >> > > > > automatically.
> >> > > > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> >> > > > > index dda50ee..95f6ea6 100644
> >> > > > > --- a/lib/librte_gso/rte_gso.c
> >> > > > > +++ b/lib/librte_gso/rte_gso.c
> >> > > > > @@ -33,18 +33,53 @@
> >> > > > >
> >> > > > >  #include <errno.h>
> >> > > > >
> >> > > > > +#include <rte_log.h>
> >> > > > > +
> >> > > > >  #include "rte_gso.h"
> >> > > > > +#include "gso_common.h"
> >> > > > > +#include "gso_tcp4.h"
> >> > > > >
> >> > > > >  int
> >> > > > >  rte_gso_segment(struct rte_mbuf *pkt,
> >> > > > > -		struct rte_gso_ctx gso_ctx __rte_unused,
> >> > > > > +		struct rte_gso_ctx gso_ctx,
> >> > > > >  		struct rte_mbuf **pkts_out,
> >> > > > >  		uint16_t nb_pkts_out)
> >> > > > >  {
> >> > > > > +	struct rte_mempool *direct_pool, *indirect_pool;
> >> > > > > +	struct rte_mbuf *pkt_seg;
> >> > > > > +	uint16_t gso_size;
> >> > > > > +	uint8_t ipid_delta;
> >> > > > > +	int ret = 1;
> >> > > > > +
> >> > > > >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
> >> > > > >  		return -EINVAL;
> >> > > > >
> >> > > > > -	pkts_out[0] = pkt;
> >> > > > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> >> > > > > +			(pkt->packet_type & gso_ctx.gso_types) !=
> >> > > > > +			pkt->packet_type) {
> >> > > > > +		pkts_out[0] = pkt;
> >> > > > > +		return ret;
> >> > > > > +	}
> >> > > > > +
> >> > > > > +	direct_pool = gso_ctx.direct_pool;
> >> > > > > +	indirect_pool = gso_ctx.indirect_pool;
> >> > > > > +	gso_size = gso_ctx.gso_size;
> >> > > > > +	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
> >> > > > > +
> >> > > > > +	if (is_ipv4_tcp(pkt->packet_type)) {
> >> > > >
> >> > > > Probably we need here:
> >> > > > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types &
> >DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> >> > >
> >> > > Sorry, actually it probably should be:
> >> > > If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) == PKT_TX_IPV4
> &&
> >> > >       (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> >> >
> >> > I don't quite understand why the GSO library should be aware if the TSO
> >> > flag is set or not. Applications can query device TSO capability before
> >> > they call the GSO library. Do I misundertsand anything?
> >> >
> >> > Additionally, we don't need to check if the packet is a TCP/IPv4 packet
> >here?
> >>
> >> Well, right now  PMD we doesn't rely on ptype to figure out what type of
> >packet and
> >> what TX offload have to be performed.
> >> Instead it looks at TX part of ol_flags, and
> >> My thought was that as what we doing is actually TSO in SW, it would be
> good
> >> to use the same API here too.
> >> Also with that approach, by setting ol_flags properly user can use the
> same
> >gso_ctx and still
> >> specify what segmentation to perform on a per-packet basis.
> >>
> >> Alternative way is to rely on ptype to distinguish should segmentation be
> >performed on that package or not.
> >> The only advantage I see here is that if someone would like to add GSO
> for
> >some new protocol,
> >> he wouldn't need to introduce new TX flag value for mbuf.ol_flags.
> >> Though he still would need to update TX_OFFLOAD_* capabilities and
> probably
> >packet_type definitions.
> >>
> >> So from my perspective first variant (use HW TSO API) is more plausible.
> >> Wonder what is your and Mark opinions here?
> >
> >In the first choice, you mean:
> >the GSO library uses gso_ctx->gso_types and mbuf->ol_flags to call a
> specific
> >GSO
> >segmentation function (e.g. gso_tcp4_segment(), gso_tunnel_xxx()) for
> each
> >input packet.
> >Applications should parse the packet type, and set an exactly correct
> >DEV_TX_OFFLOAD_*_TSO
> >flag to gso_types and ol_flags according to the packet type. That is, the
> >value of gso_types
> >is on a per-packet basis. Using gso_ctx->gso_types and mbuf->ol_flags at
> the
> >same time
> >is because that DEV_TX_OFFLOAD_*_TSO only tells tunnelling type and the
> inner
> >L4 type, and
> >we need to know L3 type by ol_flags. With this design, HW segmentation
> and SW
> >segmentation
> >are indeed consistent.
> >
> >If I understand it correctly, applications need to set 'ol_flags =
> >PKT_TX_IPV4' and
> >'gso_types = DEV_TX_OFFLOAD_VXLAN_TNL_TSO' for a
> >"ether+ipv4+udp+vxlan+ether+ipv4+
> >tcp+payload" packet. But PKT_TX_IPV4 just present the inner L3 type for
> >tunneled packet.
> >How about the outer L3 type? Always assume the inner and the outer L3
> type are
> >the same?
> 
> Hi Jiayu,
> 
> If I'm not mistaken, I think what Konstantin is suggesting is as follows:
> 
> - The DEV_TX_OFFLOAD_*_TSO flags are currently used to describe a NIC's
> TSO capabilities; the GSO capabilities may also be described using the same
> macros, to provide a consistent view of segmentation capabilities across the
> HW and SW implementations.

Yes, DEV_TX_OFFLOAD_*_TSO stored in gso_types are used to by applications
to tell the GSO library what GSO types are required. The GSO library uses ol_flags
to decide which segmentation function to use.

Thanks,
Jiayu
> 
> - As part of segmentation, it's still a case of checking the packet type, but
> then setting the appropriate ol_flags in the mbuf, which the GSO library can
> use to segment the packet.
> 
> Thanks,
> Mark
> 
> >
> >Jiayu
> >> Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-14  9:35                   ` Ananyev, Konstantin
@ 2017-09-14 10:01                     ` Hu, Jiayu
  2017-09-14 15:42                       ` Kavanagh, Mark B
  0 siblings, 1 reply; 157+ messages in thread
From: Hu, Jiayu @ 2017-09-14 10:01 UTC (permalink / raw)
  To: Ananyev, Konstantin, Kavanagh, Mark B; +Cc: dev, Tan, Jianfeng

Hi Konstantin and Mark,

> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Thursday, September 14, 2017 5:36 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> 
> 
> > -----Original Message-----
> > From: Hu, Jiayu
> > Sent: Thursday, September 14, 2017 10:29 AM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> Jianfeng <jianfeng.tan@intel.com>
> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> > Hi Konstantin,
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin
> > > Sent: Thursday, September 14, 2017 4:47 PM
> > > To: Hu, Jiayu <jiayu.hu@intel.com>
> > > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> Tan,
> > > Jianfeng <jianfeng.tan@intel.com>
> > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > >
> > > Hi Jiayu,
> > >
> > > > -----Original Message-----
> > > > From: Hu, Jiayu
> > > > Sent: Thursday, September 14, 2017 7:07 AM
> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> Tan,
> > > Jianfeng <jianfeng.tan@intel.com>
> > > > Subject: Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > >
> > > > Hi Konstantin,
> > > >
> > > > On Thu, Sep 14, 2017 at 06:10:37AM +0800, Ananyev, Konstantin wrote:
> > > > >
> > > > > Hi Jiayu,
> > > > >
> > > > > > >
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Ananyev, Konstantin
> > > > > > > > Sent: Tuesday, September 12, 2017 12:18 PM
> > > > > > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> > > > > > > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> Jianfeng
> > > <jianfeng.tan@intel.com>
> > > > > > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > > > > >
> > > > > > > > > result, when all of its GSOed segments are freed, the packet is
> > > freed
> > > > > > > > > automatically.
> > > > > > > > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > > > > > > > > index dda50ee..95f6ea6 100644
> > > > > > > > > --- a/lib/librte_gso/rte_gso.c
> > > > > > > > > +++ b/lib/librte_gso/rte_gso.c
> > > > > > > > > @@ -33,18 +33,53 @@
> > > > > > > > >
> > > > > > > > >  #include <errno.h>
> > > > > > > > >
> > > > > > > > > +#include <rte_log.h>
> > > > > > > > > +
> > > > > > > > >  #include "rte_gso.h"
> > > > > > > > > +#include "gso_common.h"
> > > > > > > > > +#include "gso_tcp4.h"
> > > > > > > > >
> > > > > > > > >  int
> > > > > > > > >  rte_gso_segment(struct rte_mbuf *pkt,
> > > > > > > > > -		struct rte_gso_ctx gso_ctx __rte_unused,
> > > > > > > > > +		struct rte_gso_ctx gso_ctx,
> > > > > > > > >  		struct rte_mbuf **pkts_out,
> > > > > > > > >  		uint16_t nb_pkts_out)
> > > > > > > > >  {
> > > > > > > > > +	struct rte_mempool *direct_pool, *indirect_pool;
> > > > > > > > > +	struct rte_mbuf *pkt_seg;
> > > > > > > > > +	uint16_t gso_size;
> > > > > > > > > +	uint8_t ipid_delta;
> > > > > > > > > +	int ret = 1;
> > > > > > > > > +
> > > > > > > > >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
> > > > > > > > >  		return -EINVAL;
> > > > > > > > >
> > > > > > > > > -	pkts_out[0] = pkt;
> > > > > > > > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> > > > > > > > > +			(pkt->packet_type & gso_ctx.gso_types) !=
> > > > > > > > > +			pkt->packet_type) {
> > > > > > > > > +		pkts_out[0] = pkt;
> > > > > > > > > +		return ret;
> > > > > > > > > +	}
> > > > > > > > > +
> > > > > > > > > +	direct_pool = gso_ctx.direct_pool;
> > > > > > > > > +	indirect_pool = gso_ctx.indirect_pool;
> > > > > > > > > +	gso_size = gso_ctx.gso_size;
> > > > > > > > > +	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
> > > > > > > > > +
> > > > > > > > > +	if (is_ipv4_tcp(pkt->packet_type)) {
> > > > > > > >
> > > > > > > > Probably we need here:
> > > > > > > > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types &
> > > DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> > > > > > >
> > > > > > > Sorry, actually it probably should be:
> > > > > > > If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) ==
> PKT_TX_IPV4
> > > &&
> > > > > > >       (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> > > > > >
> > > > > > I don't quite understand why the GSO library should be aware if the
> TSO
> > > > > > flag is set or not. Applications can query device TSO capability
> before
> > > > > > they call the GSO library. Do I misundertsand anything?
> > > > > >
> > > > > > Additionally, we don't need to check if the packet is a TCP/IPv4
> packet
> > > here?
> > > > >
> > > > > Well, right now  PMD we doesn't rely on ptype to figure out what type
> of
> > > packet and
> > > > > what TX offload have to be performed.
> > > > > Instead it looks at TX part of ol_flags, and
> > > > > My thought was that as what we doing is actually TSO in SW, it would
> be
> > > good
> > > > > to use the same API here too.
> > > > > Also with that approach, by setting ol_flags properly user can use the
> > > same gso_ctx and still
> > > > > specify what segmentation to perform on a per-packet basis.
> > > > >
> > > > > Alternative way is to rely on ptype to distinguish should segmentation
> be
> > > performed on that package or not.
> > > > > The only advantage I see here is that if someone would like to add
> GSO
> > > for some new protocol,
> > > > > he wouldn't need to introduce new TX flag value for mbuf.ol_flags.
> > > > > Though he still would need to update TX_OFFLOAD_* capabilities and
> > > probably packet_type definitions.
> > > > >
> > > > > So from my perspective first variant (use HW TSO API) is more
> plausible.
> > > > > Wonder what is your and Mark opinions here?
> > > >
> > > > In the first choice, you mean:
> > > > the GSO library uses gso_ctx->gso_types and mbuf->ol_flags to call a
> > > specific GSO
> > > > segmentation function (e.g. gso_tcp4_segment(), gso_tunnel_xxx()) for
> > > each input packet.
> > > > Applications should parse the packet type, and set an exactly correct
> > > DEV_TX_OFFLOAD_*_TSO
> > > > flag to gso_types and ol_flags according to the packet type. That is, the
> > > value of gso_types
> > > > is on a per-packet basis. Using gso_ctx->gso_types and mbuf->ol_flags
> at
> > > the same time
> > > > is because that DEV_TX_OFFLOAD_*_TSO only tells tunnelling type and
> the
> > > inner L4 type, and
> > > > we need to know L3 type by ol_flags. With this design, HW
> segmentation
> > > and SW segmentation
> > > > are indeed consistent.
> > > >
> > > > If I understand it correctly, applications need to set 'ol_flags =
> > > PKT_TX_IPV4' and
> > > > 'gso_types = DEV_TX_OFFLOAD_VXLAN_TNL_TSO' for a
> > > "ether+ipv4+udp+vxlan+ether+ipv4+
> > > > tcp+payload" packet. But PKT_TX_IPV4 just present the inner L3 type for
> > > tunneled packet.
> > > > How about the outer L3 type? Always assume the inner and the outer L3
> > > type are the same?
> > >
> > > It think that for that case you'll have to set in ol_flags:
> > >
> > > PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN |
> > > PKT_TX_TCP_SEG
> >
> > OK, so it means PKT_TX_TCP_SEG is also used for tunneled TSO. The
> > GSO library doesn't need gso_types anymore.
> 
> You still might need gso_ctx.gso_types to let user limit what types of
> segmentation
> that particular gso_ctx supports.
> An alternative would be to assume that each gso_ctx supports all
> currently implemented segmentations.
> This is possible too, but probably not very convenient to the user.

Hmm, make sense.

One thing to confirm: the value of gso_types should be DEV_TX_OFFLOAD_*_TSO,
or new macros?

Jiayu
> Konstantin
> 
> >
> > The first choice makes HW and SW segmentation are totally the same.
> > Applications just need to parse the packet and set proper ol_flags, and
> > the GSO library uses ol_flags to decide which segmentation function to use.
> > I think it's better than the second choice which depending on ptype to
> > choose segmentation function.
> >
> > Jiayu
> > >
> > > Konstantin
> > >
> > > >
> > > > Jiayu
> > > > > Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-14 10:01                     ` Hu, Jiayu
@ 2017-09-14 15:42                       ` Kavanagh, Mark B
  2017-09-14 18:38                         ` Ananyev, Konstantin
  0 siblings, 1 reply; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-09-14 15:42 UTC (permalink / raw)
  To: Hu, Jiayu, Ananyev, Konstantin; +Cc: dev, Tan, Jianfeng

>From: Hu, Jiayu
>Sent: Thursday, September 14, 2017 11:01 AM
>To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>
>Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
>Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>
>Hi Konstantin and Mark,
>
>> -----Original Message-----
>> From: Ananyev, Konstantin
>> Sent: Thursday, September 14, 2017 5:36 PM
>> To: Hu, Jiayu <jiayu.hu@intel.com>
>> Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
>> Jianfeng <jianfeng.tan@intel.com>
>> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>>
>>
>>
>> > -----Original Message-----
>> > From: Hu, Jiayu
>> > Sent: Thursday, September 14, 2017 10:29 AM
>> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
>> > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
>> Jianfeng <jianfeng.tan@intel.com>
>> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >
>> > Hi Konstantin,
>> >
>> > > -----Original Message-----
>> > > From: Ananyev, Konstantin
>> > > Sent: Thursday, September 14, 2017 4:47 PM
>> > > To: Hu, Jiayu <jiayu.hu@intel.com>
>> > > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
>> Tan,
>> > > Jianfeng <jianfeng.tan@intel.com>
>> > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> > >
>> > > Hi Jiayu,
>> > >
>> > > > -----Original Message-----
>> > > > From: Hu, Jiayu
>> > > > Sent: Thursday, September 14, 2017 7:07 AM
>> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
>> > > > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
>> Tan,
>> > > Jianfeng <jianfeng.tan@intel.com>
>> > > > Subject: Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> > > >
>> > > > Hi Konstantin,
>> > > >
>> > > > On Thu, Sep 14, 2017 at 06:10:37AM +0800, Ananyev, Konstantin wrote:
>> > > > >
>> > > > > Hi Jiayu,
>> > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > > -----Original Message-----
>> > > > > > > > From: Ananyev, Konstantin
>> > > > > > > > Sent: Tuesday, September 12, 2017 12:18 PM
>> > > > > > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
>> > > > > > > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
>> Jianfeng
>> > > <jianfeng.tan@intel.com>
>> > > > > > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> > > > > > > >
>> > > > > > > > > result, when all of its GSOed segments are freed, the packet
>is
>> > > freed
>> > > > > > > > > automatically.
>> > > > > > > > > diff --git a/lib/librte_gso/rte_gso.c
>b/lib/librte_gso/rte_gso.c
>> > > > > > > > > index dda50ee..95f6ea6 100644
>> > > > > > > > > --- a/lib/librte_gso/rte_gso.c
>> > > > > > > > > +++ b/lib/librte_gso/rte_gso.c
>> > > > > > > > > @@ -33,18 +33,53 @@
>> > > > > > > > >
>> > > > > > > > >  #include <errno.h>
>> > > > > > > > >
>> > > > > > > > > +#include <rte_log.h>
>> > > > > > > > > +
>> > > > > > > > >  #include "rte_gso.h"
>> > > > > > > > > +#include "gso_common.h"
>> > > > > > > > > +#include "gso_tcp4.h"
>> > > > > > > > >
>> > > > > > > > >  int
>> > > > > > > > >  rte_gso_segment(struct rte_mbuf *pkt,
>> > > > > > > > > -		struct rte_gso_ctx gso_ctx __rte_unused,
>> > > > > > > > > +		struct rte_gso_ctx gso_ctx,
>> > > > > > > > >  		struct rte_mbuf **pkts_out,
>> > > > > > > > >  		uint16_t nb_pkts_out)
>> > > > > > > > >  {
>> > > > > > > > > +	struct rte_mempool *direct_pool, *indirect_pool;
>> > > > > > > > > +	struct rte_mbuf *pkt_seg;
>> > > > > > > > > +	uint16_t gso_size;
>> > > > > > > > > +	uint8_t ipid_delta;
>> > > > > > > > > +	int ret = 1;
>> > > > > > > > > +
>> > > > > > > > >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
>> > > > > > > > >  		return -EINVAL;
>> > > > > > > > >
>> > > > > > > > > -	pkts_out[0] = pkt;
>> > > > > > > > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
>> > > > > > > > > +			(pkt->packet_type & gso_ctx.gso_types) !=
>> > > > > > > > > +			pkt->packet_type) {
>> > > > > > > > > +		pkts_out[0] = pkt;
>> > > > > > > > > +		return ret;
>> > > > > > > > > +	}
>> > > > > > > > > +
>> > > > > > > > > +	direct_pool = gso_ctx.direct_pool;
>> > > > > > > > > +	indirect_pool = gso_ctx.indirect_pool;
>> > > > > > > > > +	gso_size = gso_ctx.gso_size;
>> > > > > > > > > +	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
>> > > > > > > > > +
>> > > > > > > > > +	if (is_ipv4_tcp(pkt->packet_type)) {
>> > > > > > > >
>> > > > > > > > Probably we need here:
>> > > > > > > > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types &
>> > > DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
>> > > > > > >
>> > > > > > > Sorry, actually it probably should be:
>> > > > > > > If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) ==
>> PKT_TX_IPV4
>> > > &&
>> > > > > > >       (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
>> > > > > >
>> > > > > > I don't quite understand why the GSO library should be aware if
>the
>> TSO
>> > > > > > flag is set or not. Applications can query device TSO capability
>> before
>> > > > > > they call the GSO library. Do I misundertsand anything?
>> > > > > >
>> > > > > > Additionally, we don't need to check if the packet is a TCP/IPv4
>> packet
>> > > here?
>> > > > >
>> > > > > Well, right now  PMD we doesn't rely on ptype to figure out what
>type
>> of
>> > > packet and
>> > > > > what TX offload have to be performed.
>> > > > > Instead it looks at TX part of ol_flags, and
>> > > > > My thought was that as what we doing is actually TSO in SW, it would
>> be
>> > > good
>> > > > > to use the same API here too.
>> > > > > Also with that approach, by setting ol_flags properly user can use
>the
>> > > same gso_ctx and still
>> > > > > specify what segmentation to perform on a per-packet basis.
>> > > > >
>> > > > > Alternative way is to rely on ptype to distinguish should
>segmentation
>> be
>> > > performed on that package or not.
>> > > > > The only advantage I see here is that if someone would like to add
>> GSO
>> > > for some new protocol,
>> > > > > he wouldn't need to introduce new TX flag value for mbuf.ol_flags.
>> > > > > Though he still would need to update TX_OFFLOAD_* capabilities and
>> > > probably packet_type definitions.
>> > > > >
>> > > > > So from my perspective first variant (use HW TSO API) is more
>> plausible.
>> > > > > Wonder what is your and Mark opinions here?
>> > > >
>> > > > In the first choice, you mean:
>> > > > the GSO library uses gso_ctx->gso_types and mbuf->ol_flags to call a
>> > > specific GSO
>> > > > segmentation function (e.g. gso_tcp4_segment(), gso_tunnel_xxx()) for
>> > > each input packet.
>> > > > Applications should parse the packet type, and set an exactly correct
>> > > DEV_TX_OFFLOAD_*_TSO
>> > > > flag to gso_types and ol_flags according to the packet type. That is,
>the
>> > > value of gso_types
>> > > > is on a per-packet basis. Using gso_ctx->gso_types and mbuf->ol_flags
>> at
>> > > the same time
>> > > > is because that DEV_TX_OFFLOAD_*_TSO only tells tunnelling type and
>> the
>> > > inner L4 type, and
>> > > > we need to know L3 type by ol_flags. With this design, HW
>> segmentation
>> > > and SW segmentation
>> > > > are indeed consistent.
>> > > >
>> > > > If I understand it correctly, applications need to set 'ol_flags =
>> > > PKT_TX_IPV4' and
>> > > > 'gso_types = DEV_TX_OFFLOAD_VXLAN_TNL_TSO' for a
>> > > "ether+ipv4+udp+vxlan+ether+ipv4+
>> > > > tcp+payload" packet. But PKT_TX_IPV4 just present the inner L3 type
>for
>> > > tunneled packet.
>> > > > How about the outer L3 type? Always assume the inner and the outer L3
>> > > type are the same?
>> > >
>> > > It think that for that case you'll have to set in ol_flags:
>> > >
>> > > PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN |
>> > > PKT_TX_TCP_SEG
>> >
>> > OK, so it means PKT_TX_TCP_SEG is also used for tunneled TSO. The
>> > GSO library doesn't need gso_types anymore.
>>
>> You still might need gso_ctx.gso_types to let user limit what types of
>> segmentation
>> that particular gso_ctx supports.
>> An alternative would be to assume that each gso_ctx supports all
>> currently implemented segmentations.
>> This is possible too, but probably not very convenient to the user.
>
>Hmm, make sense.
>
>One thing to confirm: the value of gso_types should be DEV_TX_OFFLOAD_*_TSO,
>or new macros?

Hi Jiayu, Konstantin,

I think that the existing macros are fine, as they provide a consistent view of segmentation capabilities to the application/user.

I was initially concerned that they might be too coarse-grained (i.e. only IPv4 is currently supported, and not IPv6), but as per Konstantin's previous example, the DEV_TX_OFFLOAD_*_TSO macros can be used in concert with the packet type to determine whether a packet should be fragmented or not.

Thanks,
Mark

>
>Jiayu
>> Konstantin
>>
>> >
>> > The first choice makes HW and SW segmentation are totally the same.
>> > Applications just need to parse the packet and set proper ol_flags, and
>> > the GSO library uses ol_flags to decide which segmentation function to
>use.
>> > I think it's better than the second choice which depending on ptype to
>> > choose segmentation function.
>> >
>> > Jiayu
>> > >
>> > > Konstantin
>> > >
>> > > >
>> > > > Jiayu
>> > > > > Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 1/5] gso: add Generic Segmentation Offload API framework
  2017-09-12  2:43     ` [PATCH v3 1/5] gso: add Generic Segmentation Offload API framework Jiayu Hu
  2017-09-12 10:36       ` Ananyev, Konstantin
@ 2017-09-14 18:33       ` Ferruh Yigit
  2017-09-15  1:12         ` Hu, Jiayu
  1 sibling, 1 reply; 157+ messages in thread
From: Ferruh Yigit @ 2017-09-14 18:33 UTC (permalink / raw)
  To: Jiayu Hu, dev; +Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan

On 9/12/2017 3:43 AM, Jiayu Hu wrote:
> Generic Segmentation Offload (GSO) is a SW technique to split large
> packets into small ones. Akin to TSO, GSO enables applications to
> operate on large packets, thus reducing per-packet processing overhead.
> 
> To enable more flexibility to applications, DPDK GSO is implemented
> as a standalone library. Applications explicitly use the GSO library
> to segment packets. This patch introduces the GSO API framework to DPDK.
> 
> The GSO library provides a segmentation API, rte_gso_segment(), for
> applications. It splits an input packet into small ones in each
> invocation. The GSO library refers to these small packets generated
> by rte_gso_segment() as GSO segments. Each of the newly-created GSO
> segments is organized as a two-segment MBUF, where the first segment is a
> standard MBUF, which stores a copy of packet header, and the second is an
> indirect MBUF which points to a section of data in the input packet.
> rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
> when all GSO segments are freed, the input packet is freed automatically.
> Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
> the driver of the interface which the GSO segments are sent to should
> support to transmit multi-segment packets.
> 
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> ---
>  config/common_base                 |   5 ++
>  lib/Makefile                       |   2 +
>  lib/librte_gso/Makefile            |  49 ++++++++++++++
>  lib/librte_gso/rte_gso.c           |  50 ++++++++++++++
>  lib/librte_gso/rte_gso.h           | 133 +++++++++++++++++++++++++++++++++++++
>  lib/librte_gso/rte_gso_version.map |   7 ++
>  mk/rte.app.mk                      |   1 +

Can you please update documentation for new library:

- library documentation "doc/guides/prog_guide/xxx.rst"
- api documentation: doc/api/doxy-api.conf, doc/api/doxy-api-index.md
- release notes update to announce new library
- release notes, "Shared Library Versions" section with new library.

<...>

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-09-12  2:43     ` [PATCH v3 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
@ 2017-09-14 18:33       ` Ferruh Yigit
  2017-09-15  1:13         ` Hu, Jiayu
  0 siblings, 1 reply; 157+ messages in thread
From: Ferruh Yigit @ 2017-09-14 18:33 UTC (permalink / raw)
  To: Jiayu Hu, dev; +Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan

On 9/12/2017 3:43 AM, Jiayu Hu wrote:
> This patch adds GSO support to the csum forwarding engine. Oversized
> packets transmitted over a GSO-enabled port will undergo segmentation
> (with the exception of packet-types unsupported by the GSO library).
> GSO support is disabled by default.
> 
> GSO support may be toggled on a per-port basis, using the command:
> 
>         "set port <port_id> gso on|off"
> 
> The maximum packet length (including the packet header and payload) for
> GSO segments may be set with the command:
> 
>         "set gso segsz <length>"
> 
> Show GSO configuration for a given port with the command:
> 
> 	"show port <port_id> gso"
> 
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> ---
>  app/test-pmd/cmdline.c  | 178 ++++++++++++++++++++++++++++++++++++++++++++++++
>  app/test-pmd/config.c   |  24 +++++++
>  app/test-pmd/csumonly.c | 102 +++++++++++++++++++++++++--
>  app/test-pmd/testpmd.c  |  16 +++++
>  app/test-pmd/testpmd.h  |  10 +++
Can you please update tespmd document (testpmd_funcs.rst) with new commands.

<...>

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-14 15:42                       ` Kavanagh, Mark B
@ 2017-09-14 18:38                         ` Ananyev, Konstantin
  2017-09-15  7:54                           ` Hu, Jiayu
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-14 18:38 UTC (permalink / raw)
  To: Kavanagh, Mark B, Hu, Jiayu; +Cc: dev, Tan, Jianfeng



> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Thursday, September 14, 2017 4:42 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> >From: Hu, Jiayu
> >Sent: Thursday, September 14, 2017 11:01 AM
> >To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark B
> ><mark.b.kavanagh@intel.com>
> >Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> >Hi Konstantin and Mark,
> >
> >> -----Original Message-----
> >> From: Ananyev, Konstantin
> >> Sent: Thursday, September 14, 2017 5:36 PM
> >> To: Hu, Jiayu <jiayu.hu@intel.com>
> >> Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> >> Jianfeng <jianfeng.tan@intel.com>
> >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >>
> >>
> >>
> >> > -----Original Message-----
> >> > From: Hu, Jiayu
> >> > Sent: Thursday, September 14, 2017 10:29 AM
> >> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> >> > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> >> Jianfeng <jianfeng.tan@intel.com>
> >> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> >
> >> > Hi Konstantin,
> >> >
> >> > > -----Original Message-----
> >> > > From: Ananyev, Konstantin
> >> > > Sent: Thursday, September 14, 2017 4:47 PM
> >> > > To: Hu, Jiayu <jiayu.hu@intel.com>
> >> > > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> >> Tan,
> >> > > Jianfeng <jianfeng.tan@intel.com>
> >> > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> > >
> >> > > Hi Jiayu,
> >> > >
> >> > > > -----Original Message-----
> >> > > > From: Hu, Jiayu
> >> > > > Sent: Thursday, September 14, 2017 7:07 AM
> >> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> >> > > > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> >> Tan,
> >> > > Jianfeng <jianfeng.tan@intel.com>
> >> > > > Subject: Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> > > >
> >> > > > Hi Konstantin,
> >> > > >
> >> > > > On Thu, Sep 14, 2017 at 06:10:37AM +0800, Ananyev, Konstantin wrote:
> >> > > > >
> >> > > > > Hi Jiayu,
> >> > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > > -----Original Message-----
> >> > > > > > > > From: Ananyev, Konstantin
> >> > > > > > > > Sent: Tuesday, September 12, 2017 12:18 PM
> >> > > > > > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> >> > > > > > > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> >> Jianfeng
> >> > > <jianfeng.tan@intel.com>
> >> > > > > > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> > > > > > > >
> >> > > > > > > > > result, when all of its GSOed segments are freed, the packet
> >is
> >> > > freed
> >> > > > > > > > > automatically.
> >> > > > > > > > > diff --git a/lib/librte_gso/rte_gso.c
> >b/lib/librte_gso/rte_gso.c
> >> > > > > > > > > index dda50ee..95f6ea6 100644
> >> > > > > > > > > --- a/lib/librte_gso/rte_gso.c
> >> > > > > > > > > +++ b/lib/librte_gso/rte_gso.c
> >> > > > > > > > > @@ -33,18 +33,53 @@
> >> > > > > > > > >
> >> > > > > > > > >  #include <errno.h>
> >> > > > > > > > >
> >> > > > > > > > > +#include <rte_log.h>
> >> > > > > > > > > +
> >> > > > > > > > >  #include "rte_gso.h"
> >> > > > > > > > > +#include "gso_common.h"
> >> > > > > > > > > +#include "gso_tcp4.h"
> >> > > > > > > > >
> >> > > > > > > > >  int
> >> > > > > > > > >  rte_gso_segment(struct rte_mbuf *pkt,
> >> > > > > > > > > -		struct rte_gso_ctx gso_ctx __rte_unused,
> >> > > > > > > > > +		struct rte_gso_ctx gso_ctx,
> >> > > > > > > > >  		struct rte_mbuf **pkts_out,
> >> > > > > > > > >  		uint16_t nb_pkts_out)
> >> > > > > > > > >  {
> >> > > > > > > > > +	struct rte_mempool *direct_pool, *indirect_pool;
> >> > > > > > > > > +	struct rte_mbuf *pkt_seg;
> >> > > > > > > > > +	uint16_t gso_size;
> >> > > > > > > > > +	uint8_t ipid_delta;
> >> > > > > > > > > +	int ret = 1;
> >> > > > > > > > > +
> >> > > > > > > > >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out < 1)
> >> > > > > > > > >  		return -EINVAL;
> >> > > > > > > > >
> >> > > > > > > > > -	pkts_out[0] = pkt;
> >> > > > > > > > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> >> > > > > > > > > +			(pkt->packet_type & gso_ctx.gso_types) !=
> >> > > > > > > > > +			pkt->packet_type) {
> >> > > > > > > > > +		pkts_out[0] = pkt;
> >> > > > > > > > > +		return ret;
> >> > > > > > > > > +	}
> >> > > > > > > > > +
> >> > > > > > > > > +	direct_pool = gso_ctx.direct_pool;
> >> > > > > > > > > +	indirect_pool = gso_ctx.indirect_pool;
> >> > > > > > > > > +	gso_size = gso_ctx.gso_size;
> >> > > > > > > > > +	ipid_delta = gso_ctx.ipid_flag == RTE_GSO_IPID_INCREASE;
> >> > > > > > > > > +
> >> > > > > > > > > +	if (is_ipv4_tcp(pkt->packet_type)) {
> >> > > > > > > >
> >> > > > > > > > Probably we need here:
> >> > > > > > > > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types &
> >> > > DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> >> > > > > > >
> >> > > > > > > Sorry, actually it probably should be:
> >> > > > > > > If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) ==
> >> PKT_TX_IPV4
> >> > > &&
> >> > > > > > >       (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> >> > > > > >
> >> > > > > > I don't quite understand why the GSO library should be aware if
> >the
> >> TSO
> >> > > > > > flag is set or not. Applications can query device TSO capability
> >> before
> >> > > > > > they call the GSO library. Do I misundertsand anything?
> >> > > > > >
> >> > > > > > Additionally, we don't need to check if the packet is a TCP/IPv4
> >> packet
> >> > > here?
> >> > > > >
> >> > > > > Well, right now  PMD we doesn't rely on ptype to figure out what
> >type
> >> of
> >> > > packet and
> >> > > > > what TX offload have to be performed.
> >> > > > > Instead it looks at TX part of ol_flags, and
> >> > > > > My thought was that as what we doing is actually TSO in SW, it would
> >> be
> >> > > good
> >> > > > > to use the same API here too.
> >> > > > > Also with that approach, by setting ol_flags properly user can use
> >the
> >> > > same gso_ctx and still
> >> > > > > specify what segmentation to perform on a per-packet basis.
> >> > > > >
> >> > > > > Alternative way is to rely on ptype to distinguish should
> >segmentation
> >> be
> >> > > performed on that package or not.
> >> > > > > The only advantage I see here is that if someone would like to add
> >> GSO
> >> > > for some new protocol,
> >> > > > > he wouldn't need to introduce new TX flag value for mbuf.ol_flags.
> >> > > > > Though he still would need to update TX_OFFLOAD_* capabilities and
> >> > > probably packet_type definitions.
> >> > > > >
> >> > > > > So from my perspective first variant (use HW TSO API) is more
> >> plausible.
> >> > > > > Wonder what is your and Mark opinions here?
> >> > > >
> >> > > > In the first choice, you mean:
> >> > > > the GSO library uses gso_ctx->gso_types and mbuf->ol_flags to call a
> >> > > specific GSO
> >> > > > segmentation function (e.g. gso_tcp4_segment(), gso_tunnel_xxx()) for
> >> > > each input packet.
> >> > > > Applications should parse the packet type, and set an exactly correct
> >> > > DEV_TX_OFFLOAD_*_TSO
> >> > > > flag to gso_types and ol_flags according to the packet type. That is,
> >the
> >> > > value of gso_types
> >> > > > is on a per-packet basis. Using gso_ctx->gso_types and mbuf->ol_flags
> >> at
> >> > > the same time
> >> > > > is because that DEV_TX_OFFLOAD_*_TSO only tells tunnelling type and
> >> the
> >> > > inner L4 type, and
> >> > > > we need to know L3 type by ol_flags. With this design, HW
> >> segmentation
> >> > > and SW segmentation
> >> > > > are indeed consistent.
> >> > > >
> >> > > > If I understand it correctly, applications need to set 'ol_flags =
> >> > > PKT_TX_IPV4' and
> >> > > > 'gso_types = DEV_TX_OFFLOAD_VXLAN_TNL_TSO' for a
> >> > > "ether+ipv4+udp+vxlan+ether+ipv4+
> >> > > > tcp+payload" packet. But PKT_TX_IPV4 just present the inner L3 type
> >for
> >> > > tunneled packet.
> >> > > > How about the outer L3 type? Always assume the inner and the outer L3
> >> > > type are the same?
> >> > >
> >> > > It think that for that case you'll have to set in ol_flags:
> >> > >
> >> > > PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN |
> >> > > PKT_TX_TCP_SEG
> >> >
> >> > OK, so it means PKT_TX_TCP_SEG is also used for tunneled TSO. The
> >> > GSO library doesn't need gso_types anymore.
> >>
> >> You still might need gso_ctx.gso_types to let user limit what types of
> >> segmentation
> >> that particular gso_ctx supports.
> >> An alternative would be to assume that each gso_ctx supports all
> >> currently implemented segmentations.
> >> This is possible too, but probably not very convenient to the user.
> >
> >Hmm, make sense.
> >
> >One thing to confirm: the value of gso_types should be DEV_TX_OFFLOAD_*_TSO,
> >or new macros?
> 
> Hi Jiayu, Konstantin,
> 
> I think that the existing macros are fine, as they provide a consistent view of segmentation capabilities to the application/user.

+1
I also think it is better to re-use DEV_TX_OFFLOAD_*_TSO.

> 
> I was initially concerned that they might be too coarse-grained (i.e. only IPv4 is currently supported, and not IPv6), but as per Konstantin's
> previous example, the DEV_TX_OFFLOAD_*_TSO macros can be used in concert with the packet type to determine whether a packet should
> be fragmented or not.
> 
> Thanks,
> Mark
> 
> >
> >Jiayu
> >> Konstantin
> >>
> >> >
> >> > The first choice makes HW and SW segmentation are totally the same.
> >> > Applications just need to parse the packet and set proper ol_flags, and
> >> > the GSO library uses ol_flags to decide which segmentation function to
> >use.
> >> > I think it's better than the second choice which depending on ptype to
> >> > choose segmentation function.
> >> >
> >> > Jiayu
> >> > >
> >> > > Konstantin
> >> > >
> >> > > >
> >> > > > Jiayu
> >> > > > > Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 1/5] gso: add Generic Segmentation Offload API framework
  2017-09-14 18:33       ` Ferruh Yigit
@ 2017-09-15  1:12         ` Hu, Jiayu
  0 siblings, 0 replies; 157+ messages in thread
From: Hu, Jiayu @ 2017-09-15  1:12 UTC (permalink / raw)
  To: Yigit, Ferruh, dev; +Cc: Ananyev, Konstantin, Kavanagh, Mark B, Tan, Jianfeng

Hi Ferruh,

> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Friday, September 15, 2017 2:33 AM
> To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark
> B <mark.b.kavanagh@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 1/5] gso: add Generic Segmentation
> Offload API framework
> 
> On 9/12/2017 3:43 AM, Jiayu Hu wrote:
> > Generic Segmentation Offload (GSO) is a SW technique to split large
> > packets into small ones. Akin to TSO, GSO enables applications to
> > operate on large packets, thus reducing per-packet processing overhead.
> >
> > To enable more flexibility to applications, DPDK GSO is implemented
> > as a standalone library. Applications explicitly use the GSO library
> > to segment packets. This patch introduces the GSO API framework to DPDK.
> >
> > The GSO library provides a segmentation API, rte_gso_segment(), for
> > applications. It splits an input packet into small ones in each
> > invocation. The GSO library refers to these small packets generated
> > by rte_gso_segment() as GSO segments. Each of the newly-created GSO
> > segments is organized as a two-segment MBUF, where the first segment is
> a
> > standard MBUF, which stores a copy of packet header, and the second is an
> > indirect MBUF which points to a section of data in the input packet.
> > rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
> > when all GSO segments are freed, the input packet is freed automatically.
> > Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
> > the driver of the interface which the GSO segments are sent to should
> > support to transmit multi-segment packets.
> >
> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> > ---
> >  config/common_base                 |   5 ++
> >  lib/Makefile                       |   2 +
> >  lib/librte_gso/Makefile            |  49 ++++++++++++++
> >  lib/librte_gso/rte_gso.c           |  50 ++++++++++++++
> >  lib/librte_gso/rte_gso.h           | 133
> +++++++++++++++++++++++++++++++++++++
> >  lib/librte_gso/rte_gso_version.map |   7 ++
> >  mk/rte.app.mk                      |   1 +
> 
> Can you please update documentation for new library:
> 
> - library documentation "doc/guides/prog_guide/xxx.rst"
> - api documentation: doc/api/doxy-api.conf, doc/api/doxy-api-index.md
> - release notes update to announce new library
> - release notes, "Shared Library Versions" section with new library.

Thanks for your reminder. I will update them sooner.

Jiayu
> 
> <...>

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-09-14 18:33       ` Ferruh Yigit
@ 2017-09-15  1:13         ` Hu, Jiayu
  0 siblings, 0 replies; 157+ messages in thread
From: Hu, Jiayu @ 2017-09-15  1:13 UTC (permalink / raw)
  To: Yigit, Ferruh, dev; +Cc: Ananyev, Konstantin, Kavanagh, Mark B, Tan, Jianfeng

Hi Ferruh,

> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Friday, September 15, 2017 2:33 AM
> To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark
> B <mark.b.kavanagh@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 5/5] app/testpmd: enable TCP/IPv4,
> VxLAN and GRE GSO
> 
> On 9/12/2017 3:43 AM, Jiayu Hu wrote:
> > This patch adds GSO support to the csum forwarding engine. Oversized
> > packets transmitted over a GSO-enabled port will undergo segmentation
> > (with the exception of packet-types unsupported by the GSO library).
> > GSO support is disabled by default.
> >
> > GSO support may be toggled on a per-port basis, using the command:
> >
> >         "set port <port_id> gso on|off"
> >
> > The maximum packet length (including the packet header and payload) for
> > GSO segments may be set with the command:
> >
> >         "set gso segsz <length>"
> >
> > Show GSO configuration for a given port with the command:
> >
> > 	"show port <port_id> gso"
> >
> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> > ---
> >  app/test-pmd/cmdline.c  | 178
> ++++++++++++++++++++++++++++++++++++++++++++++++
> >  app/test-pmd/config.c   |  24 +++++++
> >  app/test-pmd/csumonly.c | 102 +++++++++++++++++++++++++--
> >  app/test-pmd/testpmd.c  |  16 +++++
> >  app/test-pmd/testpmd.h  |  10 +++
> Can you please update tespmd document (testpmd_funcs.rst) with new
> commands.

Thanks, I will update it sooner.

Jiayu
> 
> <...>

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-14 18:38                         ` Ananyev, Konstantin
@ 2017-09-15  7:54                           ` Hu, Jiayu
  2017-09-15  8:15                             ` Ananyev, Konstantin
  2017-09-15  8:17                             ` Ananyev, Konstantin
  0 siblings, 2 replies; 157+ messages in thread
From: Hu, Jiayu @ 2017-09-15  7:54 UTC (permalink / raw)
  To: Ananyev, Konstantin, Kavanagh, Mark B; +Cc: dev, Tan, Jianfeng

Hi Konstantin,

> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Friday, September 15, 2017 2:39 AM
> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
> <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> 
> 
> > -----Original Message-----
> > From: Kavanagh, Mark B
> > Sent: Thursday, September 14, 2017 4:42 PM
> > To: Hu, Jiayu <jiayu.hu@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>
> > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> > >From: Hu, Jiayu
> > >Sent: Thursday, September 14, 2017 11:01 AM
> > >To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh,
> Mark B
> > ><mark.b.kavanagh@intel.com>
> > >Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > >
> > >Hi Konstantin and Mark,
> > >
> > >> -----Original Message-----
> > >> From: Ananyev, Konstantin
> > >> Sent: Thursday, September 14, 2017 5:36 PM
> > >> To: Hu, Jiayu <jiayu.hu@intel.com>
> > >> Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> Tan,
> > >> Jianfeng <jianfeng.tan@intel.com>
> > >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > >>
> > >>
> > >>
> > >> > -----Original Message-----
> > >> > From: Hu, Jiayu
> > >> > Sent: Thursday, September 14, 2017 10:29 AM
> > >> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > >> > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> Tan,
> > >> Jianfeng <jianfeng.tan@intel.com>
> > >> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > >> >
> > >> > Hi Konstantin,
> > >> >
> > >> > > -----Original Message-----
> > >> > > From: Ananyev, Konstantin
> > >> > > Sent: Thursday, September 14, 2017 4:47 PM
> > >> > > To: Hu, Jiayu <jiayu.hu@intel.com>
> > >> > > Cc: dev@dpdk.org; Kavanagh, Mark B
> <mark.b.kavanagh@intel.com>;
> > >> Tan,
> > >> > > Jianfeng <jianfeng.tan@intel.com>
> > >> > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > >> > >
> > >> > > Hi Jiayu,
> > >> > >
> > >> > > > -----Original Message-----
> > >> > > > From: Hu, Jiayu
> > >> > > > Sent: Thursday, September 14, 2017 7:07 AM
> > >> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > >> > > > Cc: dev@dpdk.org; Kavanagh, Mark B
> <mark.b.kavanagh@intel.com>;
> > >> Tan,
> > >> > > Jianfeng <jianfeng.tan@intel.com>
> > >> > > > Subject: Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > >> > > >
> > >> > > > Hi Konstantin,
> > >> > > >
> > >> > > > On Thu, Sep 14, 2017 at 06:10:37AM +0800, Ananyev, Konstantin
> wrote:
> > >> > > > >
> > >> > > > > Hi Jiayu,
> > >> > > > >
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > > -----Original Message-----
> > >> > > > > > > > From: Ananyev, Konstantin
> > >> > > > > > > > Sent: Tuesday, September 12, 2017 12:18 PM
> > >> > > > > > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> > >> > > > > > > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> > >> Jianfeng
> > >> > > <jianfeng.tan@intel.com>
> > >> > > > > > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > >> > > > > > > >
> > >> > > > > > > > > result, when all of its GSOed segments are freed, the
> packet
> > >is
> > >> > > freed
> > >> > > > > > > > > automatically.
> > >> > > > > > > > > diff --git a/lib/librte_gso/rte_gso.c
> > >b/lib/librte_gso/rte_gso.c
> > >> > > > > > > > > index dda50ee..95f6ea6 100644
> > >> > > > > > > > > --- a/lib/librte_gso/rte_gso.c
> > >> > > > > > > > > +++ b/lib/librte_gso/rte_gso.c
> > >> > > > > > > > > @@ -33,18 +33,53 @@
> > >> > > > > > > > >
> > >> > > > > > > > >  #include <errno.h>
> > >> > > > > > > > >
> > >> > > > > > > > > +#include <rte_log.h>
> > >> > > > > > > > > +
> > >> > > > > > > > >  #include "rte_gso.h"
> > >> > > > > > > > > +#include "gso_common.h"
> > >> > > > > > > > > +#include "gso_tcp4.h"
> > >> > > > > > > > >
> > >> > > > > > > > >  int
> > >> > > > > > > > >  rte_gso_segment(struct rte_mbuf *pkt,
> > >> > > > > > > > > -		struct rte_gso_ctx gso_ctx __rte_unused,
> > >> > > > > > > > > +		struct rte_gso_ctx gso_ctx,
> > >> > > > > > > > >  		struct rte_mbuf **pkts_out,
> > >> > > > > > > > >  		uint16_t nb_pkts_out)
> > >> > > > > > > > >  {
> > >> > > > > > > > > +	struct rte_mempool *direct_pool, *indirect_pool;
> > >> > > > > > > > > +	struct rte_mbuf *pkt_seg;
> > >> > > > > > > > > +	uint16_t gso_size;
> > >> > > > > > > > > +	uint8_t ipid_delta;
> > >> > > > > > > > > +	int ret = 1;
> > >> > > > > > > > > +
> > >> > > > > > > > >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out
> < 1)
> > >> > > > > > > > >  		return -EINVAL;
> > >> > > > > > > > >
> > >> > > > > > > > > -	pkts_out[0] = pkt;
> > >> > > > > > > > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> > >> > > > > > > > > +			(pkt->packet_type &
> gso_ctx.gso_types) !=
> > >> > > > > > > > > +			pkt->packet_type) {
> > >> > > > > > > > > +		pkts_out[0] = pkt;
> > >> > > > > > > > > +		return ret;
> > >> > > > > > > > > +	}
> > >> > > > > > > > > +
> > >> > > > > > > > > +	direct_pool = gso_ctx.direct_pool;
> > >> > > > > > > > > +	indirect_pool = gso_ctx.indirect_pool;
> > >> > > > > > > > > +	gso_size = gso_ctx.gso_size;
> > >> > > > > > > > > +	ipid_delta = gso_ctx.ipid_flag ==
> RTE_GSO_IPID_INCREASE;
> > >> > > > > > > > > +
> > >> > > > > > > > > +	if (is_ipv4_tcp(pkt->packet_type)) {
> > >> > > > > > > >
> > >> > > > > > > > Probably we need here:
> > >> > > > > > > > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types
> &
> > >> > > DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> > >> > > > > > >
> > >> > > > > > > Sorry, actually it probably should be:
> > >> > > > > > > If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) ==
> > >> PKT_TX_IPV4
> > >> > > &&
> > >> > > > > > >       (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0)
> {...
> > >> > > > > >
> > >> > > > > > I don't quite understand why the GSO library should be aware if
> > >the
> > >> TSO
> > >> > > > > > flag is set or not. Applications can query device TSO capability
> > >> before
> > >> > > > > > they call the GSO library. Do I misundertsand anything?
> > >> > > > > >
> > >> > > > > > Additionally, we don't need to check if the packet is a TCP/IPv4
> > >> packet
> > >> > > here?
> > >> > > > >
> > >> > > > > Well, right now  PMD we doesn't rely on ptype to figure out what
> > >type
> > >> of
> > >> > > packet and
> > >> > > > > what TX offload have to be performed.
> > >> > > > > Instead it looks at TX part of ol_flags, and
> > >> > > > > My thought was that as what we doing is actually TSO in SW, it
> would
> > >> be
> > >> > > good
> > >> > > > > to use the same API here too.
> > >> > > > > Also with that approach, by setting ol_flags properly user can use
> > >the
> > >> > > same gso_ctx and still
> > >> > > > > specify what segmentation to perform on a per-packet basis.
> > >> > > > >
> > >> > > > > Alternative way is to rely on ptype to distinguish should
> > >segmentation
> > >> be
> > >> > > performed on that package or not.
> > >> > > > > The only advantage I see here is that if someone would like to
> add
> > >> GSO
> > >> > > for some new protocol,
> > >> > > > > he wouldn't need to introduce new TX flag value for
> mbuf.ol_flags.
> > >> > > > > Though he still would need to update TX_OFFLOAD_* capabilities
> and
> > >> > > probably packet_type definitions.
> > >> > > > >
> > >> > > > > So from my perspective first variant (use HW TSO API) is more
> > >> plausible.
> > >> > > > > Wonder what is your and Mark opinions here?
> > >> > > >
> > >> > > > In the first choice, you mean:
> > >> > > > the GSO library uses gso_ctx->gso_types and mbuf->ol_flags to call
> a
> > >> > > specific GSO
> > >> > > > segmentation function (e.g. gso_tcp4_segment(), gso_tunnel_xxx())
> for
> > >> > > each input packet.
> > >> > > > Applications should parse the packet type, and set an exactly
> correct
> > >> > > DEV_TX_OFFLOAD_*_TSO
> > >> > > > flag to gso_types and ol_flags according to the packet type. That is,
> > >the
> > >> > > value of gso_types
> > >> > > > is on a per-packet basis. Using gso_ctx->gso_types and mbuf-
> >ol_flags
> > >> at
> > >> > > the same time
> > >> > > > is because that DEV_TX_OFFLOAD_*_TSO only tells tunnelling type
> and
> > >> the
> > >> > > inner L4 type, and
> > >> > > > we need to know L3 type by ol_flags. With this design, HW
> > >> segmentation
> > >> > > and SW segmentation
> > >> > > > are indeed consistent.
> > >> > > >
> > >> > > > If I understand it correctly, applications need to set 'ol_flags =
> > >> > > PKT_TX_IPV4' and
> > >> > > > 'gso_types = DEV_TX_OFFLOAD_VXLAN_TNL_TSO' for a
> > >> > > "ether+ipv4+udp+vxlan+ether+ipv4+
> > >> > > > tcp+payload" packet. But PKT_TX_IPV4 just present the inner L3
> type
> > >for
> > >> > > tunneled packet.
> > >> > > > How about the outer L3 type? Always assume the inner and the
> outer L3
> > >> > > type are the same?
> > >> > >
> > >> > > It think that for that case you'll have to set in ol_flags:
> > >> > >
> > >> > > PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN |
> > >> > > PKT_TX_TCP_SEG
> > >> >
> > >> > OK, so it means PKT_TX_TCP_SEG is also used for tunneled TSO. The
> > >> > GSO library doesn't need gso_types anymore.
> > >>
> > >> You still might need gso_ctx.gso_types to let user limit what types of
> > >> segmentation
> > >> that particular gso_ctx supports.
> > >> An alternative would be to assume that each gso_ctx supports all
> > >> currently implemented segmentations.
> > >> This is possible too, but probably not very convenient to the user.
> > >
> > >Hmm, make sense.
> > >
> > >One thing to confirm: the value of gso_types should be
> DEV_TX_OFFLOAD_*_TSO,
> > >or new macros?
> >
> > Hi Jiayu, Konstantin,
> >
> > I think that the existing macros are fine, as they provide a consistent view
> of segmentation capabilities to the application/user.
> 
> +1
> I also think it is better to re-use DEV_TX_OFFLOAD_*_TSO.

There might be an 'issue', if we use 'PKT_TX_TCP_SEG' to tell the
GSO library to segment a packet or not. Given the scenario that
an application only wants to do GSO and doesn't want to use TSO.
The application sets 'mbuf->ol_flags=PKT_TX_TCP_SEG' and doesn't
set mbuf->tso_segsz. Then the GSO library segments the packet, and
all output GSO segments have the same ol_flags as the input packet
(in current GSO library design). Then the output GSO segments are
transmitted to rte_eth_tx_prepare(). If the NIC is i40e, its TX prepare function,
i40e_prep_pkts, checks if mbuf->tso_segsz is in the range of I40E_MIN_TSO_MSS
and I40E_MAX_TSO_MSS, when PKT_TX_TCP_SEG is set. So an error happens in
this scenario, since tso_segsz is 0.
 
In fact, it may confuse the PMD driver when set PKT_TX_TCP_SEG but don't want
to do TSO. One solution is that the GSO library removes the PKT_TX_TCP_SEG flag
for all GSO segments after finishes segmenting. Wonder you and Mark's opinion.
 
Thanks,
Jiayu
> 
> >
> > I was initially concerned that they might be too coarse-grained (i.e. only
> IPv4 is currently supported, and not IPv6), but as per Konstantin's
> > previous example, the DEV_TX_OFFLOAD_*_TSO macros can be used in
> concert with the packet type to determine whether a packet should
> > be fragmented or not.
> >
> > Thanks,
> > Mark
> >
> > >
> > >Jiayu
> > >> Konstantin
> > >>
> > >> >
> > >> > The first choice makes HW and SW segmentation are totally the same.
> > >> > Applications just need to parse the packet and set proper ol_flags, and
> > >> > the GSO library uses ol_flags to decide which segmentation function to
> > >use.
> > >> > I think it's better than the second choice which depending on ptype to
> > >> > choose segmentation function.
> > >> >
> > >> > Jiayu
> > >> > >
> > >> > > Konstantin
> > >> > >
> > >> > > >
> > >> > > > Jiayu
> > >> > > > > Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-15  7:54                           ` Hu, Jiayu
@ 2017-09-15  8:15                             ` Ananyev, Konstantin
  2017-09-15  8:17                             ` Ananyev, Konstantin
  1 sibling, 0 replies; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-15  8:15 UTC (permalink / raw)
  To: Hu, Jiayu, Kavanagh, Mark B; +Cc: dev, Tan, Jianfeng

Hi Jiayu,

> -----Original Message-----
> From: Hu, Jiayu
> Sent: Friday, September 15, 2017 8:55 AM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> Hi Konstantin,
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Friday, September 15, 2017 2:39 AM
> > To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
> > <jiayu.hu@intel.com>
> > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> >
> >
> > > -----Original Message-----
> > > From: Kavanagh, Mark B
> > > Sent: Thursday, September 14, 2017 4:42 PM
> > > To: Hu, Jiayu <jiayu.hu@intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>
> > > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > >
> > > >From: Hu, Jiayu
> > > >Sent: Thursday, September 14, 2017 11:01 AM
> > > >To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh,
> > Mark B
> > > ><mark.b.kavanagh@intel.com>
> > > >Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > > >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > >
> > > >Hi Konstantin and Mark,
> > > >
> > > >> -----Original Message-----
> > > >> From: Ananyev, Konstantin
> > > >> Sent: Thursday, September 14, 2017 5:36 PM
> > > >> To: Hu, Jiayu <jiayu.hu@intel.com>
> > > >> Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> > Tan,
> > > >> Jianfeng <jianfeng.tan@intel.com>
> > > >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > >>
> > > >>
> > > >>
> > > >> > -----Original Message-----
> > > >> > From: Hu, Jiayu
> > > >> > Sent: Thursday, September 14, 2017 10:29 AM
> > > >> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > >> > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> > Tan,
> > > >> Jianfeng <jianfeng.tan@intel.com>
> > > >> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > >> >
> > > >> > Hi Konstantin,
> > > >> >
> > > >> > > -----Original Message-----
> > > >> > > From: Ananyev, Konstantin
> > > >> > > Sent: Thursday, September 14, 2017 4:47 PM
> > > >> > > To: Hu, Jiayu <jiayu.hu@intel.com>
> > > >> > > Cc: dev@dpdk.org; Kavanagh, Mark B
> > <mark.b.kavanagh@intel.com>;
> > > >> Tan,
> > > >> > > Jianfeng <jianfeng.tan@intel.com>
> > > >> > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > >> > >
> > > >> > > Hi Jiayu,
> > > >> > >
> > > >> > > > -----Original Message-----
> > > >> > > > From: Hu, Jiayu
> > > >> > > > Sent: Thursday, September 14, 2017 7:07 AM
> > > >> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > >> > > > Cc: dev@dpdk.org; Kavanagh, Mark B
> > <mark.b.kavanagh@intel.com>;
> > > >> Tan,
> > > >> > > Jianfeng <jianfeng.tan@intel.com>
> > > >> > > > Subject: Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > >> > > >
> > > >> > > > Hi Konstantin,
> > > >> > > >
> > > >> > > > On Thu, Sep 14, 2017 at 06:10:37AM +0800, Ananyev, Konstantin
> > wrote:
> > > >> > > > >
> > > >> > > > > Hi Jiayu,
> > > >> > > > >
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > > > > -----Original Message-----
> > > >> > > > > > > > From: Ananyev, Konstantin
> > > >> > > > > > > > Sent: Tuesday, September 12, 2017 12:18 PM
> > > >> > > > > > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> > > >> > > > > > > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> > > >> Jianfeng
> > > >> > > <jianfeng.tan@intel.com>
> > > >> > > > > > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > >> > > > > > > >
> > > >> > > > > > > > > result, when all of its GSOed segments are freed, the
> > packet
> > > >is
> > > >> > > freed
> > > >> > > > > > > > > automatically.
> > > >> > > > > > > > > diff --git a/lib/librte_gso/rte_gso.c
> > > >b/lib/librte_gso/rte_gso.c
> > > >> > > > > > > > > index dda50ee..95f6ea6 100644
> > > >> > > > > > > > > --- a/lib/librte_gso/rte_gso.c
> > > >> > > > > > > > > +++ b/lib/librte_gso/rte_gso.c
> > > >> > > > > > > > > @@ -33,18 +33,53 @@
> > > >> > > > > > > > >
> > > >> > > > > > > > >  #include <errno.h>
> > > >> > > > > > > > >
> > > >> > > > > > > > > +#include <rte_log.h>
> > > >> > > > > > > > > +
> > > >> > > > > > > > >  #include "rte_gso.h"
> > > >> > > > > > > > > +#include "gso_common.h"
> > > >> > > > > > > > > +#include "gso_tcp4.h"
> > > >> > > > > > > > >
> > > >> > > > > > > > >  int
> > > >> > > > > > > > >  rte_gso_segment(struct rte_mbuf *pkt,
> > > >> > > > > > > > > -		struct rte_gso_ctx gso_ctx __rte_unused,
> > > >> > > > > > > > > +		struct rte_gso_ctx gso_ctx,
> > > >> > > > > > > > >  		struct rte_mbuf **pkts_out,
> > > >> > > > > > > > >  		uint16_t nb_pkts_out)
> > > >> > > > > > > > >  {
> > > >> > > > > > > > > +	struct rte_mempool *direct_pool, *indirect_pool;
> > > >> > > > > > > > > +	struct rte_mbuf *pkt_seg;
> > > >> > > > > > > > > +	uint16_t gso_size;
> > > >> > > > > > > > > +	uint8_t ipid_delta;
> > > >> > > > > > > > > +	int ret = 1;
> > > >> > > > > > > > > +
> > > >> > > > > > > > >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out
> > < 1)
> > > >> > > > > > > > >  		return -EINVAL;
> > > >> > > > > > > > >
> > > >> > > > > > > > > -	pkts_out[0] = pkt;
> > > >> > > > > > > > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> > > >> > > > > > > > > +			(pkt->packet_type &
> > gso_ctx.gso_types) !=
> > > >> > > > > > > > > +			pkt->packet_type) {
> > > >> > > > > > > > > +		pkts_out[0] = pkt;
> > > >> > > > > > > > > +		return ret;
> > > >> > > > > > > > > +	}
> > > >> > > > > > > > > +
> > > >> > > > > > > > > +	direct_pool = gso_ctx.direct_pool;
> > > >> > > > > > > > > +	indirect_pool = gso_ctx.indirect_pool;
> > > >> > > > > > > > > +	gso_size = gso_ctx.gso_size;
> > > >> > > > > > > > > +	ipid_delta = gso_ctx.ipid_flag ==
> > RTE_GSO_IPID_INCREASE;
> > > >> > > > > > > > > +
> > > >> > > > > > > > > +	if (is_ipv4_tcp(pkt->packet_type)) {
> > > >> > > > > > > >
> > > >> > > > > > > > Probably we need here:
> > > >> > > > > > > > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types
> > &
> > > >> > > DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> > > >> > > > > > >
> > > >> > > > > > > Sorry, actually it probably should be:
> > > >> > > > > > > If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) ==
> > > >> PKT_TX_IPV4
> > > >> > > &&
> > > >> > > > > > >       (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0)
> > {...
> > > >> > > > > >
> > > >> > > > > > I don't quite understand why the GSO library should be aware if
> > > >the
> > > >> TSO
> > > >> > > > > > flag is set or not. Applications can query device TSO capability
> > > >> before
> > > >> > > > > > they call the GSO library. Do I misundertsand anything?
> > > >> > > > > >
> > > >> > > > > > Additionally, we don't need to check if the packet is a TCP/IPv4
> > > >> packet
> > > >> > > here?
> > > >> > > > >
> > > >> > > > > Well, right now  PMD we doesn't rely on ptype to figure out what
> > > >type
> > > >> of
> > > >> > > packet and
> > > >> > > > > what TX offload have to be performed.
> > > >> > > > > Instead it looks at TX part of ol_flags, and
> > > >> > > > > My thought was that as what we doing is actually TSO in SW, it
> > would
> > > >> be
> > > >> > > good
> > > >> > > > > to use the same API here too.
> > > >> > > > > Also with that approach, by setting ol_flags properly user can use
> > > >the
> > > >> > > same gso_ctx and still
> > > >> > > > > specify what segmentation to perform on a per-packet basis.
> > > >> > > > >
> > > >> > > > > Alternative way is to rely on ptype to distinguish should
> > > >segmentation
> > > >> be
> > > >> > > performed on that package or not.
> > > >> > > > > The only advantage I see here is that if someone would like to
> > add
> > > >> GSO
> > > >> > > for some new protocol,
> > > >> > > > > he wouldn't need to introduce new TX flag value for
> > mbuf.ol_flags.
> > > >> > > > > Though he still would need to update TX_OFFLOAD_* capabilities
> > and
> > > >> > > probably packet_type definitions.
> > > >> > > > >
> > > >> > > > > So from my perspective first variant (use HW TSO API) is more
> > > >> plausible.
> > > >> > > > > Wonder what is your and Mark opinions here?
> > > >> > > >
> > > >> > > > In the first choice, you mean:
> > > >> > > > the GSO library uses gso_ctx->gso_types and mbuf->ol_flags to call
> > a
> > > >> > > specific GSO
> > > >> > > > segmentation function (e.g. gso_tcp4_segment(), gso_tunnel_xxx())
> > for
> > > >> > > each input packet.
> > > >> > > > Applications should parse the packet type, and set an exactly
> > correct
> > > >> > > DEV_TX_OFFLOAD_*_TSO
> > > >> > > > flag to gso_types and ol_flags according to the packet type. That is,
> > > >the
> > > >> > > value of gso_types
> > > >> > > > is on a per-packet basis. Using gso_ctx->gso_types and mbuf-
> > >ol_flags
> > > >> at
> > > >> > > the same time
> > > >> > > > is because that DEV_TX_OFFLOAD_*_TSO only tells tunnelling type
> > and
> > > >> the
> > > >> > > inner L4 type, and
> > > >> > > > we need to know L3 type by ol_flags. With this design, HW
> > > >> segmentation
> > > >> > > and SW segmentation
> > > >> > > > are indeed consistent.
> > > >> > > >
> > > >> > > > If I understand it correctly, applications need to set 'ol_flags =
> > > >> > > PKT_TX_IPV4' and
> > > >> > > > 'gso_types = DEV_TX_OFFLOAD_VXLAN_TNL_TSO' for a
> > > >> > > "ether+ipv4+udp+vxlan+ether+ipv4+
> > > >> > > > tcp+payload" packet. But PKT_TX_IPV4 just present the inner L3
> > type
> > > >for
> > > >> > > tunneled packet.
> > > >> > > > How about the outer L3 type? Always assume the inner and the
> > outer L3
> > > >> > > type are the same?
> > > >> > >
> > > >> > > It think that for that case you'll have to set in ol_flags:
> > > >> > >
> > > >> > > PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN |
> > > >> > > PKT_TX_TCP_SEG
> > > >> >
> > > >> > OK, so it means PKT_TX_TCP_SEG is also used for tunneled TSO. The
> > > >> > GSO library doesn't need gso_types anymore.
> > > >>
> > > >> You still might need gso_ctx.gso_types to let user limit what types of
> > > >> segmentation
> > > >> that particular gso_ctx supports.
> > > >> An alternative would be to assume that each gso_ctx supports all
> > > >> currently implemented segmentations.
> > > >> This is possible too, but probably not very convenient to the user.
> > > >
> > > >Hmm, make sense.
> > > >
> > > >One thing to confirm: the value of gso_types should be
> > DEV_TX_OFFLOAD_*_TSO,
> > > >or new macros?
> > >
> > > Hi Jiayu, Konstantin,
> > >
> > > I think that the existing macros are fine, as they provide a consistent view
> > of segmentation capabilities to the application/user.
> >
> > +1
> > I also think it is better to re-use DEV_TX_OFFLOAD_*_TSO.
> 
> There might be an 'issue', if we use 'PKT_TX_TCP_SEG' to tell the
> GSO library to segment a packet or not. Given the scenario that
> an application only wants to do GSO and doesn't want to use TSO.
> The application sets 'mbuf->ol_flags=PKT_TX_TCP_SEG' and doesn't
> set mbuf->tso_segsz. Then the GSO library segments the packet, and
> all output GSO segments have the same ol_flags as the input packet
> (in current GSO library design). Then the output GSO segments are
> transmitted to rte_eth_tx_prepare(). If the NIC is i40e, its TX prepare function,
> i40e_prep_pkts, checks if mbuf->tso_segsz is in the range of I40E_MIN_TSO_MSS
> and I40E_MAX_TSO_MSS, when PKT_TX_TCP_SEG is set. So an error happens in
> this scenario, since tso_segsz is 0.
> 
> In fact, it may confuse the PMD driver when set PKT_TX_TCP_SEG but don't want
> to do TSO. One solution is that the GSO library removes the PKT_TX_TCP_SEG flag
> for all GSO segments after finishes segmenting.

Yes, that was my thought too: after successful segmentation we probably 
need to cleanup related ol_flags.
Konstantin

> Wonder you and Mark's opinion.
> 
> Thanks,
> Jiayu
> >
> > >
> > > I was initially concerned that they might be too coarse-grained (i.e. only
> > IPv4 is currently supported, and not IPv6), but as per Konstantin's
> > > previous example, the DEV_TX_OFFLOAD_*_TSO macros can be used in
> > concert with the packet type to determine whether a packet should
> > > be fragmented or not.
> > >
> > > Thanks,
> > > Mark
> > >
> > > >
> > > >Jiayu
> > > >> Konstantin
> > > >>
> > > >> >
> > > >> > The first choice makes HW and SW segmentation are totally the same.
> > > >> > Applications just need to parse the packet and set proper ol_flags, and
> > > >> > the GSO library uses ol_flags to decide which segmentation function to
> > > >use.
> > > >> > I think it's better than the second choice which depending on ptype to
> > > >> > choose segmentation function.
> > > >> >
> > > >> > Jiayu
> > > >> > >
> > > >> > > Konstantin
> > > >> > >
> > > >> > > >
> > > >> > > > Jiayu
> > > >> > > > > Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-15  7:54                           ` Hu, Jiayu
  2017-09-15  8:15                             ` Ananyev, Konstantin
@ 2017-09-15  8:17                             ` Ananyev, Konstantin
  2017-09-15  8:38                               ` Hu, Jiayu
  1 sibling, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-09-15  8:17 UTC (permalink / raw)
  To: Hu, Jiayu, Kavanagh, Mark B; +Cc: dev, Tan, Jianfeng



> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Friday, September 15, 2017 9:16 AM
> To: Hu, Jiayu <jiayu.hu@intel.com>; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> Hi Jiayu,
> 
> > -----Original Message-----
> > From: Hu, Jiayu
> > Sent: Friday, September 15, 2017 8:55 AM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> > Hi Konstantin,
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin
> > > Sent: Friday, September 15, 2017 2:39 AM
> > > To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
> > > <jiayu.hu@intel.com>
> > > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Kavanagh, Mark B
> > > > Sent: Thursday, September 14, 2017 4:42 PM
> > > > To: Hu, Jiayu <jiayu.hu@intel.com>; Ananyev, Konstantin
> > > <konstantin.ananyev@intel.com>
> > > > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > >
> > > > >From: Hu, Jiayu
> > > > >Sent: Thursday, September 14, 2017 11:01 AM
> > > > >To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh,
> > > Mark B
> > > > ><mark.b.kavanagh@intel.com>
> > > > >Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > > > >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > >
> > > > >Hi Konstantin and Mark,
> > > > >
> > > > >> -----Original Message-----
> > > > >> From: Ananyev, Konstantin
> > > > >> Sent: Thursday, September 14, 2017 5:36 PM
> > > > >> To: Hu, Jiayu <jiayu.hu@intel.com>
> > > > >> Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> > > Tan,
> > > > >> Jianfeng <jianfeng.tan@intel.com>
> > > > >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > >>
> > > > >>
> > > > >>
> > > > >> > -----Original Message-----
> > > > >> > From: Hu, Jiayu
> > > > >> > Sent: Thursday, September 14, 2017 10:29 AM
> > > > >> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > >> > Cc: dev@dpdk.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> > > Tan,
> > > > >> Jianfeng <jianfeng.tan@intel.com>
> > > > >> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > >> >
> > > > >> > Hi Konstantin,
> > > > >> >
> > > > >> > > -----Original Message-----
> > > > >> > > From: Ananyev, Konstantin
> > > > >> > > Sent: Thursday, September 14, 2017 4:47 PM
> > > > >> > > To: Hu, Jiayu <jiayu.hu@intel.com>
> > > > >> > > Cc: dev@dpdk.org; Kavanagh, Mark B
> > > <mark.b.kavanagh@intel.com>;
> > > > >> Tan,
> > > > >> > > Jianfeng <jianfeng.tan@intel.com>
> > > > >> > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > >> > >
> > > > >> > > Hi Jiayu,
> > > > >> > >
> > > > >> > > > -----Original Message-----
> > > > >> > > > From: Hu, Jiayu
> > > > >> > > > Sent: Thursday, September 14, 2017 7:07 AM
> > > > >> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > >> > > > Cc: dev@dpdk.org; Kavanagh, Mark B
> > > <mark.b.kavanagh@intel.com>;
> > > > >> Tan,
> > > > >> > > Jianfeng <jianfeng.tan@intel.com>
> > > > >> > > > Subject: Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > >> > > >
> > > > >> > > > Hi Konstantin,
> > > > >> > > >
> > > > >> > > > On Thu, Sep 14, 2017 at 06:10:37AM +0800, Ananyev, Konstantin
> > > wrote:
> > > > >> > > > >
> > > > >> > > > > Hi Jiayu,
> > > > >> > > > >
> > > > >> > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > > > -----Original Message-----
> > > > >> > > > > > > > From: Ananyev, Konstantin
> > > > >> > > > > > > > Sent: Tuesday, September 12, 2017 12:18 PM
> > > > >> > > > > > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> > > > >> > > > > > > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Tan,
> > > > >> Jianfeng
> > > > >> > > <jianfeng.tan@intel.com>
> > > > >> > > > > > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > >> > > > > > > >
> > > > >> > > > > > > > > result, when all of its GSOed segments are freed, the
> > > packet
> > > > >is
> > > > >> > > freed
> > > > >> > > > > > > > > automatically.
> > > > >> > > > > > > > > diff --git a/lib/librte_gso/rte_gso.c
> > > > >b/lib/librte_gso/rte_gso.c
> > > > >> > > > > > > > > index dda50ee..95f6ea6 100644
> > > > >> > > > > > > > > --- a/lib/librte_gso/rte_gso.c
> > > > >> > > > > > > > > +++ b/lib/librte_gso/rte_gso.c
> > > > >> > > > > > > > > @@ -33,18 +33,53 @@
> > > > >> > > > > > > > >
> > > > >> > > > > > > > >  #include <errno.h>
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > +#include <rte_log.h>
> > > > >> > > > > > > > > +
> > > > >> > > > > > > > >  #include "rte_gso.h"
> > > > >> > > > > > > > > +#include "gso_common.h"
> > > > >> > > > > > > > > +#include "gso_tcp4.h"
> > > > >> > > > > > > > >
> > > > >> > > > > > > > >  int
> > > > >> > > > > > > > >  rte_gso_segment(struct rte_mbuf *pkt,
> > > > >> > > > > > > > > -		struct rte_gso_ctx gso_ctx __rte_unused,
> > > > >> > > > > > > > > +		struct rte_gso_ctx gso_ctx,
> > > > >> > > > > > > > >  		struct rte_mbuf **pkts_out,
> > > > >> > > > > > > > >  		uint16_t nb_pkts_out)
> > > > >> > > > > > > > >  {
> > > > >> > > > > > > > > +	struct rte_mempool *direct_pool, *indirect_pool;
> > > > >> > > > > > > > > +	struct rte_mbuf *pkt_seg;
> > > > >> > > > > > > > > +	uint16_t gso_size;
> > > > >> > > > > > > > > +	uint8_t ipid_delta;
> > > > >> > > > > > > > > +	int ret = 1;
> > > > >> > > > > > > > > +
> > > > >> > > > > > > > >  	if (pkt == NULL || pkts_out == NULL || nb_pkts_out
> > > < 1)
> > > > >> > > > > > > > >  		return -EINVAL;
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > -	pkts_out[0] = pkt;
> > > > >> > > > > > > > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> > > > >> > > > > > > > > +			(pkt->packet_type &
> > > gso_ctx.gso_types) !=
> > > > >> > > > > > > > > +			pkt->packet_type) {
> > > > >> > > > > > > > > +		pkts_out[0] = pkt;
> > > > >> > > > > > > > > +		return ret;
> > > > >> > > > > > > > > +	}
> > > > >> > > > > > > > > +
> > > > >> > > > > > > > > +	direct_pool = gso_ctx.direct_pool;
> > > > >> > > > > > > > > +	indirect_pool = gso_ctx.indirect_pool;
> > > > >> > > > > > > > > +	gso_size = gso_ctx.gso_size;
> > > > >> > > > > > > > > +	ipid_delta = gso_ctx.ipid_flag ==
> > > RTE_GSO_IPID_INCREASE;
> > > > >> > > > > > > > > +
> > > > >> > > > > > > > > +	if (is_ipv4_tcp(pkt->packet_type)) {
> > > > >> > > > > > > >
> > > > >> > > > > > > > Probably we need here:
> > > > >> > > > > > > > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx->gso_types
> > > &
> > > > >> > > DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> > > > >> > > > > > >
> > > > >> > > > > > > Sorry, actually it probably should be:
> > > > >> > > > > > > If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) ==
> > > > >> PKT_TX_IPV4
> > > > >> > > &&
> > > > >> > > > > > >       (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) != 0)
> > > {...
> > > > >> > > > > >
> > > > >> > > > > > I don't quite understand why the GSO library should be aware if
> > > > >the
> > > > >> TSO
> > > > >> > > > > > flag is set or not. Applications can query device TSO capability
> > > > >> before
> > > > >> > > > > > they call the GSO library. Do I misundertsand anything?
> > > > >> > > > > >
> > > > >> > > > > > Additionally, we don't need to check if the packet is a TCP/IPv4
> > > > >> packet
> > > > >> > > here?
> > > > >> > > > >
> > > > >> > > > > Well, right now  PMD we doesn't rely on ptype to figure out what
> > > > >type
> > > > >> of
> > > > >> > > packet and
> > > > >> > > > > what TX offload have to be performed.
> > > > >> > > > > Instead it looks at TX part of ol_flags, and
> > > > >> > > > > My thought was that as what we doing is actually TSO in SW, it
> > > would
> > > > >> be
> > > > >> > > good
> > > > >> > > > > to use the same API here too.
> > > > >> > > > > Also with that approach, by setting ol_flags properly user can use
> > > > >the
> > > > >> > > same gso_ctx and still
> > > > >> > > > > specify what segmentation to perform on a per-packet basis.
> > > > >> > > > >
> > > > >> > > > > Alternative way is to rely on ptype to distinguish should
> > > > >segmentation
> > > > >> be
> > > > >> > > performed on that package or not.
> > > > >> > > > > The only advantage I see here is that if someone would like to
> > > add
> > > > >> GSO
> > > > >> > > for some new protocol,
> > > > >> > > > > he wouldn't need to introduce new TX flag value for
> > > mbuf.ol_flags.
> > > > >> > > > > Though he still would need to update TX_OFFLOAD_* capabilities
> > > and
> > > > >> > > probably packet_type definitions.
> > > > >> > > > >
> > > > >> > > > > So from my perspective first variant (use HW TSO API) is more
> > > > >> plausible.
> > > > >> > > > > Wonder what is your and Mark opinions here?
> > > > >> > > >
> > > > >> > > > In the first choice, you mean:
> > > > >> > > > the GSO library uses gso_ctx->gso_types and mbuf->ol_flags to call
> > > a
> > > > >> > > specific GSO
> > > > >> > > > segmentation function (e.g. gso_tcp4_segment(), gso_tunnel_xxx())
> > > for
> > > > >> > > each input packet.
> > > > >> > > > Applications should parse the packet type, and set an exactly
> > > correct
> > > > >> > > DEV_TX_OFFLOAD_*_TSO
> > > > >> > > > flag to gso_types and ol_flags according to the packet type. That is,
> > > > >the
> > > > >> > > value of gso_types
> > > > >> > > > is on a per-packet basis. Using gso_ctx->gso_types and mbuf-
> > > >ol_flags
> > > > >> at
> > > > >> > > the same time
> > > > >> > > > is because that DEV_TX_OFFLOAD_*_TSO only tells tunnelling type
> > > and
> > > > >> the
> > > > >> > > inner L4 type, and
> > > > >> > > > we need to know L3 type by ol_flags. With this design, HW
> > > > >> segmentation
> > > > >> > > and SW segmentation
> > > > >> > > > are indeed consistent.
> > > > >> > > >
> > > > >> > > > If I understand it correctly, applications need to set 'ol_flags =
> > > > >> > > PKT_TX_IPV4' and
> > > > >> > > > 'gso_types = DEV_TX_OFFLOAD_VXLAN_TNL_TSO' for a
> > > > >> > > "ether+ipv4+udp+vxlan+ether+ipv4+
> > > > >> > > > tcp+payload" packet. But PKT_TX_IPV4 just present the inner L3
> > > type
> > > > >for
> > > > >> > > tunneled packet.
> > > > >> > > > How about the outer L3 type? Always assume the inner and the
> > > outer L3
> > > > >> > > type are the same?
> > > > >> > >
> > > > >> > > It think that for that case you'll have to set in ol_flags:
> > > > >> > >
> > > > >> > > PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN |
> > > > >> > > PKT_TX_TCP_SEG
> > > > >> >
> > > > >> > OK, so it means PKT_TX_TCP_SEG is also used for tunneled TSO. The
> > > > >> > GSO library doesn't need gso_types anymore.
> > > > >>
> > > > >> You still might need gso_ctx.gso_types to let user limit what types of
> > > > >> segmentation
> > > > >> that particular gso_ctx supports.
> > > > >> An alternative would be to assume that each gso_ctx supports all
> > > > >> currently implemented segmentations.
> > > > >> This is possible too, but probably not very convenient to the user.
> > > > >
> > > > >Hmm, make sense.
> > > > >
> > > > >One thing to confirm: the value of gso_types should be
> > > DEV_TX_OFFLOAD_*_TSO,
> > > > >or new macros?
> > > >
> > > > Hi Jiayu, Konstantin,
> > > >
> > > > I think that the existing macros are fine, as they provide a consistent view
> > > of segmentation capabilities to the application/user.
> > >
> > > +1
> > > I also think it is better to re-use DEV_TX_OFFLOAD_*_TSO.
> >
> > There might be an 'issue', if we use 'PKT_TX_TCP_SEG' to tell the
> > GSO library to segment a packet or not. Given the scenario that
> > an application only wants to do GSO and doesn't want to use TSO.
> > The application sets 'mbuf->ol_flags=PKT_TX_TCP_SEG' and doesn't
> > set mbuf->tso_segsz. Then the GSO library segments the packet, and
> > all output GSO segments have the same ol_flags as the input packet
> > (in current GSO library design). Then the output GSO segments are
> > transmitted to rte_eth_tx_prepare(). If the NIC is i40e, its TX prepare function,
> > i40e_prep_pkts, checks if mbuf->tso_segsz is in the range of I40E_MIN_TSO_MSS
> > and I40E_MAX_TSO_MSS, when PKT_TX_TCP_SEG is set. So an error happens in
> > this scenario, since tso_segsz is 0.
> >
> > In fact, it may confuse the PMD driver when set PKT_TX_TCP_SEG but don't want
> > to do TSO. One solution is that the GSO library removes the PKT_TX_TCP_SEG flag
> > for all GSO segments after finishes segmenting.
> 
> Yes, that was my thought too: after successful segmentation we probably
> need to cleanup related ol_flags.

In fact, we just don't need to set these flags in our newly created segments.

> Konstantin
> 
> > Wonder you and Mark's opinion.
> >
> > Thanks,
> > Jiayu
> > >
> > > >
> > > > I was initially concerned that they might be too coarse-grained (i.e. only
> > > IPv4 is currently supported, and not IPv6), but as per Konstantin's
> > > > previous example, the DEV_TX_OFFLOAD_*_TSO macros can be used in
> > > concert with the packet type to determine whether a packet should
> > > > be fragmented or not.
> > > >
> > > > Thanks,
> > > > Mark
> > > >
> > > > >
> > > > >Jiayu
> > > > >> Konstantin
> > > > >>
> > > > >> >
> > > > >> > The first choice makes HW and SW segmentation are totally the same.
> > > > >> > Applications just need to parse the packet and set proper ol_flags, and
> > > > >> > the GSO library uses ol_flags to decide which segmentation function to
> > > > >use.
> > > > >> > I think it's better than the second choice which depending on ptype to
> > > > >> > choose segmentation function.
> > > > >> >
> > > > >> > Jiayu
> > > > >> > >
> > > > >> > > Konstantin
> > > > >> > >
> > > > >> > > >
> > > > >> > > > Jiayu
> > > > >> > > > > Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
  2017-09-15  8:17                             ` Ananyev, Konstantin
@ 2017-09-15  8:38                               ` Hu, Jiayu
  0 siblings, 0 replies; 157+ messages in thread
From: Hu, Jiayu @ 2017-09-15  8:38 UTC (permalink / raw)
  To: Ananyev, Konstantin, Kavanagh, Mark B; +Cc: dev, Tan, Jianfeng



> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Friday, September 15, 2017 4:17 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>; Kavanagh, Mark B
> <mark.b.kavanagh@intel.com>
> Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> 
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Friday, September 15, 2017 9:16 AM
> > To: Hu, Jiayu <jiayu.hu@intel.com>; Kavanagh, Mark B
> <mark.b.kavanagh@intel.com>
> > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> > Hi Jiayu,
> >
> > > -----Original Message-----
> > > From: Hu, Jiayu
> > > Sent: Friday, September 15, 2017 8:55 AM
> > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh,
> Mark B <mark.b.kavanagh@intel.com>
> > > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > >
> > > Hi Konstantin,
> > >
> > > > -----Original Message-----
> > > > From: Ananyev, Konstantin
> > > > Sent: Friday, September 15, 2017 2:39 AM
> > > > To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Hu, Jiayu
> > > > <jiayu.hu@intel.com>
> > > > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Kavanagh, Mark B
> > > > > Sent: Thursday, September 14, 2017 4:42 PM
> > > > > To: Hu, Jiayu <jiayu.hu@intel.com>; Ananyev, Konstantin
> > > > <konstantin.ananyev@intel.com>
> > > > > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > >
> > > > > >From: Hu, Jiayu
> > > > > >Sent: Thursday, September 14, 2017 11:01 AM
> > > > > >To: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> Kavanagh,
> > > > Mark B
> > > > > ><mark.b.kavanagh@intel.com>
> > > > > >Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > > > > >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > > >
> > > > > >Hi Konstantin and Mark,
> > > > > >
> > > > > >> -----Original Message-----
> > > > > >> From: Ananyev, Konstantin
> > > > > >> Sent: Thursday, September 14, 2017 5:36 PM
> > > > > >> To: Hu, Jiayu <jiayu.hu@intel.com>
> > > > > >> Cc: dev@dpdk.org; Kavanagh, Mark B
> <mark.b.kavanagh@intel.com>;
> > > > Tan,
> > > > > >> Jianfeng <jianfeng.tan@intel.com>
> > > > > >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> > -----Original Message-----
> > > > > >> > From: Hu, Jiayu
> > > > > >> > Sent: Thursday, September 14, 2017 10:29 AM
> > > > > >> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > >> > Cc: dev@dpdk.org; Kavanagh, Mark B
> <mark.b.kavanagh@intel.com>;
> > > > Tan,
> > > > > >> Jianfeng <jianfeng.tan@intel.com>
> > > > > >> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > > >> >
> > > > > >> > Hi Konstantin,
> > > > > >> >
> > > > > >> > > -----Original Message-----
> > > > > >> > > From: Ananyev, Konstantin
> > > > > >> > > Sent: Thursday, September 14, 2017 4:47 PM
> > > > > >> > > To: Hu, Jiayu <jiayu.hu@intel.com>
> > > > > >> > > Cc: dev@dpdk.org; Kavanagh, Mark B
> > > > <mark.b.kavanagh@intel.com>;
> > > > > >> Tan,
> > > > > >> > > Jianfeng <jianfeng.tan@intel.com>
> > > > > >> > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > > >> > >
> > > > > >> > > Hi Jiayu,
> > > > > >> > >
> > > > > >> > > > -----Original Message-----
> > > > > >> > > > From: Hu, Jiayu
> > > > > >> > > > Sent: Thursday, September 14, 2017 7:07 AM
> > > > > >> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > >> > > > Cc: dev@dpdk.org; Kavanagh, Mark B
> > > > <mark.b.kavanagh@intel.com>;
> > > > > >> Tan,
> > > > > >> > > Jianfeng <jianfeng.tan@intel.com>
> > > > > >> > > > Subject: Re: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> > > > > >> > > >
> > > > > >> > > > Hi Konstantin,
> > > > > >> > > >
> > > > > >> > > > On Thu, Sep 14, 2017 at 06:10:37AM +0800, Ananyev,
> Konstantin
> > > > wrote:
> > > > > >> > > > >
> > > > > >> > > > > Hi Jiayu,
> > > > > >> > > > >
> > > > > >> > > > > > >
> > > > > >> > > > > > >
> > > > > >> > > > > > > > -----Original Message-----
> > > > > >> > > > > > > > From: Ananyev, Konstantin
> > > > > >> > > > > > > > Sent: Tuesday, September 12, 2017 12:18 PM
> > > > > >> > > > > > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> > > > > >> > > > > > > > Cc: Kavanagh, Mark B <mark.b.kavanagh@intel.com>;
> Tan,
> > > > > >> Jianfeng
> > > > > >> > > <jianfeng.tan@intel.com>
> > > > > >> > > > > > > > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO
> support
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > > result, when all of its GSOed segments are freed, the
> > > > packet
> > > > > >is
> > > > > >> > > freed
> > > > > >> > > > > > > > > automatically.
> > > > > >> > > > > > > > > diff --git a/lib/librte_gso/rte_gso.c
> > > > > >b/lib/librte_gso/rte_gso.c
> > > > > >> > > > > > > > > index dda50ee..95f6ea6 100644
> > > > > >> > > > > > > > > --- a/lib/librte_gso/rte_gso.c
> > > > > >> > > > > > > > > +++ b/lib/librte_gso/rte_gso.c
> > > > > >> > > > > > > > > @@ -33,18 +33,53 @@
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > > >  #include <errno.h>
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > > > +#include <rte_log.h>
> > > > > >> > > > > > > > > +
> > > > > >> > > > > > > > >  #include "rte_gso.h"
> > > > > >> > > > > > > > > +#include "gso_common.h"
> > > > > >> > > > > > > > > +#include "gso_tcp4.h"
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > > >  int
> > > > > >> > > > > > > > >  rte_gso_segment(struct rte_mbuf *pkt,
> > > > > >> > > > > > > > > -		struct rte_gso_ctx gso_ctx
> __rte_unused,
> > > > > >> > > > > > > > > +		struct rte_gso_ctx gso_ctx,
> > > > > >> > > > > > > > >  		struct rte_mbuf **pkts_out,
> > > > > >> > > > > > > > >  		uint16_t nb_pkts_out)
> > > > > >> > > > > > > > >  {
> > > > > >> > > > > > > > > +	struct rte_mempool *direct_pool,
> *indirect_pool;
> > > > > >> > > > > > > > > +	struct rte_mbuf *pkt_seg;
> > > > > >> > > > > > > > > +	uint16_t gso_size;
> > > > > >> > > > > > > > > +	uint8_t ipid_delta;
> > > > > >> > > > > > > > > +	int ret = 1;
> > > > > >> > > > > > > > > +
> > > > > >> > > > > > > > >  	if (pkt == NULL || pkts_out == NULL ||
> nb_pkts_out
> > > > < 1)
> > > > > >> > > > > > > > >  		return -EINVAL;
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > > > -	pkts_out[0] = pkt;
> > > > > >> > > > > > > > > +	if (gso_ctx.gso_size >= pkt->pkt_len ||
> > > > > >> > > > > > > > > +			(pkt->packet_type &
> > > > gso_ctx.gso_types) !=
> > > > > >> > > > > > > > > +			pkt->packet_type) {
> > > > > >> > > > > > > > > +		pkts_out[0] = pkt;
> > > > > >> > > > > > > > > +		return ret;
> > > > > >> > > > > > > > > +	}
> > > > > >> > > > > > > > > +
> > > > > >> > > > > > > > > +	direct_pool = gso_ctx.direct_pool;
> > > > > >> > > > > > > > > +	indirect_pool = gso_ctx.indirect_pool;
> > > > > >> > > > > > > > > +	gso_size = gso_ctx.gso_size;
> > > > > >> > > > > > > > > +	ipid_delta = gso_ctx.ipid_flag ==
> > > > RTE_GSO_IPID_INCREASE;
> > > > > >> > > > > > > > > +
> > > > > >> > > > > > > > > +	if (is_ipv4_tcp(pkt->packet_type)) {
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > Probably we need here:
> > > > > >> > > > > > > > If (is_ipv4_tcp(pkt->packet_type)  && (gso_ctx-
> >gso_types
> > > > &
> > > > > >> > > DEV_TX_OFFLOAD_TCP_TSO) != 0) {...
> > > > > >> > > > > > >
> > > > > >> > > > > > > Sorry, actually it probably should be:
> > > > > >> > > > > > > If (pkt->ol_flags & (PKT_TX_TCP_SEG | PKT_TX_IPV4) ==
> > > > > >> PKT_TX_IPV4
> > > > > >> > > &&
> > > > > >> > > > > > >       (gso_ctx->gso_types &
> DEV_TX_OFFLOAD_TCP_TSO) != 0)
> > > > {...
> > > > > >> > > > > >
> > > > > >> > > > > > I don't quite understand why the GSO library should be
> aware if
> > > > > >the
> > > > > >> TSO
> > > > > >> > > > > > flag is set or not. Applications can query device TSO
> capability
> > > > > >> before
> > > > > >> > > > > > they call the GSO library. Do I misundertsand anything?
> > > > > >> > > > > >
> > > > > >> > > > > > Additionally, we don't need to check if the packet is a
> TCP/IPv4
> > > > > >> packet
> > > > > >> > > here?
> > > > > >> > > > >
> > > > > >> > > > > Well, right now  PMD we doesn't rely on ptype to figure out
> what
> > > > > >type
> > > > > >> of
> > > > > >> > > packet and
> > > > > >> > > > > what TX offload have to be performed.
> > > > > >> > > > > Instead it looks at TX part of ol_flags, and
> > > > > >> > > > > My thought was that as what we doing is actually TSO in SW,
> it
> > > > would
> > > > > >> be
> > > > > >> > > good
> > > > > >> > > > > to use the same API here too.
> > > > > >> > > > > Also with that approach, by setting ol_flags properly user
> can use
> > > > > >the
> > > > > >> > > same gso_ctx and still
> > > > > >> > > > > specify what segmentation to perform on a per-packet
> basis.
> > > > > >> > > > >
> > > > > >> > > > > Alternative way is to rely on ptype to distinguish should
> > > > > >segmentation
> > > > > >> be
> > > > > >> > > performed on that package or not.
> > > > > >> > > > > The only advantage I see here is that if someone would like
> to
> > > > add
> > > > > >> GSO
> > > > > >> > > for some new protocol,
> > > > > >> > > > > he wouldn't need to introduce new TX flag value for
> > > > mbuf.ol_flags.
> > > > > >> > > > > Though he still would need to update TX_OFFLOAD_*
> capabilities
> > > > and
> > > > > >> > > probably packet_type definitions.
> > > > > >> > > > >
> > > > > >> > > > > So from my perspective first variant (use HW TSO API) is
> more
> > > > > >> plausible.
> > > > > >> > > > > Wonder what is your and Mark opinions here?
> > > > > >> > > >
> > > > > >> > > > In the first choice, you mean:
> > > > > >> > > > the GSO library uses gso_ctx->gso_types and mbuf->ol_flags
> to call
> > > > a
> > > > > >> > > specific GSO
> > > > > >> > > > segmentation function (e.g. gso_tcp4_segment(),
> gso_tunnel_xxx())
> > > > for
> > > > > >> > > each input packet.
> > > > > >> > > > Applications should parse the packet type, and set an exactly
> > > > correct
> > > > > >> > > DEV_TX_OFFLOAD_*_TSO
> > > > > >> > > > flag to gso_types and ol_flags according to the packet type.
> That is,
> > > > > >the
> > > > > >> > > value of gso_types
> > > > > >> > > > is on a per-packet basis. Using gso_ctx->gso_types and mbuf-
> > > > >ol_flags
> > > > > >> at
> > > > > >> > > the same time
> > > > > >> > > > is because that DEV_TX_OFFLOAD_*_TSO only tells tunnelling
> type
> > > > and
> > > > > >> the
> > > > > >> > > inner L4 type, and
> > > > > >> > > > we need to know L3 type by ol_flags. With this design, HW
> > > > > >> segmentation
> > > > > >> > > and SW segmentation
> > > > > >> > > > are indeed consistent.
> > > > > >> > > >
> > > > > >> > > > If I understand it correctly, applications need to set 'ol_flags =
> > > > > >> > > PKT_TX_IPV4' and
> > > > > >> > > > 'gso_types = DEV_TX_OFFLOAD_VXLAN_TNL_TSO' for a
> > > > > >> > > "ether+ipv4+udp+vxlan+ether+ipv4+
> > > > > >> > > > tcp+payload" packet. But PKT_TX_IPV4 just present the inner
> L3
> > > > type
> > > > > >for
> > > > > >> > > tunneled packet.
> > > > > >> > > > How about the outer L3 type? Always assume the inner and
> the
> > > > outer L3
> > > > > >> > > type are the same?
> > > > > >> > >
> > > > > >> > > It think that for that case you'll have to set in ol_flags:
> > > > > >> > >
> > > > > >> > > PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 |
> PKT_TX_TUNNEL_VXLAN |
> > > > > >> > > PKT_TX_TCP_SEG
> > > > > >> >
> > > > > >> > OK, so it means PKT_TX_TCP_SEG is also used for tunneled TSO.
> The
> > > > > >> > GSO library doesn't need gso_types anymore.
> > > > > >>
> > > > > >> You still might need gso_ctx.gso_types to let user limit what types
> of
> > > > > >> segmentation
> > > > > >> that particular gso_ctx supports.
> > > > > >> An alternative would be to assume that each gso_ctx supports all
> > > > > >> currently implemented segmentations.
> > > > > >> This is possible too, but probably not very convenient to the user.
> > > > > >
> > > > > >Hmm, make sense.
> > > > > >
> > > > > >One thing to confirm: the value of gso_types should be
> > > > DEV_TX_OFFLOAD_*_TSO,
> > > > > >or new macros?
> > > > >
> > > > > Hi Jiayu, Konstantin,
> > > > >
> > > > > I think that the existing macros are fine, as they provide a consistent
> view
> > > > of segmentation capabilities to the application/user.
> > > >
> > > > +1
> > > > I also think it is better to re-use DEV_TX_OFFLOAD_*_TSO.
> > >
> > > There might be an 'issue', if we use 'PKT_TX_TCP_SEG' to tell the
> > > GSO library to segment a packet or not. Given the scenario that
> > > an application only wants to do GSO and doesn't want to use TSO.
> > > The application sets 'mbuf->ol_flags=PKT_TX_TCP_SEG' and doesn't
> > > set mbuf->tso_segsz. Then the GSO library segments the packet, and
> > > all output GSO segments have the same ol_flags as the input packet
> > > (in current GSO library design). Then the output GSO segments are
> > > transmitted to rte_eth_tx_prepare(). If the NIC is i40e, its TX prepare
> function,
> > > i40e_prep_pkts, checks if mbuf->tso_segsz is in the range of
> I40E_MIN_TSO_MSS
> > > and I40E_MAX_TSO_MSS, when PKT_TX_TCP_SEG is set. So an error
> happens in
> > > this scenario, since tso_segsz is 0.
> > >
> > > In fact, it may confuse the PMD driver when set PKT_TX_TCP_SEG but
> don't want
> > > to do TSO. One solution is that the GSO library removes the
> PKT_TX_TCP_SEG flag
> > > for all GSO segments after finishes segmenting.
> >
> > Yes, that was my thought too: after successful segmentation we probably
> > need to cleanup related ol_flags.
> 
> In fact, we just don't need to set these flags in our newly created segments.

+1. PKT_TX_TCP_SEG is not needed, but others, like PKT_TX_IPV4, should be
kept, since they may also be used by other HW offloadings, like csum.

Thanks,
Jiayu
> 
> > Konstantin
> >
> > > Wonder you and Mark's opinion.
> > >
> > > Thanks,
> > > Jiayu
> > > >
> > > > >
> > > > > I was initially concerned that they might be too coarse-grained (i.e.
> only
> > > > IPv4 is currently supported, and not IPv6), but as per Konstantin's
> > > > > previous example, the DEV_TX_OFFLOAD_*_TSO macros can be used
> in
> > > > concert with the packet type to determine whether a packet should
> > > > > be fragmented or not.
> > > > >
> > > > > Thanks,
> > > > > Mark
> > > > >
> > > > > >
> > > > > >Jiayu
> > > > > >> Konstantin
> > > > > >>
> > > > > >> >
> > > > > >> > The first choice makes HW and SW segmentation are totally the
> same.
> > > > > >> > Applications just need to parse the packet and set proper ol_flags,
> and
> > > > > >> > the GSO library uses ol_flags to decide which segmentation
> function to
> > > > > >use.
> > > > > >> > I think it's better than the second choice which depending on
> ptype to
> > > > > >> > choose segmentation function.
> > > > > >> >
> > > > > >> > Jiayu
> > > > > >> > >
> > > > > >> > > Konstantin
> > > > > >> > >
> > > > > >> > > >
> > > > > >> > > > Jiayu
> > > > > >> > > > > Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK
  2017-09-12  2:43   ` [PATCH v3 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                       ` (4 preceding siblings ...)
  2017-09-12  2:43     ` [PATCH v3 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
@ 2017-09-19  7:32     ` Jiayu Hu
  2017-09-19  7:32       ` [PATCH v4 1/5] gso: add Generic Segmentation Offload API framework Jiayu Hu
                         ` (12 more replies)
  5 siblings, 13 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-19  7:32 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, ferruh.yigit,
	thomas, Jiayu Hu

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. This patch adds GSO support to DPDK for specific
packet types: specifically, TCP/IPv4, VxLAN, and GRE.

The first patch introduces the GSO API framework. The second patch
adds GSO support for TCP/IPv4 packets (containing an optional VLAN
tag). The third patch adds GSO support for VxLAN packets that contain
outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or 
outer VLAN tags). The fourth patch adds GSO support for GRE packets
that contain outer IPv4, and inner TCP/IPv4 headers (with optional 
outer VLAN tag). The last patch in the series enables TCP/IPv4, VxLAN,
and GRE GSO in testpmd's checksum forwarding engine.

The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
iperf. Setup for the test is described as follows:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum
   forwarding engine with "retry".
c. Select IP and TCP HW checksum calculation for P0; select TCP HW
   checksum calculation for vhost-user port.
d. Launch a VM with csum and tso offloading enabled.
e. Run iperf-client on virtio-net port in the VM to send TCP packets.
   With enabling csum and tso, the VM can send large TCP/IPv4 packets
   (mss is up to 64KB).
f. P1 is assigned to linux kernel and enabled kernel GRO. Run
   iperf-server on P1.

We conduct three iperf tests:

test-1: enable GSO for P0 in testpmd, and set max GSO segment length
	to 1518B. Run two iperf-client in the VM.
test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
	two iperf-client in the VM.
test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.

Throughput of the above three tests:

test-1: 9.4Gbps
test-2: 9.5Gbps
test-3: 3Mbps

Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
length of tunnelled packets from VMs is 1514B. So current experiment
method can't used to measure VxLAN and GRE GSO performance, but test
the functionality via setting small GSO segment length (e.g. 500B).

To test VxLAN GSO functionality, we use the following setup:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
   engine with "retry".
c. Testpmd commands:
	- csum parse-tunnel on "P0"
	- csum parse-tunnel on "vhost-user port"
	- csum set outer-ip hw "P0"
	- csum set ip hw "P0"
	- csum set tcp hw "P0"
	- csum set tcp hw "vhost-user port"
	- set port "P0" gso on
	- set gso segsz 500
d. Launch a VM with csum and tso offloading enabled.
e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
   on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
   max packet length is 1514B.
f. P1 is assigned to linux kernel and disabled kernel GRO. Similarly,
   create a VxLAN port for P1, and run iperf-server on the VxLAN port.

In testpmd, we can see the length of all packets sent from P0 is smaller
than or equal to 500B. Additionally, the packets arriving in P1 is
encapsulated and is smaller than or equal to 500B.

The experimental data of GRE GRO will be shown later, and the prog_guide
will be added later.

Change log
==========
v4:
- use ol_flags instead of packet_type to decide which segmentation
  function to use.
- use MF and offset to check if a packet is IP fragmented, instead of
  using DF.
- remove ETHER_CRC_LEN from gso segment payload length calculation.
- refactor internal header update and other functions.
- remove RTE_GSO_IPID_INCREASE.
- add some of GSO documents.
- set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
  packets sent from GSO-enabled ports in testpmd.
v3:
- support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
  RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
  UNKNOWN.
- fill mbuf->packet_type instead of using rte_net_get_ptype() in
  csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
- store the input packet into pkts_out inside gso_tcp4_segment() and
  gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
  is performed.
- add missing incldues.
- optimize file names, function names and function description.
- fix one bug in testpmd.
v2:
- merge data segments whose data_len is less than mss into a large data
  segment in gso_do_segment().
- use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
  header in rte_gso_segment().
- provide IP id macros for applications to select fixed or incremental IP
  ids.
- change the defination of gso_types in struct rte_gso_ctx.
- replace rte_pktmbuf_detach() with rte_pktmbuf_free().
- refactor gso_update_pkt_headers().
- change the return value of rte_gso_segment().
- remove parameter checks in rte_gso_segment().
- use rte_net_get_ptype() in app/test-pmd/csumonly.c to fill
  mbuf->packet_type.
- add a new GSO command in testpmd to show GSO configuration for ports.
- misc: fix typo and optimize function description.

Jiayu Hu (3):
  gso: add Generic Segmentation Offload API framework
  gso: add TCP/IPv4 GSO support
  app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO

Mark Kavanagh (2):
  gso: add VxLAN GSO support
  gso: add GRE GSO support

 app/test-pmd/cmdline.c                      | 178 +++++++++++++++++
 app/test-pmd/config.c                       |  24 +++
 app/test-pmd/csumonly.c                     |  69 ++++++-
 app/test-pmd/testpmd.c                      |  13 ++
 app/test-pmd/testpmd.h                      |  10 +
 config/common_base                          |   5 +
 doc/api/doxy-api-index.md                   |   1 +
 doc/api/doxy-api.conf                       |   1 +
 doc/guides/rel_notes/release_17_11.rst      |  19 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  41 ++++
 lib/Makefile                                |   2 +
 lib/librte_eal/common/include/rte_log.h     |   1 +
 lib/librte_gso/Makefile                     |  52 +++++
 lib/librte_gso/gso_common.c                 | 291 ++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h                 | 157 +++++++++++++++
 lib/librte_gso/gso_tcp4.c                   |  82 ++++++++
 lib/librte_gso/gso_tcp4.h                   |  76 ++++++++
 lib/librte_gso/gso_tunnel_tcp4.c            |  89 +++++++++
 lib/librte_gso/gso_tunnel_tcp4.h            |  76 ++++++++
 lib/librte_gso/rte_gso.c                    | 107 ++++++++++
 lib/librte_gso/rte_gso.h                    | 144 ++++++++++++++
 lib/librte_gso/rte_gso_version.map          |   7 +
 mk/rte.app.mk                               |   1 +
 23 files changed, 1441 insertions(+), 5 deletions(-)
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

-- 
2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v4 1/5] gso: add Generic Segmentation Offload API framework
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
@ 2017-09-19  7:32       ` Jiayu Hu
  2017-09-19  7:32       ` [PATCH v4 2/5] gso: add TCP/IPv4 GSO support Jiayu Hu
                         ` (11 subsequent siblings)
  12 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-19  7:32 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, ferruh.yigit,
	thomas, Jiayu Hu

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. To segment a packet requires two steps. The first
is to set proper flags to mbuf->ol_flags, where the flags are the same
as that of TSO. The second is to call the segmentation API,
rte_gso_segment(). This patch introduces the GSO API framework to DPDK.

rte_gso_segment() splits an input packet into small ones in each
invocation. The GSO library refers to these small packets generated
by rte_gso_segment() as GSO segments. Each of the newly-created GSO
segments is organized as a two-segment MBUF, where the first segment is a
standard MBUF, which stores a copy of packet header, and the second is an
indirect MBUF which points to a section of data in the input packet.
rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
when all GSO segments are freed, the input packet is freed automatically.
Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
the driver of the interface which the GSO segments are sent to should
support to transmit multi-segment packets.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 config/common_base                     |   5 ++
 doc/api/doxy-api-index.md              |   1 +
 doc/api/doxy-api.conf                  |   1 +
 doc/guides/rel_notes/release_17_11.rst |   1 +
 lib/Makefile                           |   2 +
 lib/librte_gso/Makefile                |  49 +++++++++++
 lib/librte_gso/rte_gso.c               |  52 ++++++++++++
 lib/librte_gso/rte_gso.h               | 144 +++++++++++++++++++++++++++++++++
 lib/librte_gso/rte_gso_version.map     |   7 ++
 mk/rte.app.mk                          |   1 +
 10 files changed, 263 insertions(+)
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

diff --git a/config/common_base b/config/common_base
index 5e97a08..603e340 100644
--- a/config/common_base
+++ b/config/common_base
@@ -652,6 +652,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
 CONFIG_RTE_LIBRTE_GRO=y
 
 #
+# Compile GSO library
+#
+CONFIG_RTE_LIBRTE_GSO=y
+
+#
 # Compile librte_meter
 #
 CONFIG_RTE_LIBRTE_METER=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 19e0d4f..6512918 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -101,6 +101,7 @@ The public API headers are grouped by topics:
   [TCP]                (@ref rte_tcp.h),
   [UDP]                (@ref rte_udp.h),
   [GRO]                (@ref rte_gro.h),
+  [GSO]                (@ref rte_gso.h),
   [frag/reass]         (@ref rte_ip_frag.h),
   [LPM IPv4 route]     (@ref rte_lpm.h),
   [LPM IPv6 route]     (@ref rte_lpm6.h),
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 823554f..408f2e6 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -47,6 +47,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_ether \
                           lib/librte_eventdev \
                           lib/librte_gro \
+                          lib/librte_gso \
                           lib/librte_hash \
                           lib/librte_ip_frag \
                           lib/librte_jobstats \
diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 8bf91bd..7508be7 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -174,6 +174,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_ethdev.so.7
      librte_eventdev.so.2
      librte_gro.so.1
+   + librte_gso.so.1
      librte_hash.so.2
      librte_ip_frag.so.1
      librte_jobstats.so.1
diff --git a/lib/Makefile b/lib/Makefile
index 86caba1..3d123f4 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
 DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
+DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
+DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
new file mode 100644
index 0000000..aeaacbc
--- /dev/null
+++ b/lib/librte_gso/Makefile
@@ -0,0 +1,49 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_gso.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+
+EXPORT_MAP := rte_gso_version.map
+
+LIBABIVER := 1
+
+#source files
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
new file mode 100644
index 0000000..b773636
--- /dev/null
+++ b/lib/librte_gso/rte_gso.c
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <errno.h>
+
+#include "rte_gso.h"
+
+int
+rte_gso_segment(struct rte_mbuf *pkt,
+		const struct rte_gso_ctx *gso_ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
+			nb_pkts_out < 1)
+		return -EINVAL;
+
+	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+	pkts_out[0] = pkt;
+
+	return 1;
+}
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
new file mode 100644
index 0000000..b5bd848
--- /dev/null
+++ b/lib/librte_gso/rte_gso.h
@@ -0,0 +1,144 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_GSO_H_
+#define _RTE_GSO_H_
+
+/**
+ * @file
+ * Interface to GSO library
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/* GSO IP id flags for the IPv4 header */
+#define RTE_GSO_IPID_FIXED (1ULL << 0)
+/**< Use fixed IP ids for output GSO segments. Setting
+ * !RTE_GSO_IPID_FIXED indicates using incremental IP ids.
+ */
+
+/**
+ * GSO context structure.
+ */
+struct rte_gso_ctx {
+	struct rte_mempool *direct_pool;
+	/**< MBUF pool for allocating direct buffers, which are used
+	 * to store packet headers for GSO segments.
+	 */
+	struct rte_mempool *indirect_pool;
+	/**< MBUF pool for allocating indirect buffers, which are used
+	 * to locate packet payloads for GSO segments. The indirect
+	 * buffer doesn't contain any data, but simply points to an
+	 * offset within the packet to segment.
+	 */
+	uint64_t ipid_flag;
+	/**< flag to indicate GSO uses fixed or incremental IP ids for
+	 * IPv4 headers of output GSO segments. If applications want
+	 * fixed IP ids, set RTE_GSO_IPID_FIXED to ipid_flag.
+	 */
+	uint32_t gso_types;
+	/**< the bit mask of required GSO types. The GSO library
+	 * uses the same macros as that of describing device TX
+	 * offloading capabilities (i.e. DEV_TX_OFFLOAD_*_TSO) for
+	 * gso_types.
+	 *
+	 * For example, if applications want to segment TCP/IPv4
+	 * packets, set DEV_TX_OFFLOAD_TCP_TSO to gso_types.
+	 */
+	uint16_t gso_size;
+	/**< maximum size of an output GSO segment, including packet
+	 * header and payload, measured in bytes.
+	 */
+};
+
+/**
+ * Segmentation function, which supports processing of both single- and
+ * multi- MBUF packets.
+ *
+ * Note that we refer to the packets that are segmented from the input
+ * packet as 'GSO segments'. rte_gso_segment() doesn't check if the
+ * input packet has correct checksums, and doesn't update checksums for
+ * output GSO segments. Additionally, it doesn't process IP fragment
+ * packets.
+ *
+ * Before calling rte_gso_segment(), applications must set proper ol_flags
+ * for the packet. The GSO library uses the same macros as that of TSO.
+ * For example, set PKT_TX_TCP_SEG and PKT_TX_IPV4 to ol_flags to segment
+ * a TCP/IPv4 packet. If rte_gso_segment() succcesses, the PKT_TX_TCP_SEG
+ * flag is removed for all GSO segments and the input packet.
+ *
+ * Each of the newly-created GSO segments is organized as a two-segment
+ * MBUF, where the first segment is a standard MBUF, which stores a copy
+ * of packet header, and the second is an indirect MBUF which points to
+ * a section of data in the input packet. Since each GSO segment has
+ * multiple MBUFs (i.e. 2 MBUFs), the driver of the interface which the
+ * GSO segments are sent to should support to transmit multi-segment
+ * packets.
+ *
+ * If the input packet is GSOed, its mbuf refcnt reduces by 1. Therefore,
+ * when all GSO segments are freed, the input packet is freed automatically.
+ *
+ * If the memory space in pkts_out or MBUF pools is insufficient, this
+ * function fails, and it returns (-1) * errno. Otherwise, GSO successes,
+ * and this function returns the number of output GSO segments filled in
+ * pkts_out.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param ctx
+ *  GSO context object pointer.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when rte_gso_segment() successes.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of GSO segments filled in pkts_out on success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int rte_gso_segment(struct rte_mbuf *pkt,
+		const struct rte_gso_ctx *ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_GSO_H_ */
diff --git a/lib/librte_gso/rte_gso_version.map b/lib/librte_gso/rte_gso_version.map
new file mode 100644
index 0000000..e1fd453
--- /dev/null
+++ b/lib/librte_gso/rte_gso_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_gso_segment;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index c25fdd9..d4c9873 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
+_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v4 2/5] gso: add TCP/IPv4 GSO support
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
  2017-09-19  7:32       ` [PATCH v4 1/5] gso: add Generic Segmentation Offload API framework Jiayu Hu
@ 2017-09-19  7:32       ` Jiayu Hu
  2017-09-20  7:03         ` Yao, Lei A
  2017-09-19  7:32       ` [PATCH v4 3/5] gso: add VxLAN " Jiayu Hu
                         ` (10 subsequent siblings)
  12 siblings, 1 reply; 157+ messages in thread
From: Jiayu Hu @ 2017-09-19  7:32 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, ferruh.yigit,
	thomas, Jiayu Hu

This patch adds GSO support for TCP/IPv4 packets. Supported packets
may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
packets have correct checksums, and doesn't update checksums for
output packets (the responsibility for this lies with the application).
Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.

TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
MBUF, to organize an output packet. Note that we refer to these two
chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
header, while the indirect mbuf simply points to a location within the
original packet's payload. Consequently, use of the GSO library requires
multi-segment MBUF support in the TX functions of the NIC driver.

If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
result, when all of its GSOed segments are freed, the packet is freed
automatically.

TCP/IPv4 GSO clears the PKT_TX_TCP_SEG flag for the input packet and
GSO segments on the event of success.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst  |  12 ++
 lib/librte_eal/common/include/rte_log.h |   1 +
 lib/librte_gso/Makefile                 |   2 +
 lib/librte_gso/gso_common.c             | 202 ++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h             | 107 +++++++++++++++++
 lib/librte_gso/gso_tcp4.c               |  82 +++++++++++++
 lib/librte_gso/gso_tcp4.h               |  76 ++++++++++++
 lib/librte_gso/rte_gso.c                |  52 +++++++-
 8 files changed, 531 insertions(+), 3 deletions(-)
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 7508be7..7453bb0 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -41,6 +41,18 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added the Generic Segmentation Offload Library.**
+
+  Added the Generic Segmentation Offload (GSO) library to enable
+  applications to split large packets (e.g. MSS is 64KB) into small
+  ones (e.g. MTU is 1500B). Supported packet types are:
+
+  * TCP/IPv4 packets, which may include a single VLAN tag.
+
+  The GSO library doesn't check if the input packets have correct
+  checksums, and doesn't update checksums for output packets.
+  Additionally, the GSO library doesn't process IP fragmented packets.
+
 
 Resolved Issues
 ---------------
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index ec8dba7..2fa1199 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -87,6 +87,7 @@ extern struct rte_logs rte_logs;
 #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
 #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
 #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
+#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index aeaacbc..2be64d1 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -42,6 +42,8 @@ LIBABIVER := 1
 
 #source files
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
new file mode 100644
index 0000000..b2c84f6
--- /dev/null
+++ b/lib/librte_gso/gso_common.c
@@ -0,0 +1,202 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdbool.h>
+#include <errno.h>
+
+#include <rte_memcpy.h>
+#include <rte_mempool.h>
+#include <rte_ether.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
+#include "gso_common.h"
+
+static inline void
+hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset)
+{
+	/* Copy MBUF metadata */
+	hdr_segment->nb_segs = 1;
+	hdr_segment->port = pkt->port;
+	hdr_segment->ol_flags = pkt->ol_flags;
+	hdr_segment->packet_type = pkt->packet_type;
+	hdr_segment->pkt_len = pkt_hdr_offset;
+	hdr_segment->data_len = pkt_hdr_offset;
+	hdr_segment->tx_offload = pkt->tx_offload;
+
+	/* Copy the packet header */
+	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
+			rte_pktmbuf_mtod(pkt, char *),
+			pkt_hdr_offset);
+}
+
+static inline void
+free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
+{
+	uint16_t i;
+
+	for (i = 0; i < nb_pkts; i++)
+		rte_pktmbuf_free(pkts[i]);
+}
+
+int
+gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct rte_mbuf *pkt_in;
+	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
+	uint16_t pkt_in_data_pos, segment_bytes_remaining;
+	uint16_t pyld_len, nb_segs;
+	bool more_in_pkt, more_out_segs;
+
+	pkt_in = pkt;
+	nb_segs = 0;
+	more_in_pkt = 1;
+	pkt_in_data_pos = pkt_hdr_offset;
+
+	while (more_in_pkt) {
+		if (unlikely(nb_segs >= nb_pkts_out)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -EINVAL;
+		}
+
+		/* Allocate a direct MBUF */
+		hdr_segment = rte_pktmbuf_alloc(direct_pool);
+		if (unlikely(hdr_segment == NULL)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -ENOMEM;
+		}
+		/* Fill the packet header */
+		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
+
+		prev_segment = hdr_segment;
+		segment_bytes_remaining = pyld_unit_size;
+		more_out_segs = 1;
+
+		while (more_out_segs && more_in_pkt) {
+			/* Allocate an indirect MBUF */
+			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
+			if (unlikely(pyld_segment == NULL)) {
+				rte_pktmbuf_free(hdr_segment);
+				free_gso_segment(pkts_out, nb_segs);
+				return -ENOMEM;
+			}
+			/* Attach to current MBUF segment of pkt */
+			rte_pktmbuf_attach(pyld_segment, pkt_in);
+
+			prev_segment->next = pyld_segment;
+			prev_segment = pyld_segment;
+
+			pyld_len = segment_bytes_remaining;
+			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
+				pyld_len = pkt_in->data_len - pkt_in_data_pos;
+
+			pyld_segment->data_off = pkt_in_data_pos +
+				pkt_in->data_off;
+			pyld_segment->data_len = pyld_len;
+
+			/* Update header segment */
+			hdr_segment->pkt_len += pyld_len;
+			hdr_segment->nb_segs++;
+
+			pkt_in_data_pos += pyld_len;
+			segment_bytes_remaining -= pyld_len;
+
+			/* Finish processing a MBUF segment of pkt */
+			if (pkt_in_data_pos == pkt_in->data_len) {
+				pkt_in = pkt_in->next;
+				pkt_in_data_pos = 0;
+				if (pkt_in == NULL)
+					more_in_pkt = 0;
+			}
+
+			/* Finish generating a GSO segment */
+			if (segment_bytes_remaining == 0)
+				more_out_segs = 0;
+		}
+		pkts_out[nb_segs++] = hdr_segment;
+	}
+	return nb_segs;
+}
+
+static inline void
+__update_ipv4_tcp_header(struct rte_mbuf *pkt, uint16_t l2_len, uint16_t id,
+		uint32_t sent_seq, uint8_t non_tail)
+{
+	struct tcp_hdr *tcp_hdr;
+	struct ipv4_hdr *ipv4_hdr;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l2_len);
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+
+	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len -
+			l2_len);
+	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
+
+	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
+	if (likely(non_tail))
+		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
+					TCP_HDR_FIN_MASK));
+}
+
+void
+update_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct tcp_hdr *tcp_hdr;
+	struct ipv4_hdr *ipv4_hdr;
+	uint32_t sent_seq;
+	uint16_t id, l2_len, tail_idx, i;
+
+	l2_len = pkt->l2_len;
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l2_len);
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+
+	for (i = 0; i < nb_segs; i++) {
+		__update_ipv4_tcp_header(segs[i], l2_len, id, sent_seq,
+				i < tail_idx);
+		id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
new file mode 100644
index 0000000..2a01cd0
--- /dev/null
+++ b/lib/librte_gso/gso_common.h
@@ -0,0 +1,107 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_COMMON_H_
+#define _GSO_COMMON_H_
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ip.h>
+
+#define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
+		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
+
+#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
+#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
+
+#define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
+
+/**
+ * Internal function which updates relevant packet headers for TCP/IPv4
+ * packets, following segmentation. This is required to update, for
+ * example, the IPv4 'total_length' field, to reflect the reduced length
+ * of the now-segmented packet.
+ *
+ * @param pkt
+ *  The original packet.
+ * @param ipid_delta
+ *  The increasing uint of IP ids.
+ * @param segs
+ *  Pointer array used for storing mbuf addresses for GSO segments.
+ * @param nb_segs
+ *  The number of GSO segments placed in segs.
+ */
+void update_tcp4_header(struct rte_mbuf *pkt,
+		uint8_t ipid_delta,
+		struct rte_mbuf **segs,
+		uint16_t nb_segs);
+
+/**
+ * Internal function which divides the input packet into small segments.
+ * Each of the newly-created segments is organized as a two-segment MBUF,
+ * where the first segment is a standard mbuf, which stores a copy of
+ * packet header, and the second is an indirect mbuf which points to a
+ * section of data in the input packet.
+ *
+ * @param pkt
+ *  Packet to segment.
+ * @param pkt_hdr_offset
+ *  Packet header offset, measured in bytes.
+ * @param pyld_unit_size
+ *  The max payload length of a GSO segment.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to keep the mbuf addresses of output segments. If
+ *  the memory space in pkts_out is insufficient, gso_do_segment() fails
+ *  and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of segments created in the event of success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
new file mode 100644
index 0000000..e3b9dab
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.c
@@ -0,0 +1,82 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+
+#include <rte_ether.h>
+#include <rte_ip.h>
+
+#include "gso_common.h"
+#include "gso_tcp4.h"
+
+int
+gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t tcp_dl;
+	uint16_t pyld_unit_size, hdr_offset;
+	uint16_t frag_off;
+	int ret = 1;
+
+	/* Don't process the fragmented packet */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->l2_len);
+	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return ret;
+	}
+
+	/* Don't process the packet without data */
+	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
+	if (unlikely(tcp_dl == 0)) {
+		pkts_out[0] = pkt;
+		return ret;
+	}
+
+	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret > 1)
+		update_tcp4_header(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
new file mode 100644
index 0000000..84fa72f
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.h
@@ -0,0 +1,76 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TCP4_H_
+#define _GSO_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment an IPv4/TCP packet. This function doesn't check if the input
+ * packet has correct checksums, and doesn't update checksums for output
+ * GSO segments. Furthermore, it doesn't process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing uint of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when gso_tcp4_segment() successes. If the memory space in
+ *  pkts_out is insufficient, gso_tcp4_segment() fails and returns
+ *  -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ip_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index b773636..c65b9ae 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -33,7 +33,12 @@
 
 #include <errno.h>
 
+#include <rte_log.h>
+#include <rte_ethdev.h>
+
 #include "rte_gso.h"
+#include "gso_common.h"
+#include "gso_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -41,12 +46,53 @@ rte_gso_segment(struct rte_mbuf *pkt,
 		struct rte_mbuf **pkts_out,
 		uint16_t nb_pkts_out)
 {
+	struct rte_mempool *direct_pool, *indirect_pool;
+	struct rte_mbuf *pkt_seg;
+	uint64_t ol_flags;
+	uint16_t gso_size;
+	uint8_t ipid_delta;
+	int ret = 1;
+
 	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
 			nb_pkts_out < 1)
 		return -EINVAL;
 
-	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
-	pkts_out[0] = pkt;
+	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
+				DEV_TX_OFFLOAD_TCP_TSO) !=
+			gso_ctx->gso_types) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkts_out[0] = pkt;
+		return ret;
+	}
+
+	direct_pool = gso_ctx->direct_pool;
+	indirect_pool = gso_ctx->indirect_pool;
+	gso_size = gso_ctx->gso_size;
+	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
+	ol_flags = pkt->ol_flags;
+
+	if (IS_IPV4_TCP(pkt->ol_flags)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkts_out[0] = pkt;
+		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
+		return ret;
+	}
+
+	if (ret > 1) {
+		pkt_seg = pkt;
+		while (pkt_seg) {
+			rte_mbuf_refcnt_update(pkt_seg, -1);
+			pkt_seg = pkt_seg->next;
+		}
+	} else if (ret < 0) {
+		/* Revert the ol_flags on the even of failure. */
+		pkt->ol_flags = ol_flags;
+	}
 
-	return 1;
+	return ret;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v4 3/5] gso: add VxLAN GSO support
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
  2017-09-19  7:32       ` [PATCH v4 1/5] gso: add Generic Segmentation Offload API framework Jiayu Hu
  2017-09-19  7:32       ` [PATCH v4 2/5] gso: add TCP/IPv4 GSO support Jiayu Hu
@ 2017-09-19  7:32       ` Jiayu Hu
  2017-09-20  3:11         ` Tan, Jianfeng
  2017-09-19  7:32       ` [PATCH v4 4/5] gso: add GRE " Jiayu Hu
                         ` (9 subsequent siblings)
  12 siblings, 1 reply; 157+ messages in thread
From: Jiayu Hu @ 2017-09-19  7:32 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, ferruh.yigit,
	thomas, Jiayu Hu

From: Mark Kavanagh <mark.b.kavanagh@intel.com>

This patch adds GSO support for VxLAN-encapsulated packets. Supported
VxLAN packets must have an outer IPv4 header (prepended by an optional
VLAN tag), and contain an inner TCP/IPv4 packet (with an optional inner
VLAN tag).

VxLAN GSO doesn't check if all input packets have correct checksums and
doesn't update checksums for output packets. Additionally, it doesn't
process IP fragmented packets.

As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
output packet, which mandates support for multi-segment mbufs in the TX
functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
reduces its MBUF refcnt by 1. As a result, when all of its GSOed segments
are freed, the packet is freed automatically.

VxLAN GSO clears the PKT_TX_TCP_SEG flag for the input packet and GSO
segments on the event of success.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst |  3 ++
 lib/librte_gso/Makefile                |  1 +
 lib/librte_gso/gso_common.c            | 58 +++++++++++++++++++++++
 lib/librte_gso/gso_common.h            | 25 ++++++++++
 lib/librte_gso/gso_tunnel_tcp4.c       | 87 ++++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_tunnel_tcp4.h       | 76 +++++++++++++++++++++++++++++
 lib/librte_gso/rte_gso.c               | 13 +++--
 7 files changed, 260 insertions(+), 3 deletions(-)
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 7453bb0..2dc6b89 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -48,6 +48,9 @@ New Features
   ones (e.g. MTU is 1500B). Supported packet types are:
 
   * TCP/IPv4 packets, which may include a single VLAN tag.
+  * VxLAN packets, which must have an outer IPv4 header (prepended by
+    an optional VLAN tag), and contain an inner TCP/IPv4 packet (with
+    an optional VLAN tag).
 
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index 2be64d1..e6d41df 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -44,6 +44,7 @@ LIBABIVER := 1
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
index b2c84f6..90fcb2a 100644
--- a/lib/librte_gso/gso_common.c
+++ b/lib/librte_gso/gso_common.c
@@ -39,6 +39,7 @@
 #include <rte_ether.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
+#include <rte_udp.h>
 
 #include "gso_common.h"
 
@@ -200,3 +201,60 @@ update_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
 		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
 	}
 }
+
+static inline void
+__update_outer_ipv4_header(struct rte_mbuf *pkt, uint16_t id)
+{
+	struct ipv4_hdr *ipv4_hdr;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->outer_l2_len);
+	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len -
+			pkt->outer_l2_len);
+	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
+}
+
+static inline void
+__update_outer_udp_header(struct rte_mbuf *pkt)
+{
+	struct udp_hdr *udp_hdr;
+	uint16_t length;
+
+	length = pkt->outer_l2_len + pkt->outer_l3_len;
+	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			length);
+	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - length);
+}
+
+void
+update_ipv4_vxlan_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t l2_len, outer_id, inner_id, tail_idx, i;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->outer_l2_len);
+	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+
+	l2_len = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l2_len);
+	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+
+	for (i = 0; i < nb_segs; i++) {
+		__update_outer_ipv4_header(segs[i], outer_id);
+		outer_id += ipid_delta;
+		__update_outer_udp_header(segs[i]);
+
+		__update_ipv4_tcp_header(segs[i], l2_len, inner_id, sent_seq,
+				i < tail_idx);
+		inner_id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index 2a01cd0..0b0d8ed 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -48,6 +48,11 @@
 #define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
 		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
 
+#define IS_IPV4_VXLAN_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
+				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
+		 PKT_TX_TUNNEL_VXLAN))
+
 /**
  * Internal function which updates relevant packet headers for TCP/IPv4
  * packets, following segmentation. This is required to update, for
@@ -69,6 +74,26 @@ void update_tcp4_header(struct rte_mbuf *pkt,
 		uint16_t nb_segs);
 
 /**
+ * Internal function which updates relevant packet headers for VxLAN
+ * packets, following segmentation. This is required to update, for
+ * example, the IPv4 'total_length' field, to reflect the reduced length
+ * of the now-segmented packet.
+ *
+ * @param pkt
+ *  The original packet.
+ * @param ipid_delta
+ *  The increasing uint of IP ids.
+ * @param segs
+ *  Pointer array used for storing mbuf addresses for GSO segments.
+ * @param nb_segs
+ *  The number of GSO segments placed in segs.
+ */
+void update_ipv4_vxlan_tcp4_header(struct rte_mbuf *pkt,
+		uint8_t ipid_delta,
+		struct rte_mbuf **segs,
+		uint16_t nb_segs);
+
+/**
  * Internal function which divides the input packet into small segments.
  * Each of the newly-created segments is organized as a two-segment MBUF,
  * where the first segment is a standard mbuf, which stores a copy of
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
new file mode 100644
index 0000000..cc017bd
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -0,0 +1,87 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_ether.h>
+#include <rte_ip.h>
+
+#include "gso_common.h"
+#include "gso_tunnel_tcp4.h"
+
+int
+gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *inner_ipv4_hdr;
+	uint16_t pyld_unit_size, hdr_offset;
+	uint16_t tcp_dl, frag_off;
+	int ret = 1;
+
+	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
+	inner_ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			hdr_offset);
+	/*
+	 * Don't process the packet whose MF bit and offset in the inner
+	 * IPv4 header are non-zero.
+	 */
+	frag_off = rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return ret;
+	}
+
+	/* Don't process the packet without data */
+	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
+	if (unlikely(tcp_dl == 0)) {
+		pkts_out[0] = pkt;
+		return ret;
+	}
+
+	hdr_offset += pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret <= 1)
+		return ret;
+
+	if (pkt->ol_flags & PKT_TX_TUNNEL_VXLAN)
+		update_ipv4_vxlan_tcp4_header(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tunnel_tcp4.h b/lib/librte_gso/gso_tunnel_tcp4.h
new file mode 100644
index 0000000..a848a2e
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.h
@@ -0,0 +1,76 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TUNNEL_TCP4_H_
+#define _GSO_TUNNEL_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment an tunneling packet with inner TCP/IPv4 headers. This function
+ * doesn't check if the input packet has correct checksums, and doesn't
+ * update checksums for output GSO segments. Furthermore, it doesn't
+ * process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing uint of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when gso_tunnel_tcp4_segment() successes. If the memory
+ *  space in pkts_out is insufficient, gso_tcp4_segment() fails and
+ *  returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index c65b9ae..d1a723b 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -39,6 +39,7 @@
 #include "rte_gso.h"
 #include "gso_common.h"
 #include "gso_tcp4.h"
+#include "gso_tunnel_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -58,8 +59,9 @@ rte_gso_segment(struct rte_mbuf *pkt,
 		return -EINVAL;
 
 	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
-				DEV_TX_OFFLOAD_TCP_TSO) !=
-			gso_ctx->gso_types) {
+				(DEV_TX_OFFLOAD_TCP_TSO |
+				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
+				gso_ctx->gso_types) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		pkts_out[0] = pkt;
 		return ret;
@@ -71,7 +73,12 @@ rte_gso_segment(struct rte_mbuf *pkt,
 	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
 	ol_flags = pkt->ol_flags;
 
-	if (IS_IPV4_TCP(pkt->ol_flags)) {
+	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else if (IS_IPV4_TCP(pkt->ol_flags)) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v4 4/5] gso: add GRE GSO support
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                         ` (2 preceding siblings ...)
  2017-09-19  7:32       ` [PATCH v4 3/5] gso: add VxLAN " Jiayu Hu
@ 2017-09-19  7:32       ` Jiayu Hu
  2017-09-20  2:53         ` Tan, Jianfeng
  2017-09-19  7:32       ` [PATCH v4 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
                         ` (8 subsequent siblings)
  12 siblings, 1 reply; 157+ messages in thread
From: Jiayu Hu @ 2017-09-19  7:32 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, ferruh.yigit,
	thomas, Jiayu Hu

From: Mark Kavanagh <mark.b.kavanagh@intel.com>

This patch adds GSO support for GRE-tunneled packets. Supported GRE
packets must contain an outer IPv4 header, and inner TCP/IPv4 headers.
They may also contain a single VLAN tag. GRE GSO doesn't check if all
input packets have correct checksums and doesn't update checksums for
output packets. Additionally, it doesn't process IP fragmented packets.

As with VxLAN GSO, GRE GSO uses a two-segment MBUF to organize each
output packet, which requires multi-segment mbuf support in the TX
functions of the NIC driver. Also, if a packet is GSOed, GRE GSO reduces
its MBUF refcnt by 1. As a result, when all of its GSOed segments are
freed, the packet is freed automatically.

GRE GSO clears the PKT_TX_TCP_SEG flag for the input packet and GSO
segments on the event of success.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst |  3 +++
 lib/librte_gso/gso_common.c            | 31 +++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h            | 25 +++++++++++++++++++++++++
 lib/librte_gso/gso_tunnel_tcp4.c       |  2 ++
 lib/librte_gso/rte_gso.c               |  8 +++++---
 5 files changed, 66 insertions(+), 3 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 2dc6b89..119f662 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -51,6 +51,9 @@ New Features
   * VxLAN packets, which must have an outer IPv4 header (prepended by
     an optional VLAN tag), and contain an inner TCP/IPv4 packet (with
     an optional VLAN tag).
+  * GRE packets, which must contain an outer IPv4 header (prepended by
+    an optional VLAN tag), and inner TCP/IPv4 headers (with an optional
+    VLAN tag).
 
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
index 90fcb2a..3629625 100644
--- a/lib/librte_gso/gso_common.c
+++ b/lib/librte_gso/gso_common.c
@@ -37,6 +37,7 @@
 #include <rte_memcpy.h>
 #include <rte_mempool.h>
 #include <rte_ether.h>
+#include <rte_gre.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
 #include <rte_udp.h>
@@ -258,3 +259,33 @@ update_ipv4_vxlan_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
 		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
 	}
 }
+
+void
+update_ipv4_gre_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t l2_len, outer_id, inner_id, tail_idx, i;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->outer_l2_len);
+	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+
+	l2_len = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) + l2_len);
+	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+	for (i = 0; i < nb_segs; i++) {
+		__update_outer_ipv4_header(segs[i], outer_id);
+		outer_id += ipid_delta;
+
+		__update_ipv4_tcp_header(segs[i], l2_len, inner_id, sent_seq,
+				i < tail_idx);
+		inner_id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index 0b0d8ed..433e952 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -53,6 +53,11 @@
 		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
 		 PKT_TX_TUNNEL_VXLAN))
 
+#define IS_IPV4_GRE_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
+				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_GRE)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
+		 PKT_TX_TUNNEL_GRE))
+
 /**
  * Internal function which updates relevant packet headers for TCP/IPv4
  * packets, following segmentation. This is required to update, for
@@ -94,6 +99,26 @@ void update_ipv4_vxlan_tcp4_header(struct rte_mbuf *pkt,
 		uint16_t nb_segs);
 
 /**
+ * Internal function which updates relevant packet headers for GRE
+ * packets, following segmentation. This is required to update, for
+ * example, the IPv4 'total_length' field, to reflect the reduced
+ * length of the now-segmented packet.
+ *
+ * @param pkt
+ *  The original packet.
+ * @param ipid_delta
+ *  The increasing uint of IP ids.
+ * @param segs
+ *  Pointer array used for storing mbuf addresses for GSO segments.
+ * @param nb_segs
+ *  The number of GSO segments placed in segs.
+ */
+void update_ipv4_gre_tcp4_header(struct rte_mbuf *pkt,
+		uint8_t ipid_delta,
+		struct rte_mbuf **segs,
+		uint16_t nb_segs);
+
+/**
  * Internal function which divides the input packet into small segments.
  * Each of the newly-created segments is organized as a two-segment MBUF,
  * where the first segment is a standard mbuf, which stores a copy of
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
index cc017bd..5d5930a 100644
--- a/lib/librte_gso/gso_tunnel_tcp4.c
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -82,6 +82,8 @@ gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
 
 	if (pkt->ol_flags & PKT_TX_TUNNEL_VXLAN)
 		update_ipv4_vxlan_tcp4_header(pkt, ipid_delta, pkts_out, ret);
+	else if (pkt->ol_flags & PKT_TX_TUNNEL_GRE)
+		update_ipv4_gre_tcp4_header(pkt, ipid_delta, pkts_out, ret);
 
 	return ret;
 }
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index d1a723b..5464831 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -60,8 +60,9 @@ rte_gso_segment(struct rte_mbuf *pkt,
 
 	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
 				(DEV_TX_OFFLOAD_TCP_TSO |
-				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
-				gso_ctx->gso_types) {
+				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+				 DEV_TX_OFFLOAD_GRE_TNL_TSO)) !=
+				 gso_ctx->gso_types) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		pkts_out[0] = pkt;
 		return ret;
@@ -73,7 +74,8 @@ rte_gso_segment(struct rte_mbuf *pkt,
 	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
 	ol_flags = pkt->ol_flags;
 
-	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) {
+	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags) ||
+			IS_IPV4_GRE_TCP4(pkt->ol_flags)) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v4 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                         ` (3 preceding siblings ...)
  2017-09-19  7:32       ` [PATCH v4 4/5] gso: add GRE " Jiayu Hu
@ 2017-09-19  7:32       ` Jiayu Hu
  2017-09-28 22:13       ` [PATCH v5 0/6] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Mark Kavanagh
                         ` (7 subsequent siblings)
  12 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-09-19  7:32 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, mark.b.kavanagh, jianfeng.tan, ferruh.yigit,
	thomas, Jiayu Hu

This patch adds GSO support to the csum forwarding engine. Oversized
packets transmitted over a GSO-enabled port will undergo segmentation
(with the exception of packet-types unsupported by the GSO library).
GSO support is disabled by default.

GSO support may be toggled on a per-port basis, using the command:

        "set port <port_id> gso on|off"

The maximum packet length (including the packet header and payload) for
GSO segments may be set with the command:

        "set gso segsz <length>"

Show GSO configuration for a given port with the command:

	"show port <port_id> gso"

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 app/test-pmd/cmdline.c                      | 178 ++++++++++++++++++++++++++++
 app/test-pmd/config.c                       |  24 ++++
 app/test-pmd/csumonly.c                     |  69 ++++++++++-
 app/test-pmd/testpmd.c                      |  13 ++
 app/test-pmd/testpmd.h                      |  10 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  41 +++++++
 6 files changed, 330 insertions(+), 5 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index ccdf239..2f308ed 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -431,6 +431,17 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"    Set max flow number and max packet number per-flow"
 			" for GRO.\n\n"
 
+			"set port (port_id) gso (on|off)"
+			"    Enable or disable Generic Segmentation Offload in"
+			" csum forwarding engine.\n\n"
+
+			"set gso segsz (length)\n"
+			"    Set max packet length for output GSO segments,"
+			" including packet header and payload.\n\n"
+
+			"show port (port_id) gso\n"
+			"    Show GSO configuration.\n\n"
+
 			"set fwd (%s)\n"
 			"    Set packet forwarding mode.\n\n"
 
@@ -3967,6 +3978,170 @@ cmdline_parse_inst_t cmd_gro_set = {
 	},
 };
 
+/* *** ENABLE/DISABLE GSO *** */
+struct cmd_gso_enable_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_mode;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_enable_parsed(void *parsed_result,
+		__attribute__((unused)) struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_gso_enable_result *res;
+
+	res = parsed_result;
+	if (!strcmp(res->cmd_keyword, "gso"))
+		setup_gso(res->cmd_mode, res->cmd_pid);
+}
+
+cmdline_parse_token_string_t cmd_gso_enable_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_enable_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_enable_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_enable_mode =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_mode, "on#off");
+cmdline_parse_token_num_t cmd_gso_enable_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_enable = {
+	.f = cmd_gso_enable_parsed,
+	.data = NULL,
+	.help_str = "set port <port_id> gso on|off",
+	.tokens = {
+		(void *)&cmd_gso_enable_set,
+		(void *)&cmd_gso_enable_port,
+		(void *)&cmd_gso_enable_pid,
+		(void *)&cmd_gso_enable_keyword,
+		(void *)&cmd_gso_enable_mode,
+		NULL,
+	},
+};
+
+/* *** SET MAX PACKET LENGTH FOR GSO SEGMENTS *** */
+struct cmd_gso_size_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_segsz;
+	uint16_t cmd_size;
+};
+
+static void
+cmd_gso_size_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_size_result *res = parsed_result;
+
+	if (test_done == 0) {
+		printf("Before set GSO segsz, please stop fowarding first\n");
+		return;
+	}
+
+	if (!strcmp(res->cmd_keyword, "gso") &&
+			!strcmp(res->cmd_segsz, "segsz")) {
+		if (res->cmd_size == 0) {
+			printf("gso_size should be larger than 0."
+					" Please input a legal value\n");
+		} else
+			gso_max_segment_size = res->cmd_size;
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_size_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_size_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_size_segsz =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_segsz, "segsz");
+cmdline_parse_token_num_t cmd_gso_size_size =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_size_result,
+				cmd_size, UINT16);
+
+cmdline_parse_inst_t cmd_gso_size = {
+	.f = cmd_gso_size_parsed,
+	.data = NULL,
+	.help_str = "set gso segsz <length>",
+	.tokens = {
+		(void *)&cmd_gso_size_set,
+		(void *)&cmd_gso_size_keyword,
+		(void *)&cmd_gso_size_segsz,
+		(void *)&cmd_gso_size_size,
+		NULL,
+	},
+};
+
+/* *** SHOW GSO CONFIGURATION *** */
+struct cmd_gso_show_result {
+	cmdline_fixed_string_t cmd_show;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_show_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_show_result *res = parsed_result;
+
+	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
+		printf("invalid port id %u\n", res->cmd_pid);
+		return;
+	}
+
+	if (!strcmp(res->cmd_keyword, "gso")) {
+		if (gso_ports[res->cmd_pid].enable) {
+			printf("Max GSO segment size: %uB\n"
+					"Support GSO protocols: TCP/IPv4,"
+					" VxlAN and GRE\n",
+					gso_max_segment_size);
+		} else
+			printf("Port %u doesn't enable GSO\n", res->cmd_pid);
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_show_show =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_show, "show");
+cmdline_parse_token_string_t cmd_gso_show_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_show_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_num_t cmd_gso_show_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_show_result,
+				cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_show = {
+	.f = cmd_gso_show_parsed,
+	.data = NULL,
+	.help_str = "show port <port_id> gso",
+	.tokens = {
+		(void *)&cmd_gso_show_show,
+		(void *)&cmd_gso_show_port,
+		(void *)&cmd_gso_show_pid,
+		(void *)&cmd_gso_show_keyword,
+		NULL,
+	},
+};
+
 /* *** ENABLE/DISABLE FLUSH ON RX STREAMS *** */
 struct cmd_set_flush_rx {
 	cmdline_fixed_string_t set;
@@ -14255,6 +14430,9 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_tunnel_tso_show,
 	(cmdline_parse_inst_t *)&cmd_enable_gro,
 	(cmdline_parse_inst_t *)&cmd_gro_set,
+	(cmdline_parse_inst_t *)&cmd_gso_enable,
+	(cmdline_parse_inst_t *)&cmd_gso_size,
+	(cmdline_parse_inst_t *)&cmd_gso_show,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_rx,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_tx,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 3ae3e1c..3434346 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2454,6 +2454,30 @@ setup_gro(const char *mode, uint8_t port_id)
 	}
 }
 
+void
+setup_gso(const char *mode, uint8_t port_id)
+{
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		printf("invalid port id %u\n", port_id);
+		return;
+	}
+	if (strcmp(mode, "on") == 0) {
+		if (test_done == 0) {
+			printf("before enable GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 1;
+	} else if (strcmp(mode, "off") == 0) {
+		if (test_done == 0) {
+			printf("before disable GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 0;
+	}
+}
+
 char*
 list_pkt_forwarding_modes(void)
 {
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 90c8119..0a064f4 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -66,10 +66,12 @@
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_sctp.h>
+#include <rte_net.h>
 #include <rte_prefetch.h>
 #include <rte_string_fns.h>
 #include <rte_flow.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
@@ -91,6 +93,7 @@
 /* structure that caches offload info for the current packet */
 struct testpmd_offload_info {
 	uint16_t ethertype;
+	uint8_t gso_enable;
 	uint16_t l2_len;
 	uint16_t l3_len;
 	uint16_t l4_len;
@@ -381,6 +384,8 @@ process_inner_cksums(void *l3_hdr, const struct testpmd_offload_info *info,
 				get_udptcp_checksum(l3_hdr, tcp_hdr,
 					info->ethertype);
 		}
+		if (info->gso_enable)
+			ol_flags |= PKT_TX_TCP_SEG;
 	} else if (info->l4_proto == IPPROTO_SCTP) {
 		sctp_hdr = (struct sctp_hdr *)((char *)l3_hdr + info->l3_len);
 		sctp_hdr->cksum = 0;
@@ -627,6 +632,9 @@ static void
 pkt_burst_checksum_forward(struct fwd_stream *fs)
 {
 	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_mbuf *gso_segments[GSO_MAX_PKT_BURST];
+	struct rte_gso_ctx *gso_ctx;
+	struct rte_mbuf **tx_pkts_burst;
 	struct rte_port *txp;
 	struct rte_mbuf *m, *p;
 	struct ether_hdr *eth_hdr;
@@ -634,13 +642,15 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 	uint16_t nb_rx;
 	uint16_t nb_tx;
 	uint16_t nb_prep;
-	uint16_t i;
+	uint16_t i, j;
 	uint64_t rx_ol_flags, tx_ol_flags;
 	uint16_t testpmd_ol_flags;
 	uint32_t retry;
 	uint32_t rx_bad_ip_csum;
 	uint32_t rx_bad_l4_csum;
 	struct testpmd_offload_info info;
+	uint16_t nb_segments = 0;
+	int ret;
 
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	uint64_t start_tsc;
@@ -674,6 +684,8 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 	memset(&info, 0, sizeof(info));
 	info.tso_segsz = txp->tso_segsz;
 	info.tunnel_tso_segsz = txp->tunnel_tso_segsz;
+	if (gso_ports[fs->tx_port].enable)
+		info.gso_enable = 1;
 
 	for (i = 0; i < nb_rx; i++) {
 		if (likely(i < nb_rx - 1))
@@ -851,13 +863,59 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 		}
 	}
 
+	if (gso_ports[fs->tx_port].enable == 0)
+		tx_pkts_burst = pkts_burst;
+	else {
+		gso_ctx = &(current_fwd_lcore()->gso_ctx);
+		gso_ctx->gso_size = gso_max_segment_size;
+		for (i = 0; i < nb_rx; i++) {
+			if (unlikely(nb_rx - i >= GSO_MAX_PKT_BURST -
+						nb_segments)) {
+				/*
+				 * insufficient space in gso_segments,
+				 * stop GSO.
+				 */
+				for (j = i; j < GSO_MAX_PKT_BURST -
+						nb_segments; j++) {
+					pkts_burst[j]->ol_flags &=
+						(~PKT_TX_TCP_SEG);
+					gso_segments[nb_segments++] =
+						pkts_burst[j];
+				}
+				for (; j < nb_rx; j++)
+					rte_pktmbuf_free(pkts_burst[j]);
+				break;
+			}
+			ret = rte_gso_segment(pkts_burst[i], gso_ctx,
+					&gso_segments[nb_segments],
+					GSO_MAX_PKT_BURST - nb_segments);
+			if (ret >= 1)
+				nb_segments += ret;
+			else if (ret < 0) {
+				/*
+				 * insufficient MBUFs or space in
+				 * gso_segments, stop GSO.
+				 */
+				for (j = i; j < nb_rx; j++) {
+					pkts_burst[j]->ol_flags &=
+						(~PKT_TX_TCP_SEG);
+					gso_segments[nb_segments++] =
+						pkts_burst[j];
+				}
+				break;
+			}
+		}
+		tx_pkts_burst = gso_segments;
+		nb_rx = nb_segments;
+	}
+
 	nb_prep = rte_eth_tx_prepare(fs->tx_port, fs->tx_queue,
-			pkts_burst, nb_rx);
+			tx_pkts_burst, nb_rx);
 	if (nb_prep != nb_rx)
 		printf("Preparing packet burst to transmit failed: %s\n",
 				rte_strerror(rte_errno));
 
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst,
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, tx_pkts_burst,
 			nb_prep);
 
 	/*
@@ -868,7 +926,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
 			rte_delay_us(burst_tx_delay_time);
 			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_rx - nb_tx);
+					&tx_pkts_burst[nb_tx], nb_rx - nb_tx);
 		}
 	}
 	fs->tx_packets += nb_tx;
@@ -881,9 +939,10 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 	if (unlikely(nb_tx < nb_rx)) {
 		fs->fwd_dropped += (nb_rx - nb_tx);
 		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
+			rte_pktmbuf_free(tx_pkts_burst[nb_tx]);
 		} while (++nb_tx < nb_rx);
 	}
+
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	end_tsc = rte_rdtsc();
 	core_cycles = (end_tsc - start_tsc);
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index e097ee0..97e349d 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -400,6 +400,9 @@ static int eth_event_callback(uint8_t port_id,
  */
 static int all_ports_started(void);
 
+struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+uint16_t gso_max_segment_size = ETHER_MAX_LEN - ETHER_CRC_LEN;
+
 /*
  * Helper function to check if socket is already discovered.
  * If yes, return positive value. If not, return zero.
@@ -570,6 +573,7 @@ init_config(void)
 	unsigned int nb_mbuf_per_pool;
 	lcoreid_t  lc_id;
 	uint8_t port_per_socket[RTE_MAX_NUMA_NODES];
+	uint32_t gso_types = 0;
 
 	memset(port_per_socket,0,RTE_MAX_NUMA_NODES);
 
@@ -654,6 +658,8 @@ init_config(void)
 
 	init_port_config();
 
+	gso_types = DEV_TX_OFFLOAD_TCP_TSO | DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+		DEV_TX_OFFLOAD_GRE_TNL_TSO;
 	/*
 	 * Records which Mbuf pool to use by each logical core, if needed.
 	 */
@@ -664,6 +670,13 @@ init_config(void)
 		if (mbp == NULL)
 			mbp = mbuf_pool_find(0);
 		fwd_lcores[lc_id]->mbp = mbp;
+		/* initialize GSO context */
+		fwd_lcores[lc_id]->gso_ctx.direct_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.indirect_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.gso_types = gso_types;
+		fwd_lcores[lc_id]->gso_ctx.gso_size = ETHER_MAX_LEN -
+			ETHER_CRC_LEN;
+		fwd_lcores[lc_id]->gso_ctx.ipid_flag = !RTE_GSO_IPID_FIXED;
 	}
 
 	/* Configuration of packet forwarding streams. */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 1d1ee75..ff842a1 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -36,6 +36,7 @@
 
 #include <rte_pci.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
@@ -205,6 +206,7 @@ struct rte_port {
  * CPU id. configuration table.
  */
 struct fwd_lcore {
+	struct rte_gso_ctx gso_ctx;     /**< GSO context */
 	struct rte_mempool *mbp; /**< The mbuf pool to use by this core */
 	streamid_t stream_idx;   /**< index of 1st stream in "fwd_streams" */
 	streamid_t stream_nb;    /**< number of streams in "fwd_streams" */
@@ -442,6 +444,13 @@ struct gro_status {
 };
 extern struct gro_status gro_ports[RTE_MAX_ETHPORTS];
 
+#define GSO_MAX_PKT_BURST 2048
+struct gso_status {
+	uint8_t enable;
+};
+extern struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+extern uint16_t gso_max_segment_size;
+
 static inline unsigned int
 lcore_num(void)
 {
@@ -642,6 +651,7 @@ void get_5tuple_filter(uint8_t port_id, uint16_t index);
 int rx_queue_id_is_invalid(queueid_t rxq_id);
 int tx_queue_id_is_invalid(queueid_t txq_id);
 void setup_gro(const char *mode, uint8_t port_id);
+void setup_gso(const char *mode, uint8_t port_id);
 
 /* Functions to manage the set of filtered Multicast MAC addresses */
 void mcast_addr_add(uint8_t port_id, struct ether_addr *mc_addr);
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 2ed62f5..81e3199 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -932,6 +932,47 @@ number of packets a GRO table can store.
 If current packet number is greater than or equal to the max value, GRO
 will stop processing incoming packets.
 
+set port - gso
+~~~~~~~~~~~~~~
+
+Toggle per-port GSO support in ``csum`` forwarding engine::
+
+   testpmd> set port <port_id> gso on|off
+
+If enabled, the csum forwarding engine will perform GSO on supported IPv4
+packets, transmitted on the given port.
+
+If disabled, packets transmitted on the given port will not undergo GSO.
+By default, GSO is disabled for all ports.
+
+.. note::
+
+   When GSO is enabled on a port, supported IPv4 packets transmitted on that
+   port undergo GSO. Afterwards, the segmented packets are represented by
+   multi-segment mbufs; however, the csum forwarding engine doesn't support
+   calculation of TCP checksums for multi-segment mbufs in SW. As a result, TCP
+   (and as a corollary, IP) HW checksum calculation should also be enabled for
+   GSO-enabled ports.
+
+   testpmd> csum set <port_id> ip hw on
+
+   testpmd> csum set <port_id> tcp hw on
+
+set gso segsz
+~~~~~~~~~~~~~
+
+Set the maximum GSO segment size (measured in bytes), which includes the
+packet header and the packet payload for GSO-enabled ports (global)::
+
+   testpmd> set gso segsz <length>
+
+show port - gso
+~~~~~~~~~~~~~~~
+
+Display the status of Generic Segmentation Offload for a given port::
+
+   testpmd> show port <port_id> gso
+
 mac_addr add
 ~~~~~~~~~~~~
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* Re: [PATCH v4 4/5] gso: add GRE GSO support
  2017-09-19  7:32       ` [PATCH v4 4/5] gso: add GRE " Jiayu Hu
@ 2017-09-20  2:53         ` Tan, Jianfeng
  2017-09-20  6:01           ` Hu, Jiayu
  0 siblings, 1 reply; 157+ messages in thread
From: Tan, Jianfeng @ 2017-09-20  2:53 UTC (permalink / raw)
  To: Jiayu Hu, dev; +Cc: konstantin.ananyev, mark.b.kavanagh, ferruh.yigit, thomas

Hi,


On 9/19/2017 3:32 PM, Jiayu Hu wrote:
> From: Mark Kavanagh <mark.b.kavanagh@intel.com>
>
> This patch adds GSO support for GRE-tunneled packets. Supported GRE
> packets must contain an outer IPv4 header, and inner TCP/IPv4 headers.
> They may also contain a single VLAN tag. GRE GSO doesn't check if all
> input packets have correct checksums and doesn't update checksums for
> output packets. Additionally, it doesn't process IP fragmented packets.
>
> As with VxLAN GSO, GRE GSO uses a two-segment MBUF to organize each
> output packet, which requires multi-segment mbuf support in the TX
> functions of the NIC driver. Also, if a packet is GSOed, GRE GSO reduces
> its MBUF refcnt by 1. As a result, when all of its GSOed segments are
> freed, the packet is freed automatically.
>
> GRE GSO clears the PKT_TX_TCP_SEG flag for the input packet and GSO
> segments on the event of success.
>
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> ---
>   doc/guides/rel_notes/release_17_11.rst |  3 +++
>   lib/librte_gso/gso_common.c            | 31 +++++++++++++++++++++++++++++++
>   lib/librte_gso/gso_common.h            | 25 +++++++++++++++++++++++++
>   lib/librte_gso/gso_tunnel_tcp4.c       |  2 ++
>   lib/librte_gso/rte_gso.c               |  8 +++++---
>   5 files changed, 66 insertions(+), 3 deletions(-)
>
> diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
> index 2dc6b89..119f662 100644
> --- a/doc/guides/rel_notes/release_17_11.rst
> +++ b/doc/guides/rel_notes/release_17_11.rst
> @@ -51,6 +51,9 @@ New Features
>     * VxLAN packets, which must have an outer IPv4 header (prepended by
>       an optional VLAN tag), and contain an inner TCP/IPv4 packet (with
>       an optional VLAN tag).
> +  * GRE packets, which must contain an outer IPv4 header (prepended by
> +    an optional VLAN tag), and inner TCP/IPv4 headers (with an optional
> +    VLAN tag).
>   
>     The GSO library doesn't check if the input packets have correct
>     checksums, and doesn't update checksums for output packets.
> diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
> index 90fcb2a..3629625 100644
> --- a/lib/librte_gso/gso_common.c
> +++ b/lib/librte_gso/gso_common.c
> @@ -37,6 +37,7 @@
>   #include <rte_memcpy.h>
>   #include <rte_mempool.h>
>   #include <rte_ether.h>
> +#include <rte_gre.h>
>   #include <rte_ip.h>
>   #include <rte_tcp.h>
>   #include <rte_udp.h>
> @@ -258,3 +259,33 @@ update_ipv4_vxlan_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
>   		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
>   	}
>   }
> +
> +void
> +update_ipv4_gre_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
> +		struct rte_mbuf **segs, uint16_t nb_segs)
> +{

This function seems to have too many duplicated code with above 
function, can we merge?

> +	struct ipv4_hdr *ipv4_hdr;
> +	struct tcp_hdr *tcp_hdr;
> +	uint32_t sent_seq;
> +	uint16_t l2_len, outer_id, inner_id, tail_idx, i;
> +
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			pkt->outer_l2_len);
> +	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> +
> +	l2_len = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) + l2_len);
> +	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> +	tail_idx = nb_segs - 1;
> +	for (i = 0; i < nb_segs; i++) {
> +		__update_outer_ipv4_header(segs[i], outer_id);
> +		outer_id += ipid_delta;

We should update both outer and inner IPID? Could you add some spec 
reference here?

> +
> +		__update_ipv4_tcp_header(segs[i], l2_len, inner_id, sent_seq,
> +				i < tail_idx);
> +		inner_id += ipid_delta;
> +		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
> +	}
> +}
> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> index 0b0d8ed..433e952 100644
> --- a/lib/librte_gso/gso_common.h
> +++ b/lib/librte_gso/gso_common.h
> @@ -53,6 +53,11 @@
>   		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
>   		 PKT_TX_TUNNEL_VXLAN))
>   
> +#define IS_IPV4_GRE_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
> +				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_GRE)) == \
> +		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
> +		 PKT_TX_TUNNEL_GRE))
> +
>   /**
>    * Internal function which updates relevant packet headers for TCP/IPv4
>    * packets, following segmentation. This is required to update, for
> @@ -94,6 +99,26 @@ void update_ipv4_vxlan_tcp4_header(struct rte_mbuf *pkt,
>   		uint16_t nb_segs);
>   
>   /**
> + * Internal function which updates relevant packet headers for GRE
> + * packets, following segmentation. This is required to update, for
> + * example, the IPv4 'total_length' field, to reflect the reduced
> + * length of the now-segmented packet.
> + *
> + * @param pkt
> + *  The original packet.
> + * @param ipid_delta
> + *  The increasing uint of IP ids.

uint -> unit?

> + * @param segs
> + *  Pointer array used for storing mbuf addresses for GSO segments.
> + * @param nb_segs
> + *  The number of GSO segments placed in segs.
> + */
> +void update_ipv4_gre_tcp4_header(struct rte_mbuf *pkt,
> +		uint8_t ipid_delta,
> +		struct rte_mbuf **segs,
> +		uint16_t nb_segs);
> +
> +/**
>    * Internal function which divides the input packet into small segments.
>    * Each of the newly-created segments is organized as a two-segment MBUF,
>    * where the first segment is a standard mbuf, which stores a copy of
> diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
> index cc017bd..5d5930a 100644
> --- a/lib/librte_gso/gso_tunnel_tcp4.c
> +++ b/lib/librte_gso/gso_tunnel_tcp4.c
> @@ -82,6 +82,8 @@ gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
>   
>   	if (pkt->ol_flags & PKT_TX_TUNNEL_VXLAN)
>   		update_ipv4_vxlan_tcp4_header(pkt, ipid_delta, pkts_out, ret);
> +	else if (pkt->ol_flags & PKT_TX_TUNNEL_GRE)
> +		update_ipv4_gre_tcp4_header(pkt, ipid_delta, pkts_out, ret);
>   
>   	return ret;
>   }
> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> index d1a723b..5464831 100644
> --- a/lib/librte_gso/rte_gso.c
> +++ b/lib/librte_gso/rte_gso.c
> @@ -60,8 +60,9 @@ rte_gso_segment(struct rte_mbuf *pkt,
>   
>   	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
>   				(DEV_TX_OFFLOAD_TCP_TSO |
> -				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
> -				gso_ctx->gso_types) {
> +				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
> +				 DEV_TX_OFFLOAD_GRE_TNL_TSO)) !=
> +				 gso_ctx->gso_types) {
>   		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>   		pkts_out[0] = pkt;
>   		return ret;
> @@ -73,7 +74,8 @@ rte_gso_segment(struct rte_mbuf *pkt,
>   	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
>   	ol_flags = pkt->ol_flags;
>   
> -	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) {
> +	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags) ||
> +			IS_IPV4_GRE_TCP4(pkt->ol_flags)) {
>   		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>   		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
>   				direct_pool, indirect_pool,

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v4 3/5] gso: add VxLAN GSO support
  2017-09-19  7:32       ` [PATCH v4 3/5] gso: add VxLAN " Jiayu Hu
@ 2017-09-20  3:11         ` Tan, Jianfeng
  2017-09-20  3:17           ` Hu, Jiayu
  0 siblings, 1 reply; 157+ messages in thread
From: Tan, Jianfeng @ 2017-09-20  3:11 UTC (permalink / raw)
  To: Jiayu Hu, dev; +Cc: konstantin.ananyev, mark.b.kavanagh, ferruh.yigit, thomas


On 9/19/2017 3:32 PM, Jiayu Hu wrote:
> From: Mark Kavanagh <mark.b.kavanagh@intel.com>
>
> This patch adds GSO support for VxLAN-encapsulated packets. Supported
> VxLAN packets must have an outer IPv4 header (prepended by an optional
> VLAN tag), and contain an inner TCP/IPv4 packet (with an optional inner
> VLAN tag).

This patch not only adds support for VxLAN, but also support for tunnel 
framework. Better to mention it in the first place.

> VxLAN GSO doesn't check if all input packets have correct checksums and
> doesn't update checksums for output packets. Additionally, it doesn't
> process IP fragmented packets.
>
> As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
> output packet, which mandates support for multi-segment mbufs in the TX
> functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
> reduces its MBUF refcnt by 1. As a result, when all of its GSOed segments
> are freed, the packet is freed automatically.
>
> VxLAN GSO clears the PKT_TX_TCP_SEG flag for the input packet and GSO
> segments on the event of success.

This flag is not cleared here, it's cleared in the gso interface. So 
remove above sentence?

>
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> ---
>   doc/guides/rel_notes/release_17_11.rst |  3 ++
>   lib/librte_gso/Makefile                |  1 +
>   lib/librte_gso/gso_common.c            | 58 +++++++++++++++++++++++
>   lib/librte_gso/gso_common.h            | 25 ++++++++++
>   lib/librte_gso/gso_tunnel_tcp4.c       | 87 ++++++++++++++++++++++++++++++++++
>   lib/librte_gso/gso_tunnel_tcp4.h       | 76 +++++++++++++++++++++++++++++
>   lib/librte_gso/rte_gso.c               | 13 +++--
>   7 files changed, 260 insertions(+), 3 deletions(-)
>   create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
>   create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
>
> diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
> index 7453bb0..2dc6b89 100644
> --- a/doc/guides/rel_notes/release_17_11.rst
> +++ b/doc/guides/rel_notes/release_17_11.rst
> @@ -48,6 +48,9 @@ New Features
>     ones (e.g. MTU is 1500B). Supported packet types are:
>   
>     * TCP/IPv4 packets, which may include a single VLAN tag.
> +  * VxLAN packets, which must have an outer IPv4 header (prepended by
> +    an optional VLAN tag), and contain an inner TCP/IPv4 packet (with
> +    an optional VLAN tag).
>   
>     The GSO library doesn't check if the input packets have correct
>     checksums, and doesn't update checksums for output packets.
> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> index 2be64d1..e6d41df 100644
> --- a/lib/librte_gso/Makefile
> +++ b/lib/librte_gso/Makefile
> @@ -44,6 +44,7 @@ LIBABIVER := 1
>   SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
>   SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
>   SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
>   
>   # install this header file
>   SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
> index b2c84f6..90fcb2a 100644
> --- a/lib/librte_gso/gso_common.c
> +++ b/lib/librte_gso/gso_common.c
> @@ -39,6 +39,7 @@
>   #include <rte_ether.h>
>   #include <rte_ip.h>
>   #include <rte_tcp.h>
> +#include <rte_udp.h>
>   
>   #include "gso_common.h"
>   
> @@ -200,3 +201,60 @@ update_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
>   		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
>   	}
>   }
> +
> +static inline void
> +__update_outer_ipv4_header(struct rte_mbuf *pkt, uint16_t id)
> +{
> +	struct ipv4_hdr *ipv4_hdr;
> +
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			pkt->outer_l2_len);
> +	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len -
> +			pkt->outer_l2_len);
> +	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
> +}
> +
> +static inline void
> +__update_outer_udp_header(struct rte_mbuf *pkt)
> +{
> +	struct udp_hdr *udp_hdr;
> +	uint16_t length;
> +
> +	length = pkt->outer_l2_len + pkt->outer_l3_len;
> +	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			length);
> +	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - length);
> +}
> +
> +void
> +update_ipv4_vxlan_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
> +		struct rte_mbuf **segs, uint16_t nb_segs)

This function is specific to tunnel, better move to gso_tunnel_tcp4.c

> +{
> +	struct ipv4_hdr *ipv4_hdr;
> +	struct tcp_hdr *tcp_hdr;
> +	uint32_t sent_seq;
> +	uint16_t l2_len, outer_id, inner_id, tail_idx, i;
> +
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			pkt->outer_l2_len);
> +	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> +
> +	l2_len = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			l2_len);
> +	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> +	tail_idx = nb_segs - 1;
> +
> +	for (i = 0; i < nb_segs; i++) {
> +		__update_outer_ipv4_header(segs[i], outer_id);
> +		outer_id += ipid_delta;
> +		__update_outer_udp_header(segs[i]);
> +
> +		__update_ipv4_tcp_header(segs[i], l2_len, inner_id, sent_seq,
> +				i < tail_idx);
> +		inner_id += ipid_delta;
> +		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
> +	}
> +}
> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> index 2a01cd0..0b0d8ed 100644
> --- a/lib/librte_gso/gso_common.h
> +++ b/lib/librte_gso/gso_common.h
> @@ -48,6 +48,11 @@
>   #define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
>   		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
>   
> +#define IS_IPV4_VXLAN_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
> +				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN)) == \
> +		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
> +		 PKT_TX_TUNNEL_VXLAN))
> +
>   /**
>    * Internal function which updates relevant packet headers for TCP/IPv4
>    * packets, following segmentation. This is required to update, for
> @@ -69,6 +74,26 @@ void update_tcp4_header(struct rte_mbuf *pkt,
>   		uint16_t nb_segs);
>   
>   /**
> + * Internal function which updates relevant packet headers for VxLAN
> + * packets, following segmentation. This is required to update, for
> + * example, the IPv4 'total_length' field, to reflect the reduced length
> + * of the now-segmented packet.
> + *
> + * @param pkt
> + *  The original packet.
> + * @param ipid_delta
> + *  The increasing uint of IP ids.
> + * @param segs
> + *  Pointer array used for storing mbuf addresses for GSO segments.
> + * @param nb_segs
> + *  The number of GSO segments placed in segs.
> + */
> +void update_ipv4_vxlan_tcp4_header(struct rte_mbuf *pkt,
> +		uint8_t ipid_delta,
> +		struct rte_mbuf **segs,
> +		uint16_t nb_segs);
> +
> +/**
>    * Internal function which divides the input packet into small segments.
>    * Each of the newly-created segments is organized as a two-segment MBUF,
>    * where the first segment is a standard mbuf, which stores a copy of
> diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
> new file mode 100644
> index 0000000..cc017bd
> --- /dev/null
> +++ b/lib/librte_gso/gso_tunnel_tcp4.c
> @@ -0,0 +1,87 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <rte_ether.h>
> +#include <rte_ip.h>
> +
> +#include "gso_common.h"
> +#include "gso_tunnel_tcp4.h"
> +
> +int
> +gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
> +		uint16_t gso_size,
> +		uint8_t ipid_delta,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out)
> +{
> +	struct ipv4_hdr *inner_ipv4_hdr;
> +	uint16_t pyld_unit_size, hdr_offset;
> +	uint16_t tcp_dl, frag_off;
> +	int ret = 1;
> +
> +	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
> +	inner_ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			hdr_offset);
> +	/*
> +	 * Don't process the packet whose MF bit and offset in the inner
> +	 * IPv4 header are non-zero.
> +	 */
> +	frag_off = rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset);
> +	if (unlikely(IS_FRAGMENTED(frag_off))) {
> +		pkts_out[0] = pkt;
> +		return ret;

Please use "return 1;" for readability.

> +	}
> +
> +	/* Don't process the packet without data */
> +	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
> +	if (unlikely(tcp_dl == 0)) {
> +		pkts_out[0] = pkt;
> +		return ret;

Ditto.

> +	}
> +
> +	hdr_offset += pkt->l3_len + pkt->l4_len;
> +	pyld_unit_size = gso_size - hdr_offset;
> +
> +	/* Segment the payload */
> +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
> +			indirect_pool, pkts_out, nb_pkts_out);
> +	if (ret <= 1)
> +		return ret;
> +
> +	if (pkt->ol_flags & PKT_TX_TUNNEL_VXLAN)
> +		update_ipv4_vxlan_tcp4_header(pkt, ipid_delta, pkts_out, ret);
> +
> +	return ret;
> +}
> diff --git a/lib/librte_gso/gso_tunnel_tcp4.h b/lib/librte_gso/gso_tunnel_tcp4.h
> new file mode 100644
> index 0000000..a848a2e
> --- /dev/null
> +++ b/lib/librte_gso/gso_tunnel_tcp4.h
> @@ -0,0 +1,76 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _GSO_TUNNEL_TCP4_H_
> +#define _GSO_TUNNEL_TCP4_H_
> +
> +#include <stdint.h>
> +#include <rte_mbuf.h>
> +
> +/**
> + * Segment an tunneling packet with inner TCP/IPv4 headers. This function
> + * doesn't check if the input packet has correct checksums, and doesn't
> + * update checksums for output GSO segments. Furthermore, it doesn't
> + * process IP fragment packets.
> + *
> + * @param pkt
> + *  The packet mbuf to segment.
> + * @param gso_size
> + *  The max length of a GSO segment, measured in bytes.
> + * @param ipid_delta
> + *  The increasing uint of IP ids.
> + * @param direct_pool
> + *  MBUF pool used for allocating direct buffers for output segments.
> + * @param indirect_pool
> + *  MBUF pool used for allocating indirect buffers for output segments.
> + * @param pkts_out
> + *  Pointer array used to store the MBUF addresses of output GSO
> + *  segments, when gso_tunnel_tcp4_segment() successes. If the memory
> + *  space in pkts_out is insufficient, gso_tcp4_segment() fails and

"gso_tcp4_segment()" -> "it".

Thanks,
Jianfeng

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v4 3/5] gso: add VxLAN GSO support
  2017-09-20  3:11         ` Tan, Jianfeng
@ 2017-09-20  3:17           ` Hu, Jiayu
  0 siblings, 0 replies; 157+ messages in thread
From: Hu, Jiayu @ 2017-09-20  3:17 UTC (permalink / raw)
  To: Tan, Jianfeng, dev
  Cc: Ananyev, Konstantin, Kavanagh, Mark B, Yigit, Ferruh, thomas

Hi Jianfeng,

> -----Original Message-----
> From: Tan, Jianfeng
> Sent: Wednesday, September 20, 2017 11:11 AM
> To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark
> B <mark.b.kavanagh@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> thomas@monjalon.net
> Subject: Re: [PATCH v4 3/5] gso: add VxLAN GSO support
> 
> 
> On 9/19/2017 3:32 PM, Jiayu Hu wrote:
> > From: Mark Kavanagh <mark.b.kavanagh@intel.com>
> >
> > This patch adds GSO support for VxLAN-encapsulated packets. Supported
> > VxLAN packets must have an outer IPv4 header (prepended by an optional
> > VLAN tag), and contain an inner TCP/IPv4 packet (with an optional inner
> > VLAN tag).
> 
> This patch not only adds support for VxLAN, but also support for tunnel
> framework. Better to mention it in the first place.
> 
> > VxLAN GSO doesn't check if all input packets have correct checksums and
> > doesn't update checksums for output packets. Additionally, it doesn't
> > process IP fragmented packets.
> >
> > As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize
> each
> > output packet, which mandates support for multi-segment mbufs in the TX
> > functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
> > reduces its MBUF refcnt by 1. As a result, when all of its GSOed segments
> > are freed, the packet is freed automatically.
> >
> > VxLAN GSO clears the PKT_TX_TCP_SEG flag for the input packet and GSO
> > segments on the event of success.
> 
> This flag is not cleared here, it's cleared in the gso interface. So
> remove above sentence?

Make sense. I will remove the above sentence.

> 
> >
> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > ---
> >   doc/guides/rel_notes/release_17_11.rst |  3 ++
> >   lib/librte_gso/Makefile                |  1 +
> >   lib/librte_gso/gso_common.c            | 58 +++++++++++++++++++++++
> >   lib/librte_gso/gso_common.h            | 25 ++++++++++
> >   lib/librte_gso/gso_tunnel_tcp4.c       | 87
> ++++++++++++++++++++++++++++++++++
> >   lib/librte_gso/gso_tunnel_tcp4.h       | 76
> +++++++++++++++++++++++++++++
> >   lib/librte_gso/rte_gso.c               | 13 +++--
> >   7 files changed, 260 insertions(+), 3 deletions(-)
> >   create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
> >   create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
> >
> > diff --git a/doc/guides/rel_notes/release_17_11.rst
> b/doc/guides/rel_notes/release_17_11.rst
> > index 7453bb0..2dc6b89 100644
> > --- a/doc/guides/rel_notes/release_17_11.rst
> > +++ b/doc/guides/rel_notes/release_17_11.rst
> > @@ -48,6 +48,9 @@ New Features
> >     ones (e.g. MTU is 1500B). Supported packet types are:
> >
> >     * TCP/IPv4 packets, which may include a single VLAN tag.
> > +  * VxLAN packets, which must have an outer IPv4 header (prepended by
> > +    an optional VLAN tag), and contain an inner TCP/IPv4 packet (with
> > +    an optional VLAN tag).
> >
> >     The GSO library doesn't check if the input packets have correct
> >     checksums, and doesn't update checksums for output packets.
> > diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> > index 2be64d1..e6d41df 100644
> > --- a/lib/librte_gso/Makefile
> > +++ b/lib/librte_gso/Makefile
> > @@ -44,6 +44,7 @@ LIBABIVER := 1
> >   SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
> >   SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
> >   SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
> > +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
> >
> >   # install this header file
> >   SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> > diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
> > index b2c84f6..90fcb2a 100644
> > --- a/lib/librte_gso/gso_common.c
> > +++ b/lib/librte_gso/gso_common.c
> > @@ -39,6 +39,7 @@
> >   #include <rte_ether.h>
> >   #include <rte_ip.h>
> >   #include <rte_tcp.h>
> > +#include <rte_udp.h>
> >
> >   #include "gso_common.h"
> >
> > @@ -200,3 +201,60 @@ update_tcp4_header(struct rte_mbuf *pkt,
> uint8_t ipid_delta,
> >   		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
> >   	}
> >   }
> > +
> > +static inline void
> > +__update_outer_ipv4_header(struct rte_mbuf *pkt, uint16_t id)
> > +{
> > +	struct ipv4_hdr *ipv4_hdr;
> > +
> > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> > +			pkt->outer_l2_len);
> > +	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len -
> > +			pkt->outer_l2_len);
> > +	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
> > +}
> > +
> > +static inline void
> > +__update_outer_udp_header(struct rte_mbuf *pkt)
> > +{
> > +	struct udp_hdr *udp_hdr;
> > +	uint16_t length;
> > +
> > +	length = pkt->outer_l2_len + pkt->outer_l3_len;
> > +	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> > +			length);
> > +	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - length);
> > +}
> > +
> > +void
> > +update_ipv4_vxlan_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
> > +		struct rte_mbuf **segs, uint16_t nb_segs)
> 
> This function is specific to tunnel, better move to gso_tunnel_tcp4.c

Make sense. I will remove GRE header update function too.

> 
> > +{
> > +	struct ipv4_hdr *ipv4_hdr;
> > +	struct tcp_hdr *tcp_hdr;
> > +	uint32_t sent_seq;
> > +	uint16_t l2_len, outer_id, inner_id, tail_idx, i;
> > +
> > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> > +			pkt->outer_l2_len);
> > +	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> > +
> > +	l2_len = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
> > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> > +			l2_len);
> > +	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> > +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> > +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> > +	tail_idx = nb_segs - 1;
> > +
> > +	for (i = 0; i < nb_segs; i++) {
> > +		__update_outer_ipv4_header(segs[i], outer_id);
> > +		outer_id += ipid_delta;
> > +		__update_outer_udp_header(segs[i]);
> > +
> > +		__update_ipv4_tcp_header(segs[i], l2_len, inner_id,
> sent_seq,
> > +				i < tail_idx);
> > +		inner_id += ipid_delta;
> > +		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
> > +	}
> > +}
> > diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> > index 2a01cd0..0b0d8ed 100644
> > --- a/lib/librte_gso/gso_common.h
> > +++ b/lib/librte_gso/gso_common.h
> > @@ -48,6 +48,11 @@
> >   #define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) ==
> \
> >   		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
> >
> > +#define IS_IPV4_VXLAN_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG |
> PKT_TX_IPV4 | \
> > +				PKT_TX_OUTER_IPV4 |
> PKT_TX_TUNNEL_VXLAN)) == \
> > +		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
> > +		 PKT_TX_TUNNEL_VXLAN))
> > +
> >   /**
> >    * Internal function which updates relevant packet headers for TCP/IPv4
> >    * packets, following segmentation. This is required to update, for
> > @@ -69,6 +74,26 @@ void update_tcp4_header(struct rte_mbuf *pkt,
> >   		uint16_t nb_segs);
> >
> >   /**
> > + * Internal function which updates relevant packet headers for VxLAN
> > + * packets, following segmentation. This is required to update, for
> > + * example, the IPv4 'total_length' field, to reflect the reduced length
> > + * of the now-segmented packet.
> > + *
> > + * @param pkt
> > + *  The original packet.
> > + * @param ipid_delta
> > + *  The increasing uint of IP ids.
> > + * @param segs
> > + *  Pointer array used for storing mbuf addresses for GSO segments.
> > + * @param nb_segs
> > + *  The number of GSO segments placed in segs.
> > + */
> > +void update_ipv4_vxlan_tcp4_header(struct rte_mbuf *pkt,
> > +		uint8_t ipid_delta,
> > +		struct rte_mbuf **segs,
> > +		uint16_t nb_segs);
> > +
> > +/**
> >    * Internal function which divides the input packet into small segments.
> >    * Each of the newly-created segments is organized as a two-segment
> MBUF,
> >    * where the first segment is a standard mbuf, which stores a copy of
> > diff --git a/lib/librte_gso/gso_tunnel_tcp4.c
> b/lib/librte_gso/gso_tunnel_tcp4.c
> > new file mode 100644
> > index 0000000..cc017bd
> > --- /dev/null
> > +++ b/lib/librte_gso/gso_tunnel_tcp4.c
> > @@ -0,0 +1,87 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
> OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> > + */
> > +
> > +#include <rte_ether.h>
> > +#include <rte_ip.h>
> > +
> > +#include "gso_common.h"
> > +#include "gso_tunnel_tcp4.h"
> > +
> > +int
> > +gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
> > +		uint16_t gso_size,
> > +		uint8_t ipid_delta,
> > +		struct rte_mempool *direct_pool,
> > +		struct rte_mempool *indirect_pool,
> > +		struct rte_mbuf **pkts_out,
> > +		uint16_t nb_pkts_out)
> > +{
> > +	struct ipv4_hdr *inner_ipv4_hdr;
> > +	uint16_t pyld_unit_size, hdr_offset;
> > +	uint16_t tcp_dl, frag_off;
> > +	int ret = 1;
> > +
> > +	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
> > +	inner_ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> > +			hdr_offset);
> > +	/*
> > +	 * Don't process the packet whose MF bit and offset in the inner
> > +	 * IPv4 header are non-zero.
> > +	 */
> > +	frag_off = rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset);
> > +	if (unlikely(IS_FRAGMENTED(frag_off))) {
> > +		pkts_out[0] = pkt;
> > +		return ret;
> 
> Please use "return 1;" for readability.

OK.

> 
> > +	}
> > +
> > +	/* Don't process the packet without data */
> > +	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
> > +	if (unlikely(tcp_dl == 0)) {
> > +		pkts_out[0] = pkt;
> > +		return ret;
> 
> Ditto.
> 
> > +	}
> > +
> > +	hdr_offset += pkt->l3_len + pkt->l4_len;
> > +	pyld_unit_size = gso_size - hdr_offset;
> > +
> > +	/* Segment the payload */
> > +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
> > +			indirect_pool, pkts_out, nb_pkts_out);
> > +	if (ret <= 1)
> > +		return ret;
> > +
> > +	if (pkt->ol_flags & PKT_TX_TUNNEL_VXLAN)
> > +		update_ipv4_vxlan_tcp4_header(pkt, ipid_delta, pkts_out,
> ret);
> > +
> > +	return ret;
> > +}
> > diff --git a/lib/librte_gso/gso_tunnel_tcp4.h
> b/lib/librte_gso/gso_tunnel_tcp4.h
> > new file mode 100644
> > index 0000000..a848a2e
> > --- /dev/null
> > +++ b/lib/librte_gso/gso_tunnel_tcp4.h
> > @@ -0,0 +1,76 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
> OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> > + */
> > +
> > +#ifndef _GSO_TUNNEL_TCP4_H_
> > +#define _GSO_TUNNEL_TCP4_H_
> > +
> > +#include <stdint.h>
> > +#include <rte_mbuf.h>
> > +
> > +/**
> > + * Segment an tunneling packet with inner TCP/IPv4 headers. This function
> > + * doesn't check if the input packet has correct checksums, and doesn't
> > + * update checksums for output GSO segments. Furthermore, it doesn't
> > + * process IP fragment packets.
> > + *
> > + * @param pkt
> > + *  The packet mbuf to segment.
> > + * @param gso_size
> > + *  The max length of a GSO segment, measured in bytes.
> > + * @param ipid_delta
> > + *  The increasing uint of IP ids.
> > + * @param direct_pool
> > + *  MBUF pool used for allocating direct buffers for output segments.
> > + * @param indirect_pool
> > + *  MBUF pool used for allocating indirect buffers for output segments.
> > + * @param pkts_out
> > + *  Pointer array used to store the MBUF addresses of output GSO
> > + *  segments, when gso_tunnel_tcp4_segment() successes. If the memory
> > + *  space in pkts_out is insufficient, gso_tcp4_segment() fails and
> 
> "gso_tcp4_segment()" -> "it".

Yes, a typo here. Thanks.

Thanks,
Jiayu

> 
> Thanks,
> Jianfeng

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v4 4/5] gso: add GRE GSO support
  2017-09-20  2:53         ` Tan, Jianfeng
@ 2017-09-20  6:01           ` Hu, Jiayu
  0 siblings, 0 replies; 157+ messages in thread
From: Hu, Jiayu @ 2017-09-20  6:01 UTC (permalink / raw)
  To: Tan, Jianfeng, dev
  Cc: Ananyev, Konstantin, Kavanagh, Mark B, Yigit, Ferruh, thomas

Hi Jianfeng,

> -----Original Message-----
> From: Tan, Jianfeng
> Sent: Wednesday, September 20, 2017 10:54 AM
> To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark
> B <mark.b.kavanagh@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> thomas@monjalon.net
> Subject: Re: [PATCH v4 4/5] gso: add GRE GSO support
> 
> Hi,
> 
> 
> On 9/19/2017 3:32 PM, Jiayu Hu wrote:
> > From: Mark Kavanagh <mark.b.kavanagh@intel.com>
> >
> > This patch adds GSO support for GRE-tunneled packets. Supported GRE
> > packets must contain an outer IPv4 header, and inner TCP/IPv4 headers.
> > They may also contain a single VLAN tag. GRE GSO doesn't check if all
> > input packets have correct checksums and doesn't update checksums for
> > output packets. Additionally, it doesn't process IP fragmented packets.
> >
> > As with VxLAN GSO, GRE GSO uses a two-segment MBUF to organize each
> > output packet, which requires multi-segment mbuf support in the TX
> > functions of the NIC driver. Also, if a packet is GSOed, GRE GSO reduces
> > its MBUF refcnt by 1. As a result, when all of its GSOed segments are
> > freed, the packet is freed automatically.
> >
> > GRE GSO clears the PKT_TX_TCP_SEG flag for the input packet and GSO
> > segments on the event of success.
> >
> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > ---
> >   doc/guides/rel_notes/release_17_11.rst |  3 +++
> >   lib/librte_gso/gso_common.c            | 31
> +++++++++++++++++++++++++++++++
> >   lib/librte_gso/gso_common.h            | 25 +++++++++++++++++++++++++
> >   lib/librte_gso/gso_tunnel_tcp4.c       |  2 ++
> >   lib/librte_gso/rte_gso.c               |  8 +++++---
> >   5 files changed, 66 insertions(+), 3 deletions(-)
> >
> > diff --git a/doc/guides/rel_notes/release_17_11.rst
> b/doc/guides/rel_notes/release_17_11.rst
> > index 2dc6b89..119f662 100644
> > --- a/doc/guides/rel_notes/release_17_11.rst
> > +++ b/doc/guides/rel_notes/release_17_11.rst
> > @@ -51,6 +51,9 @@ New Features
> >     * VxLAN packets, which must have an outer IPv4 header (prepended by
> >       an optional VLAN tag), and contain an inner TCP/IPv4 packet (with
> >       an optional VLAN tag).
> > +  * GRE packets, which must contain an outer IPv4 header (prepended by
> > +    an optional VLAN tag), and inner TCP/IPv4 headers (with an optional
> > +    VLAN tag).
> >
> >     The GSO library doesn't check if the input packets have correct
> >     checksums, and doesn't update checksums for output packets.
> > diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
> > index 90fcb2a..3629625 100644
> > --- a/lib/librte_gso/gso_common.c
> > +++ b/lib/librte_gso/gso_common.c
> > @@ -37,6 +37,7 @@
> >   #include <rte_memcpy.h>
> >   #include <rte_mempool.h>
> >   #include <rte_ether.h>
> > +#include <rte_gre.h>
> >   #include <rte_ip.h>
> >   #include <rte_tcp.h>
> >   #include <rte_udp.h>
> > @@ -258,3 +259,33 @@ update_ipv4_vxlan_tcp4_header(struct rte_mbuf
> *pkt, uint8_t ipid_delta,
> >   		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
> >   	}
> >   }
> > +
> > +void
> > +update_ipv4_gre_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
> > +		struct rte_mbuf **segs, uint16_t nb_segs)
> > +{
> 
> This function seems to have too many duplicated code with above
> function, can we merge?

Yes, they can be merged into one.

> 
> > +	struct ipv4_hdr *ipv4_hdr;
> > +	struct tcp_hdr *tcp_hdr;
> > +	uint32_t sent_seq;
> > +	uint16_t l2_len, outer_id, inner_id, tail_idx, i;
> > +
> > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> > +			pkt->outer_l2_len);
> > +	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> > +
> > +	l2_len = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
> > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> l2_len);
> > +	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> > +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> > +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> > +	tail_idx = nb_segs - 1;
> > +	for (i = 0; i < nb_segs; i++) {
> > +		__update_outer_ipv4_header(segs[i], outer_id);
> > +		outer_id += ipid_delta;
> 
> We should update both outer and inner IPID? Could you add some spec
> reference here?

I check the VxLAN RFC, but it doesn't have strict limits on the value of outer IP ID.
In the implementation of Linux GSO, the outer IP ID is always incremental, where
SKB_GSO_TCP_FIXEDID is only meaningful to inner IP ID.

If no objections, I will take the same design as Linux. Wonder your opinions,
@Jianfeng @Konstantin @Mark.

Thanks,
Jiayu

> 
> > +
> > +		__update_ipv4_tcp_header(segs[i], l2_len, inner_id,
> sent_seq,
> > +				i < tail_idx);
> > +		inner_id += ipid_delta;
> > +		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
> > +	}
> > +}
> > diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> > index 0b0d8ed..433e952 100644
> > --- a/lib/librte_gso/gso_common.h
> > +++ b/lib/librte_gso/gso_common.h
> > @@ -53,6 +53,11 @@
> >   		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
> >   		 PKT_TX_TUNNEL_VXLAN))
> >
> > +#define IS_IPV4_GRE_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG |
> PKT_TX_IPV4 | \
> > +				PKT_TX_OUTER_IPV4 |
> PKT_TX_TUNNEL_GRE)) == \
> > +		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
> > +		 PKT_TX_TUNNEL_GRE))
> > +
> >   /**
> >    * Internal function which updates relevant packet headers for TCP/IPv4
> >    * packets, following segmentation. This is required to update, for
> > @@ -94,6 +99,26 @@ void update_ipv4_vxlan_tcp4_header(struct
> rte_mbuf *pkt,
> >   		uint16_t nb_segs);
> >
> >   /**
> > + * Internal function which updates relevant packet headers for GRE
> > + * packets, following segmentation. This is required to update, for
> > + * example, the IPv4 'total_length' field, to reflect the reduced
> > + * length of the now-segmented packet.
> > + *
> > + * @param pkt
> > + *  The original packet.
> > + * @param ipid_delta
> > + *  The increasing uint of IP ids.
> 
> uint -> unit?
> 
> > + * @param segs
> > + *  Pointer array used for storing mbuf addresses for GSO segments.
> > + * @param nb_segs
> > + *  The number of GSO segments placed in segs.
> > + */
> > +void update_ipv4_gre_tcp4_header(struct rte_mbuf *pkt,
> > +		uint8_t ipid_delta,
> > +		struct rte_mbuf **segs,
> > +		uint16_t nb_segs);
> > +
> > +/**
> >    * Internal function which divides the input packet into small segments.
> >    * Each of the newly-created segments is organized as a two-segment
> MBUF,
> >    * where the first segment is a standard mbuf, which stores a copy of
> > diff --git a/lib/librte_gso/gso_tunnel_tcp4.c
> b/lib/librte_gso/gso_tunnel_tcp4.c
> > index cc017bd..5d5930a 100644
> > --- a/lib/librte_gso/gso_tunnel_tcp4.c
> > +++ b/lib/librte_gso/gso_tunnel_tcp4.c
> > @@ -82,6 +82,8 @@ gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
> >
> >   	if (pkt->ol_flags & PKT_TX_TUNNEL_VXLAN)
> >   		update_ipv4_vxlan_tcp4_header(pkt, ipid_delta, pkts_out,
> ret);
> > +	else if (pkt->ol_flags & PKT_TX_TUNNEL_GRE)
> > +		update_ipv4_gre_tcp4_header(pkt, ipid_delta, pkts_out, ret);
> >
> >   	return ret;
> >   }
> > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > index d1a723b..5464831 100644
> > --- a/lib/librte_gso/rte_gso.c
> > +++ b/lib/librte_gso/rte_gso.c
> > @@ -60,8 +60,9 @@ rte_gso_segment(struct rte_mbuf *pkt,
> >
> >   	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
> >   				(DEV_TX_OFFLOAD_TCP_TSO |
> > -				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
> > -				gso_ctx->gso_types) {
> > +				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
> > +				 DEV_TX_OFFLOAD_GRE_TNL_TSO)) !=
> > +				 gso_ctx->gso_types) {
> >   		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> >   		pkts_out[0] = pkt;
> >   		return ret;
> > @@ -73,7 +74,8 @@ rte_gso_segment(struct rte_mbuf *pkt,
> >   	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
> >   	ol_flags = pkt->ol_flags;
> >
> > -	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) {
> > +	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags) ||
> > +			IS_IPV4_GRE_TCP4(pkt->ol_flags)) {
> >   		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> >   		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
> >   				direct_pool, indirect_pool,

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v4 2/5] gso: add TCP/IPv4 GSO support
  2017-09-19  7:32       ` [PATCH v4 2/5] gso: add TCP/IPv4 GSO support Jiayu Hu
@ 2017-09-20  7:03         ` Yao, Lei A
  0 siblings, 0 replies; 157+ messages in thread
From: Yao, Lei A @ 2017-09-20  7:03 UTC (permalink / raw)
  To: Hu, Jiayu, dev
  Cc: Ananyev, Konstantin, Kavanagh, Mark B, Tan, Jianfeng, Yigit,
	Ferruh, thomas, Hu, Jiayu



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jiayu Hu
> Sent: Tuesday, September 19, 2017 3:33 PM
> To: dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kavanagh, Mark
> B <mark.b.kavanagh@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
> Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Hu, Jiayu
> <jiayu.hu@intel.com>
> Subject: [dpdk-dev] [PATCH v4 2/5] gso: add TCP/IPv4 GSO support
> 
> This patch adds GSO support for TCP/IPv4 packets. Supported packets
> may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
> packets have correct checksums, and doesn't update checksums for
> output packets (the responsibility for this lies with the application).
> Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
> 
> TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
> MBUF, to organize an output packet. Note that we refer to these two
> chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
> header, while the indirect mbuf simply points to a location within the
> original packet's payload. Consequently, use of the GSO library requires
> multi-segment MBUF support in the TX functions of the NIC driver.
> 
> If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> result, when all of its GSOed segments are freed, the packet is freed
> automatically.
> 
> TCP/IPv4 GSO clears the PKT_TX_TCP_SEG flag for the input packet and
> GSO segments on the event of success.
> 
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
This patch is test on my bench. Iperf result as following:
TSO : 18 Gbps
DPDK GSO: 10 Gbps
No TSO/GSO: 4 Gbps

> ---
>  doc/guides/rel_notes/release_17_11.rst  |  12 ++
>  lib/librte_eal/common/include/rte_log.h |   1 +
>  lib/librte_gso/Makefile                 |   2 +
>  lib/librte_gso/gso_common.c             | 202
> ++++++++++++++++++++++++++++++++
>  lib/librte_gso/gso_common.h             | 107 +++++++++++++++++
>  lib/librte_gso/gso_tcp4.c               |  82 +++++++++++++
>  lib/librte_gso/gso_tcp4.h               |  76 ++++++++++++
>  lib/librte_gso/rte_gso.c                |  52 +++++++-
>  8 files changed, 531 insertions(+), 3 deletions(-)
>  create mode 100644 lib/librte_gso/gso_common.c
>  create mode 100644 lib/librte_gso/gso_common.h
>  create mode 100644 lib/librte_gso/gso_tcp4.c
>  create mode 100644 lib/librte_gso/gso_tcp4.h
> 
> diff --git a/doc/guides/rel_notes/release_17_11.rst
> b/doc/guides/rel_notes/release_17_11.rst
> index 7508be7..7453bb0 100644
> --- a/doc/guides/rel_notes/release_17_11.rst
> +++ b/doc/guides/rel_notes/release_17_11.rst
> @@ -41,6 +41,18 @@ New Features
>       Also, make sure to start the actual text at the margin.
> 
> =========================================================
> 
> +* **Added the Generic Segmentation Offload Library.**
> +
> +  Added the Generic Segmentation Offload (GSO) library to enable
> +  applications to split large packets (e.g. MSS is 64KB) into small
> +  ones (e.g. MTU is 1500B). Supported packet types are:
> +
> +  * TCP/IPv4 packets, which may include a single VLAN tag.
> +
> +  The GSO library doesn't check if the input packets have correct
> +  checksums, and doesn't update checksums for output packets.
> +  Additionally, the GSO library doesn't process IP fragmented packets.
> +
> 
>  Resolved Issues
>  ---------------
> diff --git a/lib/librte_eal/common/include/rte_log.h
> b/lib/librte_eal/common/include/rte_log.h
> index ec8dba7..2fa1199 100644
> --- a/lib/librte_eal/common/include/rte_log.h
> +++ b/lib/librte_eal/common/include/rte_log.h
> @@ -87,6 +87,7 @@ extern struct rte_logs rte_logs;
>  #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
>  #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
>  #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
> +#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
> 
>  /* these log types can be used in an application */
>  #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> index aeaacbc..2be64d1 100644
> --- a/lib/librte_gso/Makefile
> +++ b/lib/librte_gso/Makefile
> @@ -42,6 +42,8 @@ LIBABIVER := 1
> 
>  #source files
>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
> 
>  # install this header file
>  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
> new file mode 100644
> index 0000000..b2c84f6
> --- /dev/null
> +++ b/lib/librte_gso/gso_common.c
> @@ -0,0 +1,202 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +#include <stdbool.h>
> +#include <errno.h>
> +
> +#include <rte_memcpy.h>
> +#include <rte_mempool.h>
> +#include <rte_ether.h>
> +#include <rte_ip.h>
> +#include <rte_tcp.h>
> +
> +#include "gso_common.h"
> +
> +static inline void
> +hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset)
> +{
> +	/* Copy MBUF metadata */
> +	hdr_segment->nb_segs = 1;
> +	hdr_segment->port = pkt->port;
> +	hdr_segment->ol_flags = pkt->ol_flags;
> +	hdr_segment->packet_type = pkt->packet_type;
> +	hdr_segment->pkt_len = pkt_hdr_offset;
> +	hdr_segment->data_len = pkt_hdr_offset;
> +	hdr_segment->tx_offload = pkt->tx_offload;
> +
> +	/* Copy the packet header */
> +	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
> +			rte_pktmbuf_mtod(pkt, char *),
> +			pkt_hdr_offset);
> +}
> +
> +static inline void
> +free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
> +{
> +	uint16_t i;
> +
> +	for (i = 0; i < nb_pkts; i++)
> +		rte_pktmbuf_free(pkts[i]);
> +}
> +
> +int
> +gso_do_segment(struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset,
> +		uint16_t pyld_unit_size,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out)
> +{
> +	struct rte_mbuf *pkt_in;
> +	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
> +	uint16_t pkt_in_data_pos, segment_bytes_remaining;
> +	uint16_t pyld_len, nb_segs;
> +	bool more_in_pkt, more_out_segs;
> +
> +	pkt_in = pkt;
> +	nb_segs = 0;
> +	more_in_pkt = 1;
> +	pkt_in_data_pos = pkt_hdr_offset;
> +
> +	while (more_in_pkt) {
> +		if (unlikely(nb_segs >= nb_pkts_out)) {
> +			free_gso_segment(pkts_out, nb_segs);
> +			return -EINVAL;
> +		}
> +
> +		/* Allocate a direct MBUF */
> +		hdr_segment = rte_pktmbuf_alloc(direct_pool);
> +		if (unlikely(hdr_segment == NULL)) {
> +			free_gso_segment(pkts_out, nb_segs);
> +			return -ENOMEM;
> +		}
> +		/* Fill the packet header */
> +		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
> +
> +		prev_segment = hdr_segment;
> +		segment_bytes_remaining = pyld_unit_size;
> +		more_out_segs = 1;
> +
> +		while (more_out_segs && more_in_pkt) {
> +			/* Allocate an indirect MBUF */
> +			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
> +			if (unlikely(pyld_segment == NULL)) {
> +				rte_pktmbuf_free(hdr_segment);
> +				free_gso_segment(pkts_out, nb_segs);
> +				return -ENOMEM;
> +			}
> +			/* Attach to current MBUF segment of pkt */
> +			rte_pktmbuf_attach(pyld_segment, pkt_in);
> +
> +			prev_segment->next = pyld_segment;
> +			prev_segment = pyld_segment;
> +
> +			pyld_len = segment_bytes_remaining;
> +			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
> +				pyld_len = pkt_in->data_len -
> pkt_in_data_pos;
> +
> +			pyld_segment->data_off = pkt_in_data_pos +
> +				pkt_in->data_off;
> +			pyld_segment->data_len = pyld_len;
> +
> +			/* Update header segment */
> +			hdr_segment->pkt_len += pyld_len;
> +			hdr_segment->nb_segs++;
> +
> +			pkt_in_data_pos += pyld_len;
> +			segment_bytes_remaining -= pyld_len;
> +
> +			/* Finish processing a MBUF segment of pkt */
> +			if (pkt_in_data_pos == pkt_in->data_len) {
> +				pkt_in = pkt_in->next;
> +				pkt_in_data_pos = 0;
> +				if (pkt_in == NULL)
> +					more_in_pkt = 0;
> +			}
> +
> +			/* Finish generating a GSO segment */
> +			if (segment_bytes_remaining == 0)
> +				more_out_segs = 0;
> +		}
> +		pkts_out[nb_segs++] = hdr_segment;
> +	}
> +	return nb_segs;
> +}
> +
> +static inline void
> +__update_ipv4_tcp_header(struct rte_mbuf *pkt, uint16_t l2_len, uint16_t
> id,
> +		uint32_t sent_seq, uint8_t non_tail)
> +{
> +	struct tcp_hdr *tcp_hdr;
> +	struct ipv4_hdr *ipv4_hdr;
> +
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			l2_len);
> +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> +
> +	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len -
> +			l2_len);
> +	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
> +
> +	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
> +	if (likely(non_tail))
> +		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
> +					TCP_HDR_FIN_MASK));
> +}
> +
> +void
> +update_tcp4_header(struct rte_mbuf *pkt, uint8_t ipid_delta,
> +		struct rte_mbuf **segs, uint16_t nb_segs)
> +{
> +	struct tcp_hdr *tcp_hdr;
> +	struct ipv4_hdr *ipv4_hdr;
> +	uint32_t sent_seq;
> +	uint16_t id, l2_len, tail_idx, i;
> +
> +	l2_len = pkt->l2_len;
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			l2_len);
> +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> +	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> +	tail_idx = nb_segs - 1;
> +
> +	for (i = 0; i < nb_segs; i++) {
> +		__update_ipv4_tcp_header(segs[i], l2_len, id, sent_seq,
> +				i < tail_idx);
> +		id += ipid_delta;
> +		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
> +	}
> +}
> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> new file mode 100644
> index 0000000..2a01cd0
> --- /dev/null
> +++ b/lib/librte_gso/gso_common.h
> @@ -0,0 +1,107 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +#ifndef _GSO_COMMON_H_
> +#define _GSO_COMMON_H_
> +
> +#include <stdint.h>
> +
> +#include <rte_mbuf.h>
> +#include <rte_ip.h>
> +
> +#define IS_FRAGMENTED(frag_off) (((frag_off) &
> IPV4_HDR_OFFSET_MASK) != 0 \
> +		|| ((frag_off) & IPV4_HDR_MF_FLAG) ==
> IPV4_HDR_MF_FLAG)
> +
> +#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
> +#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
> +
> +#define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) ==
> \
> +		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
> +
> +/**
> + * Internal function which updates relevant packet headers for TCP/IPv4
> + * packets, following segmentation. This is required to update, for
> + * example, the IPv4 'total_length' field, to reflect the reduced length
> + * of the now-segmented packet.
> + *
> + * @param pkt
> + *  The original packet.
> + * @param ipid_delta
> + *  The increasing uint of IP ids.
> + * @param segs
> + *  Pointer array used for storing mbuf addresses for GSO segments.
> + * @param nb_segs
> + *  The number of GSO segments placed in segs.
> + */
> +void update_tcp4_header(struct rte_mbuf *pkt,
> +		uint8_t ipid_delta,
> +		struct rte_mbuf **segs,
> +		uint16_t nb_segs);
> +
> +/**
> + * Internal function which divides the input packet into small segments.
> + * Each of the newly-created segments is organized as a two-segment
> MBUF,
> + * where the first segment is a standard mbuf, which stores a copy of
> + * packet header, and the second is an indirect mbuf which points to a
> + * section of data in the input packet.
> + *
> + * @param pkt
> + *  Packet to segment.
> + * @param pkt_hdr_offset
> + *  Packet header offset, measured in bytes.
> + * @param pyld_unit_size
> + *  The max payload length of a GSO segment.
> + * @param direct_pool
> + *  MBUF pool used for allocating direct buffers for output segments.
> + * @param indirect_pool
> + *  MBUF pool used for allocating indirect buffers for output segments.
> + * @param pkts_out
> + *  Pointer array used to keep the mbuf addresses of output segments. If
> + *  the memory space in pkts_out is insufficient, gso_do_segment() fails
> + *  and returns -EINVAL.
> + * @param nb_pkts_out
> + *  The max number of items that pkts_out can keep.
> + *
> + * @return
> + *  - The number of segments created in the event of success.
> + *  - Return -ENOMEM if run out of memory in MBUF pools.
> + *  - Return -EINVAL for invalid parameters.
> + */
> +int gso_do_segment(struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset,
> +		uint16_t pyld_unit_size,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out);
> +#endif
> diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
> new file mode 100644
> index 0000000..e3b9dab
> --- /dev/null
> +++ b/lib/librte_gso/gso_tcp4.c
> @@ -0,0 +1,82 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +
> +#include <rte_ether.h>
> +#include <rte_ip.h>
> +
> +#include "gso_common.h"
> +#include "gso_tcp4.h"
> +
> +int
> +gso_tcp4_segment(struct rte_mbuf *pkt,
> +		uint16_t gso_size,
> +		uint8_t ipid_delta,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out)
> +{
> +	struct ipv4_hdr *ipv4_hdr;
> +	uint16_t tcp_dl;
> +	uint16_t pyld_unit_size, hdr_offset;
> +	uint16_t frag_off;
> +	int ret = 1;
> +
> +	/* Don't process the fragmented packet */
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			pkt->l2_len);
> +	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
> +	if (unlikely(IS_FRAGMENTED(frag_off))) {
> +		pkts_out[0] = pkt;
> +		return ret;
> +	}
> +
> +	/* Don't process the packet without data */
> +	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
> +	if (unlikely(tcp_dl == 0)) {
> +		pkts_out[0] = pkt;
> +		return ret;
> +	}
> +
> +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> +	pyld_unit_size = gso_size - hdr_offset;
> +
> +	/* Segment the payload */
> +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
> +			indirect_pool, pkts_out, nb_pkts_out);
> +	if (ret > 1)
> +		update_tcp4_header(pkt, ipid_delta, pkts_out, ret);
> +
> +	return ret;
> +}
> diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
> new file mode 100644
> index 0000000..84fa72f
> --- /dev/null
> +++ b/lib/librte_gso/gso_tcp4.h
> @@ -0,0 +1,76 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +#ifndef _GSO_TCP4_H_
> +#define _GSO_TCP4_H_
> +
> +#include <stdint.h>
> +#include <rte_mbuf.h>
> +
> +/**
> + * Segment an IPv4/TCP packet. This function doesn't check if the input
> + * packet has correct checksums, and doesn't update checksums for output
> + * GSO segments. Furthermore, it doesn't process IP fragment packets.
> + *
> + * @param pkt
> + *  The packet mbuf to segment.
> + * @param gso_size
> + *  The max length of a GSO segment, measured in bytes.
> + * @param ipid_delta
> + *  The increasing uint of IP ids.
> + * @param direct_pool
> + *  MBUF pool used for allocating direct buffers for output segments.
> + * @param indirect_pool
> + *  MBUF pool used for allocating indirect buffers for output segments.
> + * @param pkts_out
> + *  Pointer array used to store the MBUF addresses of output GSO
> + *  segments, when gso_tcp4_segment() successes. If the memory space in
> + *  pkts_out is insufficient, gso_tcp4_segment() fails and returns
> + *  -EINVAL.
> + * @param nb_pkts_out
> + *  The max number of items that 'pkts_out' can keep.
> + *
> + * @return
> + *   - The number of GSO segments filled in pkts_out on success.
> + *   - Return -ENOMEM if run out of memory in MBUF pools.
> + *   - Return -EINVAL for invalid parameters.
> + */
> +int gso_tcp4_segment(struct rte_mbuf *pkt,
> +		uint16_t gso_size,
> +		uint8_t ip_delta,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out);
> +
> +#endif
> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> index b773636..c65b9ae 100644
> --- a/lib/librte_gso/rte_gso.c
> +++ b/lib/librte_gso/rte_gso.c
> @@ -33,7 +33,12 @@
> 
>  #include <errno.h>
> 
> +#include <rte_log.h>
> +#include <rte_ethdev.h>
> +
>  #include "rte_gso.h"
> +#include "gso_common.h"
> +#include "gso_tcp4.h"
> 
>  int
>  rte_gso_segment(struct rte_mbuf *pkt,
> @@ -41,12 +46,53 @@ rte_gso_segment(struct rte_mbuf *pkt,
>  		struct rte_mbuf **pkts_out,
>  		uint16_t nb_pkts_out)
>  {
> +	struct rte_mempool *direct_pool, *indirect_pool;
> +	struct rte_mbuf *pkt_seg;
> +	uint64_t ol_flags;
> +	uint16_t gso_size;
> +	uint8_t ipid_delta;
> +	int ret = 1;
> +
>  	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
>  			nb_pkts_out < 1)
>  		return -EINVAL;
> 
> -	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> -	pkts_out[0] = pkt;
> +	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
> +				DEV_TX_OFFLOAD_TCP_TSO) !=
> +			gso_ctx->gso_types) {
> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> +		pkts_out[0] = pkt;
> +		return ret;
> +	}
> +
> +	direct_pool = gso_ctx->direct_pool;
> +	indirect_pool = gso_ctx->indirect_pool;
> +	gso_size = gso_ctx->gso_size;
> +	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
> +	ol_flags = pkt->ol_flags;
> +
> +	if (IS_IPV4_TCP(pkt->ol_flags)) {
> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> +		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> +				direct_pool, indirect_pool,
> +				pkts_out, nb_pkts_out);
> +	} else {
> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> +		pkts_out[0] = pkt;
> +		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
> +		return ret;
> +	}
> +
> +	if (ret > 1) {
> +		pkt_seg = pkt;
> +		while (pkt_seg) {
> +			rte_mbuf_refcnt_update(pkt_seg, -1);
> +			pkt_seg = pkt_seg->next;
> +		}
> +	} else if (ret < 0) {
> +		/* Revert the ol_flags on the even of failure. */
> +		pkt->ol_flags = ol_flags;
> +	}
> 
> -	return 1;
> +	return ret;
>  }
> --
> 2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v5 0/6] Support TCP/IPv4, VxLAN and GRE GSO in DPDK
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                         ` (4 preceding siblings ...)
  2017-09-19  7:32       ` [PATCH v4 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
@ 2017-09-28 22:13       ` Mark Kavanagh
  2017-10-02 16:45         ` [PATCH v6 0/6] Support TCP/IPv4, VxLAN, " Mark Kavanagh
  2017-09-28 22:13       ` [PATCH v5 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
                         ` (6 subsequent siblings)
  12 siblings, 1 reply; 157+ messages in thread
From: Mark Kavanagh @ 2017-09-28 22:13 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. This patch adds GSO support to DPDK for specific
packet types: specifically, TCP/IPv4, VxLAN, and GRE.

The first patch introduces the GSO API framework. The second patch
adds GSO support for TCP/IPv4 packets (containing an optional VLAN
tag). The third patch adds GSO support for VxLAN packets that contain
outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or 
outer VLAN tags). The fourth patch adds GSO support for GRE packets
that contain outer IPv4, and inner TCP/IPv4 headers (with optional 
outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
and GRE GSO in testpmd's checksum forwarding engine. The final patch
in the series adds GSO documentation to the programmer's guide.

The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
iperf. Setup for the test is described as follows:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum
   forwarding engine with "retry".
c. Select IP and TCP HW checksum calculation for P0; select TCP HW
   checksum calculation for vhost-user port.
d. Launch a VM with csum and tso offloading enabled.
e. Run iperf-client on virtio-net port in the VM to send TCP packets.
   With enabling csum and tso, the VM can send large TCP/IPv4 packets
   (mss is up to 64KB).
f. P1 is assigned to linux kernel and enabled kernel GRO. Run
   iperf-server on P1.

We conduct three iperf tests:

test-1: enable GSO for P0 in testpmd, and set max GSO segment length
    to 1518B. Run two iperf-client in the VM.
test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
    two iperf-client in the VM.
test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.

Throughput of the above three tests:

test-1: 9.4Gbps
test-2: 9.5Gbps
test-3: 3Mbps

Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
length of tunnelled packets from VMs is 1514B. So current experiment
method can't be used to measure VxLAN and GRE GSO performance, but simply
test the functionality via setting small GSO segment length (e.g. 500B).

To test VxLAN GSO functionality, we use the following setup:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
   engine with "retry".
c. Testpmd commands:
    - csum parse-tunnel on "P0"
    - csum parse-tunnel on "vhost-user port"
    - csum set outer-ip hw "P0"
    - csum set ip hw "P0"
    - csum set tcp hw "P0"
    - csum set tcp hw "vhost-user port"
    - set port "P0" gso on
    - set gso segsz 500
d. Launch a VM with csum and tso offloading enabled.
e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
   on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
   max packet length is 1514B.
f. P1 is assigned to linux kernel and disabled kernel GRO. Similarly,
   create a VxLAN port for P1, and run iperf-server on the VxLAN port.

In testpmd, we can see the length of all packets sent from P0 is smaller
than or equal to 500B. Additionally, the packets arriving in P1 is
encapsulated and is smaller than or equal to 500B.

Change log
==========
v5:
- add GSO section to the programmer's guide.
- use MF or (previously 'and') offset to check if a packet is IP
  fragmented.
- move 'update_header' helper functions to gso_common.h.
- move txp/ipv4 'update_header' function to gso_tcp4.c.
- move tunnel 'update_header' function to gso_tunnel_tcp4.c.
- add offset parameter to 'update_header' functions.
- combine GRE and VxLAN tunnel header update functions into a single
  function.
- correct typos and errors in comments/commit messages.

v4:
- use ol_flags instead of packet_type to decide which segmentation
  function to use.
- use MF and offset to check if a packet is IP fragmented, instead of
  using DF.
- remove ETHER_CRC_LEN from gso segment payload length calculation.
- refactor internal header update and other functions.
- remove RTE_GSO_IPID_INCREASE.
- add some of GSO documents.
- set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
  packets sent from GSO-enabled ports in testpmd.
v3:
- support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
  RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
  UNKNOWN.
- fill mbuf->packet_type instead of using rte_net_get_ptype() in
  csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
- store the input packet into pkts_out inside gso_tcp4_segment() and
  gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
  is performed.
- add missing incldues.
- optimize file names, function names and function description.
- fix one bug in testpmd.
v2:
- merge data segments whose data_len is less than mss into a large data
  segment in gso_do_segment().
- use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
  header in rte_gso_segment().
- provide IP id macros for applications to select fixed or incremental IP
  ids.
- change the defination of gso_types in struct rte_gso_ctx.
- replace rte_pktmbuf_detach() with rte_pktmbuf_free().
- refactor gso_update_pkt_headers().
- change the return value of rte_gso_segment().
- remove parameter checks in rte_gso_segment().
- use rte_net_get_ptype() in app/test-pmd/csumonly.c to fill
  mbuf->packet_type.
- add a new GSO command in testpmd to show GSO configuration for ports.
- misc: fix typo and optimize function description.

Jiayu Hu (3):
  gso: add Generic Segmentation Offload API framework
  gso: add TCP/IPv4 GSO support
  app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO

Mark Kavanagh (3):
  gso: add VxLAN GSO support
  gso: add GRE GSO support
  doc: add GSO programmer's guide

 MAINTAINERS                                        |   6 +
 app/test-pmd/cmdline.c                             | 178 ++++++++
 app/test-pmd/config.c                              |  24 ++
 app/test-pmd/csumonly.c                            |  69 ++-
 app/test-pmd/testpmd.c                             |  13 +
 app/test-pmd/testpmd.h                             |  10 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   1 +
 doc/api/doxy-api.conf                              |   1 +
 .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
 .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
 doc/guides/prog_guide/index.rst                    |   1 +
 doc/guides/rel_notes/release_17_11.rst             |  19 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  46 ++
 lib/Makefile                                       |   2 +
 lib/librte_eal/common/include/rte_log.h            |   1 +
 lib/librte_gso/Makefile                            |  52 +++
 lib/librte_gso/gso_common.c                        | 153 +++++++
 lib/librte_gso/gso_common.h                        | 171 ++++++++
 lib/librte_gso/gso_tcp4.c                          | 106 +++++
 lib/librte_gso/gso_tcp4.h                          |  74 ++++
 lib/librte_gso/gso_tunnel_tcp4.c                   | 129 ++++++
 lib/librte_gso/gso_tunnel_tcp4.h                   |  75 ++++
 lib/librte_gso/rte_gso.c                           | 107 +++++
 lib/librte_gso/rte_gso.h                           | 145 +++++++
 lib/librte_gso/rte_gso_version.map                 |   7 +
 mk/rte.app.mk                                      |   1 +
 28 files changed, 2437 insertions(+), 5 deletions(-)
 create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
 create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
 create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

-- 
1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v5 1/6] gso: add Generic Segmentation Offload API framework
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                         ` (5 preceding siblings ...)
  2017-09-28 22:13       ` [PATCH v5 0/6] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Mark Kavanagh
@ 2017-09-28 22:13       ` Mark Kavanagh
  2017-09-28 22:13       ` [PATCH v5 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
                         ` (5 subsequent siblings)
  12 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-09-28 22:13 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. To segment a packet requires two steps. The first
is to set proper flags to mbuf->ol_flags, where the flags are the same
as that of TSO. The second is to call the segmentation API,
rte_gso_segment(). This patch introduces the GSO API framework to DPDK.

rte_gso_segment() splits an input packet into small ones in each
invocation. The GSO library refers to these small packets generated
by rte_gso_segment() as GSO segments. Each of the newly-created GSO
segments is organized as a two-segment MBUF, where the first segment is a
standard MBUF, which stores a copy of packet header, and the second is an
indirect MBUF which points to a section of data in the input packet.
rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
when all GSO segments are freed, the input packet is freed automatically.
Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
the driver of the interface which the GSO segments are sent to should
support to transmit multi-segment packets.

The GSO framework clears the PKT_TX_TCP_SEG flag for both the input
packet, and all produced GSO segments in the event of success, since
segmentation in hardware is no longer required at that point.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 config/common_base                     |   5 ++
 doc/api/doxy-api-index.md              |   1 +
 doc/api/doxy-api.conf                  |   1 +
 doc/guides/rel_notes/release_17_11.rst |   1 +
 lib/Makefile                           |   2 +
 lib/librte_gso/Makefile                |  49 +++++++++++
 lib/librte_gso/rte_gso.c               |  52 ++++++++++++
 lib/librte_gso/rte_gso.h               | 145 +++++++++++++++++++++++++++++++++
 lib/librte_gso/rte_gso_version.map     |   7 ++
 mk/rte.app.mk                          |   1 +
 10 files changed, 264 insertions(+)
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

diff --git a/config/common_base b/config/common_base
index 5e97a08..603e340 100644
--- a/config/common_base
+++ b/config/common_base
@@ -652,6 +652,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
 CONFIG_RTE_LIBRTE_GRO=y
 
 #
+# Compile GSO library
+#
+CONFIG_RTE_LIBRTE_GSO=y
+
+#
 # Compile librte_meter
 #
 CONFIG_RTE_LIBRTE_METER=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 19e0d4f..6512918 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -101,6 +101,7 @@ The public API headers are grouped by topics:
   [TCP]                (@ref rte_tcp.h),
   [UDP]                (@ref rte_udp.h),
   [GRO]                (@ref rte_gro.h),
+  [GSO]                (@ref rte_gso.h),
   [frag/reass]         (@ref rte_ip_frag.h),
   [LPM IPv4 route]     (@ref rte_lpm.h),
   [LPM IPv6 route]     (@ref rte_lpm6.h),
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 823554f..408f2e6 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -47,6 +47,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_ether \
                           lib/librte_eventdev \
                           lib/librte_gro \
+                          lib/librte_gso \
                           lib/librte_hash \
                           lib/librte_ip_frag \
                           lib/librte_jobstats \
diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 8bf91bd..7508be7 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -174,6 +174,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_ethdev.so.7
      librte_eventdev.so.2
      librte_gro.so.1
+   + librte_gso.so.1
      librte_hash.so.2
      librte_ip_frag.so.1
      librte_jobstats.so.1
diff --git a/lib/Makefile b/lib/Makefile
index 86caba1..3d123f4 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
 DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
+DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
+DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
new file mode 100644
index 0000000..aeaacbc
--- /dev/null
+++ b/lib/librte_gso/Makefile
@@ -0,0 +1,49 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_gso.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+
+EXPORT_MAP := rte_gso_version.map
+
+LIBABIVER := 1
+
+#source files
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
new file mode 100644
index 0000000..b773636
--- /dev/null
+++ b/lib/librte_gso/rte_gso.c
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <errno.h>
+
+#include "rte_gso.h"
+
+int
+rte_gso_segment(struct rte_mbuf *pkt,
+		const struct rte_gso_ctx *gso_ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
+			nb_pkts_out < 1)
+		return -EINVAL;
+
+	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+	pkts_out[0] = pkt;
+
+	return 1;
+}
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
new file mode 100644
index 0000000..53725e6
--- /dev/null
+++ b/lib/librte_gso/rte_gso.h
@@ -0,0 +1,145 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_GSO_H_
+#define _RTE_GSO_H_
+
+/**
+ * @file
+ * Interface to GSO library
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/* GSO IP id flags for the IPv4 header */
+#define RTE_GSO_IPID_FIXED (1ULL << 0)
+/**< Use fixed IP ids for output GSO segments. Setting
+ * !RTE_GSO_IPID_FIXED indicates using incremental IP ids.
+ */
+
+/**
+ * GSO context structure.
+ */
+struct rte_gso_ctx {
+	struct rte_mempool *direct_pool;
+	/**< MBUF pool for allocating direct buffers, which are used
+	 * to store packet headers for GSO segments.
+	 */
+	struct rte_mempool *indirect_pool;
+	/**< MBUF pool for allocating indirect buffers, which are used
+	 * to locate packet payloads for GSO segments. The indirect
+	 * buffer doesn't contain any data, but simply points to an
+	 * offset within the packet to segment.
+	 */
+	uint64_t ipid_flag;
+	/**< flag to indicate GSO uses fixed or incremental IP ids for
+	 * IPv4 headers of output GSO segments. If applications want
+	 * fixed IP ids, set RTE_GSO_IPID_FIXED to ipid_flag. Conversely,
+	 * if applications want incremental IP ids, set !RTE_GSO_IPID_FIXED.
+	 */
+	uint32_t gso_types;
+	/**< the bit mask of required GSO types. The GSO library
+	 * uses the same macros as that of describing device TX
+	 * offloading capabilities (i.e. DEV_TX_OFFLOAD_*_TSO) for
+	 * gso_types.
+	 *
+	 * For example, if applications want to segment TCP/IPv4
+	 * packets, set DEV_TX_OFFLOAD_TCP_TSO in gso_types.
+	 */
+	uint16_t gso_size;
+	/**< maximum size of an output GSO segment, including packet
+	 * header and payload, measured in bytes.
+	 */
+};
+
+/**
+ * Segmentation function, which supports processing of both single- and
+ * multi- MBUF packets.
+ *
+ * Note that we refer to the packets that are segmented from the input
+ * packet as 'GSO segments'. rte_gso_segment() doesn't check if the
+ * input packet has correct checksums, and doesn't update checksums for
+ * output GSO segments. Additionally, it doesn't process IP fragment
+ * packets.
+ *
+ * Before calling rte_gso_segment(), applications must set proper ol_flags
+ * for the packet. The GSO library uses the same macros as that of TSO.
+ * For example, set PKT_TX_TCP_SEG and PKT_TX_IPV4 in ol_flags to segment
+ * a TCP/IPv4 packet. If rte_gso_segment() succceds, the PKT_TX_TCP_SEG
+ * flag is removed for all GSO segments and the input packet.
+ *
+ * Each of the newly-created GSO segments is organized as a two-segment
+ * MBUF, where the first segment is a standard MBUF, which stores a copy
+ * of packet header, and the second is an indirect MBUF which points to
+ * a section of data in the input packet. Since each GSO segment has
+ * multiple MBUFs (i.e. typically 2 MBUFs), the driver of the interface which
+ * the GSO segments are sent to should support transmission of multi-segment
+ * packets.
+ *
+ * If the input packet is GSO'd, its mbuf refcnt reduces by 1. Therefore,
+ * when all GSO segments are freed, the input packet is freed automatically.
+ *
+ * If the memory space in pkts_out or MBUF pools is insufficient, this
+ * function fails, and it returns (-1) * errno. Otherwise, GSO succeeds,
+ * and this function returns the number of output GSO segments filled in
+ * pkts_out.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param ctx
+ *  GSO context object pointer.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when rte_gso_segment() succeeds.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of GSO segments filled in pkts_out on success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int rte_gso_segment(struct rte_mbuf *pkt,
+		const struct rte_gso_ctx *ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_GSO_H_ */
diff --git a/lib/librte_gso/rte_gso_version.map b/lib/librte_gso/rte_gso_version.map
new file mode 100644
index 0000000..e1fd453
--- /dev/null
+++ b/lib/librte_gso/rte_gso_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_gso_segment;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index c25fdd9..d4c9873 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
+_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v5 2/6] gso: add TCP/IPv4 GSO support
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                         ` (6 preceding siblings ...)
  2017-09-28 22:13       ` [PATCH v5 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
@ 2017-09-28 22:13       ` Mark Kavanagh
  2017-09-29  3:12         ` Jiayu Hu
  2017-09-28 22:13       ` [PATCH v5 3/6] gso: add VxLAN " Mark Kavanagh
                         ` (4 subsequent siblings)
  12 siblings, 1 reply; 157+ messages in thread
From: Mark Kavanagh @ 2017-09-28 22:13 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

This patch adds GSO support for TCP/IPv4 packets. Supported packets
may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
packets have correct checksums, and doesn't update checksums for
output packets (the responsibility for this lies with the application).
Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.

TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
MBUF, to organize an output packet. Note that we refer to these two
chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
header, while the indirect mbuf simply points to a location within the
original packet's payload. Consequently, use of the GSO library requires
multi-segment MBUF support in the TX functions of the NIC driver.

If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
result, when all of its GSOed segments are freed, the packet is freed
automatically.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst  |  12 +++
 lib/librte_eal/common/include/rte_log.h |   1 +
 lib/librte_gso/Makefile                 |   2 +
 lib/librte_gso/gso_common.c             | 153 ++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h             | 141 +++++++++++++++++++++++++++++
 lib/librte_gso/gso_tcp4.c               | 106 ++++++++++++++++++++++
 lib/librte_gso/gso_tcp4.h               |  74 +++++++++++++++
 lib/librte_gso/rte_gso.c                |  52 ++++++++++-
 8 files changed, 538 insertions(+), 3 deletions(-)
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 7508be7..c414f73 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -41,6 +41,18 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added the Generic Segmentation Offload Library.**
+
+  Added the Generic Segmentation Offload (GSO) library to enable
+  applications to split large packets (e.g. MTU is 64KB) into small
+  ones (e.g. MTU is 1500B). Supported packet types are:
+
+  * TCP/IPv4 packets, which may include a single VLAN tag.
+
+  The GSO library doesn't check if the input packets have correct
+  checksums, and doesn't update checksums for output packets.
+  Additionally, the GSO library doesn't process IP fragmented packets.
+
 
 Resolved Issues
 ---------------
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index ec8dba7..2fa1199 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -87,6 +87,7 @@ struct rte_logs {
 #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
 #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
 #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
+#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index aeaacbc..2be64d1 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -42,6 +42,8 @@ LIBABIVER := 1
 
 #source files
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
new file mode 100644
index 0000000..ee75d4c
--- /dev/null
+++ b/lib/librte_gso/gso_common.c
@@ -0,0 +1,153 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdbool.h>
+#include <errno.h>
+
+#include <rte_memcpy.h>
+#include <rte_mempool.h>
+
+#include "gso_common.h"
+
+static inline void
+hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset)
+{
+	/* Copy MBUF metadata */
+	hdr_segment->nb_segs = 1;
+	hdr_segment->port = pkt->port;
+	hdr_segment->ol_flags = pkt->ol_flags;
+	hdr_segment->packet_type = pkt->packet_type;
+	hdr_segment->pkt_len = pkt_hdr_offset;
+	hdr_segment->data_len = pkt_hdr_offset;
+	hdr_segment->tx_offload = pkt->tx_offload;
+
+	/* Copy the packet header */
+	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
+			rte_pktmbuf_mtod(pkt, char *),
+			pkt_hdr_offset);
+}
+
+static inline void
+free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
+{
+	uint16_t i;
+
+	for (i = 0; i < nb_pkts; i++)
+		rte_pktmbuf_free(pkts[i]);
+}
+
+int
+gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct rte_mbuf *pkt_in;
+	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
+	uint16_t pkt_in_data_pos, segment_bytes_remaining;
+	uint16_t pyld_len, nb_segs;
+	bool more_in_pkt, more_out_segs;
+
+	pkt_in = pkt;
+	nb_segs = 0;
+	more_in_pkt = 1;
+	pkt_in_data_pos = pkt_hdr_offset;
+
+	while (more_in_pkt) {
+		if (unlikely(nb_segs >= nb_pkts_out)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -EINVAL;
+		}
+
+		/* Allocate a direct MBUF */
+		hdr_segment = rte_pktmbuf_alloc(direct_pool);
+		if (unlikely(hdr_segment == NULL)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -ENOMEM;
+		}
+		/* Fill the packet header */
+		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
+
+		prev_segment = hdr_segment;
+		segment_bytes_remaining = pyld_unit_size;
+		more_out_segs = 1;
+
+		while (more_out_segs && more_in_pkt) {
+			/* Allocate an indirect MBUF */
+			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
+			if (unlikely(pyld_segment == NULL)) {
+				rte_pktmbuf_free(hdr_segment);
+				free_gso_segment(pkts_out, nb_segs);
+				return -ENOMEM;
+			}
+			/* Attach to current MBUF segment of pkt */
+			rte_pktmbuf_attach(pyld_segment, pkt_in);
+
+			prev_segment->next = pyld_segment;
+			prev_segment = pyld_segment;
+
+			pyld_len = segment_bytes_remaining;
+			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
+				pyld_len = pkt_in->data_len - pkt_in_data_pos;
+
+			pyld_segment->data_off = pkt_in_data_pos +
+				pkt_in->data_off;
+			pyld_segment->data_len = pyld_len;
+
+			/* Update header segment */
+			hdr_segment->pkt_len += pyld_len;
+			hdr_segment->nb_segs++;
+
+			pkt_in_data_pos += pyld_len;
+			segment_bytes_remaining -= pyld_len;
+
+			/* Finish processing a MBUF segment of pkt */
+			if (pkt_in_data_pos == pkt_in->data_len) {
+				pkt_in = pkt_in->next;
+				pkt_in_data_pos = 0;
+				if (pkt_in == NULL)
+					more_in_pkt = 0;
+			}
+
+			/* Finish generating a GSO segment */
+			if (segment_bytes_remaining == 0)
+				more_out_segs = 0;
+		}
+		pkts_out[nb_segs++] = hdr_segment;
+	}
+	return nb_segs;
+}
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
new file mode 100644
index 0000000..8d9b94e
--- /dev/null
+++ b/lib/librte_gso/gso_common.h
@@ -0,0 +1,141 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_COMMON_H_
+#define _GSO_COMMON_H_
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
+#define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
+		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
+
+#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
+#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
+
+#define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
+
+/**
+ * Internal function which updates the TCP header of a packet, following
+ * segmentation. This is required to update the header's 'sent' sequence
+ * number, and also to clear 'PSH' and 'FIN' flags for non-tail segments.
+ *
+ * @param pkt
+ *  The packet containing the TCP header.
+ * @param l4_offset
+ *  The offset of the TCP header from the start of the packet.
+ * @param sent_seq
+ *  The sent sequence number.
+ * @param non-tail
+ *  Indicates whether or not this is a tail segment.
+ */
+static inline void
+update_tcp_header(struct rte_mbuf *pkt, uint16_t l4_offset, uint32_t sent_seq,
+		uint8_t non_tail)
+{
+	struct tcp_hdr *tcp_hdr;
+
+	tcp_hdr = (struct tcp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l4_offset);
+	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
+	if (likely(non_tail))
+		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
+					TCP_HDR_FIN_MASK));
+}
+
+/**
+ * Internal function which updates the IPv4 header of a packet, following
+ * segmentation. This is required to update the header's 'total_length' field,
+ * to reflect the reduced length of the now-segmented packet. Furthermore, the
+ * header's 'packet_id' field must be updated to reflect the new ID of the
+ * now-segmented packet.
+ *
+ * @param pkt
+ *  The packet containing the IPv4 header.
+ * @param l3_offset
+ *  The offset of the IPv4 header from the start of the packet.
+ * @param id
+ *  The new ID of the packet.
+  */
+static inline void
+update_ipv4_header(struct rte_mbuf *pkt, uint16_t l3_offset, uint16_t id)
+{
+	struct ipv4_hdr *ipv4_hdr;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l3_offset);
+	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len - l3_offset);
+	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
+}
+
+/**
+ * Internal function which divides the input packet into small segments.
+ * Each of the newly-created segments is organized as a two-segment MBUF,
+ * where the first segment is a standard mbuf, which stores a copy of
+ * packet header, and the second is an indirect mbuf which points to a
+ * section of data in the input packet.
+ *
+ * @param pkt
+ *  Packet to segment.
+ * @param pkt_hdr_offset
+ *  Packet header offset, measured in bytes.
+ * @param pyld_unit_size
+ *  The max payload length of a GSO segment.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to keep the mbuf addresses of output segments. If
+ *  the memory space in pkts_out is insufficient, gso_do_segment() fails
+ *  and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of segments created in the event of success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
new file mode 100644
index 0000000..584a77d
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.c
@@ -0,0 +1,106 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "gso_common.h"
+#include "gso_tcp4.h"
+
+static void
+update_ipv4_tcp_headers(struct rte_mbuf *pkt, uint64_t l3_offset,
+		uint8_t ipid_delta, struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t id, tail_idx, i;
+	uint16_t l4_offset;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char*) +
+			l3_offset);
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+	l4_offset = l3_offset + pkt->l3_len;
+
+	for (i = 0; i < nb_segs; i++) {
+		update_ipv4_header(segs[i], l3_offset, id);
+		update_tcp_header(segs[i], l4_offset, sent_seq, i < tail_idx);
+		id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
+
+int
+gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t tcp_dl;
+	uint16_t pyld_unit_size, hdr_offset;
+	uint16_t frag_off;
+	int ret;
+
+	/* Don't process the fragmented packet */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->l2_len);
+	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	/* Don't process the packet without data */
+	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
+	if (unlikely(tcp_dl == 0)) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret > 1) {
+		update_ipv4_tcp_headers(pkt, pkt->l2_len, ipid_delta,
+				pkts_out, ret);
+	}
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
new file mode 100644
index 0000000..1c57441
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.h
@@ -0,0 +1,74 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TCP4_H_
+#define _GSO_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment an IPv4/TCP packet. This function doesn't check if the input
+ * packet has correct checksums, and doesn't update checksums for output
+ * GSO segments. Furthermore, it doesn't process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing unit of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when the function succeeds. If the memory space in
+ *  pkts_out is insufficient, it fails and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ip_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index b773636..a4fce50 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -33,7 +33,12 @@
 
 #include <errno.h>
 
+#include <rte_log.h>
+#include <rte_ethdev.h>
+
 #include "rte_gso.h"
+#include "gso_common.h"
+#include "gso_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -41,12 +46,53 @@
 		struct rte_mbuf **pkts_out,
 		uint16_t nb_pkts_out)
 {
+	struct rte_mempool *direct_pool, *indirect_pool;
+	struct rte_mbuf *pkt_seg;
+	uint64_t ol_flags;
+	uint16_t gso_size;
+	uint8_t ipid_delta;
+	int ret = 1;
+
 	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
 			nb_pkts_out < 1)
 		return -EINVAL;
 
-	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
-	pkts_out[0] = pkt;
+	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
+				DEV_TX_OFFLOAD_TCP_TSO) !=
+			gso_ctx->gso_types) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	direct_pool = gso_ctx->direct_pool;
+	indirect_pool = gso_ctx->indirect_pool;
+	gso_size = gso_ctx->gso_size;
+	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
+	ol_flags = pkt->ol_flags;
+
+	if (IS_IPV4_TCP(pkt->ol_flags)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkts_out[0] = pkt;
+		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
+		return 1;
+	}
+
+	if (ret > 1) {
+		pkt_seg = pkt;
+		while (pkt_seg) {
+			rte_mbuf_refcnt_update(pkt_seg, -1);
+			pkt_seg = pkt_seg->next;
+		}
+	} else if (ret < 0) {
+		/* Revert the ol_flags in the event of failure. */
+		pkt->ol_flags = ol_flags;
+	}
 
-	return 1;
+	return ret;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v5 3/6] gso: add VxLAN GSO support
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                         ` (7 preceding siblings ...)
  2017-09-28 22:13       ` [PATCH v5 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
@ 2017-09-28 22:13       ` Mark Kavanagh
  2017-09-28 22:13       ` [PATCH v5 4/6] gso: add GRE " Mark Kavanagh
                         ` (3 subsequent siblings)
  12 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-09-28 22:13 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

This patch adds a framework that allows GSO on tunneled packets.
Furthermore, it leverages that framework to provide GSO support for
VxLAN-encapsulated packets.

Supported VxLAN packets must have an outer IPv4 header (prepended by an
optional VLAN tag), and contain an inner TCP/IPv4 packet (with an optional
inner VLAN tag).

VxLAN GSO doesn't check if input packets have correct checksums and
doesn't update checksums for output packets. Additionally, it doesn't
process IP fragmented packets.

As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
output packet, which mandates support for multi-segment mbufs in the TX
functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
reduces its MBUF refcnt by 1. As a result, when all of its GSO'd segments
are freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst |   3 +
 lib/librte_gso/Makefile                |   1 +
 lib/librte_gso/gso_common.h            |  25 +++++++
 lib/librte_gso/gso_tcp4.c              |   2 +-
 lib/librte_gso/gso_tunnel_tcp4.c       | 123 +++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_tunnel_tcp4.h       |  75 ++++++++++++++++++++
 lib/librte_gso/rte_gso.c               |  13 +++-
 7 files changed, 238 insertions(+), 4 deletions(-)
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index c414f73..25b8a78 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -48,6 +48,9 @@ New Features
   ones (e.g. MTU is 1500B). Supported packet types are:
 
   * TCP/IPv4 packets, which may include a single VLAN tag.
+  * VxLAN packets, which must have an outer IPv4 header (prepended by
+    an optional VLAN tag), and contain an inner TCP/IPv4 packet (with
+    an optional VLAN tag).
 
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index 2be64d1..e6d41df 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -44,6 +44,7 @@ LIBABIVER := 1
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index 8d9b94e..c051295 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -39,6 +39,7 @@
 #include <rte_mbuf.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
+#include <rte_udp.h>
 
 #define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
 		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
@@ -49,6 +50,30 @@
 #define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
 		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
 
+#define IS_IPV4_VXLAN_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
+				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
+		 PKT_TX_TUNNEL_VXLAN))
+
+/**
+ * Internal function which updates the UDP header of a packet, following
+ * segmentation. This is required to update the header's datagram length field.
+ *
+ * @param pkt
+ *  The packet containing the UDP header.
+ * @param udp_offset
+ *  The offset of the UDP header from the start of the packet.
+ */
+static inline void
+update_udp_header(struct rte_mbuf *pkt, uint16_t udp_offset)
+{
+	struct udp_hdr *udp_hdr;
+
+	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			udp_offset);
+	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - udp_offset);
+}
+
 /**
  * Internal function which updates the TCP header of a packet, following
  * segmentation. This is required to update the header's 'sent' sequence
diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
index 584a77d..fe0b7dd 100644
--- a/lib/librte_gso/gso_tcp4.c
+++ b/lib/librte_gso/gso_tcp4.c
@@ -35,7 +35,7 @@
 #include "gso_tcp4.h"
 
 static void
-update_ipv4_tcp_headers(struct rte_mbuf *pkt, uint64_t l3_offset,
+update_ipv4_tcp_headers(struct rte_mbuf *pkt, uint16_t l3_offset,
 		uint8_t ipid_delta, struct rte_mbuf **segs, uint16_t nb_segs)
 {
 	struct ipv4_hdr *ipv4_hdr;
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
new file mode 100644
index 0000000..34bbbd7
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -0,0 +1,123 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "gso_common.h"
+#include "gso_tunnel_tcp4.h"
+
+static void
+update_tunnel_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t outer_id, inner_id, tail_idx, i;
+	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
+
+	outer_ipv4_offset = pkt->outer_l2_len;
+	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
+	inner_ipv4_offset = udp_offset + pkt->l2_len;
+	tcp_offset = inner_ipv4_offset + pkt->l3_len;
+
+	/* Outer IPv4 header. */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			outer_ipv4_offset);
+	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+
+	/* Inner IPv4 header. */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			inner_ipv4_offset);
+	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+
+	for (i = 0; i < nb_segs; i++) {
+		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
+		update_udp_header(segs[i], udp_offset);
+		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
+		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
+		outer_id++;
+		inner_id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
+
+int
+gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *inner_ipv4_hdr;
+	uint16_t pyld_unit_size, hdr_offset;
+	uint16_t tcp_dl, frag_off;
+	int ret = 1;
+
+	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
+	inner_ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			hdr_offset);
+	/*
+	 * Don't process the packet whose MF bit or offset in the inner
+	 * IPv4 header are non-zero.
+	 */
+	frag_off = rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	/* Don't process the packet without data */
+	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
+	if (unlikely(tcp_dl == 0)) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	hdr_offset += pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret <= 1)
+		return ret;
+
+	update_tunnel_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tunnel_tcp4.h b/lib/librte_gso/gso_tunnel_tcp4.h
new file mode 100644
index 0000000..3c67f0c
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.h
@@ -0,0 +1,75 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TUNNEL_TCP4_H_
+#define _GSO_TUNNEL_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment a tunneling packet with inner TCP/IPv4 headers. This function
+ * doesn't check if the input packet has correct checksums, and doesn't
+ * update checksums for output GSO segments. Furthermore, it doesn't
+ * process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing unit of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when it succeeds. If the memory space in pkts_out is
+ *  insufficient, it fails and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index a4fce50..6095689 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -39,6 +39,7 @@
 #include "rte_gso.h"
 #include "gso_common.h"
 #include "gso_tcp4.h"
+#include "gso_tunnel_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -58,8 +59,9 @@
 		return -EINVAL;
 
 	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
-				DEV_TX_OFFLOAD_TCP_TSO) !=
-			gso_ctx->gso_types) {
+				(DEV_TX_OFFLOAD_TCP_TSO |
+				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
+				gso_ctx->gso_types) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		pkts_out[0] = pkt;
 		return 1;
@@ -71,7 +73,12 @@
 	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
 	ol_flags = pkt->ol_flags;
 
-	if (IS_IPV4_TCP(pkt->ol_flags)) {
+	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else if (IS_IPV4_TCP(pkt->ol_flags)) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v5 4/6] gso: add GRE GSO support
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                         ` (8 preceding siblings ...)
  2017-09-28 22:13       ` [PATCH v5 3/6] gso: add VxLAN " Mark Kavanagh
@ 2017-09-28 22:13       ` Mark Kavanagh
  2017-09-28 22:13       ` [PATCH v5 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
                         ` (2 subsequent siblings)
  12 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-09-28 22:13 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

This patch adds GSO support for GRE-tunneled packets. Supported GRE
packets must contain an outer IPv4 header, and inner TCP/IPv4 headers.
They may also contain a single VLAN tag. GRE GSO doesn't check if all
input packets have correct checksums and doesn't update checksums for
output packets. Additionally, it doesn't process IP fragmented packets.

As with VxLAN GSO, GRE GSO uses a two-segment MBUF to organize each
output packet, which requires multi-segment mbuf support in the TX
functions of the NIC driver. Also, if a packet is GSOed, GRE GSO reduces
its MBUF refcnt by 1. As a result, when all of its GSOed segments are
freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst |  3 +++
 lib/librte_gso/gso_common.h            |  5 +++++
 lib/librte_gso/gso_tunnel_tcp4.c       | 14 ++++++++++----
 lib/librte_gso/rte_gso.c               |  8 +++++---
 4 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 25b8a78..808f537 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -51,6 +51,9 @@ New Features
   * VxLAN packets, which must have an outer IPv4 header (prepended by
     an optional VLAN tag), and contain an inner TCP/IPv4 packet (with
     an optional VLAN tag).
+  * GRE packets, which must contain an outer IPv4 header (prepended by
+    an optional VLAN tag), and inner TCP/IPv4 headers (with an optional
+    VLAN tag).
 
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index c051295..1e99cc0 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -55,6 +55,11 @@
 		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
 		 PKT_TX_TUNNEL_VXLAN))
 
+#define IS_IPV4_GRE_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
+				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_GRE)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
+		 PKT_TX_TUNNEL_GRE))
+
 /**
  * Internal function which updates the UDP header of a packet, following
  * segmentation. This is required to update the header's datagram length field.
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
index 34bbbd7..d79fc6b 100644
--- a/lib/librte_gso/gso_tunnel_tcp4.c
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -42,11 +42,13 @@
 	struct tcp_hdr *tcp_hdr;
 	uint32_t sent_seq;
 	uint16_t outer_id, inner_id, tail_idx, i;
-	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
+	uint16_t outer_ipv4_offset, inner_ipv4_offset;
+	uint16_t udp_gre_offset, tcp_offset;
+	uint8_t update_udp_hdr;
 
 	outer_ipv4_offset = pkt->outer_l2_len;
-	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
-	inner_ipv4_offset = udp_offset + pkt->l2_len;
+	udp_gre_offset = outer_ipv4_offset + pkt->outer_l3_len;
+	inner_ipv4_offset = udp_gre_offset + pkt->l2_len;
 	tcp_offset = inner_ipv4_offset + pkt->l3_len;
 
 	/* Outer IPv4 header. */
@@ -63,9 +65,13 @@
 	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
 	tail_idx = nb_segs - 1;
 
+	/* Only update UDP header for VxLAN packets. */
+	update_udp_hdr = (pkt->ol_flags & PKT_TX_TUNNEL_VXLAN) ? 1 : 0;
+
 	for (i = 0; i < nb_segs; i++) {
 		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
-		update_udp_header(segs[i], udp_offset);
+		if (update_udp_hdr)
+			update_udp_header(segs[i], udp_gre_offset);
 		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
 		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
 		outer_id++;
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index 6095689..b748ab1 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -60,8 +60,9 @@
 
 	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
 				(DEV_TX_OFFLOAD_TCP_TSO |
-				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
-				gso_ctx->gso_types) {
+				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+				 DEV_TX_OFFLOAD_GRE_TNL_TSO)) !=
+				 gso_ctx->gso_types) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		pkts_out[0] = pkt;
 		return 1;
@@ -73,7 +74,8 @@
 	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
 	ol_flags = pkt->ol_flags;
 
-	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) {
+	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags) ||
+			IS_IPV4_GRE_TCP4(pkt->ol_flags)) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v5 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                         ` (9 preceding siblings ...)
  2017-09-28 22:13       ` [PATCH v5 4/6] gso: add GRE " Mark Kavanagh
@ 2017-09-28 22:13       ` Mark Kavanagh
  2017-09-28 22:13       ` [PATCH v5 6/6] doc: add GSO programmer's guide Mark Kavanagh
  2017-09-28 22:18       ` [PATCH v5 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
  12 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-09-28 22:13 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

This patch adds GSO support to the csum forwarding engine. Oversized
packets transmitted over a GSO-enabled port will undergo segmentation
(with the exception of packet-types unsupported by the GSO library).
GSO support is disabled by default.

GSO support may be toggled on a per-port basis, using the command:

        "set port <port_id> gso on|off"

The maximum packet length (including the packet header and payload) for
GSO segments may be set with the command:

        "set gso segsz <length>"

Show GSO configuration for a given port with the command:

	"show port <port_id> gso"

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 app/test-pmd/cmdline.c                      | 178 ++++++++++++++++++++++++++++
 app/test-pmd/config.c                       |  24 ++++
 app/test-pmd/csumonly.c                     |  69 ++++++++++-
 app/test-pmd/testpmd.c                      |  13 ++
 app/test-pmd/testpmd.h                      |  10 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  46 +++++++
 6 files changed, 335 insertions(+), 5 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index ccdf239..05b0ce8 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -431,6 +431,17 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"    Set max flow number and max packet number per-flow"
 			" for GRO.\n\n"
 
+			"set port (port_id) gso (on|off)"
+			"    Enable or disable Generic Segmentation Offload in"
+			" csum forwarding engine.\n\n"
+
+			"set gso segsz (length)\n"
+			"    Set max packet length for output GSO segments,"
+			" including packet header and payload.\n\n"
+
+			"show port (port_id) gso\n"
+			"    Show GSO configuration.\n\n"
+
 			"set fwd (%s)\n"
 			"    Set packet forwarding mode.\n\n"
 
@@ -3967,6 +3978,170 @@ struct cmd_gro_set_result {
 	},
 };
 
+/* *** ENABLE/DISABLE GSO *** */
+struct cmd_gso_enable_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_mode;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_enable_parsed(void *parsed_result,
+		__attribute__((unused)) struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_gso_enable_result *res;
+
+	res = parsed_result;
+	if (!strcmp(res->cmd_keyword, "gso"))
+		setup_gso(res->cmd_mode, res->cmd_pid);
+}
+
+cmdline_parse_token_string_t cmd_gso_enable_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_enable_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_enable_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_enable_mode =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_mode, "on#off");
+cmdline_parse_token_num_t cmd_gso_enable_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_enable = {
+	.f = cmd_gso_enable_parsed,
+	.data = NULL,
+	.help_str = "set port <port_id> gso on|off",
+	.tokens = {
+		(void *)&cmd_gso_enable_set,
+		(void *)&cmd_gso_enable_port,
+		(void *)&cmd_gso_enable_pid,
+		(void *)&cmd_gso_enable_keyword,
+		(void *)&cmd_gso_enable_mode,
+		NULL,
+	},
+};
+
+/* *** SET MAX PACKET LENGTH FOR GSO SEGMENTS *** */
+struct cmd_gso_size_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_segsz;
+	uint16_t cmd_size;
+};
+
+static void
+cmd_gso_size_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_size_result *res = parsed_result;
+
+	if (test_done == 0) {
+		printf("Before setting GSO segsz, please first stop fowarding\n");
+		return;
+	}
+
+	if (!strcmp(res->cmd_keyword, "gso") &&
+			!strcmp(res->cmd_segsz, "segsz")) {
+		if (res->cmd_size == 0) {
+			printf("gso_size should be larger than 0."
+					" Please input a legal value\n");
+		} else
+			gso_max_segment_size = res->cmd_size;
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_size_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_size_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_size_segsz =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_segsz, "segsz");
+cmdline_parse_token_num_t cmd_gso_size_size =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_size_result,
+				cmd_size, UINT16);
+
+cmdline_parse_inst_t cmd_gso_size = {
+	.f = cmd_gso_size_parsed,
+	.data = NULL,
+	.help_str = "set gso segsz <length>",
+	.tokens = {
+		(void *)&cmd_gso_size_set,
+		(void *)&cmd_gso_size_keyword,
+		(void *)&cmd_gso_size_segsz,
+		(void *)&cmd_gso_size_size,
+		NULL,
+	},
+};
+
+/* *** SHOW GSO CONFIGURATION *** */
+struct cmd_gso_show_result {
+	cmdline_fixed_string_t cmd_show;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_show_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_show_result *res = parsed_result;
+
+	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
+		printf("invalid port id %u\n", res->cmd_pid);
+		return;
+	}
+	if (!strcmp(res->cmd_keyword, "gso")) {
+		if (gso_ports[res->cmd_pid].enable) {
+			printf("Max GSO'd packet size: %uB\n"
+					"Supported GSO types: TCP/IPv4, "
+					"VxLAN with inner TCP/IPv4 packet, "
+					"GRE with inner TCP/IPv4  packet\n",
+					gso_max_segment_size);
+		} else
+			printf("GSO is not enabled on Port %u\n", res->cmd_pid);
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_show_show =
+TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+		cmd_show, "show");
+cmdline_parse_token_string_t cmd_gso_show_port =
+TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+		cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_show_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_num_t cmd_gso_show_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_show_result,
+				cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_show = {
+	.f = cmd_gso_show_parsed,
+	.data = NULL,
+	.help_str = "show port <port_id> gso",
+	.tokens = {
+		(void *)&cmd_gso_show_show,
+		(void *)&cmd_gso_show_port,
+		(void *)&cmd_gso_show_pid,
+		(void *)&cmd_gso_show_keyword,
+		NULL,
+	},
+};
+
 /* *** ENABLE/DISABLE FLUSH ON RX STREAMS *** */
 struct cmd_set_flush_rx {
 	cmdline_fixed_string_t set;
@@ -14255,6 +14430,9 @@ struct cmd_cmdfile_result {
 	(cmdline_parse_inst_t *)&cmd_tunnel_tso_show,
 	(cmdline_parse_inst_t *)&cmd_enable_gro,
 	(cmdline_parse_inst_t *)&cmd_gro_set,
+	(cmdline_parse_inst_t *)&cmd_gso_enable,
+	(cmdline_parse_inst_t *)&cmd_gso_size,
+	(cmdline_parse_inst_t *)&cmd_gso_show,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_rx,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_tx,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 3ae3e1c..88d09d0 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2454,6 +2454,30 @@ struct igb_ring_desc_16_bytes {
 	}
 }
 
+void
+setup_gso(const char *mode, uint8_t port_id)
+{
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		printf("invalid port id %u\n", port_id);
+		return;
+	}
+	if (strcmp(mode, "on") == 0) {
+		if (test_done == 0) {
+			printf("before enabling GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 1;
+	} else if (strcmp(mode, "off") == 0) {
+		if (test_done == 0) {
+			printf("before disabling GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 0;
+	}
+}
+
 char*
 list_pkt_forwarding_modes(void)
 {
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 90c8119..bd1a287 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -70,6 +70,8 @@
 #include <rte_string_fns.h>
 #include <rte_flow.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
+
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
@@ -91,6 +93,7 @@
 /* structure that caches offload info for the current packet */
 struct testpmd_offload_info {
 	uint16_t ethertype;
+	uint8_t gso_enable;
 	uint16_t l2_len;
 	uint16_t l3_len;
 	uint16_t l4_len;
@@ -381,6 +384,8 @@ struct simple_gre_hdr {
 				get_udptcp_checksum(l3_hdr, tcp_hdr,
 					info->ethertype);
 		}
+		if (info->gso_enable)
+			ol_flags |= PKT_TX_TCP_SEG;
 	} else if (info->l4_proto == IPPROTO_SCTP) {
 		sctp_hdr = (struct sctp_hdr *)((char *)l3_hdr + info->l3_len);
 		sctp_hdr->cksum = 0;
@@ -627,6 +632,9 @@ struct simple_gre_hdr {
 pkt_burst_checksum_forward(struct fwd_stream *fs)
 {
 	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_mbuf *gso_segments[GSO_MAX_PKT_BURST];
+	struct rte_gso_ctx *gso_ctx;
+	struct rte_mbuf **tx_pkts_burst;
 	struct rte_port *txp;
 	struct rte_mbuf *m, *p;
 	struct ether_hdr *eth_hdr;
@@ -634,13 +642,15 @@ struct simple_gre_hdr {
 	uint16_t nb_rx;
 	uint16_t nb_tx;
 	uint16_t nb_prep;
-	uint16_t i;
+	uint16_t i, j;
 	uint64_t rx_ol_flags, tx_ol_flags;
 	uint16_t testpmd_ol_flags;
 	uint32_t retry;
 	uint32_t rx_bad_ip_csum;
 	uint32_t rx_bad_l4_csum;
 	struct testpmd_offload_info info;
+	uint16_t nb_segments = 0;
+	int ret;
 
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	uint64_t start_tsc;
@@ -674,6 +684,8 @@ struct simple_gre_hdr {
 	memset(&info, 0, sizeof(info));
 	info.tso_segsz = txp->tso_segsz;
 	info.tunnel_tso_segsz = txp->tunnel_tso_segsz;
+	if (gso_ports[fs->tx_port].enable)
+		info.gso_enable = 1;
 
 	for (i = 0; i < nb_rx; i++) {
 		if (likely(i < nb_rx - 1))
@@ -851,13 +863,59 @@ struct simple_gre_hdr {
 		}
 	}
 
+	if (gso_ports[fs->tx_port].enable == 0)
+		tx_pkts_burst = pkts_burst;
+	else {
+		gso_ctx = &(current_fwd_lcore()->gso_ctx);
+		gso_ctx->gso_size = gso_max_segment_size;
+		for (i = 0; i < nb_rx; i++) {
+			if (unlikely(nb_rx - i >= GSO_MAX_PKT_BURST -
+						nb_segments)) {
+				/*
+				 * insufficient space in gso_segments,
+				 * stop GSO.
+				 */
+				for (j = i; j < GSO_MAX_PKT_BURST -
+						nb_segments; j++) {
+					pkts_burst[j]->ol_flags &=
+						(~PKT_TX_TCP_SEG);
+					gso_segments[nb_segments++] =
+						pkts_burst[j];
+				}
+				for (; j < nb_rx; j++)
+					rte_pktmbuf_free(pkts_burst[j]);
+				break;
+			}
+			ret = rte_gso_segment(pkts_burst[i], gso_ctx,
+					&gso_segments[nb_segments],
+					GSO_MAX_PKT_BURST - nb_segments);
+			if (ret >= 1)
+				nb_segments += ret;
+			else if (ret < 0) {
+				/*
+				 * insufficient MBUFs or space in
+				 * gso_segments, stop GSO.
+				 */
+				for (j = i; j < nb_rx; j++) {
+					pkts_burst[j]->ol_flags &=
+						(~PKT_TX_TCP_SEG);
+					gso_segments[nb_segments++] =
+						pkts_burst[j];
+				}
+				break;
+			}
+		}
+		tx_pkts_burst = gso_segments;
+		nb_rx = nb_segments;
+	}
+
 	nb_prep = rte_eth_tx_prepare(fs->tx_port, fs->tx_queue,
-			pkts_burst, nb_rx);
+			tx_pkts_burst, nb_rx);
 	if (nb_prep != nb_rx)
 		printf("Preparing packet burst to transmit failed: %s\n",
 				rte_strerror(rte_errno));
 
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst,
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, tx_pkts_burst,
 			nb_prep);
 
 	/*
@@ -868,7 +926,7 @@ struct simple_gre_hdr {
 		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
 			rte_delay_us(burst_tx_delay_time);
 			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_rx - nb_tx);
+					&tx_pkts_burst[nb_tx], nb_rx - nb_tx);
 		}
 	}
 	fs->tx_packets += nb_tx;
@@ -881,9 +939,10 @@ struct simple_gre_hdr {
 	if (unlikely(nb_tx < nb_rx)) {
 		fs->fwd_dropped += (nb_rx - nb_tx);
 		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
+			rte_pktmbuf_free(tx_pkts_burst[nb_tx]);
 		} while (++nb_tx < nb_rx);
 	}
+
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	end_tsc = rte_rdtsc();
 	core_cycles = (end_tsc - start_tsc);
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index e097ee0..97e349d 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -400,6 +400,9 @@ static int eth_event_callback(uint8_t port_id,
  */
 static int all_ports_started(void);
 
+struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+uint16_t gso_max_segment_size = ETHER_MAX_LEN - ETHER_CRC_LEN;
+
 /*
  * Helper function to check if socket is already discovered.
  * If yes, return positive value. If not, return zero.
@@ -570,6 +573,7 @@ static int eth_event_callback(uint8_t port_id,
 	unsigned int nb_mbuf_per_pool;
 	lcoreid_t  lc_id;
 	uint8_t port_per_socket[RTE_MAX_NUMA_NODES];
+	uint32_t gso_types = 0;
 
 	memset(port_per_socket,0,RTE_MAX_NUMA_NODES);
 
@@ -654,6 +658,8 @@ static int eth_event_callback(uint8_t port_id,
 
 	init_port_config();
 
+	gso_types = DEV_TX_OFFLOAD_TCP_TSO | DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+		DEV_TX_OFFLOAD_GRE_TNL_TSO;
 	/*
 	 * Records which Mbuf pool to use by each logical core, if needed.
 	 */
@@ -664,6 +670,13 @@ static int eth_event_callback(uint8_t port_id,
 		if (mbp == NULL)
 			mbp = mbuf_pool_find(0);
 		fwd_lcores[lc_id]->mbp = mbp;
+		/* initialize GSO context */
+		fwd_lcores[lc_id]->gso_ctx.direct_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.indirect_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.gso_types = gso_types;
+		fwd_lcores[lc_id]->gso_ctx.gso_size = ETHER_MAX_LEN -
+			ETHER_CRC_LEN;
+		fwd_lcores[lc_id]->gso_ctx.ipid_flag = !RTE_GSO_IPID_FIXED;
 	}
 
 	/* Configuration of packet forwarding streams. */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 1d1ee75..ff842a1 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -36,6 +36,7 @@
 
 #include <rte_pci.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
@@ -205,6 +206,7 @@ struct rte_port {
  * CPU id. configuration table.
  */
 struct fwd_lcore {
+	struct rte_gso_ctx gso_ctx;     /**< GSO context */
 	struct rte_mempool *mbp; /**< The mbuf pool to use by this core */
 	streamid_t stream_idx;   /**< index of 1st stream in "fwd_streams" */
 	streamid_t stream_nb;    /**< number of streams in "fwd_streams" */
@@ -442,6 +444,13 @@ struct gro_status {
 };
 extern struct gro_status gro_ports[RTE_MAX_ETHPORTS];
 
+#define GSO_MAX_PKT_BURST 2048
+struct gso_status {
+	uint8_t enable;
+};
+extern struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+extern uint16_t gso_max_segment_size;
+
 static inline unsigned int
 lcore_num(void)
 {
@@ -642,6 +651,7 @@ void port_rss_hash_key_update(portid_t port_id, char rss_type[],
 int rx_queue_id_is_invalid(queueid_t rxq_id);
 int tx_queue_id_is_invalid(queueid_t txq_id);
 void setup_gro(const char *mode, uint8_t port_id);
+void setup_gso(const char *mode, uint8_t port_id);
 
 /* Functions to manage the set of filtered Multicast MAC addresses */
 void mcast_addr_add(uint8_t port_id, struct ether_addr *mc_addr);
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 2ed62f5..f9b5bda 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -932,6 +932,52 @@ number of packets a GRO table can store.
 If current packet number is greater than or equal to the max value, GRO
 will stop processing incoming packets.
 
+set port - gso
+~~~~~~~~~~~~~~
+
+Toggle per-port GSO support in ``csum`` forwarding engine::
+
+   testpmd> set port <port_id> gso on|off
+
+If enabled, the csum forwarding engine will perform GSO on supported IPv4
+packets, transmitted on the given port.
+
+If disabled, packets transmitted on the given port will not undergo GSO.
+By default, GSO is disabled for all ports.
+
+.. note::
+
+   When GSO is enabled on a port, supported IPv4 packets transmitted on that
+   port undergo GSO. Afterwards, the segmented packets are represented by
+   multi-segment mbufs; however, the csum forwarding engine doesn't calculation
+   of checksums for GSO'd segments in SW. As a result, if users want correct
+   checksums in GSO segments, they should enable HW checksum calculation for
+   GSO-enabled ports.
+
+   For example, HW checksum calculation for VxLAN GSO'd packets may be enabled
+   by setting the following options in the csum forwarding engine:
+
+   testpmd> csum set outer_ip hw <port_id>
+
+   testpmd> csum set ip hw <port_id>
+
+   testpmd> csum set tcp hw <port_id>
+
+set gso segsz
+~~~~~~~~~~~~~
+
+Set the maximum GSO segment size (measured in bytes), which includes the
+packet header and the packet payload for GSO-enabled ports (global)::
+
+   testpmd> set gso segsz <length>
+
+show port - gso
+~~~~~~~~~~~~~~~
+
+Display the status of Generic Segmentation Offload for a given port::
+
+   testpmd> show port <port_id> gso
+
 mac_addr add
 ~~~~~~~~~~~~
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v5 6/6] doc: add GSO programmer's guide
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                         ` (10 preceding siblings ...)
  2017-09-28 22:13       ` [PATCH v5 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
@ 2017-09-28 22:13       ` Mark Kavanagh
  2017-09-28 22:18       ` [PATCH v5 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
  12 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-09-28 22:13 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

Add programmer's guide doc to explain the design and use of the
GSO library.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 MAINTAINERS                                        |   6 +
 .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
 .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
 doc/guides/prog_guide/index.rst                    |   1 +
 5 files changed, 1053 insertions(+)
 create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
 create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
 create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg

diff --git a/MAINTAINERS b/MAINTAINERS
index a0cd75e..f18e463 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -641,6 +641,12 @@ M: Jiayu Hu <jiayu.hu@intel.com>
 F: lib/librte_gro/
 F: doc/guides/prog_guide/generic_receive_offload_lib.rst
 
+Generic Segmentation Offload
+M: Jiayu Hu <jiayu.hu@intel.com>
+M: Mark Kavanagh <mark.b.kavanagh@intel.com>
+F: lib/librte_gso/
+F: doc/guides/prog_guide/generic_segmentation_offload_lib.rst
+
 Distributor
 M: Bruce Richardson <bruce.richardson@intel.com>
 M: David Hunt <david.hunt@intel.com>
diff --git a/doc/guides/prog_guide/generic_segmentation_offload_lib.rst b/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
new file mode 100644
index 0000000..5e78f16
--- /dev/null
+++ b/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
@@ -0,0 +1,256 @@
+..  BSD LICENSE
+    Copyright(c) 2017 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Generic Segmentation Offload Library
+====================================
+
+Overview
+--------
+Generic Segmentation Offload (GSO) is a widely used software implementation of
+TCP Segmentation Offload (TSO), which reduces per-packet processing overhead.
+Much like TSO, GSO gains performance by enabling upper layer applications to
+process a smaller number of large packets (e.g. MTU size of 64KB), instead of
+processing higher numbers of small packets (e.g. MTU size of 1500B), thus
+reducing per-packet overhead.
+
+For example, GSO allows guest kernel stacks to transmit over-sized TCP segments
+that far exceed the kernel interface's MTU; this eliminates the need to segment
+packets within the guest, and improves the data-to-overhead ratio of both the
+guest-host link, and PCI bus. The expectation of the guest network stack in this
+scenario is that segmentation of egress frames will take place either in the NIC
+HW, or where that hardware capability is unavailable, either in the host
+application, or network stack.
+
+Bearing that in mind, the GSO library enables DPDK applications to segment
+packets in software. Note however, that GSO is implemented as a standalone
+library, and not via a 'fallback' mechanism (i.e. for when TSO is unsupported
+in the underlying hardware); that is, applications must explicitly invoke the
+GSO library to segment packets. The size of GSO segments ``(segsz)`` is
+configurable by the application.
+
+Limitations
+-----------
+
+#. The GSO library doesn't check if input packets have correct checksums.
+
+#. In addition, the GSO library doesn't re-calculate checksums for segmented
+   packets (that task is left to the application).
+
+#. IP fragments are unsupported by the GSO library.
+
+#. The egress interface's driver must support multi-segment packets.
+
+#. Currently, the GSO library supports the following IPv4 packet types:
+
+ - TCP
+ - VxLAN
+ - GRE
+
+  See `Supported GSO Packet Types`_ for further details.
+
+Packet Segmentation
+-------------------
+
+The ``rte_gso_segment()`` function is the GSO library's primary
+segmentation API.
+
+Before performing segmentation, an application must create a GSO context object
+``(struct rte_gso_ctx)``, which provides the library with some of the
+information required to understand how the packet should be segmented. Refer to
+`How to Segment a Packet`_ for additional details on same. Once the GSO context
+has been created, and populated, the application can then use the
+``rte_gso_segment()`` function to segment packets.
+
+The GSO library typically stores each segment that it creates in two parts: the
+first part contains a copy of the original packet's headers, while the second
+part contains a pointer to an offset within the original packet. This mechanism
+is explained in more detail in `GSO Output Segment Format`_.
+
+The GSO library supports both single- and multi-segment input mbufs.
+
+GSO Output Segment Format
+~~~~~~~~~~~~~~~~~~~~~~~~~
+To reduce the number of expensive memcpy operations required when segmenting a
+packet, the GSO library typically stores each segment that it creates as a
+two-part mbuf (technically, this is termed a 'two-segment' mbuf; however, since
+the elements produced by the API are also called 'segments', for clarity the
+term 'part' is used here instead).
+
+The first part of each output segment is a direct mbuf and contains a copy of
+the original packet's headers, which must be prepended to each output segment.
+These headers are copied from the original packet into each output segment.
+
+The second part of each output segment, represents a section of data from the
+original packet, i.e. a data segment. Rather than copy the data directly from
+the original packet into the output segment (which would impact performance
+considerably), the second part of each output segment is an indirect mbuf,
+which contains no actual data, but simply points to an offset within the
+original packet.
+
+The combination of the 'header' segment and the 'data' segment constitutes a
+single logical output GSO segment of the original packet. This is illustrated
+in :numref:`figure_gso-output-segment-format`.
+
+.. _figure_gso-output-segment-format:
+
+.. figure:: img/gso-output-segment-format.svg
+   :align: center
+
+   Two-part GSO output segment
+
+In one situation, the output segment may contain additional 'data' segments.
+This only occurs when:
+
+- the input packet on which GSO is to be performed is represented by a
+  multi-segment mbuf.
+
+- the output segment is required to contain data that spans the boundaries
+  between segments of the input multi-segment mbuf.
+
+The GSO library traverses each segment of the input packet, and produces
+numerous output segments; for optimal performance, the number of output
+segments is kept to a minimum. Consequently, the GSO library maximizes the
+amount of data contained within each output segment; i.e. each output segment
+``segsz`` bytes of data. The only exception to this is in the case of the very
+final output segment; if ``pkt_len`` % ``segsz``, then the final segment is
+smaller than the rest.
+
+In order for an output segment to meet its MSS, it may need to include data from
+multiple input segments. Due to the nature of indirect mbufs (each indirect mbuf
+can point to only one direct mbuf), the solution here is to add another indirect
+mbuf to the output segment; this additional segment then points to the next
+input segment. If necessary, this chaining process is repeated, until the sum of
+all of the data 'contained' in the output segment reaches ``segsz``. This
+ensures that the amount of data contained within each output segment is uniform,
+with the possible exception of the last segment, as previously described.
+
+:numref:`figure_gso-three-seg-mbuf` illustrates an example of a three-part
+output segment. In this example, the output segment needs to include data from
+the end of one input segment, and the beginning of another. To achieve this,
+an additional indirect mbuf is chained to the second part of the output segment,
+and is attached to the next input segment (i.e. it points to the data in the
+next input segment).
+
+.. _figure_gso-three-seg-mbuf:
+
+.. figure:: img/gso-three-seg-mbuf.svg
+   :align: center
+
+   Three-part GSO output segment
+
+Supported GSO Packet Types
+--------------------------
+
+TCP/IPv4 GSO
+~~~~~~~~~~~~
+TCP/IPv4 GSO supports segmentation of suitably large TCP/IPv4 packets, which
+may also contain an optional VLAN tag.
+
+VxLAN GSO
+~~~~~~~~~
+VxLAN packets GSO supports segmentation of suitably large VxLAN packets,
+which contain an outer IPv4 header, inner TCP/IPv4 headers, and optional
+inner and/or outer VLAN tag(s).
+
+GRE GSO
+~~~~~~~
+GRE GSO supports segmentation of suitably large GRE packets, which contain
+an outer IPv4 header, inner TCP/IPv4 headers, and an optional VLAN tag.
+
+How to Segment a Packet
+-----------------------
+
+To segment an outgoing packet, an application must:
+
+#. First create a GSO context ``(struct rte_gso_ctx)``; this contains:
+
+   - a pointer to the mbuf pool for allocating the direct buffers, which are
+     used to store the GSO segments' packet headers.
+
+   - a pointer to the mbuf pool for allocating indirect buffers, which are
+     used to locate GSO segments' packet payloads.
+
+.. note::
+
+     An application may use the same pool for both direct and indirect
+     buffers. However, since each indirect mbuf simply stores a pointer, the
+     application may reduce its memory consumption by creating a separate memory
+     pool, containing smaller elements, for the indirect pool.
+
+   - the size of each output segment, including packet headers and payload,
+     measured in bytes.
+
+   - the bit mask of required GSO types. The GSO library uses the same macros as
+     those that describe a physical device's TX offloading capabilities (i.e.
+     ``DEV_TX_OFFLOAD_*_TSO``) for gso_types. For example, if an application
+     wants to segment TCP/IPv4 packets, it should set gso_types to
+     ``DEV_TX_OFFLOAD_TCP_TSO``. The only other supported values currently
+     supported for gso_types are ``DEV_TX_OFFLOAD_VXLAN_TNL_TSO``, and
+     ``DEV_TX_OFFLOAD_GRE_TNL_TSO``; a combination of these macros is also
+     allowed.
+
+   - a flag, that indicates whether the IPv4 headers of output segments should
+     contain fixed or incremental ID values.
+
+2. Set the appropriate ol_flags in the mbuf.
+
+   - The GSO library use the value of an mbuf's ``ol_flags`` attribute to
+     to determine how a packet should be segmented. It is the application's
+     responsibility to ensure that these flags are set.
+
+   - For example, in order to segment TCP/IPv4 packets, the application should
+     add the ``PKT_TX_IPV4`` and ``PKT_TX_TCP_SEG`` flags to the mbuf's
+     ol_flags.
+
+   - If checksum calculation in hardware is required, the application should
+     also add the ``PKT_TX_TCP_CKSUM`` and ``PKT_TX_IP_CKSUM`` flags.
+
+#. Check if the packet should be processed. Packets with one of the
+   following properties are not processed and are returned immediately:
+
+   - Packet length is less than ``segsz`` (i.e. GSO is not required).
+
+   - Packet type is not supported by GSO library (see
+     `Supported GSO Packet Types`_).
+
+   - Application has not enabled GSO support for the packet type.
+
+   - Packet's ol_flags have been incorrectly set.
+
+#. Allocate space in which to store the output GSO segments. If the amount of
+   space allocated by the application is insufficient, segmentation will fail.
+
+#. Invoke the GSO segmentation API, ``rte_gso_segment()``.
+
+#. If required, update the L3 and L4 checksums of the newly-created segments.
+   For tunneled packets, the outer IPv4 headers' checksums should also be
+   updated. Alternatively, the application may offload checksum calculation
+   to HW.
+
diff --git a/doc/guides/prog_guide/img/gso-output-segment-format.svg b/doc/guides/prog_guide/img/gso-output-segment-format.svg
new file mode 100644
index 0000000..bdb5ec3
--- /dev/null
+++ b/doc/guides/prog_guide/img/gso-output-segment-format.svg
@@ -0,0 +1,313 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by Microsoft Visio, SVG Export gso-output-segment-format.svg Page-1 -->
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ev="http://www.w3.org/2001/xml-events"
+		xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/" width="19.3975in" height="8.21796in"
+		viewBox="0 0 1396.62 591.693" xml:space="preserve" color-interpolation-filters="sRGB" class="st21">
+	<v:documentProperties v:langID="1033" v:metric="true" v:viewMarkup="false"/>
+
+	<style type="text/css">
+	<![CDATA[
+		.st1 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0552552}
+		.st2 {fill:#ffffff;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st3 {stroke:#c3d600;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.68828}
+		.st4 {fill:#c3d600;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st5 {stroke:#8f9d00;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.75735}
+		.st6 {fill:#00aeef;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st7 {stroke:#007fb0;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.75735}
+		.st8 {stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st9 {fill:#ffffff;font-family:Intel Clear;font-size:1.99999em;font-weight:bold}
+		.st10 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0552552}
+		.st11 {fill:#ffffff;font-family:Intel Clear;font-size:2.44732em;font-weight:bold}
+		.st12 {fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round;stroke-width:5.52552}
+		.st13 {fill:#000000;font-family:Intel Clear;font-size:2.15291em}
+		.st14 {fill:#000000;font-family:Intel Clear;font-size:1.8401em}
+		.st15 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0276276}
+		.st16 {fill:#c3d600;font-family:Intel Clear;font-size:2.44732em}
+		.st17 {fill:#ffc000;font-family:Intel Clear;font-size:2.44732em}
+		.st18 {fill:#ffc000;stroke:#ffc000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0276276}
+		.st19 {fill:#0070c0;font-family:Intel Clear;font-size:1.8401em}
+		.st20 {fill:#006fc5;font-family:Intel Clear;font-size:1.61927em}
+		.st21 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}
+	]]>
+	</style>
+
+	<g v:mID="0" v:index="1" v:groupContext="foregroundPage">
+		<title>Page-1</title>
+		<v:pageProperties v:drawingScale="0.0393701" v:pageScale="0.0393701" v:drawingUnits="24" v:shadowOffsetX="8.50394"
+				v:shadowOffsetY="-8.50394"/>
+		<g id="shape3-1" v:mID="3" v:groupContext="shape" transform="translate(577.244,-560.42)">
+			<title>Sheet.3</title>
+			<path d="M9.24 585.29 L16.32 585.29 L16.32 587.06 L9.24 587.06 L9.24 585.29 L9.24 585.29 ZM21.63 585.29 L23.4 585.29
+						 L23.4 587.06 L21.63 587.06 L21.63 585.29 L21.63 585.29 ZM28.7 585.29 L35.78 585.29 L35.78 587.06 L28.7 587.06
+						 L28.7 585.29 L28.7 585.29 ZM41.09 585.29 L42.86 585.29 L42.86 587.06 L41.09 587.06 L41.09 585.29 L41.09
+						 585.29 ZM48.17 585.29 L55.25 585.29 L55.25 587.06 L48.17 587.06 L48.17 585.29 L48.17 585.29 ZM60.56 585.29
+						 L62.33 585.29 L62.33 587.06 L60.56 587.06 L60.56 585.29 L60.56 585.29 ZM67.64 585.29 L74.72 585.29 L74.72
+						 587.06 L67.64 587.06 L67.64 585.29 L67.64 585.29 ZM80.03 585.29 L81.8 585.29 L81.8 587.06 L80.03 587.06
+						 L80.03 585.29 L80.03 585.29 ZM87.11 585.29 L94.19 585.29 L94.19 587.06 L87.11 587.06 L87.11 585.29 L87.11
+						 585.29 ZM99.5 585.29 L101.27 585.29 L101.27 587.06 L99.5 587.06 L99.5 585.29 L99.5 585.29 ZM106.58 585.29
+						 L113.66 585.29 L113.66 587.06 L106.58 587.06 L106.58 585.29 L106.58 585.29 ZM118.97 585.29 L120.74 585.29
+						 L120.74 587.06 L118.97 587.06 L118.97 585.29 L118.97 585.29 ZM126.05 585.29 L133.13 585.29 L133.13 587.06
+						 L126.05 587.06 L126.05 585.29 L126.05 585.29 ZM138.43 585.29 L140.2 585.29 L140.2 587.06 L138.43 587.06
+						 L138.43 585.29 L138.43 585.29 ZM145.51 585.29 L152.59 585.29 L152.59 587.06 L145.51 587.06 L145.51 585.29
+						 L145.51 585.29 ZM157.9 585.29 L159.67 585.29 L159.67 587.06 L157.9 587.06 L157.9 585.29 L157.9 585.29 ZM164.98
+						 585.29 L172.06 585.29 L172.06 587.06 L164.98 587.06 L164.98 585.29 L164.98 585.29 ZM177.37 585.29 L179.14
+						 585.29 L179.14 587.06 L177.37 587.06 L177.37 585.29 L177.37 585.29 ZM184.45 585.29 L191.53 585.29 L191.53
+						 587.06 L184.45 587.06 L184.45 585.29 L184.45 585.29 ZM196.84 585.29 L198.61 585.29 L198.61 587.06 L196.84
+						 587.06 L196.84 585.29 L196.84 585.29 ZM203.92 585.29 L211 585.29 L211 587.06 L203.92 587.06 L203.92 585.29
+						 L203.92 585.29 ZM216.31 585.29 L218.08 585.29 L218.08 587.06 L216.31 587.06 L216.31 585.29 L216.31 585.29
+						 ZM223.39 585.29 L230.47 585.29 L230.47 587.06 L223.39 587.06 L223.39 585.29 L223.39 585.29 ZM235.78 585.29
+						 L237.55 585.29 L237.55 587.06 L235.78 587.06 L235.78 585.29 L235.78 585.29 ZM242.86 585.29 L249.93 585.29
+						 L249.93 587.06 L242.86 587.06 L242.86 585.29 L242.86 585.29 ZM255.24 585.29 L257.01 585.29 L257.01 587.06
+						 L255.24 587.06 L255.24 585.29 L255.24 585.29 ZM262.32 585.29 L269.4 585.29 L269.4 587.06 L262.32 587.06
+						 L262.32 585.29 L262.32 585.29 ZM274.71 585.29 L276.48 585.29 L276.48 587.06 L274.71 587.06 L274.71 585.29
+						 L274.71 585.29 ZM281.79 585.29 L288.87 585.29 L288.87 587.06 L281.79 587.06 L281.79 585.29 L281.79 585.29
+						 ZM294.18 585.29 L295.95 585.29 L295.95 587.06 L294.18 587.06 L294.18 585.29 L294.18 585.29 ZM301.26 585.29
+						 L308.34 585.29 L308.34 587.06 L301.26 587.06 L301.26 585.29 L301.26 585.29 ZM313.65 585.29 L315.42 585.29
+						 L315.42 587.06 L313.65 587.06 L313.65 585.29 L313.65 585.29 ZM320.73 585.29 L324.99 585.29 L324.99 587.06
+						 L320.73 587.06 L320.73 585.29 L320.73 585.29 ZM11.06 591.69 L0 586.17 L11.06 580.65 L11.06 591.69 L11.06
+						 591.69 ZM323.16 580.65 L334.22 586.17 L323.16 591.69 L323.16 580.65 L323.16 580.65 Z" class="st1"/>
+		</g>
+		<g id="shape4-3" v:mID="4" v:groupContext="shape" transform="translate(184.298,-201.906)">
+			<title>Sheet.4</title>
+			<path d="M94.04 570.43 L117.87 557.26 L0 344.58 L47.68 318.26 L165.55 530.94 L189.39 517.79 L168.08 591.69 L94.04 570.43
+						 Z" class="st2"/>
+		</g>
+		<g id="shape5-5" v:mID="5" v:groupContext="shape" transform="translate(184.298,-201.906)">
+			<title>Sheet.5</title>
+			<path d="M94.04 570.43 L117.87 557.26 L0 344.58 L47.68 318.26 L165.55 530.94 L189.39 517.79 L168.08 591.69 L94.04 570.43"
+					class="st3"/>
+		</g>
+		<g id="shape6-8" v:mID="6" v:groupContext="shape" transform="translate(119.408,-447.917)">
+			<title>Sheet.6</title>
+			<path d="M0 510.21 L0 591.69 L129.86 591.69 L129.86 510.21 L0 510.21 L0 510.21 Z" class="st4"/>
+		</g>
+		<g id="shape7-10" v:mID="7" v:groupContext="shape" transform="translate(119.408,-447.917)">
+			<title>Sheet.7</title>
+			<path d="M0 510.21 L129.86 510.21 L129.86 591.69 L0 591.69 L0 510.21" class="st5"/>
+		</g>
+		<g id="shape10-13" v:mID="10" v:groupContext="shape" transform="translate(250.819,-447.917)">
+			<title>Sheet.10</title>
+			<path d="M0 510.21 L0 591.69 L822.53 591.69 L822.53 510.21 L0 510.21 L0 510.21 Z" class="st6"/>
+		</g>
+		<g id="shape11-15" v:mID="11" v:groupContext="shape" transform="translate(250.819,-447.917)">
+			<title>Sheet.11</title>
+			<path d="M0 510.21 L822.53 510.21 L822.53 591.69 L0 591.69 L0 510.21" class="st7"/>
+		</g>
+		<g id="shape12-18" v:mID="12" v:groupContext="shape" transform="translate(255.478,-470.123)">
+			<title>Sheet.12</title>
+			<desc>Payload 0</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="157.315" cy="574.07" width="314.63" height="35.245"/>
+			<path d="M314.63 556.45 L0 556.45 L0 591.69 L314.63 591.69 L314.63 556.45" class="st8"/>
+			<text x="102.08" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 0</text>		</g>
+		<g id="shape13-22" v:mID="13" v:groupContext="shape" transform="translate(577.354,-470.123)">
+			<title>Sheet.13</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="167.112" cy="574.07" width="334.23" height="35.245"/>
+			<path d="M334.22 556.45 L0 556.45 L0 591.69 L334.22 591.69 L334.22 556.45" class="st8"/>
+			<text x="111.88" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape14-26" v:mID="14" v:groupContext="shape" transform="translate(910.635,-470.956)">
+			<title>Sheet.14</title>
+			<desc>Payload 2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="81.8509" cy="574.07" width="163.71" height="35.245"/>
+			<path d="M163.7 556.45 L0 556.45 L0 591.69 L163.7 591.69 L163.7 556.45" class="st8"/>
+			<text x="26.61" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 2</text>		</g>
+		<g id="shape15-30" v:mID="15" v:groupContext="shape" transform="translate(909.144,-453.824)">
+			<title>Sheet.15</title>
+			<path d="M1.16 453.85 L1.05 465.33 L3.93 465.39 L4.04 453.91 L1.16 453.85 L1.16 453.85 ZM1 473.95 L0.94 476.82 L3.82
+						 476.87 L3.87 474 L1 473.95 L1 473.95 ZM0.88 485.43 L0.77 496.91 L3.65 496.96 L3.76 485.48 L0.88 485.43 L0.88
+						 485.43 ZM0.72 505.52 L0.72 508.39 L3.59 508.45 L3.59 505.58 L0.72 505.52 L0.72 505.52 ZM0.61 517 L0.55 528.49
+						 L3.43 528.54 L3.48 517.06 L0.61 517 L0.61 517 ZM0.44 537.1 L0.44 539.97 L3.32 540.02 L3.32 537.15 L0.44
+						 537.1 L0.44 537.1 ZM0.39 548.58 L0.28 560.06 L3.15 560.12 L3.26 548.63 L0.39 548.58 L0.39 548.58 ZM0.22
+						 568.67 L0.17 571.54 L3.04 571.6 L3.1 568.73 L0.22 568.67 L0.22 568.67 ZM0.11 580.16 L0 591.64 L2.88 591.69
+						 L2.99 580.21 L0.11 580.16 L0.11 580.16 Z" class="st10"/>
+		</g>
+		<g id="shape16-32" v:mID="16" v:groupContext="shape" transform="translate(119.187,-447.917)">
+			<title>Sheet.16</title>
+			<path d="M0 510.21 L0 591.69 L129.86 591.69 L129.86 510.21 L0 510.21 L0 510.21 Z" class="st4"/>
+		</g>
+		<g id="shape17-34" v:mID="17" v:groupContext="shape" transform="translate(119.187,-447.917)">
+			<title>Sheet.17</title>
+			<path d="M0 510.21 L129.86 510.21 L129.86 591.69 L0 591.69 L0 510.21" class="st5"/>
+		</g>
+		<g id="shape18-37" v:mID="18" v:groupContext="shape" transform="translate(121.944,-471.034)">
+			<title>Sheet.18</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="61.0973" cy="574.07" width="122.2" height="35.245"/>
+			<path d="M122.19 556.45 L0 556.45 L0 591.69 L122.19 591.69 L122.19 556.45" class="st8"/>
+			<text x="20.61" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape19-41" v:mID="19" v:groupContext="shape" transform="translate(329.798,-1.87868)">
+			<title>Sheet.19</title>
+			<path d="M0 510.43 L0 591.69 L289.81 591.69 L289.81 510.43 L0 510.43 L0 510.43 Z" class="st4"/>
+		</g>
+		<g id="shape20-43" v:mID="20" v:groupContext="shape" transform="translate(329.798,-1.87868)">
+			<title>Sheet.20</title>
+			<path d="M0 510.43 L289.81 510.43 L289.81 591.69 L0 591.69 L0 510.43" class="st5"/>
+		</g>
+		<g id="shape21-46" v:mID="21" v:groupContext="shape" transform="translate(424.908,-21.567)">
+			<title>Sheet.21</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="61.0973" cy="574.07" width="122.2" height="35.245"/>
+			<path d="M122.19 556.45 L0 556.45 L0 591.69 L122.19 591.69 L122.19 556.45" class="st8"/>
+			<text x="11.55" y="582.88" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape22-50" v:mID="22" v:groupContext="shape" transform="translate(619.609,-1.87868)">
+			<title>Sheet.22</title>
+			<path d="M0 510.43 L0 591.69 L453.74 591.69 L453.74 510.43 L0 510.43 L0 510.43 Z" class="st6"/>
+		</g>
+		<g id="shape23-52" v:mID="23" v:groupContext="shape" transform="translate(619.609,-1.87868)">
+			<title>Sheet.23</title>
+			<path d="M0 510.43 L453.74 510.43 L453.74 591.69 L0 591.69 L0 510.43" class="st7"/>
+		</g>
+		<g id="shape24-55" v:mID="24" v:groupContext="shape" transform="translate(778.624,-21.5672)">
+			<title>Sheet.24</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="81.8509" cy="574.07" width="163.71" height="35.245"/>
+			<path d="M163.7 556.45 L0 556.45 L0 591.69 L163.7 591.69 L163.7 556.45" class="st8"/>
+			<text x="14.26" y="582.88" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape25-59" v:mID="25" v:groupContext="shape" transform="translate(710.092,-113.83)">
+			<title>Sheet.25</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L349.43 508.89 C357.12 508.89 363.26 515.07 363.26 522.69 L363.26
+						 577.89 C363.26 585.57 357.12 591.69 349.43 591.69 L13.83 591.69 C6.19 591.69 0 585.57 0 577.89 L0 522.69
+						 Z" class="st6"/>
+		</g>
+		<g id="shape26-61" v:mID="26" v:groupContext="shape" transform="translate(710.092,-113.83)">
+			<title>Sheet.26</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L349.43 508.89 C357.12 508.89 363.26 515.07 363.26 522.69 L363.26
+						 577.89 C363.26 585.57 357.12 591.69 349.43 591.69 L13.83 591.69 C6.19 591.69 0 585.57 0 577.89 L0 522.69
+						 Z" class="st12"/>
+		</g>
+		<g id="shape27-63" v:mID="27" v:groupContext="shape" transform="translate(813.057,-150.108)">
+			<title>Sheet.27</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="94.1386" cy="576.19" width="188.28" height="31.0055"/>
+			<path d="M188.28 560.69 L0 560.69 L0 591.69 L188.28 591.69 L188.28 560.69" class="st8"/>
+			<text x="15.43" y="583.94" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape28-67" v:mID="28" v:groupContext="shape" transform="translate(810.845,-123.854)">
+			<title>Sheet.28</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="95.5065" cy="578.442" width="191.02" height="26.501"/>
+			<path d="M191.01 565.19 L0 565.19 L0 591.69 L191.01 591.69 L191.01 565.19" class="st8"/>
+			<text x="15.15" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape29-71" v:mID="29" v:groupContext="shape" transform="translate(573.151,-149.601)">
+			<title>Sheet.29</title>
+			<path d="M0 584.74 L127.76 584.74 L127.76 587.61 L0 587.61 L0 584.74 L0 584.74 ZM125.91 580.65 L136.97 586.17 L125.91
+						 591.69 L125.91 580.65 L125.91 580.65 Z" class="st15"/>
+		</g>
+		<g id="shape30-73" v:mID="30" v:groupContext="shape" transform="translate(0,-309.671)">
+			<title>Sheet.30</title>
+			<desc>Memory copy</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="108.076" cy="574.07" width="216.16" height="35.245"/>
+			<path d="M216.15 556.45 L0 556.45 L0 591.69 L216.15 591.69 L216.15 556.45" class="st8"/>
+			<text x="17.68" y="582.88" class="st16" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Memory copy</text>		</g>
+		<g id="shape31-77" v:mID="31" v:groupContext="shape" transform="translate(680.77,-305.707)">
+			<title>Sheet.31</title>
+			<desc>No Memory Copy</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="136.547" cy="574.07" width="273.1" height="35.245"/>
+			<path d="M273.09 556.45 L0 556.45 L0 591.69 L273.09 591.69 L273.09 556.45" class="st8"/>
+			<text x="21.4" y="582.88" class="st17" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>No Memory Copy</text>		</g>
+		<g id="shape32-81" v:mID="32" v:groupContext="shape" transform="translate(1102.72,-26.7532)">
+			<title>Sheet.32</title>
+			<desc>Logical output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="138.243" cy="578.442" width="276.49" height="26.501"/>
+			<path d="M276.49 565.19 L0 565.19 L0 591.69 L276.49 591.69 L276.49 565.19" class="st8"/>
+			<text x="20.73" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Logical output segment</text>		</g>
+		<g id="shape36-85" v:mID="36" v:groupContext="shape" transform="translate(1106.81,-138.647)">
+			<title>Sheet.36</title>
+			<desc>Two-part output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="144.906" cy="578.442" width="289.82" height="26.501"/>
+			<path d="M289.81 565.19 L0 565.19 L0 591.69 L289.81 591.69 L289.81 565.19" class="st8"/>
+			<text x="16.56" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Two-part output segment</text>		</g>
+		<g id="shape37-89" v:mID="37" v:groupContext="shape" transform="translate(575.916,-453.879)">
+			<title>Sheet.37</title>
+			<path d="M2.88 453.91 L2.9 465.39 L0.03 465.39 L0 453.91 L2.88 453.91 L2.88 453.91 ZM2.9 474 L2.9 476.87 L0.03 476.87
+						 L0.03 474 L2.9 474 L2.9 474 ZM2.9 485.48 L2.9 496.96 L0.03 496.96 L0.03 485.48 L2.9 485.48 L2.9 485.48 ZM2.9
+						 505.58 L2.9 508.45 L0.03 508.45 L0.03 505.58 L2.9 505.58 L2.9 505.58 ZM2.9 517.06 L2.9 528.54 L0.03 528.54
+						 L0.03 517.06 L2.9 517.06 L2.9 517.06 ZM2.9 537.15 L2.9 540.02 L0.03 540.02 L0.03 537.15 L2.9 537.15 L2.9
+						 537.15 ZM2.9 548.63 L2.9 560.12 L0.03 560.12 L0.03 548.63 L2.9 548.63 L2.9 548.63 ZM2.9 568.73 L2.9 571.6
+						 L0.03 571.6 L0.03 568.73 L2.9 568.73 L2.9 568.73 ZM2.9 580.21 L2.9 591.69 L0.03 591.69 L0.03 580.21 L2.9
+						 580.21 L2.9 580.21 Z" class="st18"/>
+		</g>
+		<g id="shape38-91" v:mID="38" v:groupContext="shape" transform="translate(577.354,-193.764)">
+			<title>Sheet.38</title>
+			<path d="M5.59 347.01 L10.92 357.16 L8.38 358.52 L3.04 348.36 L5.59 347.01 L5.59 347.01 ZM14.96 364.78 L16.29 367.32
+						 L13.74 368.67 L12.42 366.13 L14.96 364.78 L14.96 364.78 ZM20.33 374.97 L25.66 385.12 L23.12 386.45 L17.78
+						 376.29 L20.33 374.97 L20.33 374.97 ZM29.7 392.74 L31.03 395.28 L28.48 396.61 L27.16 394.07 L29.7 392.74
+						 L29.7 392.74 ZM35.04 402.9 L40.4 413.06 L37.86 414.38 L32.49 404.22 L35.04 402.9 L35.04 402.9 ZM44.41 420.67
+						 L45.77 423.21 L43.22 424.57 L41.87 422.03 L44.41 420.67 L44.41 420.67 ZM49.78 430.83 L55.14 440.99 L52.6
+						 442.34 L47.23 432.18 L49.78 430.83 L49.78 430.83 ZM59.15 448.61 L60.51 451.15 L57.96 452.5 L56.61 449.96
+						 L59.15 448.61 L59.15 448.61 ZM64.52 458.79 L69.88 468.95 L67.34 470.27 L61.97 460.12 L64.52 458.79 L64.52
+						 458.79 ZM73.89 476.57 L75.25 479.11 L72.7 480.43 L71.35 477.89 L73.89 476.57 L73.89 476.57 ZM79.26 486.72
+						 L84.62 496.88 L82.08 498.21 L76.71 488.05 L79.26 486.72 L79.26 486.72 ZM88.63 504.5 L89.96 507.04 L87.41
+						 508.39 L86.09 505.85 L88.63 504.5 L88.63 504.5 ZM94 514.66 L99.33 524.81 L96.79 526.17 L91.45 516.01 L94
+						 514.66 L94 514.66 ZM103.37 532.43 L104.7 534.97 L102.15 536.32 L100.83 533.79 L103.37 532.43 L103.37 532.43
+						 ZM108.73 542.62 L114.07 552.77 L111.53 554.1 L106.19 543.94 L108.73 542.62 L108.73 542.62 ZM118.11 560.39
+						 L119.44 562.93 L116.89 564.26 L115.57 561.72 L118.11 560.39 L118.11 560.39 ZM123.45 570.55 L128.81 580.71
+						 L126.27 582.03 L120.9 571.87 L123.45 570.55 L123.45 570.55 ZM132.82 588.33 L133.9 590.37 L131.36 591.69
+						 L130.28 589.68 L132.82 588.33 L132.82 588.33 ZM0.28 351.89 L0 339.53 L10.07 346.73 L0.28 351.89 L0.28 351.89
+						 Z" class="st18"/>
+		</g>
+		<g id="shape39-93" v:mID="39" v:groupContext="shape" transform="translate(329.798,-113.83)">
+			<title>Sheet.39</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L229.53 508.89 C237.19 508.89 243.35 515.07 243.35 522.69 L243.35
+						 577.89 C243.35 585.54 237.19 591.69 229.53 591.69 L13.83 591.69 C6.19 591.69 0 585.54 0 577.89 L0 522.69
+						 Z" class="st4"/>
+		</g>
+		<g id="shape40-95" v:mID="40" v:groupContext="shape" transform="translate(329.798,-113.83)">
+			<title>Sheet.40</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L229.53 508.89 C237.19 508.89 243.35 515.07 243.35 522.69 L243.35
+						 577.89 C243.35 585.54 237.19 591.69 229.53 591.69 L13.83 591.69 C6.19 591.69 0 585.54 0 577.89 L0 522.69
+						 Z" class="st12"/>
+		</g>
+		<g id="shape41-97" v:mID="41" v:groupContext="shape" transform="translate(368.774,-150.453)">
+			<title>Sheet.41</title>
+			<desc>Direct mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="82.7002" cy="576.19" width="165.41" height="31.0055"/>
+			<path d="M165.4 560.69 L0 560.69 L0 591.69 L165.4 591.69 L165.4 560.69" class="st8"/>
+			<text x="13.94" y="583.94" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Direct mbuf</text>		</g>
+		<g id="shape42-101" v:mID="42" v:groupContext="shape" transform="translate(351.856,-123.854)">
+			<title>Sheet.42</title>
+			<desc>(copy of headers)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="102.121" cy="578.442" width="204.25" height="26.501"/>
+			<path d="M204.24 565.19 L0 565.19 L0 591.69 L204.24 591.69 L204.24 565.19" class="st8"/>
+			<text x="16.02" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(copy of headers)</text>		</g>
+		<g id="shape43-105" v:mID="43" v:groupContext="shape" transform="translate(619.797,-155.563)">
+			<title>Sheet.43</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="28.011" cy="578.442" width="56.03" height="26.501"/>
+			<path d="M56.02 565.19 L0 565.19 L0 591.69 L56.02 591.69 L56.02 565.19" class="st8"/>
+			<text x="6.35" y="585.07" class="st19" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape44-109" v:mID="44" v:groupContext="shape" transform="translate(700.911,-551.367)">
+			<title>Sheet.44</title>
+			<path d="M0 559.23 L0 591.69 L84.29 591.69 L84.29 559.23 L0 559.23 L0 559.23 Z" class="st2"/>
+		</g>
+		<g id="shape45-111" v:mID="45" v:groupContext="shape" transform="translate(709.883,-555.163)">
+			<title>Sheet.45</title>
+			<desc>segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="30.7501" cy="580.032" width="61.51" height="23.3211"/>
+			<path d="M61.5 568.37 L0 568.37 L0 591.69 L61.5 591.69 L61.5 568.37" class="st8"/>
+			<text x="6.38" y="585.86" class="st20" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>segsz</text>		</g>
+		<g id="shape46-115" v:mID="46" v:groupContext="shape" transform="translate(1111.54,-477.36)">
+			<title>Sheet.46</title>
+			<desc>Input packet</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="74.9" cy="578.442" width="149.8" height="26.501"/>
+			<path d="M149.8 565.19 L0 565.19 L0 591.69 L149.8 591.69 L149.8 565.19" class="st8"/>
+			<text x="12.47" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Input packet</text>		</g>
+	</g>
+</svg>
diff --git a/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg b/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
new file mode 100644
index 0000000..f18a327
--- /dev/null
+++ b/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
@@ -0,0 +1,477 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by Microsoft Visio, SVG Export gso-three-seg-mbuf.svg Page-1 -->
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ev="http://www.w3.org/2001/xml-events"
+		xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/" width="21.8589in" height="9.63966in"
+		viewBox="0 0 1573.84 694.055" xml:space="preserve" color-interpolation-filters="sRGB" class="st23">
+	<title>GSO three-part output segment</title>
+	<v:documentProperties v:langID="1033" v:metric="true" v:viewMarkup="false"/>
+
+	<style type="text/css">
+	<![CDATA[
+		.st1 {fill:#ffc000;stroke:#ffc000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st2 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st3 {fill:none;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.42236}
+		.st4 {fill:#c3d600;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st5 {stroke:#8f9d00;stroke-linecap:round;stroke-linejoin:round;stroke-width:4.47539}
+		.st6 {fill:#00aeef;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st7 {stroke:#007fb0;stroke-linecap:round;stroke-linejoin:round;stroke-width:4.47539}
+		.st8 {stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st9 {fill:#ffffff;font-family:Calibri;font-size:2.08333em;font-weight:bold}
+		.st10 {fill:#ffffff;font-family:Intel Clear;font-size:2.91502em;font-weight:bold}
+		.st11 {fill:#000000;font-family:Intel Clear;font-size:2.19175em}
+		.st12 {fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round;stroke-width:6.58146}
+		.st13 {fill:#000000;font-family:Intel Clear;font-size:2.50001em}
+		.st14 {fill:#000000;font-family:Intel Clear;font-size:1.99999em}
+		.st15 {fill:#0070c0;font-family:Intel Clear;font-size:2.19175em}
+		.st16 {fill:#ffffff;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st17 {fill:#006fc5;font-family:Intel Clear;font-size:1.92874em}
+		.st18 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st19 {fill:#0070c0;font-family:Intel Clear;font-size:1.5em}
+		.st20 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0658146}
+		.st21 {fill:#000000;font-family:Intel Clear;font-size:1.81915em}
+		.st22 {fill:#000000;font-family:Intel Clear;font-size:1.49785em}
+		.st23 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}
+	]]>
+	</style>
+
+	<g v:mID="0" v:index="1" v:groupContext="foregroundPage">
+		<title>Page-1</title>
+		<v:pageProperties v:drawingScale="0.0393701" v:pageScale="0.0393701" v:drawingUnits="24" v:shadowOffsetX="8.50394"
+				v:shadowOffsetY="-8.50394"/>
+		<v:layer v:name="top" v:index="0"/>
+		<v:layer v:name="middle" v:index="1"/>
+		<g id="shape111-1" v:mID="111" v:groupContext="shape" v:layerMember="0" transform="translate(787.208,-220.973)">
+			<title>Sheet.111</title>
+			<path d="M6.65 402.61 L13.01 414.71 L9.98 416.32 L3.62 404.22 L6.65 402.61 L6.65 402.61 ZM17.82 423.78 L19.4 426.81 L16.37
+						 428.42 L14.79 425.39 L17.82 423.78 L17.82 423.78 ZM24.21 435.91 L30.57 448.01 L27.54 449.59 L21.18 437.49
+						 L24.21 435.91 L24.21 435.91 ZM35.38 457.08 L36.96 460.11 L33.93 461.69 L32.35 458.66 L35.38 457.08 L35.38
+						 457.08 ZM41.73 469.18 L48.12 481.28 L45.09 482.86 L38.7 470.76 L41.73 469.18 L41.73 469.18 ZM52.9 490.36
+						 L54.51 493.38 L51.48 494.99 L49.87 491.97 L52.9 490.36 L52.9 490.36 ZM59.29 502.45 L65.68 514.55 L62.65
+						 516.16 L56.26 504.06 L59.29 502.45 L59.29 502.45 ZM70.46 523.63 L72.07 526.65 L69.04 528.26 L67.43 525.24
+						 L70.46 523.63 L70.46 523.63 ZM76.85 535.76 L83.24 547.86 L80.21 549.43 L73.82 537.34 L76.85 535.76 L76.85
+						 535.76 ZM88.01 556.93 L89.63 559.95 L86.6 561.53 L84.98 558.51 L88.01 556.93 L88.01 556.93 ZM94.4 569.03
+						 L100.79 581.13 L97.76 582.7 L91.37 570.61 L94.4 569.03 L94.4 569.03 ZM105.57 590.2 L107.15 593.22 L104.12
+						 594.84 L102.54 591.81 L105.57 590.2 L105.57 590.2 ZM111.96 602.3 L118.32 614.4 L115.28 616.01 L108.93 603.91
+						 L111.96 602.3 L111.96 602.3 ZM123.12 623.47 L124.71 626.5 L121.67 628.11 L120.09 625.08 L123.12 623.47 L123.12
+						 623.47 ZM129.51 635.6 L135.87 647.7 L132.84 649.28 L126.48 637.18 L129.51 635.6 L129.51 635.6 ZM140.68 656.77
+						 L142.26 659.8 L139.23 661.38 L137.65 658.35 L140.68 656.77 L140.68 656.77 ZM147.04 668.87 L153.43 680.97
+						 L150.4 682.55 L144.01 670.45 L147.04 668.87 L147.04 668.87 ZM158.2 690.04 L159.49 692.48 L156.46 694.06
+						 L155.17 691.66 L158.2 690.04 L158.2 690.04 ZM0.33 408.43 L0 393.7 L11.99 402.28 L0.33 408.43 L0.33 408.43
+						 Z" class="st1"/>
+		</g>
+		<g id="shape110-3" v:mID="110" v:groupContext="shape" v:layerMember="0" transform="translate(685.078,-560.166)">
+			<title>Sheet.110</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape4-5" v:mID="4" v:groupContext="shape" transform="translate(718.715,-469.955)">
+			<title>Sheet.4</title>
+			<path d="M0 655.13 L0 678.22 C0 686.97 11.69 694.06 26.05 694.06 C40.45 694.06 52.11 686.97 52.11 678.22 L52.11 673.91
+						 L59.55 673.91 L44.66 664.86 L29.78 673.91 L37.22 673.91 L37.22 678.22 C37.22 681.98 32.25 685 26.05 685
+						 C19.89 685 14.89 681.98 14.89 678.22 L14.89 655.13 L0 655.13 Z" class="st3"/>
+		</g>
+		<g id="shape5-7" v:mID="5" v:groupContext="shape" transform="translate(547.831,-656.823)">
+			<title>Sheet.5</title>
+			<path d="M11 686.43 L19.43 686.43 L19.43 688.53 L11 688.53 L11 686.43 L11 686.43 ZM25.76 686.43 L27.87 686.43 L27.87
+						 688.53 L25.76 688.53 L25.76 686.43 L25.76 686.43 ZM34.19 686.43 L42.62 686.43 L42.62 688.53 L34.19 688.53
+						 L34.19 686.43 L34.19 686.43 ZM48.95 686.43 L51.05 686.43 L51.05 688.53 L48.95 688.53 L48.95 686.43 L48.95
+						 686.43 ZM57.38 686.43 L65.81 686.43 L65.81 688.53 L57.38 688.53 L57.38 686.43 L57.38 686.43 ZM72.14 686.43
+						 L74.24 686.43 L74.24 688.53 L72.14 688.53 L72.14 686.43 L72.14 686.43 ZM80.57 686.43 L89 686.43 L89 688.53
+						 L80.57 688.53 L80.57 686.43 L80.57 686.43 ZM95.32 686.43 L97.43 686.43 L97.43 688.53 L95.32 688.53 L95.32
+						 686.43 L95.32 686.43 ZM103.76 686.43 L112.19 686.43 L112.19 688.53 L103.76 688.53 L103.76 686.43 L103.76
+						 686.43 ZM118.51 686.43 L120.62 686.43 L120.62 688.53 L118.51 688.53 L118.51 686.43 L118.51 686.43 ZM126.94
+						 686.43 L135.38 686.43 L135.38 688.53 L126.94 688.53 L126.94 686.43 L126.94 686.43 ZM141.7 686.43 L143.81
+						 686.43 L143.81 688.53 L141.7 688.53 L141.7 686.43 L141.7 686.43 ZM150.13 686.43 L158.57 686.43 L158.57 688.53
+						 L150.13 688.53 L150.13 686.43 L150.13 686.43 ZM164.89 686.43 L167 686.43 L167 688.53 L164.89 688.53 L164.89
+						 686.43 L164.89 686.43 ZM173.32 686.43 L181.75 686.43 L181.75 688.53 L173.32 688.53 L173.32 686.43 L173.32
+						 686.43 ZM188.08 686.43 L190.19 686.43 L190.19 688.53 L188.08 688.53 L188.08 686.43 L188.08 686.43 ZM196.51
+						 686.43 L204.94 686.43 L204.94 688.53 L196.51 688.53 L196.51 686.43 L196.51 686.43 ZM211.27 686.43 L213.38
+						 686.43 L213.38 688.53 L211.27 688.53 L211.27 686.43 L211.27 686.43 ZM219.7 686.43 L228.13 686.43 L228.13
+						 688.53 L219.7 688.53 L219.7 686.43 L219.7 686.43 ZM234.46 686.43 L236.56 686.43 L236.56 688.53 L234.46 688.53
+						 L234.46 686.43 L234.46 686.43 ZM242.89 686.43 L251.32 686.43 L251.32 688.53 L242.89 688.53 L242.89 686.43
+						 L242.89 686.43 ZM257.64 686.43 L259.75 686.43 L259.75 688.53 L257.64 688.53 L257.64 686.43 L257.64 686.43
+						 ZM266.08 686.43 L274.51 686.43 L274.51 688.53 L266.08 688.53 L266.08 686.43 L266.08 686.43 ZM280.83 686.43
+						 L282.94 686.43 L282.94 688.53 L280.83 688.53 L280.83 686.43 L280.83 686.43 ZM289.27 686.43 L297.7 686.43
+						 L297.7 688.53 L289.27 688.53 L289.27 686.43 L289.27 686.43 ZM304.02 686.43 L306.13 686.43 L306.13 688.53
+						 L304.02 688.53 L304.02 686.43 L304.02 686.43 ZM312.45 686.43 L320.89 686.43 L320.89 688.53 L312.45 688.53
+						 L312.45 686.43 L312.45 686.43 ZM327.21 686.43 L329.32 686.43 L329.32 688.53 L327.21 688.53 L327.21 686.43
+						 L327.21 686.43 ZM335.64 686.43 L344.08 686.43 L344.08 688.53 L335.64 688.53 L335.64 686.43 L335.64 686.43
+						 ZM350.4 686.43 L352.51 686.43 L352.51 688.53 L350.4 688.53 L350.4 686.43 L350.4 686.43 ZM358.83 686.43 L367.26
+						 686.43 L367.26 688.53 L358.83 688.53 L358.83 686.43 L358.83 686.43 ZM373.59 686.43 L375.7 686.43 L375.7
+						 688.53 L373.59 688.53 L373.59 686.43 L373.59 686.43 ZM382.02 686.43 L387.06 686.43 L387.06 688.53 L382.02
+						 688.53 L382.02 686.43 L382.02 686.43 ZM13.18 694.06 L0 687.48 L13.18 680.9 L13.18 694.06 L13.18 694.06 ZM384.89
+						 680.9 L398.06 687.48 L384.89 694.06 L384.89 680.9 L384.89 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape6-9" v:mID="6" v:groupContext="shape" transform="translate(2.5012,-522.82)">
+			<title>Sheet.6</title>
+			<path d="M0 597.01 L0 694.06 L154.68 694.06 L154.68 597.01 L0 597.01 L0 597.01 Z" class="st4"/>
+		</g>
+		<g id="shape7-11" v:mID="7" v:groupContext="shape" transform="translate(2.5012,-522.82)">
+			<title>Sheet.7</title>
+			<path d="M0 597.01 L154.68 597.01 L154.68 694.06 L0 694.06 L0 597.01" class="st5"/>
+		</g>
+		<g id="shape10-14" v:mID="10" v:groupContext="shape" transform="translate(159.025,-522.82)">
+			<title>Sheet.10</title>
+			<path d="M0 597.01 L0 694.06 L563.73 694.06 L563.73 597.01 L0 597.01 L0 597.01 Z" class="st6"/>
+		</g>
+		<g id="shape11-16" v:mID="11" v:groupContext="shape" transform="translate(159.025,-522.82)">
+			<title>Sheet.11</title>
+			<path d="M0 597.01 L563.73 597.01 L563.73 694.06 L0 694.06 L0 597.01" class="st7"/>
+		</g>
+		<g id="shape12-19" v:mID="12" v:groupContext="shape" transform="translate(262.039,-549.269)">
+			<title>Sheet.12</title>
+			<desc>Payload 0</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="46.92" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 0</text>		</g>
+		<g id="shape13-23" v:mID="13" v:groupContext="shape" transform="translate(547.615,-549.269)">
+			<title>Sheet.13</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="87.5716" cy="673.065" width="175.15" height="41.9798"/>
+			<path d="M175.14 652.08 L0 652.08 L0 694.06 L175.14 694.06 L175.14 652.08" class="st8"/>
+			<text x="37" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape15-27" v:mID="15" v:groupContext="shape" transform="translate(2.2377,-522.82)">
+			<title>Sheet.15</title>
+			<path d="M0 597.01 L0 694.06 L154.68 694.06 L154.68 597.01 L0 597.01 L0 597.01 Z" class="st4"/>
+		</g>
+		<g id="shape16-29" v:mID="16" v:groupContext="shape" transform="translate(2.2377,-522.82)">
+			<title>Sheet.16</title>
+			<path d="M0 597.01 L154.68 597.01 L154.68 694.06 L0 694.06 L0 597.01" class="st5"/>
+		</g>
+		<g id="shape17-32" v:mID="17" v:groupContext="shape" transform="translate(6.52106,-546.331)">
+			<title>Sheet.17</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="72.773" cy="673.065" width="145.55" height="41.9798"/>
+			<path d="M145.55 652.08 L0 652.08 L0 694.06 L145.55 694.06 L145.55 652.08" class="st8"/>
+			<text x="34.98" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape23-36" v:mID="23" v:groupContext="shape" transform="translate(286.548,-2.2377)">
+			<title>Sheet.23</title>
+			<path d="M0 597.27 L0 694.06 L345.2 694.06 L345.2 597.27 L0 597.27 L0 597.27 Z" class="st4"/>
+		</g>
+		<g id="shape24-38" v:mID="24" v:groupContext="shape" transform="translate(286.548,-2.2377)">
+			<title>Sheet.24</title>
+			<path d="M0 597.27 L345.2 597.27 L345.2 694.06 L0 694.06 L0 597.27" class="st5"/>
+		</g>
+		<g id="shape25-41" v:mID="25" v:groupContext="shape" transform="translate(399.834,-25.6887)">
+			<title>Sheet.25</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="72.773" cy="673.065" width="145.55" height="41.9798"/>
+			<path d="M145.55 652.08 L0 652.08 L0 694.06 L145.55 694.06 L145.55 652.08" class="st8"/>
+			<text x="13.76" y="683.56" class="st10" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape31-45" v:mID="31" v:groupContext="shape" transform="translate(631.744,-2.2377)">
+			<title>Sheet.31</title>
+			<path d="M0 597.27 L0 694.06 L516.21 694.06 L516.21 597.27 L0 597.27 L0 597.27 Z" class="st6"/>
+		</g>
+		<g id="shape32-47" v:mID="32" v:groupContext="shape" transform="translate(631.744,-2.2377)">
+			<title>Sheet.32</title>
+			<path d="M0 597.27 L516.21 597.27 L516.21 694.06 L0 694.06 L0 597.27" class="st7"/>
+		</g>
+		<g id="shape33-50" v:mID="33" v:groupContext="shape" transform="translate(809.035,-25.6889)">
+			<title>Sheet.33</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="16.99" y="683.56" class="st10" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape35-54" v:mID="35" v:groupContext="shape" transform="translate(1199.29,-21.1708)">
+			<title>Sheet.35</title>
+			<desc>Logical output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="164.662" cy="678.273" width="329.33" height="31.5648"/>
+			<path d="M329.32 662.49 L0 662.49 L0 694.06 L329.32 694.06 L329.32 662.49" class="st8"/>
+			<text x="24.69" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Logical output segment</text>		</g>
+		<g id="shape38-58" v:mID="38" v:groupContext="shape" transform="translate(1204.65,-254.446)">
+			<title>Sheet.38</title>
+			<desc>Three-part output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="181.707" cy="678.273" width="363.42" height="31.5648"/>
+			<path d="M363.41 662.49 L0 662.49 L0 694.06 L363.41 694.06 L363.41 662.49" class="st8"/>
+			<text x="19.51" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Three-part output segment</text>		</g>
+		<g id="shape39-62" v:mID="39" v:groupContext="shape" transform="translate(546.25,-529.921)">
+			<title>Sheet.39</title>
+			<path d="M3.43 529.94 L3.46 543.61 L0.03 543.61 L0 529.94 L3.43 529.94 L3.43 529.94 ZM3.46 553.87 L3.46 557.29 L0.03
+						 557.29 L0.03 553.87 L3.46 553.87 L3.46 553.87 ZM3.46 567.55 L3.46 581.22 L0.03 581.22 L0.03 567.55 L3.46
+						 567.55 L3.46 567.55 ZM3.46 591.48 L3.46 594.9 L0.03 594.9 L0.03 591.48 L3.46 591.48 L3.46 591.48 ZM3.46
+						 605.16 L3.46 618.83 L0.03 618.83 L0.03 605.16 L3.46 605.16 L3.46 605.16 ZM3.46 629.09 L3.46 632.51 L0.03
+						 632.51 L0.03 629.09 L3.46 629.09 L3.46 629.09 ZM3.46 642.77 L3.46 656.45 L0.03 656.45 L0.03 642.77 L3.46
+						 642.77 L3.46 642.77 ZM3.46 666.7 L3.46 670.12 L0.03 670.12 L0.03 666.7 L3.46 666.7 L3.46 666.7 ZM3.46 680.38
+						 L3.46 694.06 L0.03 694.06 L0.03 680.38 L3.46 680.38 L3.46 680.38 Z" class="st1"/>
+		</g>
+		<g id="shape40-64" v:mID="40" v:groupContext="shape" transform="translate(549.097,-223.749)">
+			<title>Sheet.40</title>
+			<path d="M6.65 402.61 L13.01 414.71 L9.98 416.32 L3.62 404.22 L6.65 402.61 L6.65 402.61 ZM17.82 423.78 L19.4 426.81 L16.37
+						 428.42 L14.79 425.39 L17.82 423.78 L17.82 423.78 ZM24.21 435.91 L30.57 448.01 L27.54 449.59 L21.18 437.49
+						 L24.21 435.91 L24.21 435.91 ZM35.38 457.08 L36.96 460.11 L33.93 461.69 L32.35 458.66 L35.38 457.08 L35.38
+						 457.08 ZM41.73 469.18 L48.12 481.28 L45.09 482.86 L38.7 470.76 L41.73 469.18 L41.73 469.18 ZM52.9 490.36
+						 L54.51 493.38 L51.48 494.99 L49.87 491.97 L52.9 490.36 L52.9 490.36 ZM59.29 502.45 L65.68 514.55 L62.65
+						 516.16 L56.26 504.06 L59.29 502.45 L59.29 502.45 ZM70.46 523.63 L72.07 526.65 L69.04 528.26 L67.43 525.24
+						 L70.46 523.63 L70.46 523.63 ZM76.85 535.76 L83.24 547.86 L80.21 549.43 L73.82 537.34 L76.85 535.76 L76.85
+						 535.76 ZM88.01 556.93 L89.63 559.95 L86.6 561.53 L84.98 558.51 L88.01 556.93 L88.01 556.93 ZM94.4 569.03
+						 L100.79 581.13 L97.76 582.7 L91.37 570.61 L94.4 569.03 L94.4 569.03 ZM105.57 590.2 L107.15 593.22 L104.12
+						 594.84 L102.54 591.81 L105.57 590.2 L105.57 590.2 ZM111.96 602.3 L118.32 614.4 L115.28 616.01 L108.93 603.91
+						 L111.96 602.3 L111.96 602.3 ZM123.12 623.47 L124.71 626.5 L121.67 628.11 L120.09 625.08 L123.12 623.47 L123.12
+						 623.47 ZM129.51 635.6 L135.87 647.7 L132.84 649.28 L126.48 637.18 L129.51 635.6 L129.51 635.6 ZM140.68 656.77
+						 L142.26 659.8 L139.23 661.38 L137.65 658.35 L140.68 656.77 L140.68 656.77 ZM147.04 668.87 L153.43 680.97
+						 L150.4 682.55 L144.01 670.45 L147.04 668.87 L147.04 668.87 ZM158.2 690.04 L159.49 692.48 L156.46 694.06
+						 L155.17 691.66 L158.2 690.04 L158.2 690.04 ZM0.33 408.43 L0 393.7 L11.99 402.28 L0.33 408.43 L0.33 408.43
+						 Z" class="st1"/>
+		</g>
+		<g id="shape46-66" v:mID="46" v:groupContext="shape" transform="translate(66.8445,-221.499)">
+			<title>Sheet.46</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L273.39 595.43 C282.51 595.43 289.86 602.79 289.86 611.87 L289.86
+						 677.62 C289.86 686.72 282.51 694.06 273.39 694.06 L16.47 694.06 C7.38 694.06 -0 686.72 0 677.62 L0 611.87
+						 Z" class="st4"/>
+		</g>
+		<g id="shape47-68" v:mID="47" v:groupContext="shape" transform="translate(66.8445,-221.499)">
+			<title>Sheet.47</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L273.39 595.43 C282.51 595.43 289.86 602.79 289.86 611.87 L289.86
+						 677.62 C289.86 686.72 282.51 694.06 273.39 694.06 L16.47 694.06 C7.38 694.06 -0 686.72 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape48-70" v:mID="48" v:groupContext="shape" transform="translate(113.27,-263.667)">
+			<title>Sheet.48</title>
+			<desc>Direct mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="98.5041" cy="675.59" width="197.01" height="36.9302"/>
+			<path d="M197.01 657.13 L0 657.13 L0 694.06 L197.01 694.06 L197.01 657.13" class="st8"/>
+			<text x="18.66" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Direct mbuf</text>		</g>
+		<g id="shape51-74" v:mID="51" v:groupContext="shape" transform="translate(85.817,-233.439)">
+			<title>Sheet.51</title>
+			<desc>(copy of headers)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="127.916" cy="678.273" width="255.84" height="31.5648"/>
+			<path d="M255.83 662.49 L0 662.49 L0 694.06 L255.83 694.06 L255.83 662.49" class="st8"/>
+			<text x="34.33" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(copy of headers)</text>		</g>
+		<g id="shape53-78" v:mID="53" v:groupContext="shape" transform="translate(371.944,-275.998)">
+			<title>Sheet.53</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="7.56" y="686.16" class="st15" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape54-82" v:mID="54" v:groupContext="shape" transform="translate(695.132,-646.04)">
+			<title>Sheet.54</title>
+			<path d="M0 655.39 L0 694.06 L100.4 694.06 L100.4 655.39 L0 655.39 L0 655.39 Z" class="st16"/>
+		</g>
+		<g id="shape55-84" v:mID="55" v:groupContext="shape" transform="translate(709.033,-648.946)">
+			<title>Sheet.55</title>
+			<desc>segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="36.6265" cy="680.167" width="73.26" height="27.7775"/>
+			<path d="M73.25 666.28 L0 666.28 L0 694.06 L73.25 694.06 L73.25 666.28" class="st8"/>
+			<text x="7.6" y="687.11" class="st17" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>segsz</text>		</g>
+		<g id="shape56-88" v:mID="56" v:groupContext="shape" transform="translate(785.874,-521.182)">
+			<title>Sheet.56</title>
+			<path d="M0 597.27 L0 694.06 L363.41 694.06 L363.41 597.27 L0 597.27 L0 597.27 Z" class="st6"/>
+		</g>
+		<g id="shape57-90" v:mID="57" v:groupContext="shape" transform="translate(785.874,-521.182)">
+			<title>Sheet.57</title>
+			<path d="M0 597.27 L363.41 597.27 L363.41 694.06 L0 694.06 L0 597.27" class="st7"/>
+		</g>
+		<g id="shape58-93" v:mID="58" v:groupContext="shape" v:layerMember="0" transform="translate(943.158,-529.889)">
+			<title>Sheet.58</title>
+			<path d="M1.35 529.91 L1.25 543.58 L4.68 543.61 L4.78 529.94 L1.35 529.91 L1.35 529.91 ZM1.15 553.84 L1.12 557.26 L4.55
+						 557.29 L4.58 553.87 L1.15 553.84 L1.15 553.84 ZM1.05 567.52 L0.92 581.19 L4.35 581.22 L4.48 567.55 L1.05
+						 567.52 L1.05 567.52 ZM0.86 591.45 L0.82 594.87 L4.25 594.9 L4.28 591.48 L0.86 591.45 L0.86 591.45 ZM0.72
+						 605.13 L0.63 618.8 L4.05 618.83 L4.15 605.16 L0.72 605.13 L0.72 605.13 ZM0.53 629.06 L0.53 632.48 L3.95
+						 632.51 L3.95 629.09 L0.53 629.06 L0.53 629.06 ZM0.43 642.74 L0.33 656.41 L3.75 656.45 L3.85 642.77 L0.43
+						 642.74 L0.43 642.74 ZM0.23 666.67 L0.2 670.09 L3.62 670.12 L3.66 666.7 L0.23 666.67 L0.23 666.67 ZM0.13
+						 680.35 L0 694.02 L3.43 694.06 L3.56 680.38 L0.13 680.35 L0.13 680.35 Z" class="st18"/>
+		</g>
+		<g id="shape59-95" v:mID="59" v:groupContext="shape" transform="translate(785.874,-549.473)">
+			<title>Sheet.59</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="77.3395" cy="673.065" width="154.68" height="41.9798"/>
+			<path d="M154.68 652.08 L0 652.08 L0 694.06 L154.68 694.06 L154.68 652.08" class="st8"/>
+			<text x="26.77" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape60-99" v:mID="60" v:groupContext="shape" transform="translate(952.97,-548.822)">
+			<title>Sheet.60</title>
+			<desc>Payload 2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="46.92" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 2</text>		</g>
+		<g id="shape63-103" v:mID="63" v:groupContext="shape" transform="translate(1210.43,-551.684)">
+			<title>Sheet.63</title>
+			<desc>Multi-segment input packet</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="181.707" cy="678.273" width="363.42" height="31.5648"/>
+			<path d="M363.41 662.49 L0 662.49 L0 694.06 L363.41 694.06 L363.41 662.49" class="st8"/>
+			<text x="17.75" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Multi-segment input packet</text>		</g>
+		<g id="shape70-107" v:mID="70" v:groupContext="shape" v:layerMember="1" transform="translate(455.049,-221.499)">
+			<title>Sheet.70</title>
+			<path d="M0 611.87 C0 602.79 5.33 595.43 11.89 595.43 L282.92 595.43 C289.53 595.43 294.8 602.79 294.8 611.87 L294.8
+						 677.62 C294.8 686.76 289.53 694.06 282.92 694.06 L11.89 694.06 C5.33 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st6"/>
+		</g>
+		<g id="shape71-109" v:mID="71" v:groupContext="shape" transform="translate(455.049,-221.499)">
+			<title>Sheet.71</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L391.97 595.43 C401.12 595.43 408.44 602.79 408.44 611.87 L408.44
+						 677.62 C408.44 686.76 401.12 694.06 391.97 694.06 L16.47 694.06 C7.38 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape72-111" v:mID="72" v:groupContext="shape" transform="translate(489.065,-263.434)">
+			<title>Sheet.72</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="112.128" cy="675.59" width="224.26" height="36.9302"/>
+			<path d="M224.26 657.13 L0 657.13 L0 694.06 L224.26 694.06 L224.26 657.13" class="st8"/>
+			<text x="20.73" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape75-115" v:mID="75" v:groupContext="shape" transform="translate(849.065,-281.435)">
+			<title>Sheet.75</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="4.49" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape77-119" v:mID="77" v:groupContext="shape" transform="translate(717.742,-563.523)">
+			<title>Sheet.77</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="15.71" y="683.67" class="st19" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape78-123" v:mID="78" v:groupContext="shape" transform="translate(1148.17,-529.067)">
+			<title>Sheet.78</title>
+			<path d="M1.38 529.87 L1.25 543.55 L4.68 543.61 L4.81 529.94 L1.38 529.87 L1.38 529.87 ZM1.19 553.81 L1.12 557.23 L4.55
+						 557.29 L4.61 553.87 L1.19 553.81 L1.19 553.81 ZM1.05 567.48 L0.92 581.16 L4.35 581.22 L4.48 567.55 L1.05
+						 567.48 L1.05 567.48 ZM0.86 591.42 L0.86 594.84 L4.28 594.9 L4.28 591.48 L0.86 591.42 L0.86 591.42 ZM0.72
+						 605.09 L0.66 618.77 L4.08 618.83 L4.15 605.16 L0.72 605.09 L0.72 605.09 ZM0.53 629.03 L0.53 632.45 L3.95
+						 632.51 L3.95 629.09 L0.53 629.03 L0.53 629.03 ZM0.46 642.7 L0.33 656.38 L3.75 656.45 L3.89 642.77 L0.46
+						 642.7 L0.46 642.7 ZM0.26 666.64 L0.2 670.06 L3.62 670.12 L3.69 666.7 L0.26 666.64 L0.26 666.64 ZM0.13 680.31
+						 L0 693.99 L3.43 694.06 L3.56 680.38 L0.13 680.31 L0.13 680.31 Z" class="st20"/>
+		</g>
+		<g id="shape79-125" v:mID="79" v:groupContext="shape" transform="translate(946.254,-657.81)">
+			<title>Sheet.79</title>
+			<path d="M11 686.69 L17.33 686.69 L17.33 688.27 L11 688.27 L11 686.69 L11 686.69 ZM22.07 686.69 L23.65 686.69 L23.65
+						 688.27 L22.07 688.27 L22.07 686.69 L22.07 686.69 ZM28.39 686.69 L34.72 686.69 L34.72 688.27 L28.39 688.27
+						 L28.39 686.69 L28.39 686.69 ZM39.46 686.69 L41.04 686.69 L41.04 688.27 L39.46 688.27 L39.46 686.69 L39.46
+						 686.69 ZM45.78 686.69 L52.11 686.69 L52.11 688.27 L45.78 688.27 L45.78 686.69 L45.78 686.69 ZM56.85 686.69
+						 L58.43 686.69 L58.43 688.27 L56.85 688.27 L56.85 686.69 L56.85 686.69 ZM63.18 686.69 L69.5 686.69 L69.5
+						 688.27 L63.18 688.27 L63.18 686.69 L63.18 686.69 ZM74.24 686.69 L75.82 686.69 L75.82 688.27 L74.24 688.27
+						 L74.24 686.69 L74.24 686.69 ZM80.57 686.69 L86.89 686.69 L86.89 688.27 L80.57 688.27 L80.57 686.69 L80.57
+						 686.69 ZM91.63 686.69 L93.22 686.69 L93.22 688.27 L91.63 688.27 L91.63 686.69 L91.63 686.69 ZM97.96 686.69
+						 L104.28 686.69 L104.28 688.27 L97.96 688.27 L97.96 686.69 L97.96 686.69 ZM109.03 686.69 L110.61 686.69 L110.61
+						 688.27 L109.03 688.27 L109.03 686.69 L109.03 686.69 ZM115.35 686.69 L121.67 686.69 L121.67 688.27 L115.35
+						 688.27 L115.35 686.69 L115.35 686.69 ZM126.42 686.69 L128 686.69 L128 688.27 L126.42 688.27 L126.42 686.69
+						 L126.42 686.69 ZM132.74 686.69 L139.07 686.69 L139.07 688.27 L132.74 688.27 L132.74 686.69 L132.74 686.69
+						 ZM143.81 686.69 L145.39 686.69 L145.39 688.27 L143.81 688.27 L143.81 686.69 L143.81 686.69 ZM150.13 686.69
+						 L156.46 686.69 L156.46 688.27 L150.13 688.27 L150.13 686.69 L150.13 686.69 ZM161.2 686.69 L162.78 686.69
+						 L162.78 688.27 L161.2 688.27 L161.2 686.69 L161.2 686.69 ZM167.53 686.69 L173.85 686.69 L173.85 688.27 L167.53
+						 688.27 L167.53 686.69 L167.53 686.69 ZM178.59 686.69 L180.17 686.69 L180.17 688.27 L178.59 688.27 L178.59
+						 686.69 L178.59 686.69 ZM184.92 686.69 L189.4 686.69 L189.4 688.27 L184.92 688.27 L184.92 686.69 L184.92
+						 686.69 ZM13.18 694.06 L0 687.41 L13.18 680.9 L13.18 694.06 L13.18 694.06 ZM187.22 680.9 L200.4 687.48 L187.22
+						 694.06 L187.22 680.9 L187.22 680.9 Z" class="st20"/>
+		</g>
+		<g id="shape80-127" v:mID="80" v:groupContext="shape" transform="translate(982.882,-643.673)">
+			<title>Sheet.80</title>
+			<path d="M0 655.13 L0 694.06 L127.01 694.06 L127.01 655.13 L0 655.13 L0 655.13 Z" class="st16"/>
+		</g>
+		<g id="shape81-129" v:mID="81" v:groupContext="shape" transform="translate(1003.39,-660.621)">
+			<title>Sheet.81</title>
+			<desc>pkt_len</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="48.6041" cy="680.956" width="97.21" height="26.1994"/>
+			<path d="M97.21 667.86 L0 667.86 L0 694.06 L97.21 694.06 L97.21 667.86" class="st8"/>
+			<text x="11.67" y="687.5" class="st21" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>pkt_len  </text>		</g>
+		<g id="shape82-133" v:mID="82" v:groupContext="shape" transform="translate(1001.18,-634.321)">
+			<title>Sheet.82</title>
+			<desc>% segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="49.2945" cy="680.956" width="98.59" height="26.1994"/>
+			<path d="M98.59 667.86 L0 667.86 L0 694.06 L98.59 694.06 L98.59 667.86" class="st8"/>
+			<text x="9.09" y="687.5" class="st21" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>% segsz</text>		</g>
+		<g id="shape34-137" v:mID="34" v:groupContext="shape" v:layerMember="0" transform="translate(356.703,-264.106)">
+			<title>Sheet.34</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape85-139" v:mID="85" v:groupContext="shape" v:layerMember="0" transform="translate(78.5359,-282.66)">
+			<title>Sheet.85</title>
+			<path d="M0 680.87 C-0 673.59 6.88 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.88 694.06 0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape87-141" v:mID="87" v:groupContext="shape" v:layerMember="0" transform="translate(85.4791,-284.062)">
+			<title>Sheet.87</title>
+			<desc>1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>1</text>		</g>
+		<g id="shape88-145" v:mID="88" v:groupContext="shape" v:layerMember="0" transform="translate(468.906,-282.66)">
+			<title>Sheet.88</title>
+			<path d="M0 680.87 C-0 673.59 6.89 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.89 694.06 -0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape90-147" v:mID="90" v:groupContext="shape" v:layerMember="0" transform="translate(474.575,-284.062)">
+			<title>Sheet.90</title>
+			<desc>2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>2</text>		</g>
+		<g id="shape95-151" v:mID="95" v:groupContext="shape" v:layerMember="0" transform="translate(764.026,-275.998)">
+			<title>Sheet.95</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="7.56" y="686.16" class="st15" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape97-155" v:mID="97" v:groupContext="shape" v:layerMember="0" transform="translate(889.755,-220.915)">
+			<title>Sheet.97</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L391.97 595.43 C401.12 595.43 408.44 602.79 408.44 611.87 L408.44
+						 677.62 C408.44 686.76 401.12 694.06 391.97 694.06 L16.47 694.06 C7.38 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape100-157" v:mID="100" v:groupContext="shape" v:layerMember="0" transform="translate(751.857,-262.528)">
+			<title>Sheet.100</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape104-159" v:mID="104" v:groupContext="shape" v:layerMember="1" transform="translate(851.429,-218.08)">
+			<title>Sheet.104</title>
+			<path d="M0 611.87 C0 602.79 5.33 595.43 11.89 595.43 L282.92 595.43 C289.53 595.43 294.8 602.79 294.8 611.87 L294.8
+						 677.62 C294.8 686.76 289.53 694.06 282.92 694.06 L11.89 694.06 C5.33 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st6"/>
+		</g>
+		<g id="shape105-161" v:mID="105" v:groupContext="shape" v:layerMember="0" transform="translate(885.444,-260.015)">
+			<title>Sheet.105</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="112.128" cy="675.59" width="224.26" height="36.9302"/>
+			<path d="M224.26 657.13 L0 657.13 L0 694.06 L224.26 694.06 L224.26 657.13" class="st8"/>
+			<text x="20.73" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape106-165" v:mID="106" v:groupContext="shape" v:layerMember="0" transform="translate(895.672,-229.419)">
+			<title>Sheet.106</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="12.86" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape107-169" v:mID="107" v:groupContext="shape" v:layerMember="0" transform="translate(863.297,-280.442)">
+			<title>Sheet.107</title>
+			<path d="M0 680.87 C-0 673.59 6.89 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.89 694.06 -0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape108-171" v:mID="108" v:groupContext="shape" v:layerMember="0" transform="translate(870.001,-281.547)">
+			<title>Sheet.108</title>
+			<desc>3</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>3</text>		</g>
+		<g id="shape109-175" v:mID="109" v:groupContext="shape" v:layerMember="0" transform="translate(500.959,-231.87)">
+			<title>Sheet.109</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="12.86" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+	</g>
+</svg>
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 40f04a1..c7c8b17 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -56,6 +56,7 @@ Programmer's Guide
     reorder_lib
     ip_fragment_reassembly_lib
     generic_receive_offload_lib
+    generic_segmentation_offload_lib
     pdump_lib
     multi_proc_support
     kernel_nic_interface
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v5 2/6] gso: add TCP/IPv4 GSO support
  2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
                         ` (11 preceding siblings ...)
  2017-09-28 22:13       ` [PATCH v5 6/6] doc: add GSO programmer's guide Mark Kavanagh
@ 2017-09-28 22:18       ` Mark Kavanagh
  12 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-09-28 22:18 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

This patch adds GSO support for TCP/IPv4 packets. Supported packets
may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
packets have correct checksums, and doesn't update checksums for
output packets (the responsibility for this lies with the application).
Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.

TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
MBUF, to organize an output packet. Note that we refer to these two
chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
header, while the indirect mbuf simply points to a location within the
original packet's payload. Consequently, use of the GSO library requires
multi-segment MBUF support in the TX functions of the NIC driver.

If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
result, when all of its GSOed segments are freed, the packet is freed
automatically.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst  |  12 +++
 lib/librte_eal/common/include/rte_log.h |   1 +
 lib/librte_gso/Makefile                 |   2 +
 lib/librte_gso/gso_common.c             | 153 ++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h             | 141 +++++++++++++++++++++++++++++
 lib/librte_gso/gso_tcp4.c               | 106 ++++++++++++++++++++++
 lib/librte_gso/gso_tcp4.h               |  74 +++++++++++++++
 lib/librte_gso/rte_gso.c                |  52 ++++++++++-
 8 files changed, 538 insertions(+), 3 deletions(-)
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 7508be7..c414f73 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -41,6 +41,18 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added the Generic Segmentation Offload Library.**
+
+  Added the Generic Segmentation Offload (GSO) library to enable
+  applications to split large packets (e.g. MTU is 64KB) into small
+  ones (e.g. MTU is 1500B). Supported packet types are:
+
+  * TCP/IPv4 packets, which may include a single VLAN tag.
+
+  The GSO library doesn't check if the input packets have correct
+  checksums, and doesn't update checksums for output packets.
+  Additionally, the GSO library doesn't process IP fragmented packets.
+
 
 Resolved Issues
 ---------------
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index ec8dba7..2fa1199 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -87,6 +87,7 @@ struct rte_logs {
 #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
 #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
 #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
+#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index aeaacbc..2be64d1 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -42,6 +42,8 @@ LIBABIVER := 1
 
 #source files
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
new file mode 100644
index 0000000..ee75d4c
--- /dev/null
+++ b/lib/librte_gso/gso_common.c
@@ -0,0 +1,153 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdbool.h>
+#include <errno.h>
+
+#include <rte_memcpy.h>
+#include <rte_mempool.h>
+
+#include "gso_common.h"
+
+static inline void
+hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset)
+{
+	/* Copy MBUF metadata */
+	hdr_segment->nb_segs = 1;
+	hdr_segment->port = pkt->port;
+	hdr_segment->ol_flags = pkt->ol_flags;
+	hdr_segment->packet_type = pkt->packet_type;
+	hdr_segment->pkt_len = pkt_hdr_offset;
+	hdr_segment->data_len = pkt_hdr_offset;
+	hdr_segment->tx_offload = pkt->tx_offload;
+
+	/* Copy the packet header */
+	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
+			rte_pktmbuf_mtod(pkt, char *),
+			pkt_hdr_offset);
+}
+
+static inline void
+free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
+{
+	uint16_t i;
+
+	for (i = 0; i < nb_pkts; i++)
+		rte_pktmbuf_free(pkts[i]);
+}
+
+int
+gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct rte_mbuf *pkt_in;
+	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
+	uint16_t pkt_in_data_pos, segment_bytes_remaining;
+	uint16_t pyld_len, nb_segs;
+	bool more_in_pkt, more_out_segs;
+
+	pkt_in = pkt;
+	nb_segs = 0;
+	more_in_pkt = 1;
+	pkt_in_data_pos = pkt_hdr_offset;
+
+	while (more_in_pkt) {
+		if (unlikely(nb_segs >= nb_pkts_out)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -EINVAL;
+		}
+
+		/* Allocate a direct MBUF */
+		hdr_segment = rte_pktmbuf_alloc(direct_pool);
+		if (unlikely(hdr_segment == NULL)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -ENOMEM;
+		}
+		/* Fill the packet header */
+		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
+
+		prev_segment = hdr_segment;
+		segment_bytes_remaining = pyld_unit_size;
+		more_out_segs = 1;
+
+		while (more_out_segs && more_in_pkt) {
+			/* Allocate an indirect MBUF */
+			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
+			if (unlikely(pyld_segment == NULL)) {
+				rte_pktmbuf_free(hdr_segment);
+				free_gso_segment(pkts_out, nb_segs);
+				return -ENOMEM;
+			}
+			/* Attach to current MBUF segment of pkt */
+			rte_pktmbuf_attach(pyld_segment, pkt_in);
+
+			prev_segment->next = pyld_segment;
+			prev_segment = pyld_segment;
+
+			pyld_len = segment_bytes_remaining;
+			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
+				pyld_len = pkt_in->data_len - pkt_in_data_pos;
+
+			pyld_segment->data_off = pkt_in_data_pos +
+				pkt_in->data_off;
+			pyld_segment->data_len = pyld_len;
+
+			/* Update header segment */
+			hdr_segment->pkt_len += pyld_len;
+			hdr_segment->nb_segs++;
+
+			pkt_in_data_pos += pyld_len;
+			segment_bytes_remaining -= pyld_len;
+
+			/* Finish processing a MBUF segment of pkt */
+			if (pkt_in_data_pos == pkt_in->data_len) {
+				pkt_in = pkt_in->next;
+				pkt_in_data_pos = 0;
+				if (pkt_in == NULL)
+					more_in_pkt = 0;
+			}
+
+			/* Finish generating a GSO segment */
+			if (segment_bytes_remaining == 0)
+				more_out_segs = 0;
+		}
+		pkts_out[nb_segs++] = hdr_segment;
+	}
+	return nb_segs;
+}
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
new file mode 100644
index 0000000..8d9b94e
--- /dev/null
+++ b/lib/librte_gso/gso_common.h
@@ -0,0 +1,141 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_COMMON_H_
+#define _GSO_COMMON_H_
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
+#define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
+		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
+
+#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
+#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
+
+#define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
+
+/**
+ * Internal function which updates the TCP header of a packet, following
+ * segmentation. This is required to update the header's 'sent' sequence
+ * number, and also to clear 'PSH' and 'FIN' flags for non-tail segments.
+ *
+ * @param pkt
+ *  The packet containing the TCP header.
+ * @param l4_offset
+ *  The offset of the TCP header from the start of the packet.
+ * @param sent_seq
+ *  The sent sequence number.
+ * @param non-tail
+ *  Indicates whether or not this is a tail segment.
+ */
+static inline void
+update_tcp_header(struct rte_mbuf *pkt, uint16_t l4_offset, uint32_t sent_seq,
+		uint8_t non_tail)
+{
+	struct tcp_hdr *tcp_hdr;
+
+	tcp_hdr = (struct tcp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l4_offset);
+	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
+	if (likely(non_tail))
+		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
+					TCP_HDR_FIN_MASK));
+}
+
+/**
+ * Internal function which updates the IPv4 header of a packet, following
+ * segmentation. This is required to update the header's 'total_length' field,
+ * to reflect the reduced length of the now-segmented packet. Furthermore, the
+ * header's 'packet_id' field must be updated to reflect the new ID of the
+ * now-segmented packet.
+ *
+ * @param pkt
+ *  The packet containing the IPv4 header.
+ * @param l3_offset
+ *  The offset of the IPv4 header from the start of the packet.
+ * @param id
+ *  The new ID of the packet.
+ */
+static inline void
+update_ipv4_header(struct rte_mbuf *pkt, uint16_t l3_offset, uint16_t id)
+{
+	struct ipv4_hdr *ipv4_hdr;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l3_offset);
+	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len - l3_offset);
+	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
+}
+
+/**
+ * Internal function which divides the input packet into small segments.
+ * Each of the newly-created segments is organized as a two-segment MBUF,
+ * where the first segment is a standard mbuf, which stores a copy of
+ * packet header, and the second is an indirect mbuf which points to a
+ * section of data in the input packet.
+ *
+ * @param pkt
+ *  Packet to segment.
+ * @param pkt_hdr_offset
+ *  Packet header offset, measured in bytes.
+ * @param pyld_unit_size
+ *  The max payload length of a GSO segment.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to keep the mbuf addresses of output segments. If
+ *  the memory space in pkts_out is insufficient, gso_do_segment() fails
+ *  and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of segments created in the event of success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
new file mode 100644
index 0000000..584a77d
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.c
@@ -0,0 +1,106 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "gso_common.h"
+#include "gso_tcp4.h"
+
+static void
+update_ipv4_tcp_headers(struct rte_mbuf *pkt, uint64_t l3_offset,
+		uint8_t ipid_delta, struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t id, tail_idx, i;
+	uint16_t l4_offset;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char*) +
+			l3_offset);
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+	l4_offset = l3_offset + pkt->l3_len;
+
+	for (i = 0; i < nb_segs; i++) {
+		update_ipv4_header(segs[i], l3_offset, id);
+		update_tcp_header(segs[i], l4_offset, sent_seq, i < tail_idx);
+		id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
+
+int
+gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t tcp_dl;
+	uint16_t pyld_unit_size, hdr_offset;
+	uint16_t frag_off;
+	int ret;
+
+	/* Don't process the fragmented packet */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->l2_len);
+	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	/* Don't process the packet without data */
+	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
+	if (unlikely(tcp_dl == 0)) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret > 1) {
+		update_ipv4_tcp_headers(pkt, pkt->l2_len, ipid_delta,
+				pkts_out, ret);
+	}
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
new file mode 100644
index 0000000..1c57441
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.h
@@ -0,0 +1,74 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TCP4_H_
+#define _GSO_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment an IPv4/TCP packet. This function doesn't check if the input
+ * packet has correct checksums, and doesn't update checksums for output
+ * GSO segments. Furthermore, it doesn't process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing unit of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when the function succeeds. If the memory space in
+ *  pkts_out is insufficient, it fails and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ip_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index b773636..a4fce50 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -33,7 +33,12 @@
 
 #include <errno.h>
 
+#include <rte_log.h>
+#include <rte_ethdev.h>
+
 #include "rte_gso.h"
+#include "gso_common.h"
+#include "gso_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -41,12 +46,53 @@
 		struct rte_mbuf **pkts_out,
 		uint16_t nb_pkts_out)
 {
+	struct rte_mempool *direct_pool, *indirect_pool;
+	struct rte_mbuf *pkt_seg;
+	uint64_t ol_flags;
+	uint16_t gso_size;
+	uint8_t ipid_delta;
+	int ret = 1;
+
 	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
 			nb_pkts_out < 1)
 		return -EINVAL;
 
-	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
-	pkts_out[0] = pkt;
+	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
+				DEV_TX_OFFLOAD_TCP_TSO) !=
+			gso_ctx->gso_types) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	direct_pool = gso_ctx->direct_pool;
+	indirect_pool = gso_ctx->indirect_pool;
+	gso_size = gso_ctx->gso_size;
+	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
+	ol_flags = pkt->ol_flags;
+
+	if (IS_IPV4_TCP(pkt->ol_flags)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkts_out[0] = pkt;
+		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
+		return 1;
+	}
+
+	if (ret > 1) {
+		pkt_seg = pkt;
+		while (pkt_seg) {
+			rte_mbuf_refcnt_update(pkt_seg, -1);
+			pkt_seg = pkt_seg->next;
+		}
+	} else if (ret < 0) {
+		/* Revert the ol_flags in the event of failure. */
+		pkt->ol_flags = ol_flags;
+	}
 
-	return 1;
+	return ret;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* Re: [PATCH v5 2/6] gso: add TCP/IPv4 GSO support
  2017-09-28 22:13       ` [PATCH v5 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
@ 2017-09-29  3:12         ` Jiayu Hu
  2017-09-29  9:05           ` Kavanagh, Mark B
  0 siblings, 1 reply; 157+ messages in thread
From: Jiayu Hu @ 2017-09-29  3:12 UTC (permalink / raw)
  To: Mark Kavanagh; +Cc: dev, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas

Hi Mark,

One comment is inline.

Thanks,
Jiayu

On Thu, Sep 28, 2017 at 11:13:49PM +0100, Mark Kavanagh wrote:
> From: Jiayu Hu <jiayu.hu@intel.com>
> 
> This patch adds GSO support for TCP/IPv4 packets. Supported packets
> may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
> packets have correct checksums, and doesn't update checksums for
> output packets (the responsibility for this lies with the application).
> Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
> 
> TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
> MBUF, to organize an output packet. Note that we refer to these two
> chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
> header, while the indirect mbuf simply points to a location within the
> original packet's payload. Consequently, use of the GSO library requires
> multi-segment MBUF support in the TX functions of the NIC driver.
> 
> If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> result, when all of its GSOed segments are freed, the packet is freed
> automatically.
> 
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> Tested-by: Lei Yao <lei.a.yao@intel.com>
> ---
>  doc/guides/rel_notes/release_17_11.rst  |  12 +++
>  lib/librte_eal/common/include/rte_log.h |   1 +
>  lib/librte_gso/Makefile                 |   2 +
>  lib/librte_gso/gso_common.c             | 153 ++++++++++++++++++++++++++++++++
>  lib/librte_gso/gso_common.h             | 141 +++++++++++++++++++++++++++++
>  lib/librte_gso/gso_tcp4.c               | 106 ++++++++++++++++++++++
>  lib/librte_gso/gso_tcp4.h               |  74 +++++++++++++++
>  lib/librte_gso/rte_gso.c                |  52 ++++++++++-
>  8 files changed, 538 insertions(+), 3 deletions(-)
>  create mode 100644 lib/librte_gso/gso_common.c
>  create mode 100644 lib/librte_gso/gso_common.h
>  create mode 100644 lib/librte_gso/gso_tcp4.c
>  create mode 100644 lib/librte_gso/gso_tcp4.h
> 
> diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
> index 7508be7..c414f73 100644
> --- a/doc/guides/rel_notes/release_17_11.rst
> +++ b/doc/guides/rel_notes/release_17_11.rst
> @@ -41,6 +41,18 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =========================================================
>  
> +* **Added the Generic Segmentation Offload Library.**
> +
> +  Added the Generic Segmentation Offload (GSO) library to enable
> +  applications to split large packets (e.g. MTU is 64KB) into small
> +  ones (e.g. MTU is 1500B). Supported packet types are:
> +
> +  * TCP/IPv4 packets, which may include a single VLAN tag.
> +
> +  The GSO library doesn't check if the input packets have correct
> +  checksums, and doesn't update checksums for output packets.
> +  Additionally, the GSO library doesn't process IP fragmented packets.
> +
>  
>  Resolved Issues
>  ---------------
> diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
> index ec8dba7..2fa1199 100644
> --- a/lib/librte_eal/common/include/rte_log.h
> +++ b/lib/librte_eal/common/include/rte_log.h
> @@ -87,6 +87,7 @@ struct rte_logs {
>  #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
>  #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
>  #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
> +#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
>  
>  /* these log types can be used in an application */
>  #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> index aeaacbc..2be64d1 100644
> --- a/lib/librte_gso/Makefile
> +++ b/lib/librte_gso/Makefile
> @@ -42,6 +42,8 @@ LIBABIVER := 1
>  
>  #source files
>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
>  
>  # install this header file
>  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
> new file mode 100644
> index 0000000..ee75d4c
> --- /dev/null
> +++ b/lib/librte_gso/gso_common.c
> @@ -0,0 +1,153 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <stdbool.h>
> +#include <errno.h>
> +
> +#include <rte_memcpy.h>
> +#include <rte_mempool.h>
> +
> +#include "gso_common.h"
> +
> +static inline void
> +hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset)
> +{
> +	/* Copy MBUF metadata */
> +	hdr_segment->nb_segs = 1;
> +	hdr_segment->port = pkt->port;
> +	hdr_segment->ol_flags = pkt->ol_flags;
> +	hdr_segment->packet_type = pkt->packet_type;
> +	hdr_segment->pkt_len = pkt_hdr_offset;
> +	hdr_segment->data_len = pkt_hdr_offset;
> +	hdr_segment->tx_offload = pkt->tx_offload;
> +
> +	/* Copy the packet header */
> +	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
> +			rte_pktmbuf_mtod(pkt, char *),
> +			pkt_hdr_offset);
> +}
> +
> +static inline void
> +free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
> +{
> +	uint16_t i;
> +
> +	for (i = 0; i < nb_pkts; i++)
> +		rte_pktmbuf_free(pkts[i]);
> +}
> +
> +int
> +gso_do_segment(struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset,
> +		uint16_t pyld_unit_size,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out)
> +{
> +	struct rte_mbuf *pkt_in;
> +	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
> +	uint16_t pkt_in_data_pos, segment_bytes_remaining;
> +	uint16_t pyld_len, nb_segs;
> +	bool more_in_pkt, more_out_segs;
> +
> +	pkt_in = pkt;
> +	nb_segs = 0;
> +	more_in_pkt = 1;
> +	pkt_in_data_pos = pkt_hdr_offset;
> +
> +	while (more_in_pkt) {
> +		if (unlikely(nb_segs >= nb_pkts_out)) {
> +			free_gso_segment(pkts_out, nb_segs);
> +			return -EINVAL;
> +		}
> +
> +		/* Allocate a direct MBUF */
> +		hdr_segment = rte_pktmbuf_alloc(direct_pool);
> +		if (unlikely(hdr_segment == NULL)) {
> +			free_gso_segment(pkts_out, nb_segs);
> +			return -ENOMEM;
> +		}
> +		/* Fill the packet header */
> +		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
> +
> +		prev_segment = hdr_segment;
> +		segment_bytes_remaining = pyld_unit_size;
> +		more_out_segs = 1;
> +
> +		while (more_out_segs && more_in_pkt) {
> +			/* Allocate an indirect MBUF */
> +			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
> +			if (unlikely(pyld_segment == NULL)) {
> +				rte_pktmbuf_free(hdr_segment);
> +				free_gso_segment(pkts_out, nb_segs);
> +				return -ENOMEM;
> +			}
> +			/* Attach to current MBUF segment of pkt */
> +			rte_pktmbuf_attach(pyld_segment, pkt_in);
> +
> +			prev_segment->next = pyld_segment;
> +			prev_segment = pyld_segment;
> +
> +			pyld_len = segment_bytes_remaining;
> +			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
> +				pyld_len = pkt_in->data_len - pkt_in_data_pos;
> +
> +			pyld_segment->data_off = pkt_in_data_pos +
> +				pkt_in->data_off;
> +			pyld_segment->data_len = pyld_len;
> +
> +			/* Update header segment */
> +			hdr_segment->pkt_len += pyld_len;
> +			hdr_segment->nb_segs++;
> +
> +			pkt_in_data_pos += pyld_len;
> +			segment_bytes_remaining -= pyld_len;
> +
> +			/* Finish processing a MBUF segment of pkt */
> +			if (pkt_in_data_pos == pkt_in->data_len) {
> +				pkt_in = pkt_in->next;
> +				pkt_in_data_pos = 0;
> +				if (pkt_in == NULL)
> +					more_in_pkt = 0;
> +			}
> +
> +			/* Finish generating a GSO segment */
> +			if (segment_bytes_remaining == 0)
> +				more_out_segs = 0;
> +		}
> +		pkts_out[nb_segs++] = hdr_segment;
> +	}
> +	return nb_segs;
> +}
> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> new file mode 100644
> index 0000000..8d9b94e
> --- /dev/null
> +++ b/lib/librte_gso/gso_common.h
> @@ -0,0 +1,141 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _GSO_COMMON_H_
> +#define _GSO_COMMON_H_
> +
> +#include <stdint.h>
> +
> +#include <rte_mbuf.h>
> +#include <rte_ip.h>
> +#include <rte_tcp.h>
> +
> +#define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
> +		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
> +
> +#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
> +#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
> +
> +#define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
> +		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
> +
> +/**
> + * Internal function which updates the TCP header of a packet, following
> + * segmentation. This is required to update the header's 'sent' sequence
> + * number, and also to clear 'PSH' and 'FIN' flags for non-tail segments.
> + *
> + * @param pkt
> + *  The packet containing the TCP header.
> + * @param l4_offset
> + *  The offset of the TCP header from the start of the packet.
> + * @param sent_seq
> + *  The sent sequence number.
> + * @param non-tail
> + *  Indicates whether or not this is a tail segment.
> + */
> +static inline void
> +update_tcp_header(struct rte_mbuf *pkt, uint16_t l4_offset, uint32_t sent_seq,
> +		uint8_t non_tail)
> +{
> +	struct tcp_hdr *tcp_hdr;
> +
> +	tcp_hdr = (struct tcp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			l4_offset);
> +	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
> +	if (likely(non_tail))
> +		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
> +					TCP_HDR_FIN_MASK));
> +}
> +
> +/**
> + * Internal function which updates the IPv4 header of a packet, following
> + * segmentation. This is required to update the header's 'total_length' field,
> + * to reflect the reduced length of the now-segmented packet. Furthermore, the
> + * header's 'packet_id' field must be updated to reflect the new ID of the
> + * now-segmented packet.
> + *
> + * @param pkt
> + *  The packet containing the IPv4 header.
> + * @param l3_offset
> + *  The offset of the IPv4 header from the start of the packet.
> + * @param id
> + *  The new ID of the packet.
> +  */
> +static inline void
> +update_ipv4_header(struct rte_mbuf *pkt, uint16_t l3_offset, uint16_t id)
> +{
> +	struct ipv4_hdr *ipv4_hdr;
> +
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			l3_offset);
> +	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len - l3_offset);
> +	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
> +}
> +
> +/**
> + * Internal function which divides the input packet into small segments.
> + * Each of the newly-created segments is organized as a two-segment MBUF,
> + * where the first segment is a standard mbuf, which stores a copy of
> + * packet header, and the second is an indirect mbuf which points to a
> + * section of data in the input packet.
> + *
> + * @param pkt
> + *  Packet to segment.
> + * @param pkt_hdr_offset
> + *  Packet header offset, measured in bytes.
> + * @param pyld_unit_size
> + *  The max payload length of a GSO segment.
> + * @param direct_pool
> + *  MBUF pool used for allocating direct buffers for output segments.
> + * @param indirect_pool
> + *  MBUF pool used for allocating indirect buffers for output segments.
> + * @param pkts_out
> + *  Pointer array used to keep the mbuf addresses of output segments. If
> + *  the memory space in pkts_out is insufficient, gso_do_segment() fails
> + *  and returns -EINVAL.
> + * @param nb_pkts_out
> + *  The max number of items that pkts_out can keep.
> + *
> + * @return
> + *  - The number of segments created in the event of success.
> + *  - Return -ENOMEM if run out of memory in MBUF pools.
> + *  - Return -EINVAL for invalid parameters.
> + */
> +int gso_do_segment(struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset,
> +		uint16_t pyld_unit_size,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out);
> +#endif
> diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
> new file mode 100644
> index 0000000..584a77d
> --- /dev/null
> +++ b/lib/librte_gso/gso_tcp4.c
> @@ -0,0 +1,106 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include "gso_common.h"
> +#include "gso_tcp4.h"
> +
> +static void
> +update_ipv4_tcp_headers(struct rte_mbuf *pkt, uint64_t l3_offset,
> +		uint8_t ipid_delta, struct rte_mbuf **segs, uint16_t nb_segs)
> +{

No need to add "uint64_t l3_offset" as one parameter. We can directly get
l3_offset inside update_ipv4_tcp_headers().

Additionally, the type of l3_offset should be "uint16_t" instead of "uint64_t".
But if we change the prototype of update_ipv4_tcp_headers(), this issue doesn't
need to fix.

> +	struct ipv4_hdr *ipv4_hdr;
> +	struct tcp_hdr *tcp_hdr;
> +	uint32_t sent_seq;
> +	uint16_t id, tail_idx, i;
> +	uint16_t l4_offset;
> +
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char*) +
> +			l3_offset);
> +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> +	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> +	tail_idx = nb_segs - 1;
> +	l4_offset = l3_offset + pkt->l3_len;
> +
> +	for (i = 0; i < nb_segs; i++) {
> +		update_ipv4_header(segs[i], l3_offset, id);
> +		update_tcp_header(segs[i], l4_offset, sent_seq, i < tail_idx);
> +		id += ipid_delta;
> +		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
> +	}
> +}
> +
> +int
> +gso_tcp4_segment(struct rte_mbuf *pkt,
> +		uint16_t gso_size,
> +		uint8_t ipid_delta,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out)
> +{
> +	struct ipv4_hdr *ipv4_hdr;
> +	uint16_t tcp_dl;
> +	uint16_t pyld_unit_size, hdr_offset;
> +	uint16_t frag_off;
> +	int ret;
> +
> +	/* Don't process the fragmented packet */
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			pkt->l2_len);
> +	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
> +	if (unlikely(IS_FRAGMENTED(frag_off))) {
> +		pkts_out[0] = pkt;
> +		return 1;
> +	}
> +
> +	/* Don't process the packet without data */
> +	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
> +	if (unlikely(tcp_dl == 0)) {
> +		pkts_out[0] = pkt;
> +		return 1;
> +	}
> +
> +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> +	pyld_unit_size = gso_size - hdr_offset;
> +
> +	/* Segment the payload */
> +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
> +			indirect_pool, pkts_out, nb_pkts_out);
> +	if (ret > 1) {
> +		update_ipv4_tcp_headers(pkt, pkt->l2_len, ipid_delta,
> +				pkts_out, ret);
> +	}
> +
> +	return ret;
> +}
> diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
> new file mode 100644
> index 0000000..1c57441
> --- /dev/null
> +++ b/lib/librte_gso/gso_tcp4.h
> @@ -0,0 +1,74 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _GSO_TCP4_H_
> +#define _GSO_TCP4_H_
> +
> +#include <stdint.h>
> +#include <rte_mbuf.h>
> +
> +/**
> + * Segment an IPv4/TCP packet. This function doesn't check if the input
> + * packet has correct checksums, and doesn't update checksums for output
> + * GSO segments. Furthermore, it doesn't process IP fragment packets.
> + *
> + * @param pkt
> + *  The packet mbuf to segment.
> + * @param gso_size
> + *  The max length of a GSO segment, measured in bytes.
> + * @param ipid_delta
> + *  The increasing unit of IP ids.
> + * @param direct_pool
> + *  MBUF pool used for allocating direct buffers for output segments.
> + * @param indirect_pool
> + *  MBUF pool used for allocating indirect buffers for output segments.
> + * @param pkts_out
> + *  Pointer array used to store the MBUF addresses of output GSO
> + *  segments, when the function succeeds. If the memory space in
> + *  pkts_out is insufficient, it fails and returns -EINVAL.
> + * @param nb_pkts_out
> + *  The max number of items that 'pkts_out' can keep.
> + *
> + * @return
> + *   - The number of GSO segments filled in pkts_out on success.
> + *   - Return -ENOMEM if run out of memory in MBUF pools.
> + *   - Return -EINVAL for invalid parameters.
> + */
> +int gso_tcp4_segment(struct rte_mbuf *pkt,
> +		uint16_t gso_size,
> +		uint8_t ip_delta,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out);
> +#endif
> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> index b773636..a4fce50 100644
> --- a/lib/librte_gso/rte_gso.c
> +++ b/lib/librte_gso/rte_gso.c
> @@ -33,7 +33,12 @@
>  
>  #include <errno.h>
>  
> +#include <rte_log.h>
> +#include <rte_ethdev.h>
> +
>  #include "rte_gso.h"
> +#include "gso_common.h"
> +#include "gso_tcp4.h"
>  
>  int
>  rte_gso_segment(struct rte_mbuf *pkt,
> @@ -41,12 +46,53 @@
>  		struct rte_mbuf **pkts_out,
>  		uint16_t nb_pkts_out)
>  {
> +	struct rte_mempool *direct_pool, *indirect_pool;
> +	struct rte_mbuf *pkt_seg;
> +	uint64_t ol_flags;
> +	uint16_t gso_size;
> +	uint8_t ipid_delta;
> +	int ret = 1;
> +
>  	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
>  			nb_pkts_out < 1)
>  		return -EINVAL;
>  
> -	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> -	pkts_out[0] = pkt;
> +	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
> +				DEV_TX_OFFLOAD_TCP_TSO) !=
> +			gso_ctx->gso_types) {
> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> +		pkts_out[0] = pkt;
> +		return 1;
> +	}
> +
> +	direct_pool = gso_ctx->direct_pool;
> +	indirect_pool = gso_ctx->indirect_pool;
> +	gso_size = gso_ctx->gso_size;
> +	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
> +	ol_flags = pkt->ol_flags;
> +
> +	if (IS_IPV4_TCP(pkt->ol_flags)) {
> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> +		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> +				direct_pool, indirect_pool,
> +				pkts_out, nb_pkts_out);
> +	} else {
> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> +		pkts_out[0] = pkt;
> +		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
> +		return 1;
> +	}
> +
> +	if (ret > 1) {
> +		pkt_seg = pkt;
> +		while (pkt_seg) {
> +			rte_mbuf_refcnt_update(pkt_seg, -1);
> +			pkt_seg = pkt_seg->next;
> +		}
> +	} else if (ret < 0) {
> +		/* Revert the ol_flags in the event of failure. */
> +		pkt->ol_flags = ol_flags;
> +	}
>  
> -	return 1;
> +	return ret;
>  }
> -- 
> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v5 2/6] gso: add TCP/IPv4 GSO support
  2017-09-29  3:12         ` Jiayu Hu
@ 2017-09-29  9:05           ` Kavanagh, Mark B
  0 siblings, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-09-29  9:05 UTC (permalink / raw)
  To: Hu, Jiayu; +Cc: dev, Tan, Jianfeng, Ananyev, Konstantin, Yigit, Ferruh, thomas

Thanks for your comments Jiayu - please find responses inline.

Thanks,
Mark 

From: Hu, Jiayu
>Sent: Friday, September 29, 2017 4:13 AM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>
>Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>; Ananyev, Konstantin
><konstantin.ananyev@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
>thomas@monjalon.net
>Subject: Re: [PATCH v5 2/6] gso: add TCP/IPv4 GSO support
>
>Hi Mark,
>
>One comment is inline.
>
>Thanks,
>Jiayu
>
>On Thu, Sep 28, 2017 at 11:13:49PM +0100, Mark Kavanagh wrote:
>> From: Jiayu Hu <jiayu.hu@intel.com>
>>
>> This patch adds GSO support for TCP/IPv4 packets. Supported packets
>> may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
>> packets have correct checksums, and doesn't update checksums for
>> output packets (the responsibility for this lies with the application).
>> Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
>>
>> TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
>> MBUF, to organize an output packet. Note that we refer to these two
>> chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
>> header, while the indirect mbuf simply points to a location within the
>> original packet's payload. Consequently, use of the GSO library requires
>> multi-segment MBUF support in the TX functions of the NIC driver.
>>
>> If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
>> result, when all of its GSOed segments are freed, the packet is freed
>> automatically.
>>
>> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> Tested-by: Lei Yao <lei.a.yao@intel.com>
>> ---
>>  doc/guides/rel_notes/release_17_11.rst  |  12 +++
>>  lib/librte_eal/common/include/rte_log.h |   1 +
>>  lib/librte_gso/Makefile                 |   2 +
>>  lib/librte_gso/gso_common.c             | 153
>++++++++++++++++++++++++++++++++
>>  lib/librte_gso/gso_common.h             | 141 +++++++++++++++++++++++++++++
>>  lib/librte_gso/gso_tcp4.c               | 106 ++++++++++++++++++++++
>>  lib/librte_gso/gso_tcp4.h               |  74 +++++++++++++++
>>  lib/librte_gso/rte_gso.c                |  52 ++++++++++-
>>  8 files changed, 538 insertions(+), 3 deletions(-)
>>  create mode 100644 lib/librte_gso/gso_common.c
>>  create mode 100644 lib/librte_gso/gso_common.h
>>  create mode 100644 lib/librte_gso/gso_tcp4.c
>>  create mode 100644 lib/librte_gso/gso_tcp4.h
>>
>> diff --git a/doc/guides/rel_notes/release_17_11.rst
>b/doc/guides/rel_notes/release_17_11.rst
>> index 7508be7..c414f73 100644
>> --- a/doc/guides/rel_notes/release_17_11.rst
>> +++ b/doc/guides/rel_notes/release_17_11.rst
>> @@ -41,6 +41,18 @@ New Features
>>       Also, make sure to start the actual text at the margin.
>>       =========================================================
>>
>> +* **Added the Generic Segmentation Offload Library.**
>> +
>> +  Added the Generic Segmentation Offload (GSO) library to enable
>> +  applications to split large packets (e.g. MTU is 64KB) into small
>> +  ones (e.g. MTU is 1500B). Supported packet types are:
>> +
>> +  * TCP/IPv4 packets, which may include a single VLAN tag.
>> +
>> +  The GSO library doesn't check if the input packets have correct
>> +  checksums, and doesn't update checksums for output packets.
>> +  Additionally, the GSO library doesn't process IP fragmented packets.
>> +
>>
>>  Resolved Issues
>>  ---------------
>> diff --git a/lib/librte_eal/common/include/rte_log.h
>b/lib/librte_eal/common/include/rte_log.h
>> index ec8dba7..2fa1199 100644
>> --- a/lib/librte_eal/common/include/rte_log.h
>> +++ b/lib/librte_eal/common/include/rte_log.h
>> @@ -87,6 +87,7 @@ struct rte_logs {
>>  #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
>>  #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
>>  #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
>> +#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
>>
>>  /* these log types can be used in an application */
>>  #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
>> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
>> index aeaacbc..2be64d1 100644
>> --- a/lib/librte_gso/Makefile
>> +++ b/lib/librte_gso/Makefile
>> @@ -42,6 +42,8 @@ LIBABIVER := 1
>>
>>  #source files
>>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
>> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
>> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
>>
>>  # install this header file
>>  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
>> diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
>> new file mode 100644
>> index 0000000..ee75d4c
>> --- /dev/null
>> +++ b/lib/librte_gso/gso_common.c
>> @@ -0,0 +1,153 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#include <stdbool.h>
>> +#include <errno.h>
>> +
>> +#include <rte_memcpy.h>
>> +#include <rte_mempool.h>
>> +
>> +#include "gso_common.h"
>> +
>> +static inline void
>> +hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
>> +		uint16_t pkt_hdr_offset)
>> +{
>> +	/* Copy MBUF metadata */
>> +	hdr_segment->nb_segs = 1;
>> +	hdr_segment->port = pkt->port;
>> +	hdr_segment->ol_flags = pkt->ol_flags;
>> +	hdr_segment->packet_type = pkt->packet_type;
>> +	hdr_segment->pkt_len = pkt_hdr_offset;
>> +	hdr_segment->data_len = pkt_hdr_offset;
>> +	hdr_segment->tx_offload = pkt->tx_offload;
>> +
>> +	/* Copy the packet header */
>> +	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
>> +			rte_pktmbuf_mtod(pkt, char *),
>> +			pkt_hdr_offset);
>> +}
>> +
>> +static inline void
>> +free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
>> +{
>> +	uint16_t i;
>> +
>> +	for (i = 0; i < nb_pkts; i++)
>> +		rte_pktmbuf_free(pkts[i]);
>> +}
>> +
>> +int
>> +gso_do_segment(struct rte_mbuf *pkt,
>> +		uint16_t pkt_hdr_offset,
>> +		uint16_t pyld_unit_size,
>> +		struct rte_mempool *direct_pool,
>> +		struct rte_mempool *indirect_pool,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out)
>> +{
>> +	struct rte_mbuf *pkt_in;
>> +	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
>> +	uint16_t pkt_in_data_pos, segment_bytes_remaining;
>> +	uint16_t pyld_len, nb_segs;
>> +	bool more_in_pkt, more_out_segs;
>> +
>> +	pkt_in = pkt;
>> +	nb_segs = 0;
>> +	more_in_pkt = 1;
>> +	pkt_in_data_pos = pkt_hdr_offset;
>> +
>> +	while (more_in_pkt) {
>> +		if (unlikely(nb_segs >= nb_pkts_out)) {
>> +			free_gso_segment(pkts_out, nb_segs);
>> +			return -EINVAL;
>> +		}
>> +
>> +		/* Allocate a direct MBUF */
>> +		hdr_segment = rte_pktmbuf_alloc(direct_pool);
>> +		if (unlikely(hdr_segment == NULL)) {
>> +			free_gso_segment(pkts_out, nb_segs);
>> +			return -ENOMEM;
>> +		}
>> +		/* Fill the packet header */
>> +		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
>> +
>> +		prev_segment = hdr_segment;
>> +		segment_bytes_remaining = pyld_unit_size;
>> +		more_out_segs = 1;
>> +
>> +		while (more_out_segs && more_in_pkt) {
>> +			/* Allocate an indirect MBUF */
>> +			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
>> +			if (unlikely(pyld_segment == NULL)) {
>> +				rte_pktmbuf_free(hdr_segment);
>> +				free_gso_segment(pkts_out, nb_segs);
>> +				return -ENOMEM;
>> +			}
>> +			/* Attach to current MBUF segment of pkt */
>> +			rte_pktmbuf_attach(pyld_segment, pkt_in);
>> +
>> +			prev_segment->next = pyld_segment;
>> +			prev_segment = pyld_segment;
>> +
>> +			pyld_len = segment_bytes_remaining;
>> +			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
>> +				pyld_len = pkt_in->data_len - pkt_in_data_pos;
>> +
>> +			pyld_segment->data_off = pkt_in_data_pos +
>> +				pkt_in->data_off;
>> +			pyld_segment->data_len = pyld_len;
>> +
>> +			/* Update header segment */
>> +			hdr_segment->pkt_len += pyld_len;
>> +			hdr_segment->nb_segs++;
>> +
>> +			pkt_in_data_pos += pyld_len;
>> +			segment_bytes_remaining -= pyld_len;
>> +
>> +			/* Finish processing a MBUF segment of pkt */
>> +			if (pkt_in_data_pos == pkt_in->data_len) {
>> +				pkt_in = pkt_in->next;
>> +				pkt_in_data_pos = 0;
>> +				if (pkt_in == NULL)
>> +					more_in_pkt = 0;
>> +			}
>> +
>> +			/* Finish generating a GSO segment */
>> +			if (segment_bytes_remaining == 0)
>> +				more_out_segs = 0;
>> +		}
>> +		pkts_out[nb_segs++] = hdr_segment;
>> +	}
>> +	return nb_segs;
>> +}
>> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
>> new file mode 100644
>> index 0000000..8d9b94e
>> --- /dev/null
>> +++ b/lib/librte_gso/gso_common.h
>> @@ -0,0 +1,141 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#ifndef _GSO_COMMON_H_
>> +#define _GSO_COMMON_H_
>> +
>> +#include <stdint.h>
>> +
>> +#include <rte_mbuf.h>
>> +#include <rte_ip.h>
>> +#include <rte_tcp.h>
>> +
>> +#define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
>> +		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
>> +
>> +#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
>> +#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
>> +
>> +#define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
>> +		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
>> +
>> +/**
>> + * Internal function which updates the TCP header of a packet, following
>> + * segmentation. This is required to update the header's 'sent' sequence
>> + * number, and also to clear 'PSH' and 'FIN' flags for non-tail segments.
>> + *
>> + * @param pkt
>> + *  The packet containing the TCP header.
>> + * @param l4_offset
>> + *  The offset of the TCP header from the start of the packet.
>> + * @param sent_seq
>> + *  The sent sequence number.
>> + * @param non-tail
>> + *  Indicates whether or not this is a tail segment.
>> + */
>> +static inline void
>> +update_tcp_header(struct rte_mbuf *pkt, uint16_t l4_offset, uint32_t
>sent_seq,
>> +		uint8_t non_tail)
>> +{
>> +	struct tcp_hdr *tcp_hdr;
>> +
>> +	tcp_hdr = (struct tcp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> +			l4_offset);
>> +	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
>> +	if (likely(non_tail))
>> +		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
>> +					TCP_HDR_FIN_MASK));
>> +}
>> +
>> +/**
>> + * Internal function which updates the IPv4 header of a packet, following
>> + * segmentation. This is required to update the header's 'total_length'
>field,
>> + * to reflect the reduced length of the now-segmented packet. Furthermore,
>the
>> + * header's 'packet_id' field must be updated to reflect the new ID of the
>> + * now-segmented packet.
>> + *
>> + * @param pkt
>> + *  The packet containing the IPv4 header.
>> + * @param l3_offset
>> + *  The offset of the IPv4 header from the start of the packet.
>> + * @param id
>> + *  The new ID of the packet.
>> +  */
>> +static inline void
>> +update_ipv4_header(struct rte_mbuf *pkt, uint16_t l3_offset, uint16_t id)
>> +{
>> +	struct ipv4_hdr *ipv4_hdr;
>> +
>> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> +			l3_offset);
>> +	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len - l3_offset);
>> +	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
>> +}
>> +
>> +/**
>> + * Internal function which divides the input packet into small segments.
>> + * Each of the newly-created segments is organized as a two-segment MBUF,
>> + * where the first segment is a standard mbuf, which stores a copy of
>> + * packet header, and the second is an indirect mbuf which points to a
>> + * section of data in the input packet.
>> + *
>> + * @param pkt
>> + *  Packet to segment.
>> + * @param pkt_hdr_offset
>> + *  Packet header offset, measured in bytes.
>> + * @param pyld_unit_size
>> + *  The max payload length of a GSO segment.
>> + * @param direct_pool
>> + *  MBUF pool used for allocating direct buffers for output segments.
>> + * @param indirect_pool
>> + *  MBUF pool used for allocating indirect buffers for output segments.
>> + * @param pkts_out
>> + *  Pointer array used to keep the mbuf addresses of output segments. If
>> + *  the memory space in pkts_out is insufficient, gso_do_segment() fails
>> + *  and returns -EINVAL.
>> + * @param nb_pkts_out
>> + *  The max number of items that pkts_out can keep.
>> + *
>> + * @return
>> + *  - The number of segments created in the event of success.
>> + *  - Return -ENOMEM if run out of memory in MBUF pools.
>> + *  - Return -EINVAL for invalid parameters.
>> + */
>> +int gso_do_segment(struct rte_mbuf *pkt,
>> +		uint16_t pkt_hdr_offset,
>> +		uint16_t pyld_unit_size,
>> +		struct rte_mempool *direct_pool,
>> +		struct rte_mempool *indirect_pool,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out);
>> +#endif
>> diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
>> new file mode 100644
>> index 0000000..584a77d
>> --- /dev/null
>> +++ b/lib/librte_gso/gso_tcp4.c
>> @@ -0,0 +1,106 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#include "gso_common.h"
>> +#include "gso_tcp4.h"
>> +
>> +static void
>> +update_ipv4_tcp_headers(struct rte_mbuf *pkt, uint64_t l3_offset,
>> +		uint8_t ipid_delta, struct rte_mbuf **segs, uint16_t nb_segs)
>> +{
>
>No need to add "uint64_t l3_offset" as one parameter. We can directly get
>l3_offset inside update_ipv4_tcp_headers().

Yes, of course - thanks Jiayu.

>
>Additionally, the type of l3_offset should be "uint16_t" instead of
>"uint64_t".

That is a rebase artifact - please disregard.

>But if we change the prototype of update_ipv4_tcp_headers(), this issue
>doesn't
>need to fix.

Agreed.

>
>> +	struct ipv4_hdr *ipv4_hdr;
>> +	struct tcp_hdr *tcp_hdr;
>> +	uint32_t sent_seq;
>> +	uint16_t id, tail_idx, i;
>> +	uint16_t l4_offset;
>> +
>> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char*) +
>> +			l3_offset);
>> +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
>> +	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
>> +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
>> +	tail_idx = nb_segs - 1;
>> +	l4_offset = l3_offset + pkt->l3_len;
>> +
>> +	for (i = 0; i < nb_segs; i++) {
>> +		update_ipv4_header(segs[i], l3_offset, id);
>> +		update_tcp_header(segs[i], l4_offset, sent_seq, i < tail_idx);
>> +		id += ipid_delta;
>> +		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
>> +	}
>> +}
>> +
>> +int
>> +gso_tcp4_segment(struct rte_mbuf *pkt,
>> +		uint16_t gso_size,
>> +		uint8_t ipid_delta,
>> +		struct rte_mempool *direct_pool,
>> +		struct rte_mempool *indirect_pool,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out)
>> +{
>> +	struct ipv4_hdr *ipv4_hdr;
>> +	uint16_t tcp_dl;
>> +	uint16_t pyld_unit_size, hdr_offset;
>> +	uint16_t frag_off;
>> +	int ret;
>> +
>> +	/* Don't process the fragmented packet */
>> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> +			pkt->l2_len);
>> +	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
>> +	if (unlikely(IS_FRAGMENTED(frag_off))) {
>> +		pkts_out[0] = pkt;
>> +		return 1;
>> +	}
>> +
>> +	/* Don't process the packet without data */
>> +	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
>> +	if (unlikely(tcp_dl == 0)) {
>> +		pkts_out[0] = pkt;
>> +		return 1;
>> +	}
>> +
>> +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
>> +	pyld_unit_size = gso_size - hdr_offset;
>> +
>> +	/* Segment the payload */
>> +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
>> +			indirect_pool, pkts_out, nb_pkts_out);
>> +	if (ret > 1) {
>> +		update_ipv4_tcp_headers(pkt, pkt->l2_len, ipid_delta,
>> +				pkts_out, ret);
>> +	}
>> +
>> +	return ret;
>> +}
>> diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
>> new file mode 100644
>> index 0000000..1c57441
>> --- /dev/null
>> +++ b/lib/librte_gso/gso_tcp4.h
>> @@ -0,0 +1,74 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#ifndef _GSO_TCP4_H_
>> +#define _GSO_TCP4_H_
>> +
>> +#include <stdint.h>
>> +#include <rte_mbuf.h>
>> +
>> +/**
>> + * Segment an IPv4/TCP packet. This function doesn't check if the input
>> + * packet has correct checksums, and doesn't update checksums for output
>> + * GSO segments. Furthermore, it doesn't process IP fragment packets.
>> + *
>> + * @param pkt
>> + *  The packet mbuf to segment.
>> + * @param gso_size
>> + *  The max length of a GSO segment, measured in bytes.
>> + * @param ipid_delta
>> + *  The increasing unit of IP ids.
>> + * @param direct_pool
>> + *  MBUF pool used for allocating direct buffers for output segments.
>> + * @param indirect_pool
>> + *  MBUF pool used for allocating indirect buffers for output segments.
>> + * @param pkts_out
>> + *  Pointer array used to store the MBUF addresses of output GSO
>> + *  segments, when the function succeeds. If the memory space in
>> + *  pkts_out is insufficient, it fails and returns -EINVAL.
>> + * @param nb_pkts_out
>> + *  The max number of items that 'pkts_out' can keep.
>> + *
>> + * @return
>> + *   - The number of GSO segments filled in pkts_out on success.
>> + *   - Return -ENOMEM if run out of memory in MBUF pools.
>> + *   - Return -EINVAL for invalid parameters.
>> + */
>> +int gso_tcp4_segment(struct rte_mbuf *pkt,
>> +		uint16_t gso_size,
>> +		uint8_t ip_delta,
>> +		struct rte_mempool *direct_pool,
>> +		struct rte_mempool *indirect_pool,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out);
>> +#endif
>> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
>> index b773636..a4fce50 100644
>> --- a/lib/librte_gso/rte_gso.c
>> +++ b/lib/librte_gso/rte_gso.c
>> @@ -33,7 +33,12 @@
>>
>>  #include <errno.h>
>>
>> +#include <rte_log.h>
>> +#include <rte_ethdev.h>
>> +
>>  #include "rte_gso.h"
>> +#include "gso_common.h"
>> +#include "gso_tcp4.h"
>>
>>  int
>>  rte_gso_segment(struct rte_mbuf *pkt,
>> @@ -41,12 +46,53 @@
>>  		struct rte_mbuf **pkts_out,
>>  		uint16_t nb_pkts_out)
>>  {
>> +	struct rte_mempool *direct_pool, *indirect_pool;
>> +	struct rte_mbuf *pkt_seg;
>> +	uint64_t ol_flags;
>> +	uint16_t gso_size;
>> +	uint8_t ipid_delta;
>> +	int ret = 1;
>> +
>>  	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
>>  			nb_pkts_out < 1)
>>  		return -EINVAL;
>>
>> -	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> -	pkts_out[0] = pkt;
>> +	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
>> +				DEV_TX_OFFLOAD_TCP_TSO) !=
>> +			gso_ctx->gso_types) {
>> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> +		pkts_out[0] = pkt;
>> +		return 1;
>> +	}
>> +
>> +	direct_pool = gso_ctx->direct_pool;
>> +	indirect_pool = gso_ctx->indirect_pool;
>> +	gso_size = gso_ctx->gso_size;
>> +	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
>> +	ol_flags = pkt->ol_flags;
>> +
>> +	if (IS_IPV4_TCP(pkt->ol_flags)) {
>> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> +		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
>> +				direct_pool, indirect_pool,
>> +				pkts_out, nb_pkts_out);
>> +	} else {
>> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> +		pkts_out[0] = pkt;
>> +		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
>> +		return 1;
>> +	}
>> +
>> +	if (ret > 1) {
>> +		pkt_seg = pkt;
>> +		while (pkt_seg) {
>> +			rte_mbuf_refcnt_update(pkt_seg, -1);
>> +			pkt_seg = pkt_seg->next;
>> +		}
>> +	} else if (ret < 0) {
>> +		/* Revert the ol_flags in the event of failure. */
>> +		pkt->ol_flags = ol_flags;
>> +	}
>>
>> -	return 1;
>> +	return ret;
>>  }
>> --
>> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v6 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-09-28 22:13       ` [PATCH v5 0/6] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Mark Kavanagh
@ 2017-10-02 16:45         ` Mark Kavanagh
  2017-10-02 16:45           ` [PATCH v6 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
                             ` (6 more replies)
  0 siblings, 7 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-02 16:45 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. This patch adds GSO support to DPDK for specific
packet types: specifically, TCP/IPv4, VxLAN, and GRE.

The first patch introduces the GSO API framework. The second patch
adds GSO support for TCP/IPv4 packets (containing an optional VLAN
tag). The third patch adds GSO support for VxLAN packets that contain
outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or 
outer VLAN tags). The fourth patch adds GSO support for GRE packets
that contain outer IPv4, and inner TCP/IPv4 headers (with optional 
outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
and GRE GSO in testpmd's checksum forwarding engine. The final patch
in the series adds GSO documentation to the programmer's guide.

Performance Testing
===================
The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
iperf. Setup for the test is described as follows:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum
   forwarding engine with "retry".
c. Select IP and TCP HW checksum calculation for P0; select TCP HW
   checksum calculation for vhost-user port.
d. Launch a VM with csum and tso offloading enabled.
e. Run iperf-client on virtio-net port in the VM to send TCP packets.
   With enabling csum and tso, the VM can send large TCP/IPv4 packets
   (mss is up to 64KB).
f. P1 is assigned to linux kernel and enabled kernel GRO. Run
   iperf-server on P1.

We conduct three iperf tests:

test-1: enable GSO for P0 in testpmd, and set max GSO segment length
    to 1518B. Run two iperf-client in the VM.
test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
    two iperf-client in the VM.
test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.

Throughput of the above three tests:

test-1: 9.4Gbps
test-2: 9.5Gbps
test-3: 3Mbps

Functional Testing
==================
Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
length of tunneled packets from VMs is 1514B. So current experiment
method can't be used to measure VxLAN and GRE GSO performance, but simply
test the functionality via setting small GSO segment length (e.g. 500B).

VxLAN
-----
To test VxLAN GSO functionality, we use the following setup:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
   engine with "retry".
c. Testpmd commands:
    - csum parse_tunnel on "P0"
    - csum parse_tunnel on "vhost-user port"
    - csum set outer-ip hw "P0"
    - csum set ip hw "P0"
    - csum set tcp hw "P0"
    - csum set tcp hw "vhost-user port"
    - set port "P0" gso on
    - set gso segsz 500
d. Launch a VM with csum and tso offloading enabled.
e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
   on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
   max packet length is 1514B.
f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
   create a VxLAN port for P1, and run iperf-server on the VxLAN port.

In testpmd, we can see the length of all packets sent from P0 is smaller
than or equal to 500B. Additionally, the packets arriving in P1 is
encapsulated and is smaller than or equal to 500B.

GRE
---
The same process may be used to test GRE functionality, with the exception that
the tunnel type created for both the guest's virtio-net, and the host's kernel
interfaces is GRE:
   `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`

As in the VxLAN testcase, the length of packets sent from P0, and received on
P1, is less than 500B.

Change log
==========
v6:
- rebase to HEAD of master (i5dce9fcA)
- remove 'l3_offset' parameter from 'update_ipv4_tcp_headers'

v5:
- add GSO section to the programmer's guide.
- use MF or (previously 'and') offset to check if a packet is IP
  fragmented.
- move 'update_header' helper functions to gso_common.h.
- move txp/ipv4 'update_header' function to gso_tcp4.c.
- move tunnel 'update_header' function to gso_tunnel_tcp4.c.
- add offset parameter to 'update_header' functions.
- combine GRE and VxLAN tunnel header update functions into a single
  function.
- correct typos and errors in comments/commit messages.

v4:
- use ol_flags instead of packet_type to decide which segmentation
  function to use.
- use MF and offset to check if a packet is IP fragmented, instead of
  using DF.
- remove ETHER_CRC_LEN from gso segment payload length calculation.
- refactor internal header update and other functions.
- remove RTE_GSO_IPID_INCREASE.
- add some of GSO documents.
- set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
  packets sent from GSO-enabled ports in testpmd.
v3:
- support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
  RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
  UNKNOWN.
- fill mbuf->packet_type instead of using rte_net_get_ptype() in
  csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
- store the input packet into pkts_out inside gso_tcp4_segment() and
  gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
  is performed.
- add missing incldues.
- optimize file names, function names and function description.
- fix one bug in testpmd.
v2:
- merge data segments whose data_len is less than mss into a large data
  segment in gso_do_segment().
- use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
  header in rte_gso_segment().
- provide IP id macros for applications to select fixed or incremental IP
  ids.
- change the defination of gso_types in struct rte_gso_ctx.
- replace rte_pktmbuf_detach() with rte_pktmbuf_free().
- refactor gso_update_pkt_headers().
- change the return value of rte_gso_segment().
- remove parameter checks in rte_gso_segment().
- use rte_net_get_ptype() in app/test-pmd/csumonly.c to fill
  mbuf->packet_type.
- add a new GSO command in testpmd to show GSO configuration for ports.
- misc: fix typo and optimize function description.


Jiayu Hu (3):
  gso: add Generic Segmentation Offload API framework
  gso: add TCP/IPv4 GSO support
  app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO

Mark Kavanagh (3):
  gso: add VxLAN GSO support
  gso: add GRE GSO support
  doc: add GSO programmer's guide

 MAINTAINERS                                        |   6 +
 app/test-pmd/cmdline.c                             | 178 ++++++++
 app/test-pmd/config.c                              |  24 ++
 app/test-pmd/csumonly.c                            |  69 ++-
 app/test-pmd/testpmd.c                             |  13 +
 app/test-pmd/testpmd.h                             |  10 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   1 +
 doc/api/doxy-api.conf                              |   1 +
 .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
 .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
 doc/guides/prog_guide/index.rst                    |   1 +
 doc/guides/rel_notes/release_17_11.rst             |  19 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  46 ++
 lib/Makefile                                       |   2 +
 lib/librte_eal/common/include/rte_log.h            |   1 +
 lib/librte_gso/Makefile                            |  52 +++
 lib/librte_gso/gso_common.c                        | 153 +++++++
 lib/librte_gso/gso_common.h                        | 171 ++++++++
 lib/librte_gso/gso_tcp4.c                          | 104 +++++
 lib/librte_gso/gso_tcp4.h                          |  74 ++++
 lib/librte_gso/gso_tunnel_tcp4.c                   | 129 ++++++
 lib/librte_gso/gso_tunnel_tcp4.h                   |  75 ++++
 lib/librte_gso/rte_gso.c                           | 107 +++++
 lib/librte_gso/rte_gso.h                           | 145 +++++++
 lib/librte_gso/rte_gso_version.map                 |   7 +
 mk/rte.app.mk                                      |   1 +
 28 files changed, 2435 insertions(+), 5 deletions(-)
 create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
 create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
 create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

-- 
1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v6 1/6] gso: add Generic Segmentation Offload API framework
  2017-10-02 16:45         ` [PATCH v6 0/6] Support TCP/IPv4, VxLAN, " Mark Kavanagh
@ 2017-10-02 16:45           ` Mark Kavanagh
  2017-10-04 13:11             ` Ananyev, Konstantin
  2017-10-02 16:45           ` [PATCH v6 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
                             ` (5 subsequent siblings)
  6 siblings, 1 reply; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-02 16:45 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. To segment a packet requires two steps. The first
is to set proper flags to mbuf->ol_flags, where the flags are the same
as that of TSO. The second is to call the segmentation API,
rte_gso_segment(). This patch introduces the GSO API framework to DPDK.

rte_gso_segment() splits an input packet into small ones in each
invocation. The GSO library refers to these small packets generated
by rte_gso_segment() as GSO segments. Each of the newly-created GSO
segments is organized as a two-segment MBUF, where the first segment is a
standard MBUF, which stores a copy of packet header, and the second is an
indirect MBUF which points to a section of data in the input packet.
rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
when all GSO segments are freed, the input packet is freed automatically.
Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
the driver of the interface which the GSO segments are sent to should
support to transmit multi-segment packets.

The GSO framework clears the PKT_TX_TCP_SEG flag for both the input
packet, and all produced GSO segments in the event of success, since
segmentation in hardware is no longer required at that point.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 config/common_base                     |   5 ++
 doc/api/doxy-api-index.md              |   1 +
 doc/api/doxy-api.conf                  |   1 +
 doc/guides/rel_notes/release_17_11.rst |   1 +
 lib/Makefile                           |   2 +
 lib/librte_gso/Makefile                |  49 +++++++++++
 lib/librte_gso/rte_gso.c               |  52 ++++++++++++
 lib/librte_gso/rte_gso.h               | 145 +++++++++++++++++++++++++++++++++
 lib/librte_gso/rte_gso_version.map     |   7 ++
 mk/rte.app.mk                          |   1 +
 10 files changed, 264 insertions(+)
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

diff --git a/config/common_base b/config/common_base
index 12f6be9..58ca5c0 100644
--- a/config/common_base
+++ b/config/common_base
@@ -653,6 +653,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
 CONFIG_RTE_LIBRTE_GRO=y
 
 #
+# Compile GSO library
+#
+CONFIG_RTE_LIBRTE_GSO=y
+
+#
 # Compile librte_meter
 #
 CONFIG_RTE_LIBRTE_METER=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 19e0d4f..6512918 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -101,6 +101,7 @@ The public API headers are grouped by topics:
   [TCP]                (@ref rte_tcp.h),
   [UDP]                (@ref rte_udp.h),
   [GRO]                (@ref rte_gro.h),
+  [GSO]                (@ref rte_gso.h),
   [frag/reass]         (@ref rte_ip_frag.h),
   [LPM IPv4 route]     (@ref rte_lpm.h),
   [LPM IPv6 route]     (@ref rte_lpm6.h),
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 823554f..408f2e6 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -47,6 +47,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_ether \
                           lib/librte_eventdev \
                           lib/librte_gro \
+                          lib/librte_gso \
                           lib/librte_hash \
                           lib/librte_ip_frag \
                           lib/librte_jobstats \
diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 8bf91bd..7508be7 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -174,6 +174,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_ethdev.so.7
      librte_eventdev.so.2
      librte_gro.so.1
+   + librte_gso.so.1
      librte_hash.so.2
      librte_ip_frag.so.1
      librte_jobstats.so.1
diff --git a/lib/Makefile b/lib/Makefile
index 86caba1..3d123f4 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
 DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
+DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
+DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
new file mode 100644
index 0000000..aeaacbc
--- /dev/null
+++ b/lib/librte_gso/Makefile
@@ -0,0 +1,49 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_gso.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+
+EXPORT_MAP := rte_gso_version.map
+
+LIBABIVER := 1
+
+#source files
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
new file mode 100644
index 0000000..b773636
--- /dev/null
+++ b/lib/librte_gso/rte_gso.c
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <errno.h>
+
+#include "rte_gso.h"
+
+int
+rte_gso_segment(struct rte_mbuf *pkt,
+		const struct rte_gso_ctx *gso_ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
+			nb_pkts_out < 1)
+		return -EINVAL;
+
+	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+	pkts_out[0] = pkt;
+
+	return 1;
+}
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
new file mode 100644
index 0000000..53725e6
--- /dev/null
+++ b/lib/librte_gso/rte_gso.h
@@ -0,0 +1,145 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_GSO_H_
+#define _RTE_GSO_H_
+
+/**
+ * @file
+ * Interface to GSO library
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/* GSO IP id flags for the IPv4 header */
+#define RTE_GSO_IPID_FIXED (1ULL << 0)
+/**< Use fixed IP ids for output GSO segments. Setting
+ * !RTE_GSO_IPID_FIXED indicates using incremental IP ids.
+ */
+
+/**
+ * GSO context structure.
+ */
+struct rte_gso_ctx {
+	struct rte_mempool *direct_pool;
+	/**< MBUF pool for allocating direct buffers, which are used
+	 * to store packet headers for GSO segments.
+	 */
+	struct rte_mempool *indirect_pool;
+	/**< MBUF pool for allocating indirect buffers, which are used
+	 * to locate packet payloads for GSO segments. The indirect
+	 * buffer doesn't contain any data, but simply points to an
+	 * offset within the packet to segment.
+	 */
+	uint64_t ipid_flag;
+	/**< flag to indicate GSO uses fixed or incremental IP ids for
+	 * IPv4 headers of output GSO segments. If applications want
+	 * fixed IP ids, set RTE_GSO_IPID_FIXED to ipid_flag. Conversely,
+	 * if applications want incremental IP ids, set !RTE_GSO_IPID_FIXED.
+	 */
+	uint32_t gso_types;
+	/**< the bit mask of required GSO types. The GSO library
+	 * uses the same macros as that of describing device TX
+	 * offloading capabilities (i.e. DEV_TX_OFFLOAD_*_TSO) for
+	 * gso_types.
+	 *
+	 * For example, if applications want to segment TCP/IPv4
+	 * packets, set DEV_TX_OFFLOAD_TCP_TSO in gso_types.
+	 */
+	uint16_t gso_size;
+	/**< maximum size of an output GSO segment, including packet
+	 * header and payload, measured in bytes.
+	 */
+};
+
+/**
+ * Segmentation function, which supports processing of both single- and
+ * multi- MBUF packets.
+ *
+ * Note that we refer to the packets that are segmented from the input
+ * packet as 'GSO segments'. rte_gso_segment() doesn't check if the
+ * input packet has correct checksums, and doesn't update checksums for
+ * output GSO segments. Additionally, it doesn't process IP fragment
+ * packets.
+ *
+ * Before calling rte_gso_segment(), applications must set proper ol_flags
+ * for the packet. The GSO library uses the same macros as that of TSO.
+ * For example, set PKT_TX_TCP_SEG and PKT_TX_IPV4 in ol_flags to segment
+ * a TCP/IPv4 packet. If rte_gso_segment() succceds, the PKT_TX_TCP_SEG
+ * flag is removed for all GSO segments and the input packet.
+ *
+ * Each of the newly-created GSO segments is organized as a two-segment
+ * MBUF, where the first segment is a standard MBUF, which stores a copy
+ * of packet header, and the second is an indirect MBUF which points to
+ * a section of data in the input packet. Since each GSO segment has
+ * multiple MBUFs (i.e. typically 2 MBUFs), the driver of the interface which
+ * the GSO segments are sent to should support transmission of multi-segment
+ * packets.
+ *
+ * If the input packet is GSO'd, its mbuf refcnt reduces by 1. Therefore,
+ * when all GSO segments are freed, the input packet is freed automatically.
+ *
+ * If the memory space in pkts_out or MBUF pools is insufficient, this
+ * function fails, and it returns (-1) * errno. Otherwise, GSO succeeds,
+ * and this function returns the number of output GSO segments filled in
+ * pkts_out.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param ctx
+ *  GSO context object pointer.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when rte_gso_segment() succeeds.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of GSO segments filled in pkts_out on success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int rte_gso_segment(struct rte_mbuf *pkt,
+		const struct rte_gso_ctx *ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_GSO_H_ */
diff --git a/lib/librte_gso/rte_gso_version.map b/lib/librte_gso/rte_gso_version.map
new file mode 100644
index 0000000..e1fd453
--- /dev/null
+++ b/lib/librte_gso/rte_gso_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_gso_segment;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index c25fdd9..d4c9873 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
+_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
  2017-10-02 16:45         ` [PATCH v6 0/6] Support TCP/IPv4, VxLAN, " Mark Kavanagh
  2017-10-02 16:45           ` [PATCH v6 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
@ 2017-10-02 16:45           ` Mark Kavanagh
  2017-10-04 13:32             ` Ananyev, Konstantin
  2017-10-04 13:35             ` Ananyev, Konstantin
  2017-10-02 16:45           ` [PATCH v6 3/6] gso: add VxLAN " Mark Kavanagh
                             ` (4 subsequent siblings)
  6 siblings, 2 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-02 16:45 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

This patch adds GSO support for TCP/IPv4 packets. Supported packets
may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
packets have correct checksums, and doesn't update checksums for
output packets (the responsibility for this lies with the application).
Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.

TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
MBUF, to organize an output packet. Note that we refer to these two
chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
header, while the indirect mbuf simply points to a location within the
original packet's payload. Consequently, use of the GSO library requires
multi-segment MBUF support in the TX functions of the NIC driver.

If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
result, when all of its GSOed segments are freed, the packet is freed
automatically.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst  |  12 +++
 lib/librte_eal/common/include/rte_log.h |   1 +
 lib/librte_gso/Makefile                 |   2 +
 lib/librte_gso/gso_common.c             | 153 ++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h             | 141 +++++++++++++++++++++++++++++
 lib/librte_gso/gso_tcp4.c               | 104 ++++++++++++++++++++++
 lib/librte_gso/gso_tcp4.h               |  74 +++++++++++++++
 lib/librte_gso/rte_gso.c                |  52 ++++++++++-
 8 files changed, 536 insertions(+), 3 deletions(-)
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 7508be7..c414f73 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -41,6 +41,18 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added the Generic Segmentation Offload Library.**
+
+  Added the Generic Segmentation Offload (GSO) library to enable
+  applications to split large packets (e.g. MTU is 64KB) into small
+  ones (e.g. MTU is 1500B). Supported packet types are:
+
+  * TCP/IPv4 packets, which may include a single VLAN tag.
+
+  The GSO library doesn't check if the input packets have correct
+  checksums, and doesn't update checksums for output packets.
+  Additionally, the GSO library doesn't process IP fragmented packets.
+
 
 Resolved Issues
 ---------------
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index ec8dba7..2fa1199 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -87,6 +87,7 @@ struct rte_logs {
 #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
 #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
 #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
+#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index aeaacbc..2be64d1 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -42,6 +42,8 @@ LIBABIVER := 1
 
 #source files
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
new file mode 100644
index 0000000..ee75d4c
--- /dev/null
+++ b/lib/librte_gso/gso_common.c
@@ -0,0 +1,153 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdbool.h>
+#include <errno.h>
+
+#include <rte_memcpy.h>
+#include <rte_mempool.h>
+
+#include "gso_common.h"
+
+static inline void
+hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset)
+{
+	/* Copy MBUF metadata */
+	hdr_segment->nb_segs = 1;
+	hdr_segment->port = pkt->port;
+	hdr_segment->ol_flags = pkt->ol_flags;
+	hdr_segment->packet_type = pkt->packet_type;
+	hdr_segment->pkt_len = pkt_hdr_offset;
+	hdr_segment->data_len = pkt_hdr_offset;
+	hdr_segment->tx_offload = pkt->tx_offload;
+
+	/* Copy the packet header */
+	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
+			rte_pktmbuf_mtod(pkt, char *),
+			pkt_hdr_offset);
+}
+
+static inline void
+free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
+{
+	uint16_t i;
+
+	for (i = 0; i < nb_pkts; i++)
+		rte_pktmbuf_free(pkts[i]);
+}
+
+int
+gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct rte_mbuf *pkt_in;
+	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
+	uint16_t pkt_in_data_pos, segment_bytes_remaining;
+	uint16_t pyld_len, nb_segs;
+	bool more_in_pkt, more_out_segs;
+
+	pkt_in = pkt;
+	nb_segs = 0;
+	more_in_pkt = 1;
+	pkt_in_data_pos = pkt_hdr_offset;
+
+	while (more_in_pkt) {
+		if (unlikely(nb_segs >= nb_pkts_out)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -EINVAL;
+		}
+
+		/* Allocate a direct MBUF */
+		hdr_segment = rte_pktmbuf_alloc(direct_pool);
+		if (unlikely(hdr_segment == NULL)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -ENOMEM;
+		}
+		/* Fill the packet header */
+		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
+
+		prev_segment = hdr_segment;
+		segment_bytes_remaining = pyld_unit_size;
+		more_out_segs = 1;
+
+		while (more_out_segs && more_in_pkt) {
+			/* Allocate an indirect MBUF */
+			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
+			if (unlikely(pyld_segment == NULL)) {
+				rte_pktmbuf_free(hdr_segment);
+				free_gso_segment(pkts_out, nb_segs);
+				return -ENOMEM;
+			}
+			/* Attach to current MBUF segment of pkt */
+			rte_pktmbuf_attach(pyld_segment, pkt_in);
+
+			prev_segment->next = pyld_segment;
+			prev_segment = pyld_segment;
+
+			pyld_len = segment_bytes_remaining;
+			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
+				pyld_len = pkt_in->data_len - pkt_in_data_pos;
+
+			pyld_segment->data_off = pkt_in_data_pos +
+				pkt_in->data_off;
+			pyld_segment->data_len = pyld_len;
+
+			/* Update header segment */
+			hdr_segment->pkt_len += pyld_len;
+			hdr_segment->nb_segs++;
+
+			pkt_in_data_pos += pyld_len;
+			segment_bytes_remaining -= pyld_len;
+
+			/* Finish processing a MBUF segment of pkt */
+			if (pkt_in_data_pos == pkt_in->data_len) {
+				pkt_in = pkt_in->next;
+				pkt_in_data_pos = 0;
+				if (pkt_in == NULL)
+					more_in_pkt = 0;
+			}
+
+			/* Finish generating a GSO segment */
+			if (segment_bytes_remaining == 0)
+				more_out_segs = 0;
+		}
+		pkts_out[nb_segs++] = hdr_segment;
+	}
+	return nb_segs;
+}
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
new file mode 100644
index 0000000..8d9b94e
--- /dev/null
+++ b/lib/librte_gso/gso_common.h
@@ -0,0 +1,141 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_COMMON_H_
+#define _GSO_COMMON_H_
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
+#define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
+		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
+
+#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
+#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
+
+#define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
+
+/**
+ * Internal function which updates the TCP header of a packet, following
+ * segmentation. This is required to update the header's 'sent' sequence
+ * number, and also to clear 'PSH' and 'FIN' flags for non-tail segments.
+ *
+ * @param pkt
+ *  The packet containing the TCP header.
+ * @param l4_offset
+ *  The offset of the TCP header from the start of the packet.
+ * @param sent_seq
+ *  The sent sequence number.
+ * @param non-tail
+ *  Indicates whether or not this is a tail segment.
+ */
+static inline void
+update_tcp_header(struct rte_mbuf *pkt, uint16_t l4_offset, uint32_t sent_seq,
+		uint8_t non_tail)
+{
+	struct tcp_hdr *tcp_hdr;
+
+	tcp_hdr = (struct tcp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l4_offset);
+	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
+	if (likely(non_tail))
+		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
+					TCP_HDR_FIN_MASK));
+}
+
+/**
+ * Internal function which updates the IPv4 header of a packet, following
+ * segmentation. This is required to update the header's 'total_length' field,
+ * to reflect the reduced length of the now-segmented packet. Furthermore, the
+ * header's 'packet_id' field must be updated to reflect the new ID of the
+ * now-segmented packet.
+ *
+ * @param pkt
+ *  The packet containing the IPv4 header.
+ * @param l3_offset
+ *  The offset of the IPv4 header from the start of the packet.
+ * @param id
+ *  The new ID of the packet.
+  */
+static inline void
+update_ipv4_header(struct rte_mbuf *pkt, uint16_t l3_offset, uint16_t id)
+{
+	struct ipv4_hdr *ipv4_hdr;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l3_offset);
+	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len - l3_offset);
+	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
+}
+
+/**
+ * Internal function which divides the input packet into small segments.
+ * Each of the newly-created segments is organized as a two-segment MBUF,
+ * where the first segment is a standard mbuf, which stores a copy of
+ * packet header, and the second is an indirect mbuf which points to a
+ * section of data in the input packet.
+ *
+ * @param pkt
+ *  Packet to segment.
+ * @param pkt_hdr_offset
+ *  Packet header offset, measured in bytes.
+ * @param pyld_unit_size
+ *  The max payload length of a GSO segment.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to keep the mbuf addresses of output segments. If
+ *  the memory space in pkts_out is insufficient, gso_do_segment() fails
+ *  and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of segments created in the event of success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
new file mode 100644
index 0000000..d83e610
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.c
@@ -0,0 +1,104 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "gso_common.h"
+#include "gso_tcp4.h"
+
+static void
+update_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t id, tail_idx, i;
+	uint16_t l3_offset = pkt->l2_len;
+	uint16_t l4_offset = l3_offset + pkt->l3_len;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char*) +
+			l3_offset);
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+
+	for (i = 0; i < nb_segs; i++) {
+		update_ipv4_header(segs[i], l3_offset, id);
+		update_tcp_header(segs[i], l4_offset, sent_seq, i < tail_idx);
+		id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
+
+int
+gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t tcp_dl;
+	uint16_t pyld_unit_size, hdr_offset;
+	uint16_t frag_off;
+	int ret;
+
+	/* Don't process the fragmented packet */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->l2_len);
+	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	/* Don't process the packet without data */
+	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
+	if (unlikely(tcp_dl == 0)) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret > 1)
+		update_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
new file mode 100644
index 0000000..1c57441
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.h
@@ -0,0 +1,74 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TCP4_H_
+#define _GSO_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment an IPv4/TCP packet. This function doesn't check if the input
+ * packet has correct checksums, and doesn't update checksums for output
+ * GSO segments. Furthermore, it doesn't process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing unit of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when the function succeeds. If the memory space in
+ *  pkts_out is insufficient, it fails and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ip_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index b773636..a4fce50 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -33,7 +33,12 @@
 
 #include <errno.h>
 
+#include <rte_log.h>
+#include <rte_ethdev.h>
+
 #include "rte_gso.h"
+#include "gso_common.h"
+#include "gso_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -41,12 +46,53 @@
 		struct rte_mbuf **pkts_out,
 		uint16_t nb_pkts_out)
 {
+	struct rte_mempool *direct_pool, *indirect_pool;
+	struct rte_mbuf *pkt_seg;
+	uint64_t ol_flags;
+	uint16_t gso_size;
+	uint8_t ipid_delta;
+	int ret = 1;
+
 	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
 			nb_pkts_out < 1)
 		return -EINVAL;
 
-	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
-	pkts_out[0] = pkt;
+	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
+				DEV_TX_OFFLOAD_TCP_TSO) !=
+			gso_ctx->gso_types) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	direct_pool = gso_ctx->direct_pool;
+	indirect_pool = gso_ctx->indirect_pool;
+	gso_size = gso_ctx->gso_size;
+	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
+	ol_flags = pkt->ol_flags;
+
+	if (IS_IPV4_TCP(pkt->ol_flags)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkts_out[0] = pkt;
+		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
+		return 1;
+	}
+
+	if (ret > 1) {
+		pkt_seg = pkt;
+		while (pkt_seg) {
+			rte_mbuf_refcnt_update(pkt_seg, -1);
+			pkt_seg = pkt_seg->next;
+		}
+	} else if (ret < 0) {
+		/* Revert the ol_flags in the event of failure. */
+		pkt->ol_flags = ol_flags;
+	}
 
-	return 1;
+	return ret;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v6 3/6] gso: add VxLAN GSO support
  2017-10-02 16:45         ` [PATCH v6 0/6] Support TCP/IPv4, VxLAN, " Mark Kavanagh
  2017-10-02 16:45           ` [PATCH v6 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
  2017-10-02 16:45           ` [PATCH v6 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
@ 2017-10-02 16:45           ` Mark Kavanagh
  2017-10-04 14:12             ` Ananyev, Konstantin
  2017-10-02 16:45           ` [PATCH v6 4/6] gso: add GRE " Mark Kavanagh
                             ` (3 subsequent siblings)
  6 siblings, 1 reply; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-02 16:45 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

This patch adds a framework that allows GSO on tunneled packets.
Furthermore, it leverages that framework to provide GSO support for
VxLAN-encapsulated packets.

Supported VxLAN packets must have an outer IPv4 header (prepended by an
optional VLAN tag), and contain an inner TCP/IPv4 packet (with an optional
inner VLAN tag).

VxLAN GSO doesn't check if input packets have correct checksums and
doesn't update checksums for output packets. Additionally, it doesn't
process IP fragmented packets.

As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
output packet, which mandates support for multi-segment mbufs in the TX
functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
reduces its MBUF refcnt by 1. As a result, when all of its GSO'd segments
are freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst |   3 +
 lib/librte_gso/Makefile                |   1 +
 lib/librte_gso/gso_common.h            |  25 +++++++
 lib/librte_gso/gso_tunnel_tcp4.c       | 123 +++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_tunnel_tcp4.h       |  75 ++++++++++++++++++++
 lib/librte_gso/rte_gso.c               |  13 +++-
 6 files changed, 237 insertions(+), 3 deletions(-)
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index c414f73..25b8a78 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -48,6 +48,9 @@ New Features
   ones (e.g. MTU is 1500B). Supported packet types are:
 
   * TCP/IPv4 packets, which may include a single VLAN tag.
+  * VxLAN packets, which must have an outer IPv4 header (prepended by
+    an optional VLAN tag), and contain an inner TCP/IPv4 packet (with
+    an optional VLAN tag).
 
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index 2be64d1..e6d41df 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -44,6 +44,7 @@ LIBABIVER := 1
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index 8d9b94e..c051295 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -39,6 +39,7 @@
 #include <rte_mbuf.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
+#include <rte_udp.h>
 
 #define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
 		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
@@ -49,6 +50,30 @@
 #define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
 		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
 
+#define IS_IPV4_VXLAN_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
+				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
+		 PKT_TX_TUNNEL_VXLAN))
+
+/**
+ * Internal function which updates the UDP header of a packet, following
+ * segmentation. This is required to update the header's datagram length field.
+ *
+ * @param pkt
+ *  The packet containing the UDP header.
+ * @param udp_offset
+ *  The offset of the UDP header from the start of the packet.
+ */
+static inline void
+update_udp_header(struct rte_mbuf *pkt, uint16_t udp_offset)
+{
+	struct udp_hdr *udp_hdr;
+
+	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			udp_offset);
+	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - udp_offset);
+}
+
 /**
  * Internal function which updates the TCP header of a packet, following
  * segmentation. This is required to update the header's 'sent' sequence
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
new file mode 100644
index 0000000..34bbbd7
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -0,0 +1,123 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "gso_common.h"
+#include "gso_tunnel_tcp4.h"
+
+static void
+update_tunnel_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t outer_id, inner_id, tail_idx, i;
+	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
+
+	outer_ipv4_offset = pkt->outer_l2_len;
+	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
+	inner_ipv4_offset = udp_offset + pkt->l2_len;
+	tcp_offset = inner_ipv4_offset + pkt->l3_len;
+
+	/* Outer IPv4 header. */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			outer_ipv4_offset);
+	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+
+	/* Inner IPv4 header. */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			inner_ipv4_offset);
+	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+
+	for (i = 0; i < nb_segs; i++) {
+		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
+		update_udp_header(segs[i], udp_offset);
+		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
+		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
+		outer_id++;
+		inner_id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
+
+int
+gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *inner_ipv4_hdr;
+	uint16_t pyld_unit_size, hdr_offset;
+	uint16_t tcp_dl, frag_off;
+	int ret = 1;
+
+	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
+	inner_ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			hdr_offset);
+	/*
+	 * Don't process the packet whose MF bit or offset in the inner
+	 * IPv4 header are non-zero.
+	 */
+	frag_off = rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	/* Don't process the packet without data */
+	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
+	if (unlikely(tcp_dl == 0)) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	hdr_offset += pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret <= 1)
+		return ret;
+
+	update_tunnel_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tunnel_tcp4.h b/lib/librte_gso/gso_tunnel_tcp4.h
new file mode 100644
index 0000000..3c67f0c
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.h
@@ -0,0 +1,75 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TUNNEL_TCP4_H_
+#define _GSO_TUNNEL_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment a tunneling packet with inner TCP/IPv4 headers. This function
+ * doesn't check if the input packet has correct checksums, and doesn't
+ * update checksums for output GSO segments. Furthermore, it doesn't
+ * process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing unit of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when it succeeds. If the memory space in pkts_out is
+ *  insufficient, it fails and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index a4fce50..6095689 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -39,6 +39,7 @@
 #include "rte_gso.h"
 #include "gso_common.h"
 #include "gso_tcp4.h"
+#include "gso_tunnel_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -58,8 +59,9 @@
 		return -EINVAL;
 
 	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
-				DEV_TX_OFFLOAD_TCP_TSO) !=
-			gso_ctx->gso_types) {
+				(DEV_TX_OFFLOAD_TCP_TSO |
+				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
+				gso_ctx->gso_types) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		pkts_out[0] = pkt;
 		return 1;
@@ -71,7 +73,12 @@
 	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
 	ol_flags = pkt->ol_flags;
 
-	if (IS_IPV4_TCP(pkt->ol_flags)) {
+	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else if (IS_IPV4_TCP(pkt->ol_flags)) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v6 4/6] gso: add GRE GSO support
  2017-10-02 16:45         ` [PATCH v6 0/6] Support TCP/IPv4, VxLAN, " Mark Kavanagh
                             ` (2 preceding siblings ...)
  2017-10-02 16:45           ` [PATCH v6 3/6] gso: add VxLAN " Mark Kavanagh
@ 2017-10-02 16:45           ` Mark Kavanagh
  2017-10-04 14:15             ` Ananyev, Konstantin
  2017-10-02 16:45           ` [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
                             ` (2 subsequent siblings)
  6 siblings, 1 reply; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-02 16:45 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

This patch adds GSO support for GRE-tunneled packets. Supported GRE
packets must contain an outer IPv4 header, and inner TCP/IPv4 headers.
They may also contain a single VLAN tag. GRE GSO doesn't check if all
input packets have correct checksums and doesn't update checksums for
output packets. Additionally, it doesn't process IP fragmented packets.

As with VxLAN GSO, GRE GSO uses a two-segment MBUF to organize each
output packet, which requires multi-segment mbuf support in the TX
functions of the NIC driver. Also, if a packet is GSOed, GRE GSO reduces
its MBUF refcnt by 1. As a result, when all of its GSOed segments are
freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst |  3 +++
 lib/librte_gso/gso_common.h            |  5 +++++
 lib/librte_gso/gso_tunnel_tcp4.c       | 14 ++++++++++----
 lib/librte_gso/rte_gso.c               |  8 +++++---
 4 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 25b8a78..808f537 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -51,6 +51,9 @@ New Features
   * VxLAN packets, which must have an outer IPv4 header (prepended by
     an optional VLAN tag), and contain an inner TCP/IPv4 packet (with
     an optional VLAN tag).
+  * GRE packets, which must contain an outer IPv4 header (prepended by
+    an optional VLAN tag), and inner TCP/IPv4 headers (with an optional
+    VLAN tag).
 
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index c051295..1e99cc0 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -55,6 +55,11 @@
 		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
 		 PKT_TX_TUNNEL_VXLAN))
 
+#define IS_IPV4_GRE_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
+				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_GRE)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
+		 PKT_TX_TUNNEL_GRE))
+
 /**
  * Internal function which updates the UDP header of a packet, following
  * segmentation. This is required to update the header's datagram length field.
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
index 34bbbd7..d79fc6b 100644
--- a/lib/librte_gso/gso_tunnel_tcp4.c
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -42,11 +42,13 @@
 	struct tcp_hdr *tcp_hdr;
 	uint32_t sent_seq;
 	uint16_t outer_id, inner_id, tail_idx, i;
-	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
+	uint16_t outer_ipv4_offset, inner_ipv4_offset;
+	uint16_t udp_gre_offset, tcp_offset;
+	uint8_t update_udp_hdr;
 
 	outer_ipv4_offset = pkt->outer_l2_len;
-	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
-	inner_ipv4_offset = udp_offset + pkt->l2_len;
+	udp_gre_offset = outer_ipv4_offset + pkt->outer_l3_len;
+	inner_ipv4_offset = udp_gre_offset + pkt->l2_len;
 	tcp_offset = inner_ipv4_offset + pkt->l3_len;
 
 	/* Outer IPv4 header. */
@@ -63,9 +65,13 @@
 	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
 	tail_idx = nb_segs - 1;
 
+	/* Only update UDP header for VxLAN packets. */
+	update_udp_hdr = (pkt->ol_flags & PKT_TX_TUNNEL_VXLAN) ? 1 : 0;
+
 	for (i = 0; i < nb_segs; i++) {
 		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
-		update_udp_header(segs[i], udp_offset);
+		if (update_udp_hdr)
+			update_udp_header(segs[i], udp_gre_offset);
 		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
 		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
 		outer_id++;
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index 6095689..b748ab1 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -60,8 +60,9 @@
 
 	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
 				(DEV_TX_OFFLOAD_TCP_TSO |
-				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
-				gso_ctx->gso_types) {
+				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+				 DEV_TX_OFFLOAD_GRE_TNL_TSO)) !=
+				 gso_ctx->gso_types) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		pkts_out[0] = pkt;
 		return 1;
@@ -73,7 +74,8 @@
 	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
 	ol_flags = pkt->ol_flags;
 
-	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) {
+	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags) ||
+			IS_IPV4_GRE_TCP4(pkt->ol_flags)) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-10-02 16:45         ` [PATCH v6 0/6] Support TCP/IPv4, VxLAN, " Mark Kavanagh
                             ` (3 preceding siblings ...)
  2017-10-02 16:45           ` [PATCH v6 4/6] gso: add GRE " Mark Kavanagh
@ 2017-10-02 16:45           ` Mark Kavanagh
  2017-10-04 15:08             ` Ananyev, Konstantin
  2017-10-02 16:45           ` [PATCH v6 6/6] doc: add GSO programmer's guide Mark Kavanagh
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
  6 siblings, 1 reply; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-02 16:45 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

This patch adds GSO support to the csum forwarding engine. Oversized
packets transmitted over a GSO-enabled port will undergo segmentation
(with the exception of packet-types unsupported by the GSO library).
GSO support is disabled by default.

GSO support may be toggled on a per-port basis, using the command:

        "set port <port_id> gso on|off"

The maximum packet length (including the packet header and payload) for
GSO segments may be set with the command:

        "set gso segsz <length>"

Show GSO configuration for a given port with the command:

	"show port <port_id> gso"

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 app/test-pmd/cmdline.c                      | 178 ++++++++++++++++++++++++++++
 app/test-pmd/config.c                       |  24 ++++
 app/test-pmd/csumonly.c                     |  69 ++++++++++-
 app/test-pmd/testpmd.c                      |  13 ++
 app/test-pmd/testpmd.h                      |  10 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  46 +++++++
 6 files changed, 335 insertions(+), 5 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index ccdf239..05b0ce8 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -431,6 +431,17 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"    Set max flow number and max packet number per-flow"
 			" for GRO.\n\n"
 
+			"set port (port_id) gso (on|off)"
+			"    Enable or disable Generic Segmentation Offload in"
+			" csum forwarding engine.\n\n"
+
+			"set gso segsz (length)\n"
+			"    Set max packet length for output GSO segments,"
+			" including packet header and payload.\n\n"
+
+			"show port (port_id) gso\n"
+			"    Show GSO configuration.\n\n"
+
 			"set fwd (%s)\n"
 			"    Set packet forwarding mode.\n\n"
 
@@ -3967,6 +3978,170 @@ struct cmd_gro_set_result {
 	},
 };
 
+/* *** ENABLE/DISABLE GSO *** */
+struct cmd_gso_enable_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_mode;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_enable_parsed(void *parsed_result,
+		__attribute__((unused)) struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_gso_enable_result *res;
+
+	res = parsed_result;
+	if (!strcmp(res->cmd_keyword, "gso"))
+		setup_gso(res->cmd_mode, res->cmd_pid);
+}
+
+cmdline_parse_token_string_t cmd_gso_enable_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_enable_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_enable_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_enable_mode =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_mode, "on#off");
+cmdline_parse_token_num_t cmd_gso_enable_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_enable = {
+	.f = cmd_gso_enable_parsed,
+	.data = NULL,
+	.help_str = "set port <port_id> gso on|off",
+	.tokens = {
+		(void *)&cmd_gso_enable_set,
+		(void *)&cmd_gso_enable_port,
+		(void *)&cmd_gso_enable_pid,
+		(void *)&cmd_gso_enable_keyword,
+		(void *)&cmd_gso_enable_mode,
+		NULL,
+	},
+};
+
+/* *** SET MAX PACKET LENGTH FOR GSO SEGMENTS *** */
+struct cmd_gso_size_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_segsz;
+	uint16_t cmd_size;
+};
+
+static void
+cmd_gso_size_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_size_result *res = parsed_result;
+
+	if (test_done == 0) {
+		printf("Before setting GSO segsz, please first stop fowarding\n");
+		return;
+	}
+
+	if (!strcmp(res->cmd_keyword, "gso") &&
+			!strcmp(res->cmd_segsz, "segsz")) {
+		if (res->cmd_size == 0) {
+			printf("gso_size should be larger than 0."
+					" Please input a legal value\n");
+		} else
+			gso_max_segment_size = res->cmd_size;
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_size_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_size_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_size_segsz =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_segsz, "segsz");
+cmdline_parse_token_num_t cmd_gso_size_size =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_size_result,
+				cmd_size, UINT16);
+
+cmdline_parse_inst_t cmd_gso_size = {
+	.f = cmd_gso_size_parsed,
+	.data = NULL,
+	.help_str = "set gso segsz <length>",
+	.tokens = {
+		(void *)&cmd_gso_size_set,
+		(void *)&cmd_gso_size_keyword,
+		(void *)&cmd_gso_size_segsz,
+		(void *)&cmd_gso_size_size,
+		NULL,
+	},
+};
+
+/* *** SHOW GSO CONFIGURATION *** */
+struct cmd_gso_show_result {
+	cmdline_fixed_string_t cmd_show;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_show_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_show_result *res = parsed_result;
+
+	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
+		printf("invalid port id %u\n", res->cmd_pid);
+		return;
+	}
+	if (!strcmp(res->cmd_keyword, "gso")) {
+		if (gso_ports[res->cmd_pid].enable) {
+			printf("Max GSO'd packet size: %uB\n"
+					"Supported GSO types: TCP/IPv4, "
+					"VxLAN with inner TCP/IPv4 packet, "
+					"GRE with inner TCP/IPv4  packet\n",
+					gso_max_segment_size);
+		} else
+			printf("GSO is not enabled on Port %u\n", res->cmd_pid);
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_show_show =
+TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+		cmd_show, "show");
+cmdline_parse_token_string_t cmd_gso_show_port =
+TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+		cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_show_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_num_t cmd_gso_show_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_show_result,
+				cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_show = {
+	.f = cmd_gso_show_parsed,
+	.data = NULL,
+	.help_str = "show port <port_id> gso",
+	.tokens = {
+		(void *)&cmd_gso_show_show,
+		(void *)&cmd_gso_show_port,
+		(void *)&cmd_gso_show_pid,
+		(void *)&cmd_gso_show_keyword,
+		NULL,
+	},
+};
+
 /* *** ENABLE/DISABLE FLUSH ON RX STREAMS *** */
 struct cmd_set_flush_rx {
 	cmdline_fixed_string_t set;
@@ -14255,6 +14430,9 @@ struct cmd_cmdfile_result {
 	(cmdline_parse_inst_t *)&cmd_tunnel_tso_show,
 	(cmdline_parse_inst_t *)&cmd_enable_gro,
 	(cmdline_parse_inst_t *)&cmd_gro_set,
+	(cmdline_parse_inst_t *)&cmd_gso_enable,
+	(cmdline_parse_inst_t *)&cmd_gso_size,
+	(cmdline_parse_inst_t *)&cmd_gso_show,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_rx,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_tx,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 3ae3e1c..88d09d0 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2454,6 +2454,30 @@ struct igb_ring_desc_16_bytes {
 	}
 }
 
+void
+setup_gso(const char *mode, uint8_t port_id)
+{
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		printf("invalid port id %u\n", port_id);
+		return;
+	}
+	if (strcmp(mode, "on") == 0) {
+		if (test_done == 0) {
+			printf("before enabling GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 1;
+	} else if (strcmp(mode, "off") == 0) {
+		if (test_done == 0) {
+			printf("before disabling GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 0;
+	}
+}
+
 char*
 list_pkt_forwarding_modes(void)
 {
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 90c8119..bd1a287 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -70,6 +70,8 @@
 #include <rte_string_fns.h>
 #include <rte_flow.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
+
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
@@ -91,6 +93,7 @@
 /* structure that caches offload info for the current packet */
 struct testpmd_offload_info {
 	uint16_t ethertype;
+	uint8_t gso_enable;
 	uint16_t l2_len;
 	uint16_t l3_len;
 	uint16_t l4_len;
@@ -381,6 +384,8 @@ struct simple_gre_hdr {
 				get_udptcp_checksum(l3_hdr, tcp_hdr,
 					info->ethertype);
 		}
+		if (info->gso_enable)
+			ol_flags |= PKT_TX_TCP_SEG;
 	} else if (info->l4_proto == IPPROTO_SCTP) {
 		sctp_hdr = (struct sctp_hdr *)((char *)l3_hdr + info->l3_len);
 		sctp_hdr->cksum = 0;
@@ -627,6 +632,9 @@ struct simple_gre_hdr {
 pkt_burst_checksum_forward(struct fwd_stream *fs)
 {
 	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_mbuf *gso_segments[GSO_MAX_PKT_BURST];
+	struct rte_gso_ctx *gso_ctx;
+	struct rte_mbuf **tx_pkts_burst;
 	struct rte_port *txp;
 	struct rte_mbuf *m, *p;
 	struct ether_hdr *eth_hdr;
@@ -634,13 +642,15 @@ struct simple_gre_hdr {
 	uint16_t nb_rx;
 	uint16_t nb_tx;
 	uint16_t nb_prep;
-	uint16_t i;
+	uint16_t i, j;
 	uint64_t rx_ol_flags, tx_ol_flags;
 	uint16_t testpmd_ol_flags;
 	uint32_t retry;
 	uint32_t rx_bad_ip_csum;
 	uint32_t rx_bad_l4_csum;
 	struct testpmd_offload_info info;
+	uint16_t nb_segments = 0;
+	int ret;
 
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	uint64_t start_tsc;
@@ -674,6 +684,8 @@ struct simple_gre_hdr {
 	memset(&info, 0, sizeof(info));
 	info.tso_segsz = txp->tso_segsz;
 	info.tunnel_tso_segsz = txp->tunnel_tso_segsz;
+	if (gso_ports[fs->tx_port].enable)
+		info.gso_enable = 1;
 
 	for (i = 0; i < nb_rx; i++) {
 		if (likely(i < nb_rx - 1))
@@ -851,13 +863,59 @@ struct simple_gre_hdr {
 		}
 	}
 
+	if (gso_ports[fs->tx_port].enable == 0)
+		tx_pkts_burst = pkts_burst;
+	else {
+		gso_ctx = &(current_fwd_lcore()->gso_ctx);
+		gso_ctx->gso_size = gso_max_segment_size;
+		for (i = 0; i < nb_rx; i++) {
+			if (unlikely(nb_rx - i >= GSO_MAX_PKT_BURST -
+						nb_segments)) {
+				/*
+				 * insufficient space in gso_segments,
+				 * stop GSO.
+				 */
+				for (j = i; j < GSO_MAX_PKT_BURST -
+						nb_segments; j++) {
+					pkts_burst[j]->ol_flags &=
+						(~PKT_TX_TCP_SEG);
+					gso_segments[nb_segments++] =
+						pkts_burst[j];
+				}
+				for (; j < nb_rx; j++)
+					rte_pktmbuf_free(pkts_burst[j]);
+				break;
+			}
+			ret = rte_gso_segment(pkts_burst[i], gso_ctx,
+					&gso_segments[nb_segments],
+					GSO_MAX_PKT_BURST - nb_segments);
+			if (ret >= 1)
+				nb_segments += ret;
+			else if (ret < 0) {
+				/*
+				 * insufficient MBUFs or space in
+				 * gso_segments, stop GSO.
+				 */
+				for (j = i; j < nb_rx; j++) {
+					pkts_burst[j]->ol_flags &=
+						(~PKT_TX_TCP_SEG);
+					gso_segments[nb_segments++] =
+						pkts_burst[j];
+				}
+				break;
+			}
+		}
+		tx_pkts_burst = gso_segments;
+		nb_rx = nb_segments;
+	}
+
 	nb_prep = rte_eth_tx_prepare(fs->tx_port, fs->tx_queue,
-			pkts_burst, nb_rx);
+			tx_pkts_burst, nb_rx);
 	if (nb_prep != nb_rx)
 		printf("Preparing packet burst to transmit failed: %s\n",
 				rte_strerror(rte_errno));
 
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst,
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, tx_pkts_burst,
 			nb_prep);
 
 	/*
@@ -868,7 +926,7 @@ struct simple_gre_hdr {
 		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
 			rte_delay_us(burst_tx_delay_time);
 			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_rx - nb_tx);
+					&tx_pkts_burst[nb_tx], nb_rx - nb_tx);
 		}
 	}
 	fs->tx_packets += nb_tx;
@@ -881,9 +939,10 @@ struct simple_gre_hdr {
 	if (unlikely(nb_tx < nb_rx)) {
 		fs->fwd_dropped += (nb_rx - nb_tx);
 		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
+			rte_pktmbuf_free(tx_pkts_burst[nb_tx]);
 		} while (++nb_tx < nb_rx);
 	}
+
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	end_tsc = rte_rdtsc();
 	core_cycles = (end_tsc - start_tsc);
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index e097ee0..97e349d 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -400,6 +400,9 @@ static int eth_event_callback(uint8_t port_id,
  */
 static int all_ports_started(void);
 
+struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+uint16_t gso_max_segment_size = ETHER_MAX_LEN - ETHER_CRC_LEN;
+
 /*
  * Helper function to check if socket is already discovered.
  * If yes, return positive value. If not, return zero.
@@ -570,6 +573,7 @@ static int eth_event_callback(uint8_t port_id,
 	unsigned int nb_mbuf_per_pool;
 	lcoreid_t  lc_id;
 	uint8_t port_per_socket[RTE_MAX_NUMA_NODES];
+	uint32_t gso_types = 0;
 
 	memset(port_per_socket,0,RTE_MAX_NUMA_NODES);
 
@@ -654,6 +658,8 @@ static int eth_event_callback(uint8_t port_id,
 
 	init_port_config();
 
+	gso_types = DEV_TX_OFFLOAD_TCP_TSO | DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+		DEV_TX_OFFLOAD_GRE_TNL_TSO;
 	/*
 	 * Records which Mbuf pool to use by each logical core, if needed.
 	 */
@@ -664,6 +670,13 @@ static int eth_event_callback(uint8_t port_id,
 		if (mbp == NULL)
 			mbp = mbuf_pool_find(0);
 		fwd_lcores[lc_id]->mbp = mbp;
+		/* initialize GSO context */
+		fwd_lcores[lc_id]->gso_ctx.direct_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.indirect_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.gso_types = gso_types;
+		fwd_lcores[lc_id]->gso_ctx.gso_size = ETHER_MAX_LEN -
+			ETHER_CRC_LEN;
+		fwd_lcores[lc_id]->gso_ctx.ipid_flag = !RTE_GSO_IPID_FIXED;
 	}
 
 	/* Configuration of packet forwarding streams. */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 1d1ee75..ff842a1 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -36,6 +36,7 @@
 
 #include <rte_pci.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
@@ -205,6 +206,7 @@ struct rte_port {
  * CPU id. configuration table.
  */
 struct fwd_lcore {
+	struct rte_gso_ctx gso_ctx;     /**< GSO context */
 	struct rte_mempool *mbp; /**< The mbuf pool to use by this core */
 	streamid_t stream_idx;   /**< index of 1st stream in "fwd_streams" */
 	streamid_t stream_nb;    /**< number of streams in "fwd_streams" */
@@ -442,6 +444,13 @@ struct gro_status {
 };
 extern struct gro_status gro_ports[RTE_MAX_ETHPORTS];
 
+#define GSO_MAX_PKT_BURST 2048
+struct gso_status {
+	uint8_t enable;
+};
+extern struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+extern uint16_t gso_max_segment_size;
+
 static inline unsigned int
 lcore_num(void)
 {
@@ -642,6 +651,7 @@ void port_rss_hash_key_update(portid_t port_id, char rss_type[],
 int rx_queue_id_is_invalid(queueid_t rxq_id);
 int tx_queue_id_is_invalid(queueid_t txq_id);
 void setup_gro(const char *mode, uint8_t port_id);
+void setup_gso(const char *mode, uint8_t port_id);
 
 /* Functions to manage the set of filtered Multicast MAC addresses */
 void mcast_addr_add(uint8_t port_id, struct ether_addr *mc_addr);
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 2ed62f5..f9b5bda 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -932,6 +932,52 @@ number of packets a GRO table can store.
 If current packet number is greater than or equal to the max value, GRO
 will stop processing incoming packets.
 
+set port - gso
+~~~~~~~~~~~~~~
+
+Toggle per-port GSO support in ``csum`` forwarding engine::
+
+   testpmd> set port <port_id> gso on|off
+
+If enabled, the csum forwarding engine will perform GSO on supported IPv4
+packets, transmitted on the given port.
+
+If disabled, packets transmitted on the given port will not undergo GSO.
+By default, GSO is disabled for all ports.
+
+.. note::
+
+   When GSO is enabled on a port, supported IPv4 packets transmitted on that
+   port undergo GSO. Afterwards, the segmented packets are represented by
+   multi-segment mbufs; however, the csum forwarding engine doesn't calculation
+   of checksums for GSO'd segments in SW. As a result, if users want correct
+   checksums in GSO segments, they should enable HW checksum calculation for
+   GSO-enabled ports.
+
+   For example, HW checksum calculation for VxLAN GSO'd packets may be enabled
+   by setting the following options in the csum forwarding engine:
+
+   testpmd> csum set outer_ip hw <port_id>
+
+   testpmd> csum set ip hw <port_id>
+
+   testpmd> csum set tcp hw <port_id>
+
+set gso segsz
+~~~~~~~~~~~~~
+
+Set the maximum GSO segment size (measured in bytes), which includes the
+packet header and the packet payload for GSO-enabled ports (global)::
+
+   testpmd> set gso segsz <length>
+
+show port - gso
+~~~~~~~~~~~~~~~
+
+Display the status of Generic Segmentation Offload for a given port::
+
+   testpmd> show port <port_id> gso
+
 mac_addr add
 ~~~~~~~~~~~~
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v6 6/6] doc: add GSO programmer's guide
  2017-10-02 16:45         ` [PATCH v6 0/6] Support TCP/IPv4, VxLAN, " Mark Kavanagh
                             ` (4 preceding siblings ...)
  2017-10-02 16:45           ` [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
@ 2017-10-02 16:45           ` Mark Kavanagh
  2017-10-04 13:51             ` Mcnamara, John
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
  6 siblings, 1 reply; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-02 16:45 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

Add programmer's guide doc to explain the design and use of the
GSO library.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 MAINTAINERS                                        |   6 +
 .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
 .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
 doc/guides/prog_guide/index.rst                    |   1 +
 5 files changed, 1053 insertions(+)
 create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
 create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
 create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg

diff --git a/MAINTAINERS b/MAINTAINERS
index 8df2a7f..8f0a4bd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -644,6 +644,12 @@ M: Jiayu Hu <jiayu.hu@intel.com>
 F: lib/librte_gro/
 F: doc/guides/prog_guide/generic_receive_offload_lib.rst
 
+Generic Segmentation Offload
+M: Jiayu Hu <jiayu.hu@intel.com>
+M: Mark Kavanagh <mark.b.kavanagh@intel.com>
+F: lib/librte_gso/
+F: doc/guides/prog_guide/generic_segmentation_offload_lib.rst
+
 Distributor
 M: Bruce Richardson <bruce.richardson@intel.com>
 M: David Hunt <david.hunt@intel.com>
diff --git a/doc/guides/prog_guide/generic_segmentation_offload_lib.rst b/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
new file mode 100644
index 0000000..5e78f16
--- /dev/null
+++ b/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
@@ -0,0 +1,256 @@
+..  BSD LICENSE
+    Copyright(c) 2017 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Generic Segmentation Offload Library
+====================================
+
+Overview
+--------
+Generic Segmentation Offload (GSO) is a widely used software implementation of
+TCP Segmentation Offload (TSO), which reduces per-packet processing overhead.
+Much like TSO, GSO gains performance by enabling upper layer applications to
+process a smaller number of large packets (e.g. MTU size of 64KB), instead of
+processing higher numbers of small packets (e.g. MTU size of 1500B), thus
+reducing per-packet overhead.
+
+For example, GSO allows guest kernel stacks to transmit over-sized TCP segments
+that far exceed the kernel interface's MTU; this eliminates the need to segment
+packets within the guest, and improves the data-to-overhead ratio of both the
+guest-host link, and PCI bus. The expectation of the guest network stack in this
+scenario is that segmentation of egress frames will take place either in the NIC
+HW, or where that hardware capability is unavailable, either in the host
+application, or network stack.
+
+Bearing that in mind, the GSO library enables DPDK applications to segment
+packets in software. Note however, that GSO is implemented as a standalone
+library, and not via a 'fallback' mechanism (i.e. for when TSO is unsupported
+in the underlying hardware); that is, applications must explicitly invoke the
+GSO library to segment packets. The size of GSO segments ``(segsz)`` is
+configurable by the application.
+
+Limitations
+-----------
+
+#. The GSO library doesn't check if input packets have correct checksums.
+
+#. In addition, the GSO library doesn't re-calculate checksums for segmented
+   packets (that task is left to the application).
+
+#. IP fragments are unsupported by the GSO library.
+
+#. The egress interface's driver must support multi-segment packets.
+
+#. Currently, the GSO library supports the following IPv4 packet types:
+
+ - TCP
+ - VxLAN
+ - GRE
+
+  See `Supported GSO Packet Types`_ for further details.
+
+Packet Segmentation
+-------------------
+
+The ``rte_gso_segment()`` function is the GSO library's primary
+segmentation API.
+
+Before performing segmentation, an application must create a GSO context object
+``(struct rte_gso_ctx)``, which provides the library with some of the
+information required to understand how the packet should be segmented. Refer to
+`How to Segment a Packet`_ for additional details on same. Once the GSO context
+has been created, and populated, the application can then use the
+``rte_gso_segment()`` function to segment packets.
+
+The GSO library typically stores each segment that it creates in two parts: the
+first part contains a copy of the original packet's headers, while the second
+part contains a pointer to an offset within the original packet. This mechanism
+is explained in more detail in `GSO Output Segment Format`_.
+
+The GSO library supports both single- and multi-segment input mbufs.
+
+GSO Output Segment Format
+~~~~~~~~~~~~~~~~~~~~~~~~~
+To reduce the number of expensive memcpy operations required when segmenting a
+packet, the GSO library typically stores each segment that it creates as a
+two-part mbuf (technically, this is termed a 'two-segment' mbuf; however, since
+the elements produced by the API are also called 'segments', for clarity the
+term 'part' is used here instead).
+
+The first part of each output segment is a direct mbuf and contains a copy of
+the original packet's headers, which must be prepended to each output segment.
+These headers are copied from the original packet into each output segment.
+
+The second part of each output segment, represents a section of data from the
+original packet, i.e. a data segment. Rather than copy the data directly from
+the original packet into the output segment (which would impact performance
+considerably), the second part of each output segment is an indirect mbuf,
+which contains no actual data, but simply points to an offset within the
+original packet.
+
+The combination of the 'header' segment and the 'data' segment constitutes a
+single logical output GSO segment of the original packet. This is illustrated
+in :numref:`figure_gso-output-segment-format`.
+
+.. _figure_gso-output-segment-format:
+
+.. figure:: img/gso-output-segment-format.svg
+   :align: center
+
+   Two-part GSO output segment
+
+In one situation, the output segment may contain additional 'data' segments.
+This only occurs when:
+
+- the input packet on which GSO is to be performed is represented by a
+  multi-segment mbuf.
+
+- the output segment is required to contain data that spans the boundaries
+  between segments of the input multi-segment mbuf.
+
+The GSO library traverses each segment of the input packet, and produces
+numerous output segments; for optimal performance, the number of output
+segments is kept to a minimum. Consequently, the GSO library maximizes the
+amount of data contained within each output segment; i.e. each output segment
+``segsz`` bytes of data. The only exception to this is in the case of the very
+final output segment; if ``pkt_len`` % ``segsz``, then the final segment is
+smaller than the rest.
+
+In order for an output segment to meet its MSS, it may need to include data from
+multiple input segments. Due to the nature of indirect mbufs (each indirect mbuf
+can point to only one direct mbuf), the solution here is to add another indirect
+mbuf to the output segment; this additional segment then points to the next
+input segment. If necessary, this chaining process is repeated, until the sum of
+all of the data 'contained' in the output segment reaches ``segsz``. This
+ensures that the amount of data contained within each output segment is uniform,
+with the possible exception of the last segment, as previously described.
+
+:numref:`figure_gso-three-seg-mbuf` illustrates an example of a three-part
+output segment. In this example, the output segment needs to include data from
+the end of one input segment, and the beginning of another. To achieve this,
+an additional indirect mbuf is chained to the second part of the output segment,
+and is attached to the next input segment (i.e. it points to the data in the
+next input segment).
+
+.. _figure_gso-three-seg-mbuf:
+
+.. figure:: img/gso-three-seg-mbuf.svg
+   :align: center
+
+   Three-part GSO output segment
+
+Supported GSO Packet Types
+--------------------------
+
+TCP/IPv4 GSO
+~~~~~~~~~~~~
+TCP/IPv4 GSO supports segmentation of suitably large TCP/IPv4 packets, which
+may also contain an optional VLAN tag.
+
+VxLAN GSO
+~~~~~~~~~
+VxLAN packets GSO supports segmentation of suitably large VxLAN packets,
+which contain an outer IPv4 header, inner TCP/IPv4 headers, and optional
+inner and/or outer VLAN tag(s).
+
+GRE GSO
+~~~~~~~
+GRE GSO supports segmentation of suitably large GRE packets, which contain
+an outer IPv4 header, inner TCP/IPv4 headers, and an optional VLAN tag.
+
+How to Segment a Packet
+-----------------------
+
+To segment an outgoing packet, an application must:
+
+#. First create a GSO context ``(struct rte_gso_ctx)``; this contains:
+
+   - a pointer to the mbuf pool for allocating the direct buffers, which are
+     used to store the GSO segments' packet headers.
+
+   - a pointer to the mbuf pool for allocating indirect buffers, which are
+     used to locate GSO segments' packet payloads.
+
+.. note::
+
+     An application may use the same pool for both direct and indirect
+     buffers. However, since each indirect mbuf simply stores a pointer, the
+     application may reduce its memory consumption by creating a separate memory
+     pool, containing smaller elements, for the indirect pool.
+
+   - the size of each output segment, including packet headers and payload,
+     measured in bytes.
+
+   - the bit mask of required GSO types. The GSO library uses the same macros as
+     those that describe a physical device's TX offloading capabilities (i.e.
+     ``DEV_TX_OFFLOAD_*_TSO``) for gso_types. For example, if an application
+     wants to segment TCP/IPv4 packets, it should set gso_types to
+     ``DEV_TX_OFFLOAD_TCP_TSO``. The only other supported values currently
+     supported for gso_types are ``DEV_TX_OFFLOAD_VXLAN_TNL_TSO``, and
+     ``DEV_TX_OFFLOAD_GRE_TNL_TSO``; a combination of these macros is also
+     allowed.
+
+   - a flag, that indicates whether the IPv4 headers of output segments should
+     contain fixed or incremental ID values.
+
+2. Set the appropriate ol_flags in the mbuf.
+
+   - The GSO library use the value of an mbuf's ``ol_flags`` attribute to
+     to determine how a packet should be segmented. It is the application's
+     responsibility to ensure that these flags are set.
+
+   - For example, in order to segment TCP/IPv4 packets, the application should
+     add the ``PKT_TX_IPV4`` and ``PKT_TX_TCP_SEG`` flags to the mbuf's
+     ol_flags.
+
+   - If checksum calculation in hardware is required, the application should
+     also add the ``PKT_TX_TCP_CKSUM`` and ``PKT_TX_IP_CKSUM`` flags.
+
+#. Check if the packet should be processed. Packets with one of the
+   following properties are not processed and are returned immediately:
+
+   - Packet length is less than ``segsz`` (i.e. GSO is not required).
+
+   - Packet type is not supported by GSO library (see
+     `Supported GSO Packet Types`_).
+
+   - Application has not enabled GSO support for the packet type.
+
+   - Packet's ol_flags have been incorrectly set.
+
+#. Allocate space in which to store the output GSO segments. If the amount of
+   space allocated by the application is insufficient, segmentation will fail.
+
+#. Invoke the GSO segmentation API, ``rte_gso_segment()``.
+
+#. If required, update the L3 and L4 checksums of the newly-created segments.
+   For tunneled packets, the outer IPv4 headers' checksums should also be
+   updated. Alternatively, the application may offload checksum calculation
+   to HW.
+
diff --git a/doc/guides/prog_guide/img/gso-output-segment-format.svg b/doc/guides/prog_guide/img/gso-output-segment-format.svg
new file mode 100644
index 0000000..bdb5ec3
--- /dev/null
+++ b/doc/guides/prog_guide/img/gso-output-segment-format.svg
@@ -0,0 +1,313 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by Microsoft Visio, SVG Export gso-output-segment-format.svg Page-1 -->
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ev="http://www.w3.org/2001/xml-events"
+		xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/" width="19.3975in" height="8.21796in"
+		viewBox="0 0 1396.62 591.693" xml:space="preserve" color-interpolation-filters="sRGB" class="st21">
+	<v:documentProperties v:langID="1033" v:metric="true" v:viewMarkup="false"/>
+
+	<style type="text/css">
+	<![CDATA[
+		.st1 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0552552}
+		.st2 {fill:#ffffff;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st3 {stroke:#c3d600;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.68828}
+		.st4 {fill:#c3d600;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st5 {stroke:#8f9d00;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.75735}
+		.st6 {fill:#00aeef;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st7 {stroke:#007fb0;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.75735}
+		.st8 {stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st9 {fill:#ffffff;font-family:Intel Clear;font-size:1.99999em;font-weight:bold}
+		.st10 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0552552}
+		.st11 {fill:#ffffff;font-family:Intel Clear;font-size:2.44732em;font-weight:bold}
+		.st12 {fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round;stroke-width:5.52552}
+		.st13 {fill:#000000;font-family:Intel Clear;font-size:2.15291em}
+		.st14 {fill:#000000;font-family:Intel Clear;font-size:1.8401em}
+		.st15 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0276276}
+		.st16 {fill:#c3d600;font-family:Intel Clear;font-size:2.44732em}
+		.st17 {fill:#ffc000;font-family:Intel Clear;font-size:2.44732em}
+		.st18 {fill:#ffc000;stroke:#ffc000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0276276}
+		.st19 {fill:#0070c0;font-family:Intel Clear;font-size:1.8401em}
+		.st20 {fill:#006fc5;font-family:Intel Clear;font-size:1.61927em}
+		.st21 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}
+	]]>
+	</style>
+
+	<g v:mID="0" v:index="1" v:groupContext="foregroundPage">
+		<title>Page-1</title>
+		<v:pageProperties v:drawingScale="0.0393701" v:pageScale="0.0393701" v:drawingUnits="24" v:shadowOffsetX="8.50394"
+				v:shadowOffsetY="-8.50394"/>
+		<g id="shape3-1" v:mID="3" v:groupContext="shape" transform="translate(577.244,-560.42)">
+			<title>Sheet.3</title>
+			<path d="M9.24 585.29 L16.32 585.29 L16.32 587.06 L9.24 587.06 L9.24 585.29 L9.24 585.29 ZM21.63 585.29 L23.4 585.29
+						 L23.4 587.06 L21.63 587.06 L21.63 585.29 L21.63 585.29 ZM28.7 585.29 L35.78 585.29 L35.78 587.06 L28.7 587.06
+						 L28.7 585.29 L28.7 585.29 ZM41.09 585.29 L42.86 585.29 L42.86 587.06 L41.09 587.06 L41.09 585.29 L41.09
+						 585.29 ZM48.17 585.29 L55.25 585.29 L55.25 587.06 L48.17 587.06 L48.17 585.29 L48.17 585.29 ZM60.56 585.29
+						 L62.33 585.29 L62.33 587.06 L60.56 587.06 L60.56 585.29 L60.56 585.29 ZM67.64 585.29 L74.72 585.29 L74.72
+						 587.06 L67.64 587.06 L67.64 585.29 L67.64 585.29 ZM80.03 585.29 L81.8 585.29 L81.8 587.06 L80.03 587.06
+						 L80.03 585.29 L80.03 585.29 ZM87.11 585.29 L94.19 585.29 L94.19 587.06 L87.11 587.06 L87.11 585.29 L87.11
+						 585.29 ZM99.5 585.29 L101.27 585.29 L101.27 587.06 L99.5 587.06 L99.5 585.29 L99.5 585.29 ZM106.58 585.29
+						 L113.66 585.29 L113.66 587.06 L106.58 587.06 L106.58 585.29 L106.58 585.29 ZM118.97 585.29 L120.74 585.29
+						 L120.74 587.06 L118.97 587.06 L118.97 585.29 L118.97 585.29 ZM126.05 585.29 L133.13 585.29 L133.13 587.06
+						 L126.05 587.06 L126.05 585.29 L126.05 585.29 ZM138.43 585.29 L140.2 585.29 L140.2 587.06 L138.43 587.06
+						 L138.43 585.29 L138.43 585.29 ZM145.51 585.29 L152.59 585.29 L152.59 587.06 L145.51 587.06 L145.51 585.29
+						 L145.51 585.29 ZM157.9 585.29 L159.67 585.29 L159.67 587.06 L157.9 587.06 L157.9 585.29 L157.9 585.29 ZM164.98
+						 585.29 L172.06 585.29 L172.06 587.06 L164.98 587.06 L164.98 585.29 L164.98 585.29 ZM177.37 585.29 L179.14
+						 585.29 L179.14 587.06 L177.37 587.06 L177.37 585.29 L177.37 585.29 ZM184.45 585.29 L191.53 585.29 L191.53
+						 587.06 L184.45 587.06 L184.45 585.29 L184.45 585.29 ZM196.84 585.29 L198.61 585.29 L198.61 587.06 L196.84
+						 587.06 L196.84 585.29 L196.84 585.29 ZM203.92 585.29 L211 585.29 L211 587.06 L203.92 587.06 L203.92 585.29
+						 L203.92 585.29 ZM216.31 585.29 L218.08 585.29 L218.08 587.06 L216.31 587.06 L216.31 585.29 L216.31 585.29
+						 ZM223.39 585.29 L230.47 585.29 L230.47 587.06 L223.39 587.06 L223.39 585.29 L223.39 585.29 ZM235.78 585.29
+						 L237.55 585.29 L237.55 587.06 L235.78 587.06 L235.78 585.29 L235.78 585.29 ZM242.86 585.29 L249.93 585.29
+						 L249.93 587.06 L242.86 587.06 L242.86 585.29 L242.86 585.29 ZM255.24 585.29 L257.01 585.29 L257.01 587.06
+						 L255.24 587.06 L255.24 585.29 L255.24 585.29 ZM262.32 585.29 L269.4 585.29 L269.4 587.06 L262.32 587.06
+						 L262.32 585.29 L262.32 585.29 ZM274.71 585.29 L276.48 585.29 L276.48 587.06 L274.71 587.06 L274.71 585.29
+						 L274.71 585.29 ZM281.79 585.29 L288.87 585.29 L288.87 587.06 L281.79 587.06 L281.79 585.29 L281.79 585.29
+						 ZM294.18 585.29 L295.95 585.29 L295.95 587.06 L294.18 587.06 L294.18 585.29 L294.18 585.29 ZM301.26 585.29
+						 L308.34 585.29 L308.34 587.06 L301.26 587.06 L301.26 585.29 L301.26 585.29 ZM313.65 585.29 L315.42 585.29
+						 L315.42 587.06 L313.65 587.06 L313.65 585.29 L313.65 585.29 ZM320.73 585.29 L324.99 585.29 L324.99 587.06
+						 L320.73 587.06 L320.73 585.29 L320.73 585.29 ZM11.06 591.69 L0 586.17 L11.06 580.65 L11.06 591.69 L11.06
+						 591.69 ZM323.16 580.65 L334.22 586.17 L323.16 591.69 L323.16 580.65 L323.16 580.65 Z" class="st1"/>
+		</g>
+		<g id="shape4-3" v:mID="4" v:groupContext="shape" transform="translate(184.298,-201.906)">
+			<title>Sheet.4</title>
+			<path d="M94.04 570.43 L117.87 557.26 L0 344.58 L47.68 318.26 L165.55 530.94 L189.39 517.79 L168.08 591.69 L94.04 570.43
+						 Z" class="st2"/>
+		</g>
+		<g id="shape5-5" v:mID="5" v:groupContext="shape" transform="translate(184.298,-201.906)">
+			<title>Sheet.5</title>
+			<path d="M94.04 570.43 L117.87 557.26 L0 344.58 L47.68 318.26 L165.55 530.94 L189.39 517.79 L168.08 591.69 L94.04 570.43"
+					class="st3"/>
+		</g>
+		<g id="shape6-8" v:mID="6" v:groupContext="shape" transform="translate(119.408,-447.917)">
+			<title>Sheet.6</title>
+			<path d="M0 510.21 L0 591.69 L129.86 591.69 L129.86 510.21 L0 510.21 L0 510.21 Z" class="st4"/>
+		</g>
+		<g id="shape7-10" v:mID="7" v:groupContext="shape" transform="translate(119.408,-447.917)">
+			<title>Sheet.7</title>
+			<path d="M0 510.21 L129.86 510.21 L129.86 591.69 L0 591.69 L0 510.21" class="st5"/>
+		</g>
+		<g id="shape10-13" v:mID="10" v:groupContext="shape" transform="translate(250.819,-447.917)">
+			<title>Sheet.10</title>
+			<path d="M0 510.21 L0 591.69 L822.53 591.69 L822.53 510.21 L0 510.21 L0 510.21 Z" class="st6"/>
+		</g>
+		<g id="shape11-15" v:mID="11" v:groupContext="shape" transform="translate(250.819,-447.917)">
+			<title>Sheet.11</title>
+			<path d="M0 510.21 L822.53 510.21 L822.53 591.69 L0 591.69 L0 510.21" class="st7"/>
+		</g>
+		<g id="shape12-18" v:mID="12" v:groupContext="shape" transform="translate(255.478,-470.123)">
+			<title>Sheet.12</title>
+			<desc>Payload 0</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="157.315" cy="574.07" width="314.63" height="35.245"/>
+			<path d="M314.63 556.45 L0 556.45 L0 591.69 L314.63 591.69 L314.63 556.45" class="st8"/>
+			<text x="102.08" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 0</text>		</g>
+		<g id="shape13-22" v:mID="13" v:groupContext="shape" transform="translate(577.354,-470.123)">
+			<title>Sheet.13</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="167.112" cy="574.07" width="334.23" height="35.245"/>
+			<path d="M334.22 556.45 L0 556.45 L0 591.69 L334.22 591.69 L334.22 556.45" class="st8"/>
+			<text x="111.88" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape14-26" v:mID="14" v:groupContext="shape" transform="translate(910.635,-470.956)">
+			<title>Sheet.14</title>
+			<desc>Payload 2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="81.8509" cy="574.07" width="163.71" height="35.245"/>
+			<path d="M163.7 556.45 L0 556.45 L0 591.69 L163.7 591.69 L163.7 556.45" class="st8"/>
+			<text x="26.61" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 2</text>		</g>
+		<g id="shape15-30" v:mID="15" v:groupContext="shape" transform="translate(909.144,-453.824)">
+			<title>Sheet.15</title>
+			<path d="M1.16 453.85 L1.05 465.33 L3.93 465.39 L4.04 453.91 L1.16 453.85 L1.16 453.85 ZM1 473.95 L0.94 476.82 L3.82
+						 476.87 L3.87 474 L1 473.95 L1 473.95 ZM0.88 485.43 L0.77 496.91 L3.65 496.96 L3.76 485.48 L0.88 485.43 L0.88
+						 485.43 ZM0.72 505.52 L0.72 508.39 L3.59 508.45 L3.59 505.58 L0.72 505.52 L0.72 505.52 ZM0.61 517 L0.55 528.49
+						 L3.43 528.54 L3.48 517.06 L0.61 517 L0.61 517 ZM0.44 537.1 L0.44 539.97 L3.32 540.02 L3.32 537.15 L0.44
+						 537.1 L0.44 537.1 ZM0.39 548.58 L0.28 560.06 L3.15 560.12 L3.26 548.63 L0.39 548.58 L0.39 548.58 ZM0.22
+						 568.67 L0.17 571.54 L3.04 571.6 L3.1 568.73 L0.22 568.67 L0.22 568.67 ZM0.11 580.16 L0 591.64 L2.88 591.69
+						 L2.99 580.21 L0.11 580.16 L0.11 580.16 Z" class="st10"/>
+		</g>
+		<g id="shape16-32" v:mID="16" v:groupContext="shape" transform="translate(119.187,-447.917)">
+			<title>Sheet.16</title>
+			<path d="M0 510.21 L0 591.69 L129.86 591.69 L129.86 510.21 L0 510.21 L0 510.21 Z" class="st4"/>
+		</g>
+		<g id="shape17-34" v:mID="17" v:groupContext="shape" transform="translate(119.187,-447.917)">
+			<title>Sheet.17</title>
+			<path d="M0 510.21 L129.86 510.21 L129.86 591.69 L0 591.69 L0 510.21" class="st5"/>
+		</g>
+		<g id="shape18-37" v:mID="18" v:groupContext="shape" transform="translate(121.944,-471.034)">
+			<title>Sheet.18</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="61.0973" cy="574.07" width="122.2" height="35.245"/>
+			<path d="M122.19 556.45 L0 556.45 L0 591.69 L122.19 591.69 L122.19 556.45" class="st8"/>
+			<text x="20.61" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape19-41" v:mID="19" v:groupContext="shape" transform="translate(329.798,-1.87868)">
+			<title>Sheet.19</title>
+			<path d="M0 510.43 L0 591.69 L289.81 591.69 L289.81 510.43 L0 510.43 L0 510.43 Z" class="st4"/>
+		</g>
+		<g id="shape20-43" v:mID="20" v:groupContext="shape" transform="translate(329.798,-1.87868)">
+			<title>Sheet.20</title>
+			<path d="M0 510.43 L289.81 510.43 L289.81 591.69 L0 591.69 L0 510.43" class="st5"/>
+		</g>
+		<g id="shape21-46" v:mID="21" v:groupContext="shape" transform="translate(424.908,-21.567)">
+			<title>Sheet.21</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="61.0973" cy="574.07" width="122.2" height="35.245"/>
+			<path d="M122.19 556.45 L0 556.45 L0 591.69 L122.19 591.69 L122.19 556.45" class="st8"/>
+			<text x="11.55" y="582.88" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape22-50" v:mID="22" v:groupContext="shape" transform="translate(619.609,-1.87868)">
+			<title>Sheet.22</title>
+			<path d="M0 510.43 L0 591.69 L453.74 591.69 L453.74 510.43 L0 510.43 L0 510.43 Z" class="st6"/>
+		</g>
+		<g id="shape23-52" v:mID="23" v:groupContext="shape" transform="translate(619.609,-1.87868)">
+			<title>Sheet.23</title>
+			<path d="M0 510.43 L453.74 510.43 L453.74 591.69 L0 591.69 L0 510.43" class="st7"/>
+		</g>
+		<g id="shape24-55" v:mID="24" v:groupContext="shape" transform="translate(778.624,-21.5672)">
+			<title>Sheet.24</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="81.8509" cy="574.07" width="163.71" height="35.245"/>
+			<path d="M163.7 556.45 L0 556.45 L0 591.69 L163.7 591.69 L163.7 556.45" class="st8"/>
+			<text x="14.26" y="582.88" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape25-59" v:mID="25" v:groupContext="shape" transform="translate(710.092,-113.83)">
+			<title>Sheet.25</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L349.43 508.89 C357.12 508.89 363.26 515.07 363.26 522.69 L363.26
+						 577.89 C363.26 585.57 357.12 591.69 349.43 591.69 L13.83 591.69 C6.19 591.69 0 585.57 0 577.89 L0 522.69
+						 Z" class="st6"/>
+		</g>
+		<g id="shape26-61" v:mID="26" v:groupContext="shape" transform="translate(710.092,-113.83)">
+			<title>Sheet.26</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L349.43 508.89 C357.12 508.89 363.26 515.07 363.26 522.69 L363.26
+						 577.89 C363.26 585.57 357.12 591.69 349.43 591.69 L13.83 591.69 C6.19 591.69 0 585.57 0 577.89 L0 522.69
+						 Z" class="st12"/>
+		</g>
+		<g id="shape27-63" v:mID="27" v:groupContext="shape" transform="translate(813.057,-150.108)">
+			<title>Sheet.27</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="94.1386" cy="576.19" width="188.28" height="31.0055"/>
+			<path d="M188.28 560.69 L0 560.69 L0 591.69 L188.28 591.69 L188.28 560.69" class="st8"/>
+			<text x="15.43" y="583.94" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape28-67" v:mID="28" v:groupContext="shape" transform="translate(810.845,-123.854)">
+			<title>Sheet.28</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="95.5065" cy="578.442" width="191.02" height="26.501"/>
+			<path d="M191.01 565.19 L0 565.19 L0 591.69 L191.01 591.69 L191.01 565.19" class="st8"/>
+			<text x="15.15" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape29-71" v:mID="29" v:groupContext="shape" transform="translate(573.151,-149.601)">
+			<title>Sheet.29</title>
+			<path d="M0 584.74 L127.76 584.74 L127.76 587.61 L0 587.61 L0 584.74 L0 584.74 ZM125.91 580.65 L136.97 586.17 L125.91
+						 591.69 L125.91 580.65 L125.91 580.65 Z" class="st15"/>
+		</g>
+		<g id="shape30-73" v:mID="30" v:groupContext="shape" transform="translate(0,-309.671)">
+			<title>Sheet.30</title>
+			<desc>Memory copy</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="108.076" cy="574.07" width="216.16" height="35.245"/>
+			<path d="M216.15 556.45 L0 556.45 L0 591.69 L216.15 591.69 L216.15 556.45" class="st8"/>
+			<text x="17.68" y="582.88" class="st16" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Memory copy</text>		</g>
+		<g id="shape31-77" v:mID="31" v:groupContext="shape" transform="translate(680.77,-305.707)">
+			<title>Sheet.31</title>
+			<desc>No Memory Copy</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="136.547" cy="574.07" width="273.1" height="35.245"/>
+			<path d="M273.09 556.45 L0 556.45 L0 591.69 L273.09 591.69 L273.09 556.45" class="st8"/>
+			<text x="21.4" y="582.88" class="st17" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>No Memory Copy</text>		</g>
+		<g id="shape32-81" v:mID="32" v:groupContext="shape" transform="translate(1102.72,-26.7532)">
+			<title>Sheet.32</title>
+			<desc>Logical output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="138.243" cy="578.442" width="276.49" height="26.501"/>
+			<path d="M276.49 565.19 L0 565.19 L0 591.69 L276.49 591.69 L276.49 565.19" class="st8"/>
+			<text x="20.73" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Logical output segment</text>		</g>
+		<g id="shape36-85" v:mID="36" v:groupContext="shape" transform="translate(1106.81,-138.647)">
+			<title>Sheet.36</title>
+			<desc>Two-part output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="144.906" cy="578.442" width="289.82" height="26.501"/>
+			<path d="M289.81 565.19 L0 565.19 L0 591.69 L289.81 591.69 L289.81 565.19" class="st8"/>
+			<text x="16.56" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Two-part output segment</text>		</g>
+		<g id="shape37-89" v:mID="37" v:groupContext="shape" transform="translate(575.916,-453.879)">
+			<title>Sheet.37</title>
+			<path d="M2.88 453.91 L2.9 465.39 L0.03 465.39 L0 453.91 L2.88 453.91 L2.88 453.91 ZM2.9 474 L2.9 476.87 L0.03 476.87
+						 L0.03 474 L2.9 474 L2.9 474 ZM2.9 485.48 L2.9 496.96 L0.03 496.96 L0.03 485.48 L2.9 485.48 L2.9 485.48 ZM2.9
+						 505.58 L2.9 508.45 L0.03 508.45 L0.03 505.58 L2.9 505.58 L2.9 505.58 ZM2.9 517.06 L2.9 528.54 L0.03 528.54
+						 L0.03 517.06 L2.9 517.06 L2.9 517.06 ZM2.9 537.15 L2.9 540.02 L0.03 540.02 L0.03 537.15 L2.9 537.15 L2.9
+						 537.15 ZM2.9 548.63 L2.9 560.12 L0.03 560.12 L0.03 548.63 L2.9 548.63 L2.9 548.63 ZM2.9 568.73 L2.9 571.6
+						 L0.03 571.6 L0.03 568.73 L2.9 568.73 L2.9 568.73 ZM2.9 580.21 L2.9 591.69 L0.03 591.69 L0.03 580.21 L2.9
+						 580.21 L2.9 580.21 Z" class="st18"/>
+		</g>
+		<g id="shape38-91" v:mID="38" v:groupContext="shape" transform="translate(577.354,-193.764)">
+			<title>Sheet.38</title>
+			<path d="M5.59 347.01 L10.92 357.16 L8.38 358.52 L3.04 348.36 L5.59 347.01 L5.59 347.01 ZM14.96 364.78 L16.29 367.32
+						 L13.74 368.67 L12.42 366.13 L14.96 364.78 L14.96 364.78 ZM20.33 374.97 L25.66 385.12 L23.12 386.45 L17.78
+						 376.29 L20.33 374.97 L20.33 374.97 ZM29.7 392.74 L31.03 395.28 L28.48 396.61 L27.16 394.07 L29.7 392.74
+						 L29.7 392.74 ZM35.04 402.9 L40.4 413.06 L37.86 414.38 L32.49 404.22 L35.04 402.9 L35.04 402.9 ZM44.41 420.67
+						 L45.77 423.21 L43.22 424.57 L41.87 422.03 L44.41 420.67 L44.41 420.67 ZM49.78 430.83 L55.14 440.99 L52.6
+						 442.34 L47.23 432.18 L49.78 430.83 L49.78 430.83 ZM59.15 448.61 L60.51 451.15 L57.96 452.5 L56.61 449.96
+						 L59.15 448.61 L59.15 448.61 ZM64.52 458.79 L69.88 468.95 L67.34 470.27 L61.97 460.12 L64.52 458.79 L64.52
+						 458.79 ZM73.89 476.57 L75.25 479.11 L72.7 480.43 L71.35 477.89 L73.89 476.57 L73.89 476.57 ZM79.26 486.72
+						 L84.62 496.88 L82.08 498.21 L76.71 488.05 L79.26 486.72 L79.26 486.72 ZM88.63 504.5 L89.96 507.04 L87.41
+						 508.39 L86.09 505.85 L88.63 504.5 L88.63 504.5 ZM94 514.66 L99.33 524.81 L96.79 526.17 L91.45 516.01 L94
+						 514.66 L94 514.66 ZM103.37 532.43 L104.7 534.97 L102.15 536.32 L100.83 533.79 L103.37 532.43 L103.37 532.43
+						 ZM108.73 542.62 L114.07 552.77 L111.53 554.1 L106.19 543.94 L108.73 542.62 L108.73 542.62 ZM118.11 560.39
+						 L119.44 562.93 L116.89 564.26 L115.57 561.72 L118.11 560.39 L118.11 560.39 ZM123.45 570.55 L128.81 580.71
+						 L126.27 582.03 L120.9 571.87 L123.45 570.55 L123.45 570.55 ZM132.82 588.33 L133.9 590.37 L131.36 591.69
+						 L130.28 589.68 L132.82 588.33 L132.82 588.33 ZM0.28 351.89 L0 339.53 L10.07 346.73 L0.28 351.89 L0.28 351.89
+						 Z" class="st18"/>
+		</g>
+		<g id="shape39-93" v:mID="39" v:groupContext="shape" transform="translate(329.798,-113.83)">
+			<title>Sheet.39</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L229.53 508.89 C237.19 508.89 243.35 515.07 243.35 522.69 L243.35
+						 577.89 C243.35 585.54 237.19 591.69 229.53 591.69 L13.83 591.69 C6.19 591.69 0 585.54 0 577.89 L0 522.69
+						 Z" class="st4"/>
+		</g>
+		<g id="shape40-95" v:mID="40" v:groupContext="shape" transform="translate(329.798,-113.83)">
+			<title>Sheet.40</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L229.53 508.89 C237.19 508.89 243.35 515.07 243.35 522.69 L243.35
+						 577.89 C243.35 585.54 237.19 591.69 229.53 591.69 L13.83 591.69 C6.19 591.69 0 585.54 0 577.89 L0 522.69
+						 Z" class="st12"/>
+		</g>
+		<g id="shape41-97" v:mID="41" v:groupContext="shape" transform="translate(368.774,-150.453)">
+			<title>Sheet.41</title>
+			<desc>Direct mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="82.7002" cy="576.19" width="165.41" height="31.0055"/>
+			<path d="M165.4 560.69 L0 560.69 L0 591.69 L165.4 591.69 L165.4 560.69" class="st8"/>
+			<text x="13.94" y="583.94" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Direct mbuf</text>		</g>
+		<g id="shape42-101" v:mID="42" v:groupContext="shape" transform="translate(351.856,-123.854)">
+			<title>Sheet.42</title>
+			<desc>(copy of headers)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="102.121" cy="578.442" width="204.25" height="26.501"/>
+			<path d="M204.24 565.19 L0 565.19 L0 591.69 L204.24 591.69 L204.24 565.19" class="st8"/>
+			<text x="16.02" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(copy of headers)</text>		</g>
+		<g id="shape43-105" v:mID="43" v:groupContext="shape" transform="translate(619.797,-155.563)">
+			<title>Sheet.43</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="28.011" cy="578.442" width="56.03" height="26.501"/>
+			<path d="M56.02 565.19 L0 565.19 L0 591.69 L56.02 591.69 L56.02 565.19" class="st8"/>
+			<text x="6.35" y="585.07" class="st19" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape44-109" v:mID="44" v:groupContext="shape" transform="translate(700.911,-551.367)">
+			<title>Sheet.44</title>
+			<path d="M0 559.23 L0 591.69 L84.29 591.69 L84.29 559.23 L0 559.23 L0 559.23 Z" class="st2"/>
+		</g>
+		<g id="shape45-111" v:mID="45" v:groupContext="shape" transform="translate(709.883,-555.163)">
+			<title>Sheet.45</title>
+			<desc>segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="30.7501" cy="580.032" width="61.51" height="23.3211"/>
+			<path d="M61.5 568.37 L0 568.37 L0 591.69 L61.5 591.69 L61.5 568.37" class="st8"/>
+			<text x="6.38" y="585.86" class="st20" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>segsz</text>		</g>
+		<g id="shape46-115" v:mID="46" v:groupContext="shape" transform="translate(1111.54,-477.36)">
+			<title>Sheet.46</title>
+			<desc>Input packet</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="74.9" cy="578.442" width="149.8" height="26.501"/>
+			<path d="M149.8 565.19 L0 565.19 L0 591.69 L149.8 591.69 L149.8 565.19" class="st8"/>
+			<text x="12.47" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Input packet</text>		</g>
+	</g>
+</svg>
diff --git a/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg b/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
new file mode 100644
index 0000000..f18a327
--- /dev/null
+++ b/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
@@ -0,0 +1,477 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by Microsoft Visio, SVG Export gso-three-seg-mbuf.svg Page-1 -->
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ev="http://www.w3.org/2001/xml-events"
+		xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/" width="21.8589in" height="9.63966in"
+		viewBox="0 0 1573.84 694.055" xml:space="preserve" color-interpolation-filters="sRGB" class="st23">
+	<title>GSO three-part output segment</title>
+	<v:documentProperties v:langID="1033" v:metric="true" v:viewMarkup="false"/>
+
+	<style type="text/css">
+	<![CDATA[
+		.st1 {fill:#ffc000;stroke:#ffc000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st2 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st3 {fill:none;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.42236}
+		.st4 {fill:#c3d600;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st5 {stroke:#8f9d00;stroke-linecap:round;stroke-linejoin:round;stroke-width:4.47539}
+		.st6 {fill:#00aeef;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st7 {stroke:#007fb0;stroke-linecap:round;stroke-linejoin:round;stroke-width:4.47539}
+		.st8 {stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st9 {fill:#ffffff;font-family:Calibri;font-size:2.08333em;font-weight:bold}
+		.st10 {fill:#ffffff;font-family:Intel Clear;font-size:2.91502em;font-weight:bold}
+		.st11 {fill:#000000;font-family:Intel Clear;font-size:2.19175em}
+		.st12 {fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round;stroke-width:6.58146}
+		.st13 {fill:#000000;font-family:Intel Clear;font-size:2.50001em}
+		.st14 {fill:#000000;font-family:Intel Clear;font-size:1.99999em}
+		.st15 {fill:#0070c0;font-family:Intel Clear;font-size:2.19175em}
+		.st16 {fill:#ffffff;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st17 {fill:#006fc5;font-family:Intel Clear;font-size:1.92874em}
+		.st18 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st19 {fill:#0070c0;font-family:Intel Clear;font-size:1.5em}
+		.st20 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0658146}
+		.st21 {fill:#000000;font-family:Intel Clear;font-size:1.81915em}
+		.st22 {fill:#000000;font-family:Intel Clear;font-size:1.49785em}
+		.st23 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}
+	]]>
+	</style>
+
+	<g v:mID="0" v:index="1" v:groupContext="foregroundPage">
+		<title>Page-1</title>
+		<v:pageProperties v:drawingScale="0.0393701" v:pageScale="0.0393701" v:drawingUnits="24" v:shadowOffsetX="8.50394"
+				v:shadowOffsetY="-8.50394"/>
+		<v:layer v:name="top" v:index="0"/>
+		<v:layer v:name="middle" v:index="1"/>
+		<g id="shape111-1" v:mID="111" v:groupContext="shape" v:layerMember="0" transform="translate(787.208,-220.973)">
+			<title>Sheet.111</title>
+			<path d="M6.65 402.61 L13.01 414.71 L9.98 416.32 L3.62 404.22 L6.65 402.61 L6.65 402.61 ZM17.82 423.78 L19.4 426.81 L16.37
+						 428.42 L14.79 425.39 L17.82 423.78 L17.82 423.78 ZM24.21 435.91 L30.57 448.01 L27.54 449.59 L21.18 437.49
+						 L24.21 435.91 L24.21 435.91 ZM35.38 457.08 L36.96 460.11 L33.93 461.69 L32.35 458.66 L35.38 457.08 L35.38
+						 457.08 ZM41.73 469.18 L48.12 481.28 L45.09 482.86 L38.7 470.76 L41.73 469.18 L41.73 469.18 ZM52.9 490.36
+						 L54.51 493.38 L51.48 494.99 L49.87 491.97 L52.9 490.36 L52.9 490.36 ZM59.29 502.45 L65.68 514.55 L62.65
+						 516.16 L56.26 504.06 L59.29 502.45 L59.29 502.45 ZM70.46 523.63 L72.07 526.65 L69.04 528.26 L67.43 525.24
+						 L70.46 523.63 L70.46 523.63 ZM76.85 535.76 L83.24 547.86 L80.21 549.43 L73.82 537.34 L76.85 535.76 L76.85
+						 535.76 ZM88.01 556.93 L89.63 559.95 L86.6 561.53 L84.98 558.51 L88.01 556.93 L88.01 556.93 ZM94.4 569.03
+						 L100.79 581.13 L97.76 582.7 L91.37 570.61 L94.4 569.03 L94.4 569.03 ZM105.57 590.2 L107.15 593.22 L104.12
+						 594.84 L102.54 591.81 L105.57 590.2 L105.57 590.2 ZM111.96 602.3 L118.32 614.4 L115.28 616.01 L108.93 603.91
+						 L111.96 602.3 L111.96 602.3 ZM123.12 623.47 L124.71 626.5 L121.67 628.11 L120.09 625.08 L123.12 623.47 L123.12
+						 623.47 ZM129.51 635.6 L135.87 647.7 L132.84 649.28 L126.48 637.18 L129.51 635.6 L129.51 635.6 ZM140.68 656.77
+						 L142.26 659.8 L139.23 661.38 L137.65 658.35 L140.68 656.77 L140.68 656.77 ZM147.04 668.87 L153.43 680.97
+						 L150.4 682.55 L144.01 670.45 L147.04 668.87 L147.04 668.87 ZM158.2 690.04 L159.49 692.48 L156.46 694.06
+						 L155.17 691.66 L158.2 690.04 L158.2 690.04 ZM0.33 408.43 L0 393.7 L11.99 402.28 L0.33 408.43 L0.33 408.43
+						 Z" class="st1"/>
+		</g>
+		<g id="shape110-3" v:mID="110" v:groupContext="shape" v:layerMember="0" transform="translate(685.078,-560.166)">
+			<title>Sheet.110</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape4-5" v:mID="4" v:groupContext="shape" transform="translate(718.715,-469.955)">
+			<title>Sheet.4</title>
+			<path d="M0 655.13 L0 678.22 C0 686.97 11.69 694.06 26.05 694.06 C40.45 694.06 52.11 686.97 52.11 678.22 L52.11 673.91
+						 L59.55 673.91 L44.66 664.86 L29.78 673.91 L37.22 673.91 L37.22 678.22 C37.22 681.98 32.25 685 26.05 685
+						 C19.89 685 14.89 681.98 14.89 678.22 L14.89 655.13 L0 655.13 Z" class="st3"/>
+		</g>
+		<g id="shape5-7" v:mID="5" v:groupContext="shape" transform="translate(547.831,-656.823)">
+			<title>Sheet.5</title>
+			<path d="M11 686.43 L19.43 686.43 L19.43 688.53 L11 688.53 L11 686.43 L11 686.43 ZM25.76 686.43 L27.87 686.43 L27.87
+						 688.53 L25.76 688.53 L25.76 686.43 L25.76 686.43 ZM34.19 686.43 L42.62 686.43 L42.62 688.53 L34.19 688.53
+						 L34.19 686.43 L34.19 686.43 ZM48.95 686.43 L51.05 686.43 L51.05 688.53 L48.95 688.53 L48.95 686.43 L48.95
+						 686.43 ZM57.38 686.43 L65.81 686.43 L65.81 688.53 L57.38 688.53 L57.38 686.43 L57.38 686.43 ZM72.14 686.43
+						 L74.24 686.43 L74.24 688.53 L72.14 688.53 L72.14 686.43 L72.14 686.43 ZM80.57 686.43 L89 686.43 L89 688.53
+						 L80.57 688.53 L80.57 686.43 L80.57 686.43 ZM95.32 686.43 L97.43 686.43 L97.43 688.53 L95.32 688.53 L95.32
+						 686.43 L95.32 686.43 ZM103.76 686.43 L112.19 686.43 L112.19 688.53 L103.76 688.53 L103.76 686.43 L103.76
+						 686.43 ZM118.51 686.43 L120.62 686.43 L120.62 688.53 L118.51 688.53 L118.51 686.43 L118.51 686.43 ZM126.94
+						 686.43 L135.38 686.43 L135.38 688.53 L126.94 688.53 L126.94 686.43 L126.94 686.43 ZM141.7 686.43 L143.81
+						 686.43 L143.81 688.53 L141.7 688.53 L141.7 686.43 L141.7 686.43 ZM150.13 686.43 L158.57 686.43 L158.57 688.53
+						 L150.13 688.53 L150.13 686.43 L150.13 686.43 ZM164.89 686.43 L167 686.43 L167 688.53 L164.89 688.53 L164.89
+						 686.43 L164.89 686.43 ZM173.32 686.43 L181.75 686.43 L181.75 688.53 L173.32 688.53 L173.32 686.43 L173.32
+						 686.43 ZM188.08 686.43 L190.19 686.43 L190.19 688.53 L188.08 688.53 L188.08 686.43 L188.08 686.43 ZM196.51
+						 686.43 L204.94 686.43 L204.94 688.53 L196.51 688.53 L196.51 686.43 L196.51 686.43 ZM211.27 686.43 L213.38
+						 686.43 L213.38 688.53 L211.27 688.53 L211.27 686.43 L211.27 686.43 ZM219.7 686.43 L228.13 686.43 L228.13
+						 688.53 L219.7 688.53 L219.7 686.43 L219.7 686.43 ZM234.46 686.43 L236.56 686.43 L236.56 688.53 L234.46 688.53
+						 L234.46 686.43 L234.46 686.43 ZM242.89 686.43 L251.32 686.43 L251.32 688.53 L242.89 688.53 L242.89 686.43
+						 L242.89 686.43 ZM257.64 686.43 L259.75 686.43 L259.75 688.53 L257.64 688.53 L257.64 686.43 L257.64 686.43
+						 ZM266.08 686.43 L274.51 686.43 L274.51 688.53 L266.08 688.53 L266.08 686.43 L266.08 686.43 ZM280.83 686.43
+						 L282.94 686.43 L282.94 688.53 L280.83 688.53 L280.83 686.43 L280.83 686.43 ZM289.27 686.43 L297.7 686.43
+						 L297.7 688.53 L289.27 688.53 L289.27 686.43 L289.27 686.43 ZM304.02 686.43 L306.13 686.43 L306.13 688.53
+						 L304.02 688.53 L304.02 686.43 L304.02 686.43 ZM312.45 686.43 L320.89 686.43 L320.89 688.53 L312.45 688.53
+						 L312.45 686.43 L312.45 686.43 ZM327.21 686.43 L329.32 686.43 L329.32 688.53 L327.21 688.53 L327.21 686.43
+						 L327.21 686.43 ZM335.64 686.43 L344.08 686.43 L344.08 688.53 L335.64 688.53 L335.64 686.43 L335.64 686.43
+						 ZM350.4 686.43 L352.51 686.43 L352.51 688.53 L350.4 688.53 L350.4 686.43 L350.4 686.43 ZM358.83 686.43 L367.26
+						 686.43 L367.26 688.53 L358.83 688.53 L358.83 686.43 L358.83 686.43 ZM373.59 686.43 L375.7 686.43 L375.7
+						 688.53 L373.59 688.53 L373.59 686.43 L373.59 686.43 ZM382.02 686.43 L387.06 686.43 L387.06 688.53 L382.02
+						 688.53 L382.02 686.43 L382.02 686.43 ZM13.18 694.06 L0 687.48 L13.18 680.9 L13.18 694.06 L13.18 694.06 ZM384.89
+						 680.9 L398.06 687.48 L384.89 694.06 L384.89 680.9 L384.89 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape6-9" v:mID="6" v:groupContext="shape" transform="translate(2.5012,-522.82)">
+			<title>Sheet.6</title>
+			<path d="M0 597.01 L0 694.06 L154.68 694.06 L154.68 597.01 L0 597.01 L0 597.01 Z" class="st4"/>
+		</g>
+		<g id="shape7-11" v:mID="7" v:groupContext="shape" transform="translate(2.5012,-522.82)">
+			<title>Sheet.7</title>
+			<path d="M0 597.01 L154.68 597.01 L154.68 694.06 L0 694.06 L0 597.01" class="st5"/>
+		</g>
+		<g id="shape10-14" v:mID="10" v:groupContext="shape" transform="translate(159.025,-522.82)">
+			<title>Sheet.10</title>
+			<path d="M0 597.01 L0 694.06 L563.73 694.06 L563.73 597.01 L0 597.01 L0 597.01 Z" class="st6"/>
+		</g>
+		<g id="shape11-16" v:mID="11" v:groupContext="shape" transform="translate(159.025,-522.82)">
+			<title>Sheet.11</title>
+			<path d="M0 597.01 L563.73 597.01 L563.73 694.06 L0 694.06 L0 597.01" class="st7"/>
+		</g>
+		<g id="shape12-19" v:mID="12" v:groupContext="shape" transform="translate(262.039,-549.269)">
+			<title>Sheet.12</title>
+			<desc>Payload 0</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="46.92" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 0</text>		</g>
+		<g id="shape13-23" v:mID="13" v:groupContext="shape" transform="translate(547.615,-549.269)">
+			<title>Sheet.13</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="87.5716" cy="673.065" width="175.15" height="41.9798"/>
+			<path d="M175.14 652.08 L0 652.08 L0 694.06 L175.14 694.06 L175.14 652.08" class="st8"/>
+			<text x="37" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape15-27" v:mID="15" v:groupContext="shape" transform="translate(2.2377,-522.82)">
+			<title>Sheet.15</title>
+			<path d="M0 597.01 L0 694.06 L154.68 694.06 L154.68 597.01 L0 597.01 L0 597.01 Z" class="st4"/>
+		</g>
+		<g id="shape16-29" v:mID="16" v:groupContext="shape" transform="translate(2.2377,-522.82)">
+			<title>Sheet.16</title>
+			<path d="M0 597.01 L154.68 597.01 L154.68 694.06 L0 694.06 L0 597.01" class="st5"/>
+		</g>
+		<g id="shape17-32" v:mID="17" v:groupContext="shape" transform="translate(6.52106,-546.331)">
+			<title>Sheet.17</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="72.773" cy="673.065" width="145.55" height="41.9798"/>
+			<path d="M145.55 652.08 L0 652.08 L0 694.06 L145.55 694.06 L145.55 652.08" class="st8"/>
+			<text x="34.98" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape23-36" v:mID="23" v:groupContext="shape" transform="translate(286.548,-2.2377)">
+			<title>Sheet.23</title>
+			<path d="M0 597.27 L0 694.06 L345.2 694.06 L345.2 597.27 L0 597.27 L0 597.27 Z" class="st4"/>
+		</g>
+		<g id="shape24-38" v:mID="24" v:groupContext="shape" transform="translate(286.548,-2.2377)">
+			<title>Sheet.24</title>
+			<path d="M0 597.27 L345.2 597.27 L345.2 694.06 L0 694.06 L0 597.27" class="st5"/>
+		</g>
+		<g id="shape25-41" v:mID="25" v:groupContext="shape" transform="translate(399.834,-25.6887)">
+			<title>Sheet.25</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="72.773" cy="673.065" width="145.55" height="41.9798"/>
+			<path d="M145.55 652.08 L0 652.08 L0 694.06 L145.55 694.06 L145.55 652.08" class="st8"/>
+			<text x="13.76" y="683.56" class="st10" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape31-45" v:mID="31" v:groupContext="shape" transform="translate(631.744,-2.2377)">
+			<title>Sheet.31</title>
+			<path d="M0 597.27 L0 694.06 L516.21 694.06 L516.21 597.27 L0 597.27 L0 597.27 Z" class="st6"/>
+		</g>
+		<g id="shape32-47" v:mID="32" v:groupContext="shape" transform="translate(631.744,-2.2377)">
+			<title>Sheet.32</title>
+			<path d="M0 597.27 L516.21 597.27 L516.21 694.06 L0 694.06 L0 597.27" class="st7"/>
+		</g>
+		<g id="shape33-50" v:mID="33" v:groupContext="shape" transform="translate(809.035,-25.6889)">
+			<title>Sheet.33</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="16.99" y="683.56" class="st10" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape35-54" v:mID="35" v:groupContext="shape" transform="translate(1199.29,-21.1708)">
+			<title>Sheet.35</title>
+			<desc>Logical output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="164.662" cy="678.273" width="329.33" height="31.5648"/>
+			<path d="M329.32 662.49 L0 662.49 L0 694.06 L329.32 694.06 L329.32 662.49" class="st8"/>
+			<text x="24.69" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Logical output segment</text>		</g>
+		<g id="shape38-58" v:mID="38" v:groupContext="shape" transform="translate(1204.65,-254.446)">
+			<title>Sheet.38</title>
+			<desc>Three-part output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="181.707" cy="678.273" width="363.42" height="31.5648"/>
+			<path d="M363.41 662.49 L0 662.49 L0 694.06 L363.41 694.06 L363.41 662.49" class="st8"/>
+			<text x="19.51" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Three-part output segment</text>		</g>
+		<g id="shape39-62" v:mID="39" v:groupContext="shape" transform="translate(546.25,-529.921)">
+			<title>Sheet.39</title>
+			<path d="M3.43 529.94 L3.46 543.61 L0.03 543.61 L0 529.94 L3.43 529.94 L3.43 529.94 ZM3.46 553.87 L3.46 557.29 L0.03
+						 557.29 L0.03 553.87 L3.46 553.87 L3.46 553.87 ZM3.46 567.55 L3.46 581.22 L0.03 581.22 L0.03 567.55 L3.46
+						 567.55 L3.46 567.55 ZM3.46 591.48 L3.46 594.9 L0.03 594.9 L0.03 591.48 L3.46 591.48 L3.46 591.48 ZM3.46
+						 605.16 L3.46 618.83 L0.03 618.83 L0.03 605.16 L3.46 605.16 L3.46 605.16 ZM3.46 629.09 L3.46 632.51 L0.03
+						 632.51 L0.03 629.09 L3.46 629.09 L3.46 629.09 ZM3.46 642.77 L3.46 656.45 L0.03 656.45 L0.03 642.77 L3.46
+						 642.77 L3.46 642.77 ZM3.46 666.7 L3.46 670.12 L0.03 670.12 L0.03 666.7 L3.46 666.7 L3.46 666.7 ZM3.46 680.38
+						 L3.46 694.06 L0.03 694.06 L0.03 680.38 L3.46 680.38 L3.46 680.38 Z" class="st1"/>
+		</g>
+		<g id="shape40-64" v:mID="40" v:groupContext="shape" transform="translate(549.097,-223.749)">
+			<title>Sheet.40</title>
+			<path d="M6.65 402.61 L13.01 414.71 L9.98 416.32 L3.62 404.22 L6.65 402.61 L6.65 402.61 ZM17.82 423.78 L19.4 426.81 L16.37
+						 428.42 L14.79 425.39 L17.82 423.78 L17.82 423.78 ZM24.21 435.91 L30.57 448.01 L27.54 449.59 L21.18 437.49
+						 L24.21 435.91 L24.21 435.91 ZM35.38 457.08 L36.96 460.11 L33.93 461.69 L32.35 458.66 L35.38 457.08 L35.38
+						 457.08 ZM41.73 469.18 L48.12 481.28 L45.09 482.86 L38.7 470.76 L41.73 469.18 L41.73 469.18 ZM52.9 490.36
+						 L54.51 493.38 L51.48 494.99 L49.87 491.97 L52.9 490.36 L52.9 490.36 ZM59.29 502.45 L65.68 514.55 L62.65
+						 516.16 L56.26 504.06 L59.29 502.45 L59.29 502.45 ZM70.46 523.63 L72.07 526.65 L69.04 528.26 L67.43 525.24
+						 L70.46 523.63 L70.46 523.63 ZM76.85 535.76 L83.24 547.86 L80.21 549.43 L73.82 537.34 L76.85 535.76 L76.85
+						 535.76 ZM88.01 556.93 L89.63 559.95 L86.6 561.53 L84.98 558.51 L88.01 556.93 L88.01 556.93 ZM94.4 569.03
+						 L100.79 581.13 L97.76 582.7 L91.37 570.61 L94.4 569.03 L94.4 569.03 ZM105.57 590.2 L107.15 593.22 L104.12
+						 594.84 L102.54 591.81 L105.57 590.2 L105.57 590.2 ZM111.96 602.3 L118.32 614.4 L115.28 616.01 L108.93 603.91
+						 L111.96 602.3 L111.96 602.3 ZM123.12 623.47 L124.71 626.5 L121.67 628.11 L120.09 625.08 L123.12 623.47 L123.12
+						 623.47 ZM129.51 635.6 L135.87 647.7 L132.84 649.28 L126.48 637.18 L129.51 635.6 L129.51 635.6 ZM140.68 656.77
+						 L142.26 659.8 L139.23 661.38 L137.65 658.35 L140.68 656.77 L140.68 656.77 ZM147.04 668.87 L153.43 680.97
+						 L150.4 682.55 L144.01 670.45 L147.04 668.87 L147.04 668.87 ZM158.2 690.04 L159.49 692.48 L156.46 694.06
+						 L155.17 691.66 L158.2 690.04 L158.2 690.04 ZM0.33 408.43 L0 393.7 L11.99 402.28 L0.33 408.43 L0.33 408.43
+						 Z" class="st1"/>
+		</g>
+		<g id="shape46-66" v:mID="46" v:groupContext="shape" transform="translate(66.8445,-221.499)">
+			<title>Sheet.46</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L273.39 595.43 C282.51 595.43 289.86 602.79 289.86 611.87 L289.86
+						 677.62 C289.86 686.72 282.51 694.06 273.39 694.06 L16.47 694.06 C7.38 694.06 -0 686.72 0 677.62 L0 611.87
+						 Z" class="st4"/>
+		</g>
+		<g id="shape47-68" v:mID="47" v:groupContext="shape" transform="translate(66.8445,-221.499)">
+			<title>Sheet.47</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L273.39 595.43 C282.51 595.43 289.86 602.79 289.86 611.87 L289.86
+						 677.62 C289.86 686.72 282.51 694.06 273.39 694.06 L16.47 694.06 C7.38 694.06 -0 686.72 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape48-70" v:mID="48" v:groupContext="shape" transform="translate(113.27,-263.667)">
+			<title>Sheet.48</title>
+			<desc>Direct mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="98.5041" cy="675.59" width="197.01" height="36.9302"/>
+			<path d="M197.01 657.13 L0 657.13 L0 694.06 L197.01 694.06 L197.01 657.13" class="st8"/>
+			<text x="18.66" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Direct mbuf</text>		</g>
+		<g id="shape51-74" v:mID="51" v:groupContext="shape" transform="translate(85.817,-233.439)">
+			<title>Sheet.51</title>
+			<desc>(copy of headers)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="127.916" cy="678.273" width="255.84" height="31.5648"/>
+			<path d="M255.83 662.49 L0 662.49 L0 694.06 L255.83 694.06 L255.83 662.49" class="st8"/>
+			<text x="34.33" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(copy of headers)</text>		</g>
+		<g id="shape53-78" v:mID="53" v:groupContext="shape" transform="translate(371.944,-275.998)">
+			<title>Sheet.53</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="7.56" y="686.16" class="st15" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape54-82" v:mID="54" v:groupContext="shape" transform="translate(695.132,-646.04)">
+			<title>Sheet.54</title>
+			<path d="M0 655.39 L0 694.06 L100.4 694.06 L100.4 655.39 L0 655.39 L0 655.39 Z" class="st16"/>
+		</g>
+		<g id="shape55-84" v:mID="55" v:groupContext="shape" transform="translate(709.033,-648.946)">
+			<title>Sheet.55</title>
+			<desc>segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="36.6265" cy="680.167" width="73.26" height="27.7775"/>
+			<path d="M73.25 666.28 L0 666.28 L0 694.06 L73.25 694.06 L73.25 666.28" class="st8"/>
+			<text x="7.6" y="687.11" class="st17" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>segsz</text>		</g>
+		<g id="shape56-88" v:mID="56" v:groupContext="shape" transform="translate(785.874,-521.182)">
+			<title>Sheet.56</title>
+			<path d="M0 597.27 L0 694.06 L363.41 694.06 L363.41 597.27 L0 597.27 L0 597.27 Z" class="st6"/>
+		</g>
+		<g id="shape57-90" v:mID="57" v:groupContext="shape" transform="translate(785.874,-521.182)">
+			<title>Sheet.57</title>
+			<path d="M0 597.27 L363.41 597.27 L363.41 694.06 L0 694.06 L0 597.27" class="st7"/>
+		</g>
+		<g id="shape58-93" v:mID="58" v:groupContext="shape" v:layerMember="0" transform="translate(943.158,-529.889)">
+			<title>Sheet.58</title>
+			<path d="M1.35 529.91 L1.25 543.58 L4.68 543.61 L4.78 529.94 L1.35 529.91 L1.35 529.91 ZM1.15 553.84 L1.12 557.26 L4.55
+						 557.29 L4.58 553.87 L1.15 553.84 L1.15 553.84 ZM1.05 567.52 L0.92 581.19 L4.35 581.22 L4.48 567.55 L1.05
+						 567.52 L1.05 567.52 ZM0.86 591.45 L0.82 594.87 L4.25 594.9 L4.28 591.48 L0.86 591.45 L0.86 591.45 ZM0.72
+						 605.13 L0.63 618.8 L4.05 618.83 L4.15 605.16 L0.72 605.13 L0.72 605.13 ZM0.53 629.06 L0.53 632.48 L3.95
+						 632.51 L3.95 629.09 L0.53 629.06 L0.53 629.06 ZM0.43 642.74 L0.33 656.41 L3.75 656.45 L3.85 642.77 L0.43
+						 642.74 L0.43 642.74 ZM0.23 666.67 L0.2 670.09 L3.62 670.12 L3.66 666.7 L0.23 666.67 L0.23 666.67 ZM0.13
+						 680.35 L0 694.02 L3.43 694.06 L3.56 680.38 L0.13 680.35 L0.13 680.35 Z" class="st18"/>
+		</g>
+		<g id="shape59-95" v:mID="59" v:groupContext="shape" transform="translate(785.874,-549.473)">
+			<title>Sheet.59</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="77.3395" cy="673.065" width="154.68" height="41.9798"/>
+			<path d="M154.68 652.08 L0 652.08 L0 694.06 L154.68 694.06 L154.68 652.08" class="st8"/>
+			<text x="26.77" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape60-99" v:mID="60" v:groupContext="shape" transform="translate(952.97,-548.822)">
+			<title>Sheet.60</title>
+			<desc>Payload 2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="46.92" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 2</text>		</g>
+		<g id="shape63-103" v:mID="63" v:groupContext="shape" transform="translate(1210.43,-551.684)">
+			<title>Sheet.63</title>
+			<desc>Multi-segment input packet</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="181.707" cy="678.273" width="363.42" height="31.5648"/>
+			<path d="M363.41 662.49 L0 662.49 L0 694.06 L363.41 694.06 L363.41 662.49" class="st8"/>
+			<text x="17.75" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Multi-segment input packet</text>		</g>
+		<g id="shape70-107" v:mID="70" v:groupContext="shape" v:layerMember="1" transform="translate(455.049,-221.499)">
+			<title>Sheet.70</title>
+			<path d="M0 611.87 C0 602.79 5.33 595.43 11.89 595.43 L282.92 595.43 C289.53 595.43 294.8 602.79 294.8 611.87 L294.8
+						 677.62 C294.8 686.76 289.53 694.06 282.92 694.06 L11.89 694.06 C5.33 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st6"/>
+		</g>
+		<g id="shape71-109" v:mID="71" v:groupContext="shape" transform="translate(455.049,-221.499)">
+			<title>Sheet.71</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L391.97 595.43 C401.12 595.43 408.44 602.79 408.44 611.87 L408.44
+						 677.62 C408.44 686.76 401.12 694.06 391.97 694.06 L16.47 694.06 C7.38 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape72-111" v:mID="72" v:groupContext="shape" transform="translate(489.065,-263.434)">
+			<title>Sheet.72</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="112.128" cy="675.59" width="224.26" height="36.9302"/>
+			<path d="M224.26 657.13 L0 657.13 L0 694.06 L224.26 694.06 L224.26 657.13" class="st8"/>
+			<text x="20.73" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape75-115" v:mID="75" v:groupContext="shape" transform="translate(849.065,-281.435)">
+			<title>Sheet.75</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="4.49" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape77-119" v:mID="77" v:groupContext="shape" transform="translate(717.742,-563.523)">
+			<title>Sheet.77</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="15.71" y="683.67" class="st19" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape78-123" v:mID="78" v:groupContext="shape" transform="translate(1148.17,-529.067)">
+			<title>Sheet.78</title>
+			<path d="M1.38 529.87 L1.25 543.55 L4.68 543.61 L4.81 529.94 L1.38 529.87 L1.38 529.87 ZM1.19 553.81 L1.12 557.23 L4.55
+						 557.29 L4.61 553.87 L1.19 553.81 L1.19 553.81 ZM1.05 567.48 L0.92 581.16 L4.35 581.22 L4.48 567.55 L1.05
+						 567.48 L1.05 567.48 ZM0.86 591.42 L0.86 594.84 L4.28 594.9 L4.28 591.48 L0.86 591.42 L0.86 591.42 ZM0.72
+						 605.09 L0.66 618.77 L4.08 618.83 L4.15 605.16 L0.72 605.09 L0.72 605.09 ZM0.53 629.03 L0.53 632.45 L3.95
+						 632.51 L3.95 629.09 L0.53 629.03 L0.53 629.03 ZM0.46 642.7 L0.33 656.38 L3.75 656.45 L3.89 642.77 L0.46
+						 642.7 L0.46 642.7 ZM0.26 666.64 L0.2 670.06 L3.62 670.12 L3.69 666.7 L0.26 666.64 L0.26 666.64 ZM0.13 680.31
+						 L0 693.99 L3.43 694.06 L3.56 680.38 L0.13 680.31 L0.13 680.31 Z" class="st20"/>
+		</g>
+		<g id="shape79-125" v:mID="79" v:groupContext="shape" transform="translate(946.254,-657.81)">
+			<title>Sheet.79</title>
+			<path d="M11 686.69 L17.33 686.69 L17.33 688.27 L11 688.27 L11 686.69 L11 686.69 ZM22.07 686.69 L23.65 686.69 L23.65
+						 688.27 L22.07 688.27 L22.07 686.69 L22.07 686.69 ZM28.39 686.69 L34.72 686.69 L34.72 688.27 L28.39 688.27
+						 L28.39 686.69 L28.39 686.69 ZM39.46 686.69 L41.04 686.69 L41.04 688.27 L39.46 688.27 L39.46 686.69 L39.46
+						 686.69 ZM45.78 686.69 L52.11 686.69 L52.11 688.27 L45.78 688.27 L45.78 686.69 L45.78 686.69 ZM56.85 686.69
+						 L58.43 686.69 L58.43 688.27 L56.85 688.27 L56.85 686.69 L56.85 686.69 ZM63.18 686.69 L69.5 686.69 L69.5
+						 688.27 L63.18 688.27 L63.18 686.69 L63.18 686.69 ZM74.24 686.69 L75.82 686.69 L75.82 688.27 L74.24 688.27
+						 L74.24 686.69 L74.24 686.69 ZM80.57 686.69 L86.89 686.69 L86.89 688.27 L80.57 688.27 L80.57 686.69 L80.57
+						 686.69 ZM91.63 686.69 L93.22 686.69 L93.22 688.27 L91.63 688.27 L91.63 686.69 L91.63 686.69 ZM97.96 686.69
+						 L104.28 686.69 L104.28 688.27 L97.96 688.27 L97.96 686.69 L97.96 686.69 ZM109.03 686.69 L110.61 686.69 L110.61
+						 688.27 L109.03 688.27 L109.03 686.69 L109.03 686.69 ZM115.35 686.69 L121.67 686.69 L121.67 688.27 L115.35
+						 688.27 L115.35 686.69 L115.35 686.69 ZM126.42 686.69 L128 686.69 L128 688.27 L126.42 688.27 L126.42 686.69
+						 L126.42 686.69 ZM132.74 686.69 L139.07 686.69 L139.07 688.27 L132.74 688.27 L132.74 686.69 L132.74 686.69
+						 ZM143.81 686.69 L145.39 686.69 L145.39 688.27 L143.81 688.27 L143.81 686.69 L143.81 686.69 ZM150.13 686.69
+						 L156.46 686.69 L156.46 688.27 L150.13 688.27 L150.13 686.69 L150.13 686.69 ZM161.2 686.69 L162.78 686.69
+						 L162.78 688.27 L161.2 688.27 L161.2 686.69 L161.2 686.69 ZM167.53 686.69 L173.85 686.69 L173.85 688.27 L167.53
+						 688.27 L167.53 686.69 L167.53 686.69 ZM178.59 686.69 L180.17 686.69 L180.17 688.27 L178.59 688.27 L178.59
+						 686.69 L178.59 686.69 ZM184.92 686.69 L189.4 686.69 L189.4 688.27 L184.92 688.27 L184.92 686.69 L184.92
+						 686.69 ZM13.18 694.06 L0 687.41 L13.18 680.9 L13.18 694.06 L13.18 694.06 ZM187.22 680.9 L200.4 687.48 L187.22
+						 694.06 L187.22 680.9 L187.22 680.9 Z" class="st20"/>
+		</g>
+		<g id="shape80-127" v:mID="80" v:groupContext="shape" transform="translate(982.882,-643.673)">
+			<title>Sheet.80</title>
+			<path d="M0 655.13 L0 694.06 L127.01 694.06 L127.01 655.13 L0 655.13 L0 655.13 Z" class="st16"/>
+		</g>
+		<g id="shape81-129" v:mID="81" v:groupContext="shape" transform="translate(1003.39,-660.621)">
+			<title>Sheet.81</title>
+			<desc>pkt_len</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="48.6041" cy="680.956" width="97.21" height="26.1994"/>
+			<path d="M97.21 667.86 L0 667.86 L0 694.06 L97.21 694.06 L97.21 667.86" class="st8"/>
+			<text x="11.67" y="687.5" class="st21" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>pkt_len  </text>		</g>
+		<g id="shape82-133" v:mID="82" v:groupContext="shape" transform="translate(1001.18,-634.321)">
+			<title>Sheet.82</title>
+			<desc>% segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="49.2945" cy="680.956" width="98.59" height="26.1994"/>
+			<path d="M98.59 667.86 L0 667.86 L0 694.06 L98.59 694.06 L98.59 667.86" class="st8"/>
+			<text x="9.09" y="687.5" class="st21" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>% segsz</text>		</g>
+		<g id="shape34-137" v:mID="34" v:groupContext="shape" v:layerMember="0" transform="translate(356.703,-264.106)">
+			<title>Sheet.34</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape85-139" v:mID="85" v:groupContext="shape" v:layerMember="0" transform="translate(78.5359,-282.66)">
+			<title>Sheet.85</title>
+			<path d="M0 680.87 C-0 673.59 6.88 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.88 694.06 0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape87-141" v:mID="87" v:groupContext="shape" v:layerMember="0" transform="translate(85.4791,-284.062)">
+			<title>Sheet.87</title>
+			<desc>1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>1</text>		</g>
+		<g id="shape88-145" v:mID="88" v:groupContext="shape" v:layerMember="0" transform="translate(468.906,-282.66)">
+			<title>Sheet.88</title>
+			<path d="M0 680.87 C-0 673.59 6.89 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.89 694.06 -0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape90-147" v:mID="90" v:groupContext="shape" v:layerMember="0" transform="translate(474.575,-284.062)">
+			<title>Sheet.90</title>
+			<desc>2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>2</text>		</g>
+		<g id="shape95-151" v:mID="95" v:groupContext="shape" v:layerMember="0" transform="translate(764.026,-275.998)">
+			<title>Sheet.95</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="7.56" y="686.16" class="st15" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape97-155" v:mID="97" v:groupContext="shape" v:layerMember="0" transform="translate(889.755,-220.915)">
+			<title>Sheet.97</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L391.97 595.43 C401.12 595.43 408.44 602.79 408.44 611.87 L408.44
+						 677.62 C408.44 686.76 401.12 694.06 391.97 694.06 L16.47 694.06 C7.38 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape100-157" v:mID="100" v:groupContext="shape" v:layerMember="0" transform="translate(751.857,-262.528)">
+			<title>Sheet.100</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape104-159" v:mID="104" v:groupContext="shape" v:layerMember="1" transform="translate(851.429,-218.08)">
+			<title>Sheet.104</title>
+			<path d="M0 611.87 C0 602.79 5.33 595.43 11.89 595.43 L282.92 595.43 C289.53 595.43 294.8 602.79 294.8 611.87 L294.8
+						 677.62 C294.8 686.76 289.53 694.06 282.92 694.06 L11.89 694.06 C5.33 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st6"/>
+		</g>
+		<g id="shape105-161" v:mID="105" v:groupContext="shape" v:layerMember="0" transform="translate(885.444,-260.015)">
+			<title>Sheet.105</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="112.128" cy="675.59" width="224.26" height="36.9302"/>
+			<path d="M224.26 657.13 L0 657.13 L0 694.06 L224.26 694.06 L224.26 657.13" class="st8"/>
+			<text x="20.73" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape106-165" v:mID="106" v:groupContext="shape" v:layerMember="0" transform="translate(895.672,-229.419)">
+			<title>Sheet.106</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="12.86" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape107-169" v:mID="107" v:groupContext="shape" v:layerMember="0" transform="translate(863.297,-280.442)">
+			<title>Sheet.107</title>
+			<path d="M0 680.87 C-0 673.59 6.89 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.89 694.06 -0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape108-171" v:mID="108" v:groupContext="shape" v:layerMember="0" transform="translate(870.001,-281.547)">
+			<title>Sheet.108</title>
+			<desc>3</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>3</text>		</g>
+		<g id="shape109-175" v:mID="109" v:groupContext="shape" v:layerMember="0" transform="translate(500.959,-231.87)">
+			<title>Sheet.109</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="12.86" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+	</g>
+</svg>
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 40f04a1..c7c8b17 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -56,6 +56,7 @@ Programmer's Guide
     reorder_lib
     ip_fragment_reassembly_lib
     generic_receive_offload_lib
+    generic_segmentation_offload_lib
     pdump_lib
     multi_proc_support
     kernel_nic_interface
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 1/6] gso: add Generic Segmentation Offload API framework
  2017-10-02 16:45           ` [PATCH v6 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
@ 2017-10-04 13:11             ` Ananyev, Konstantin
  2017-10-04 13:21               ` Kavanagh, Mark B
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-10-04 13:11 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Monday, October 2, 2017 5:46 PM
> To: dev@dpdk.org
> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> Subject: [PATCH v6 1/6] gso: add Generic Segmentation Offload API framework
> 
> From: Jiayu Hu <jiayu.hu@intel.com>
> 
> Generic Segmentation Offload (GSO) is a SW technique to split large
> packets into small ones. Akin to TSO, GSO enables applications to
> operate on large packets, thus reducing per-packet processing overhead.
> 
> To enable more flexibility to applications, DPDK GSO is implemented
> as a standalone library. Applications explicitly use the GSO library
> to segment packets. To segment a packet requires two steps. The first
> is to set proper flags to mbuf->ol_flags, where the flags are the same
> as that of TSO. The second is to call the segmentation API,
> rte_gso_segment(). This patch introduces the GSO API framework to DPDK.
> 
> rte_gso_segment() splits an input packet into small ones in each
> invocation. The GSO library refers to these small packets generated
> by rte_gso_segment() as GSO segments. Each of the newly-created GSO
> segments is organized as a two-segment MBUF, where the first segment is a
> standard MBUF, which stores a copy of packet header, and the second is an
> indirect MBUF which points to a section of data in the input packet.
> rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
> when all GSO segments are freed, the input packet is freed automatically.
> Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
> the driver of the interface which the GSO segments are sent to should
> support to transmit multi-segment packets.
> 
> The GSO framework clears the PKT_TX_TCP_SEG flag for both the input
> packet, and all produced GSO segments in the event of success, since
> segmentation in hardware is no longer required at that point.
> 
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> ---
>  config/common_base                     |   5 ++
>  doc/api/doxy-api-index.md              |   1 +
>  doc/api/doxy-api.conf                  |   1 +
>  doc/guides/rel_notes/release_17_11.rst |   1 +
>  lib/Makefile                           |   2 +
>  lib/librte_gso/Makefile                |  49 +++++++++++
>  lib/librte_gso/rte_gso.c               |  52 ++++++++++++
>  lib/librte_gso/rte_gso.h               | 145 +++++++++++++++++++++++++++++++++
>  lib/librte_gso/rte_gso_version.map     |   7 ++
>  mk/rte.app.mk                          |   1 +
>  10 files changed, 264 insertions(+)
>  create mode 100644 lib/librte_gso/Makefile
>  create mode 100644 lib/librte_gso/rte_gso.c
>  create mode 100644 lib/librte_gso/rte_gso.h
>  create mode 100644 lib/librte_gso/rte_gso_version.map
> 
> diff --git a/config/common_base b/config/common_base
> index 12f6be9..58ca5c0 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -653,6 +653,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
>  CONFIG_RTE_LIBRTE_GRO=y
> 
>  #
> +# Compile GSO library
> +#
> +CONFIG_RTE_LIBRTE_GSO=y
> +
> +#
>  # Compile librte_meter
>  #
>  CONFIG_RTE_LIBRTE_METER=y
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 19e0d4f..6512918 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -101,6 +101,7 @@ The public API headers are grouped by topics:
>    [TCP]                (@ref rte_tcp.h),
>    [UDP]                (@ref rte_udp.h),
>    [GRO]                (@ref rte_gro.h),
> +  [GSO]                (@ref rte_gso.h),
>    [frag/reass]         (@ref rte_ip_frag.h),
>    [LPM IPv4 route]     (@ref rte_lpm.h),
>    [LPM IPv6 route]     (@ref rte_lpm6.h),
> diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
> index 823554f..408f2e6 100644
> --- a/doc/api/doxy-api.conf
> +++ b/doc/api/doxy-api.conf
> @@ -47,6 +47,7 @@ INPUT                   = doc/api/doxy-api-index.md \
>                            lib/librte_ether \
>                            lib/librte_eventdev \
>                            lib/librte_gro \
> +                          lib/librte_gso \
>                            lib/librte_hash \
>                            lib/librte_ip_frag \
>                            lib/librte_jobstats \
> diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
> index 8bf91bd..7508be7 100644
> --- a/doc/guides/rel_notes/release_17_11.rst
> +++ b/doc/guides/rel_notes/release_17_11.rst
> @@ -174,6 +174,7 @@ The libraries prepended with a plus sign were incremented in this version.
>       librte_ethdev.so.7
>       librte_eventdev.so.2
>       librte_gro.so.1
> +   + librte_gso.so.1
>       librte_hash.so.2
>       librte_ip_frag.so.1
>       librte_jobstats.so.1
> diff --git a/lib/Makefile b/lib/Makefile
> index 86caba1..3d123f4 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
>  DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
>  DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
>  DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
> +DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
> +DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
> 
>  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>  DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> new file mode 100644
> index 0000000..aeaacbc
> --- /dev/null
> +++ b/lib/librte_gso/Makefile
> @@ -0,0 +1,49 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2017 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_gso.a
> +
> +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
> +
> +EXPORT_MAP := rte_gso_version.map
> +
> +LIBABIVER := 1
> +
> +#source files
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
> +
> +# install this header file
> +SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> new file mode 100644
> index 0000000..b773636
> --- /dev/null
> +++ b/lib/librte_gso/rte_gso.c
> @@ -0,0 +1,52 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <errno.h>
> +
> +#include "rte_gso.h"
> +
> +int
> +rte_gso_segment(struct rte_mbuf *pkt,
> +		const struct rte_gso_ctx *gso_ctx,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out)
> +{
> +	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
> +			nb_pkts_out < 1)
> +		return -EINVAL;
> +
> +	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> +	pkts_out[0] = pkt;
> +
> +	return 1;
> +}
> diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
> new file mode 100644
> index 0000000..53725e6
> --- /dev/null
> +++ b/lib/librte_gso/rte_gso.h
> @@ -0,0 +1,145 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _RTE_GSO_H_
> +#define _RTE_GSO_H_
> +
> +/**
> + * @file
> + * Interface to GSO library
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <stdint.h>
> +#include <rte_mbuf.h>
> +
> +/* GSO IP id flags for the IPv4 header */
> +#define RTE_GSO_IPID_FIXED (1ULL << 0)
> +/**< Use fixed IP ids for output GSO segments. Setting
> + * !RTE_GSO_IPID_FIXED indicates using incremental IP ids.
> + */
> +
> +/**
> + * GSO context structure.
> + */
> +struct rte_gso_ctx {
> +	struct rte_mempool *direct_pool;
> +	/**< MBUF pool for allocating direct buffers, which are used
> +	 * to store packet headers for GSO segments.
> +	 */
> +	struct rte_mempool *indirect_pool;
> +	/**< MBUF pool for allocating indirect buffers, which are used
> +	 * to locate packet payloads for GSO segments. The indirect
> +	 * buffer doesn't contain any data, but simply points to an
> +	 * offset within the packet to segment.
> +	 */
> +	uint64_t ipid_flag;
> +	/**< flag to indicate GSO uses fixed or incremental IP ids for
> +	 * IPv4 headers of output GSO segments. If applications want
> +	 * fixed IP ids, set RTE_GSO_IPID_FIXED to ipid_flag. Conversely,
> +	 * if applications want incremental IP ids, set !RTE_GSO_IPID_FIXED.
> +	 */

Minor nit - I think better to just name it 'flag' - as in future some other 
non IPID related flags might be added.
Also make sure that all flag values have RTE_GSO_FLAG (or so prefixes),
and move explanation of particular flag value to its definition. 
Konstantin

> +	uint32_t gso_types;
> +	/**< the bit mask of required GSO types. The GSO library
> +	 * uses the same macros as that of describing device TX
> +	 * offloading capabilities (i.e. DEV_TX_OFFLOAD_*_TSO) for
> +	 * gso_types.
> +	 *
> +	 * For example, if applications want to segment TCP/IPv4
> +	 * packets, set DEV_TX_OFFLOAD_TCP_TSO in gso_types.
> +	 */
> +	uint16_t gso_size;
> +	/**< maximum size of an output GSO segment, including packet
> +	 * header and payload, measured in bytes.
> +	 */
> +};
> +
> +/**
> + * Segmentation function, which supports processing of both single- and
> + * multi- MBUF packets.
> + *
> + * Note that we refer to the packets that are segmented from the input
> + * packet as 'GSO segments'. rte_gso_segment() doesn't check if the
> + * input packet has correct checksums, and doesn't update checksums for
> + * output GSO segments. Additionally, it doesn't process IP fragment
> + * packets.
> + *
> + * Before calling rte_gso_segment(), applications must set proper ol_flags
> + * for the packet. The GSO library uses the same macros as that of TSO.
> + * For example, set PKT_TX_TCP_SEG and PKT_TX_IPV4 in ol_flags to segment
> + * a TCP/IPv4 packet. If rte_gso_segment() succceds, the PKT_TX_TCP_SEG
> + * flag is removed for all GSO segments and the input packet.
> + *
> + * Each of the newly-created GSO segments is organized as a two-segment
> + * MBUF, where the first segment is a standard MBUF, which stores a copy
> + * of packet header, and the second is an indirect MBUF which points to
> + * a section of data in the input packet. Since each GSO segment has
> + * multiple MBUFs (i.e. typically 2 MBUFs), the driver of the interface which
> + * the GSO segments are sent to should support transmission of multi-segment
> + * packets.
> + *
> + * If the input packet is GSO'd, its mbuf refcnt reduces by 1. Therefore,
> + * when all GSO segments are freed, the input packet is freed automatically.
> + *
> + * If the memory space in pkts_out or MBUF pools is insufficient, this
> + * function fails, and it returns (-1) * errno. Otherwise, GSO succeeds,
> + * and this function returns the number of output GSO segments filled in
> + * pkts_out.
> + *
> + * @param pkt
> + *  The packet mbuf to segment.
> + * @param ctx
> + *  GSO context object pointer.
> + * @param pkts_out
> + *  Pointer array used to store the MBUF addresses of output GSO
> + *  segments, when rte_gso_segment() succeeds.
> + * @param nb_pkts_out
> + *  The max number of items that pkts_out can keep.
> + *
> + * @return
> + *  - The number of GSO segments filled in pkts_out on success.
> + *  - Return -ENOMEM if run out of memory in MBUF pools.
> + *  - Return -EINVAL for invalid parameters.
> + */
> +int rte_gso_segment(struct rte_mbuf *pkt,
> +		const struct rte_gso_ctx *ctx,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out);
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_GSO_H_ */
> diff --git a/lib/librte_gso/rte_gso_version.map b/lib/librte_gso/rte_gso_version.map
> new file mode 100644
> index 0000000..e1fd453
> --- /dev/null
> +++ b/lib/librte_gso/rte_gso_version.map
> @@ -0,0 +1,7 @@
> +DPDK_17.11 {
> +	global:
> +
> +	rte_gso_segment;
> +
> +	local: *;
> +};
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> index c25fdd9..d4c9873 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
> --
> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 1/6] gso: add Generic Segmentation Offload API framework
  2017-10-04 13:11             ` Ananyev, Konstantin
@ 2017-10-04 13:21               ` Kavanagh, Mark B
  0 siblings, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-10-04 13:21 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



>From: Ananyev, Konstantin
>Sent: Wednesday, October 4, 2017 2:11 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>Subject: RE: [PATCH v6 1/6] gso: add Generic Segmentation Offload API
>framework
>
>
>
>> -----Original Message-----
>> From: Kavanagh, Mark B
>> Sent: Monday, October 2, 2017 5:46 PM
>> To: dev@dpdk.org
>> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
>> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>
>> Subject: [PATCH v6 1/6] gso: add Generic Segmentation Offload API framework
>>
>> From: Jiayu Hu <jiayu.hu@intel.com>
>>
>> Generic Segmentation Offload (GSO) is a SW technique to split large
>> packets into small ones. Akin to TSO, GSO enables applications to
>> operate on large packets, thus reducing per-packet processing overhead.
>>
>> To enable more flexibility to applications, DPDK GSO is implemented
>> as a standalone library. Applications explicitly use the GSO library
>> to segment packets. To segment a packet requires two steps. The first
>> is to set proper flags to mbuf->ol_flags, where the flags are the same
>> as that of TSO. The second is to call the segmentation API,
>> rte_gso_segment(). This patch introduces the GSO API framework to DPDK.
>>
>> rte_gso_segment() splits an input packet into small ones in each
>> invocation. The GSO library refers to these small packets generated
>> by rte_gso_segment() as GSO segments. Each of the newly-created GSO
>> segments is organized as a two-segment MBUF, where the first segment is a
>> standard MBUF, which stores a copy of packet header, and the second is an
>> indirect MBUF which points to a section of data in the input packet.
>> rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
>> when all GSO segments are freed, the input packet is freed automatically.
>> Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
>> the driver of the interface which the GSO segments are sent to should
>> support to transmit multi-segment packets.
>>
>> The GSO framework clears the PKT_TX_TCP_SEG flag for both the input
>> packet, and all produced GSO segments in the event of success, since
>> segmentation in hardware is no longer required at that point.
>>
>> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> ---
>>  config/common_base                     |   5 ++
>>  doc/api/doxy-api-index.md              |   1 +
>>  doc/api/doxy-api.conf                  |   1 +
>>  doc/guides/rel_notes/release_17_11.rst |   1 +
>>  lib/Makefile                           |   2 +
>>  lib/librte_gso/Makefile                |  49 +++++++++++
>>  lib/librte_gso/rte_gso.c               |  52 ++++++++++++
>>  lib/librte_gso/rte_gso.h               | 145
>+++++++++++++++++++++++++++++++++
>>  lib/librte_gso/rte_gso_version.map     |   7 ++
>>  mk/rte.app.mk                          |   1 +
>>  10 files changed, 264 insertions(+)
>>  create mode 100644 lib/librte_gso/Makefile
>>  create mode 100644 lib/librte_gso/rte_gso.c
>>  create mode 100644 lib/librte_gso/rte_gso.h
>>  create mode 100644 lib/librte_gso/rte_gso_version.map
>>
>> diff --git a/config/common_base b/config/common_base
>> index 12f6be9..58ca5c0 100644
>> --- a/config/common_base
>> +++ b/config/common_base
>> @@ -653,6 +653,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
>>  CONFIG_RTE_LIBRTE_GRO=y
>>
>>  #
>> +# Compile GSO library
>> +#
>> +CONFIG_RTE_LIBRTE_GSO=y
>> +
>> +#
>>  # Compile librte_meter
>>  #
>>  CONFIG_RTE_LIBRTE_METER=y
>> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
>> index 19e0d4f..6512918 100644
>> --- a/doc/api/doxy-api-index.md
>> +++ b/doc/api/doxy-api-index.md
>> @@ -101,6 +101,7 @@ The public API headers are grouped by topics:
>>    [TCP]                (@ref rte_tcp.h),
>>    [UDP]                (@ref rte_udp.h),
>>    [GRO]                (@ref rte_gro.h),
>> +  [GSO]                (@ref rte_gso.h),
>>    [frag/reass]         (@ref rte_ip_frag.h),
>>    [LPM IPv4 route]     (@ref rte_lpm.h),
>>    [LPM IPv6 route]     (@ref rte_lpm6.h),
>> diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
>> index 823554f..408f2e6 100644
>> --- a/doc/api/doxy-api.conf
>> +++ b/doc/api/doxy-api.conf
>> @@ -47,6 +47,7 @@ INPUT                   = doc/api/doxy-api-index.md \
>>                            lib/librte_ether \
>>                            lib/librte_eventdev \
>>                            lib/librte_gro \
>> +                          lib/librte_gso \
>>                            lib/librte_hash \
>>                            lib/librte_ip_frag \
>>                            lib/librte_jobstats \
>> diff --git a/doc/guides/rel_notes/release_17_11.rst
>b/doc/guides/rel_notes/release_17_11.rst
>> index 8bf91bd..7508be7 100644
>> --- a/doc/guides/rel_notes/release_17_11.rst
>> +++ b/doc/guides/rel_notes/release_17_11.rst
>> @@ -174,6 +174,7 @@ The libraries prepended with a plus sign were
>incremented in this version.
>>       librte_ethdev.so.7
>>       librte_eventdev.so.2
>>       librte_gro.so.1
>> +   + librte_gso.so.1
>>       librte_hash.so.2
>>       librte_ip_frag.so.1
>>       librte_jobstats.so.1
>> diff --git a/lib/Makefile b/lib/Makefile
>> index 86caba1..3d123f4 100644
>> --- a/lib/Makefile
>> +++ b/lib/Makefile
>> @@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
>>  DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
>>  DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
>>  DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
>> +DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
>> +DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
>>
>>  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>>  DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
>> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
>> new file mode 100644
>> index 0000000..aeaacbc
>> --- /dev/null
>> +++ b/lib/librte_gso/Makefile
>> @@ -0,0 +1,49 @@
>> +#   BSD LICENSE
>> +#
>> +#   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> +#   All rights reserved.
>> +#
>> +#   Redistribution and use in source and binary forms, with or without
>> +#   modification, are permitted provided that the following conditions
>> +#   are met:
>> +#
>> +#     * Redistributions of source code must retain the above copyright
>> +#       notice, this list of conditions and the following disclaimer.
>> +#     * Redistributions in binary form must reproduce the above copyright
>> +#       notice, this list of conditions and the following disclaimer in
>> +#       the documentation and/or other materials provided with the
>> +#       distribution.
>> +#     * Neither the name of Intel Corporation nor the names of its
>> +#       contributors may be used to endorse or promote products derived
>> +#       from this software without specific prior written permission.
>> +#
>> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> +
>> +include $(RTE_SDK)/mk/rte.vars.mk
>> +
>> +# library name
>> +LIB = librte_gso.a
>> +
>> +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
>> +
>> +EXPORT_MAP := rte_gso_version.map
>> +
>> +LIBABIVER := 1
>> +
>> +#source files
>> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
>> +
>> +# install this header file
>> +SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
>> +
>> +include $(RTE_SDK)/mk/rte.lib.mk
>> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
>> new file mode 100644
>> index 0000000..b773636
>> --- /dev/null
>> +++ b/lib/librte_gso/rte_gso.c
>> @@ -0,0 +1,52 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#include <errno.h>
>> +
>> +#include "rte_gso.h"
>> +
>> +int
>> +rte_gso_segment(struct rte_mbuf *pkt,
>> +		const struct rte_gso_ctx *gso_ctx,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out)
>> +{
>> +	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
>> +			nb_pkts_out < 1)
>> +		return -EINVAL;
>> +
>> +	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> +	pkts_out[0] = pkt;
>> +
>> +	return 1;
>> +}
>> diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
>> new file mode 100644
>> index 0000000..53725e6
>> --- /dev/null
>> +++ b/lib/librte_gso/rte_gso.h
>> @@ -0,0 +1,145 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#ifndef _RTE_GSO_H_
>> +#define _RTE_GSO_H_
>> +
>> +/**
>> + * @file
>> + * Interface to GSO library
>> + */
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include <stdint.h>
>> +#include <rte_mbuf.h>
>> +
>> +/* GSO IP id flags for the IPv4 header */
>> +#define RTE_GSO_IPID_FIXED (1ULL << 0)
>> +/**< Use fixed IP ids for output GSO segments. Setting
>> + * !RTE_GSO_IPID_FIXED indicates using incremental IP ids.
>> + */
>> +
>> +/**
>> + * GSO context structure.
>> + */
>> +struct rte_gso_ctx {
>> +	struct rte_mempool *direct_pool;
>> +	/**< MBUF pool for allocating direct buffers, which are used
>> +	 * to store packet headers for GSO segments.
>> +	 */
>> +	struct rte_mempool *indirect_pool;
>> +	/**< MBUF pool for allocating indirect buffers, which are used
>> +	 * to locate packet payloads for GSO segments. The indirect
>> +	 * buffer doesn't contain any data, but simply points to an
>> +	 * offset within the packet to segment.
>> +	 */
>> +	uint64_t ipid_flag;
>> +	/**< flag to indicate GSO uses fixed or incremental IP ids for
>> +	 * IPv4 headers of output GSO segments. If applications want
>> +	 * fixed IP ids, set RTE_GSO_IPID_FIXED to ipid_flag. Conversely,
>> +	 * if applications want incremental IP ids, set !RTE_GSO_IPID_FIXED.
>> +	 */
>
>Minor nit - I think better to just name it 'flag' - as in future some other
>non IPID related flags might be added.
>Also make sure that all flag values have RTE_GSO_FLAG (or so prefixes),
>and move explanation of particular flag value to its definition.
>Konstantin

Will do Konstantin - thanks!


>
>> +	uint32_t gso_types;
>> +	/**< the bit mask of required GSO types. The GSO library
>> +	 * uses the same macros as that of describing device TX
>> +	 * offloading capabilities (i.e. DEV_TX_OFFLOAD_*_TSO) for
>> +	 * gso_types.
>> +	 *
>> +	 * For example, if applications want to segment TCP/IPv4
>> +	 * packets, set DEV_TX_OFFLOAD_TCP_TSO in gso_types.
>> +	 */
>> +	uint16_t gso_size;
>> +	/**< maximum size of an output GSO segment, including packet
>> +	 * header and payload, measured in bytes.
>> +	 */
>> +};
>> +
>> +/**
>> + * Segmentation function, which supports processing of both single- and
>> + * multi- MBUF packets.
>> + *
>> + * Note that we refer to the packets that are segmented from the input
>> + * packet as 'GSO segments'. rte_gso_segment() doesn't check if the
>> + * input packet has correct checksums, and doesn't update checksums for
>> + * output GSO segments. Additionally, it doesn't process IP fragment
>> + * packets.
>> + *
>> + * Before calling rte_gso_segment(), applications must set proper ol_flags
>> + * for the packet. The GSO library uses the same macros as that of TSO.
>> + * For example, set PKT_TX_TCP_SEG and PKT_TX_IPV4 in ol_flags to segment
>> + * a TCP/IPv4 packet. If rte_gso_segment() succceds, the PKT_TX_TCP_SEG
>> + * flag is removed for all GSO segments and the input packet.
>> + *
>> + * Each of the newly-created GSO segments is organized as a two-segment
>> + * MBUF, where the first segment is a standard MBUF, which stores a copy
>> + * of packet header, and the second is an indirect MBUF which points to
>> + * a section of data in the input packet. Since each GSO segment has
>> + * multiple MBUFs (i.e. typically 2 MBUFs), the driver of the interface
>which
>> + * the GSO segments are sent to should support transmission of multi-
>segment
>> + * packets.
>> + *
>> + * If the input packet is GSO'd, its mbuf refcnt reduces by 1. Therefore,
>> + * when all GSO segments are freed, the input packet is freed
>automatically.
>> + *
>> + * If the memory space in pkts_out or MBUF pools is insufficient, this
>> + * function fails, and it returns (-1) * errno. Otherwise, GSO succeeds,
>> + * and this function returns the number of output GSO segments filled in
>> + * pkts_out.
>> + *
>> + * @param pkt
>> + *  The packet mbuf to segment.
>> + * @param ctx
>> + *  GSO context object pointer.
>> + * @param pkts_out
>> + *  Pointer array used to store the MBUF addresses of output GSO
>> + *  segments, when rte_gso_segment() succeeds.
>> + * @param nb_pkts_out
>> + *  The max number of items that pkts_out can keep.
>> + *
>> + * @return
>> + *  - The number of GSO segments filled in pkts_out on success.
>> + *  - Return -ENOMEM if run out of memory in MBUF pools.
>> + *  - Return -EINVAL for invalid parameters.
>> + */
>> +int rte_gso_segment(struct rte_mbuf *pkt,
>> +		const struct rte_gso_ctx *ctx,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out);
>> +#ifdef __cplusplus
>> +}
>> +#endif
>> +
>> +#endif /* _RTE_GSO_H_ */
>> diff --git a/lib/librte_gso/rte_gso_version.map
>b/lib/librte_gso/rte_gso_version.map
>> new file mode 100644
>> index 0000000..e1fd453
>> --- /dev/null
>> +++ b/lib/librte_gso/rte_gso_version.map
>> @@ -0,0 +1,7 @@
>> +DPDK_17.11 {
>> +	global:
>> +
>> +	rte_gso_segment;
>> +
>> +	local: *;
>> +};
>> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
>> index c25fdd9..d4c9873 100644
>> --- a/mk/rte.app.mk
>> +++ b/mk/rte.app.mk
>> @@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
>> +_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
>> --
>> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
  2017-10-02 16:45           ` [PATCH v6 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
@ 2017-10-04 13:32             ` Ananyev, Konstantin
  2017-10-04 14:30               ` Kavanagh, Mark B
  2017-10-04 13:35             ` Ananyev, Konstantin
  1 sibling, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-10-04 13:32 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas

Hi Mark,

> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Monday, October 2, 2017 5:46 PM
> To: dev@dpdk.org
> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> Subject: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
> 
> From: Jiayu Hu <jiayu.hu@intel.com>
> 
> This patch adds GSO support for TCP/IPv4 packets. Supported packets
> may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
> packets have correct checksums, and doesn't update checksums for
> output packets (the responsibility for this lies with the application).
> Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
> 
> TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
> MBUF, to organize an output packet. Note that we refer to these two
> chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
> header, while the indirect mbuf simply points to a location within the
> original packet's payload. Consequently, use of the GSO library requires
> multi-segment MBUF support in the TX functions of the NIC driver.
> 
> If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> result, when all of its GSOed segments are freed, the packet is freed
> automatically.
> 
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> Tested-by: Lei Yao <lei.a.yao@intel.com>
> ---
>  doc/guides/rel_notes/release_17_11.rst  |  12 +++
>  lib/librte_eal/common/include/rte_log.h |   1 +
>  lib/librte_gso/Makefile                 |   2 +
>  lib/librte_gso/gso_common.c             | 153 ++++++++++++++++++++++++++++++++
>  lib/librte_gso/gso_common.h             | 141 +++++++++++++++++++++++++++++
>  lib/librte_gso/gso_tcp4.c               | 104 ++++++++++++++++++++++
>  lib/librte_gso/gso_tcp4.h               |  74 +++++++++++++++
>  lib/librte_gso/rte_gso.c                |  52 ++++++++++-
>  8 files changed, 536 insertions(+), 3 deletions(-)
>  create mode 100644 lib/librte_gso/gso_common.c
>  create mode 100644 lib/librte_gso/gso_common.h
>  create mode 100644 lib/librte_gso/gso_tcp4.c
>  create mode 100644 lib/librte_gso/gso_tcp4.h
> 
> diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
> index 7508be7..c414f73 100644
> --- a/doc/guides/rel_notes/release_17_11.rst
> +++ b/doc/guides/rel_notes/release_17_11.rst
> @@ -41,6 +41,18 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =========================================================
> 
> +* **Added the Generic Segmentation Offload Library.**
> +
> +  Added the Generic Segmentation Offload (GSO) library to enable
> +  applications to split large packets (e.g. MTU is 64KB) into small
> +  ones (e.g. MTU is 1500B). Supported packet types are:
> +
> +  * TCP/IPv4 packets, which may include a single VLAN tag.
> +
> +  The GSO library doesn't check if the input packets have correct
> +  checksums, and doesn't update checksums for output packets.
> +  Additionally, the GSO library doesn't process IP fragmented packets.
> +
> 
>  Resolved Issues
>  ---------------
> diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
> index ec8dba7..2fa1199 100644
> --- a/lib/librte_eal/common/include/rte_log.h
> +++ b/lib/librte_eal/common/include/rte_log.h
> @@ -87,6 +87,7 @@ struct rte_logs {
>  #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
>  #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
>  #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
> +#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
> 
>  /* these log types can be used in an application */
>  #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> index aeaacbc..2be64d1 100644
> --- a/lib/librte_gso/Makefile
> +++ b/lib/librte_gso/Makefile
> @@ -42,6 +42,8 @@ LIBABIVER := 1
> 
>  #source files
>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
> 
>  # install this header file
>  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
> new file mode 100644
> index 0000000..ee75d4c
> --- /dev/null
> +++ b/lib/librte_gso/gso_common.c
> @@ -0,0 +1,153 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <stdbool.h>
> +#include <errno.h>
> +
> +#include <rte_memcpy.h>
> +#include <rte_mempool.h>
> +
> +#include "gso_common.h"
> +
> +static inline void
> +hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset)
> +{
> +	/* Copy MBUF metadata */
> +	hdr_segment->nb_segs = 1;
> +	hdr_segment->port = pkt->port;
> +	hdr_segment->ol_flags = pkt->ol_flags;
> +	hdr_segment->packet_type = pkt->packet_type;
> +	hdr_segment->pkt_len = pkt_hdr_offset;
> +	hdr_segment->data_len = pkt_hdr_offset;
> +	hdr_segment->tx_offload = pkt->tx_offload;
> +
> +	/* Copy the packet header */
> +	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
> +			rte_pktmbuf_mtod(pkt, char *),
> +			pkt_hdr_offset);
> +}
> +
> +static inline void
> +free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
> +{
> +	uint16_t i;
> +
> +	for (i = 0; i < nb_pkts; i++)
> +		rte_pktmbuf_free(pkts[i]);
> +}
> +
> +int
> +gso_do_segment(struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset,
> +		uint16_t pyld_unit_size,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out)
> +{
> +	struct rte_mbuf *pkt_in;
> +	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
> +	uint16_t pkt_in_data_pos, segment_bytes_remaining;
> +	uint16_t pyld_len, nb_segs;
> +	bool more_in_pkt, more_out_segs;
> +
> +	pkt_in = pkt;
> +	nb_segs = 0;
> +	more_in_pkt = 1;
> +	pkt_in_data_pos = pkt_hdr_offset;
> +
> +	while (more_in_pkt) {
> +		if (unlikely(nb_segs >= nb_pkts_out)) {
> +			free_gso_segment(pkts_out, nb_segs);
> +			return -EINVAL;
> +		}
> +
> +		/* Allocate a direct MBUF */
> +		hdr_segment = rte_pktmbuf_alloc(direct_pool);
> +		if (unlikely(hdr_segment == NULL)) {
> +			free_gso_segment(pkts_out, nb_segs);
> +			return -ENOMEM;
> +		}
> +		/* Fill the packet header */
> +		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
> +
> +		prev_segment = hdr_segment;
> +		segment_bytes_remaining = pyld_unit_size;
> +		more_out_segs = 1;
> +
> +		while (more_out_segs && more_in_pkt) {
> +			/* Allocate an indirect MBUF */
> +			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
> +			if (unlikely(pyld_segment == NULL)) {
> +				rte_pktmbuf_free(hdr_segment);
> +				free_gso_segment(pkts_out, nb_segs);
> +				return -ENOMEM;
> +			}
> +			/* Attach to current MBUF segment of pkt */
> +			rte_pktmbuf_attach(pyld_segment, pkt_in);
> +
> +			prev_segment->next = pyld_segment;
> +			prev_segment = pyld_segment;
> +
> +			pyld_len = segment_bytes_remaining;
> +			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
> +				pyld_len = pkt_in->data_len - pkt_in_data_pos;
> +
> +			pyld_segment->data_off = pkt_in_data_pos +
> +				pkt_in->data_off;
> +			pyld_segment->data_len = pyld_len;
> +
> +			/* Update header segment */
> +			hdr_segment->pkt_len += pyld_len;
> +			hdr_segment->nb_segs++;
> +
> +			pkt_in_data_pos += pyld_len;
> +			segment_bytes_remaining -= pyld_len;
> +
> +			/* Finish processing a MBUF segment of pkt */
> +			if (pkt_in_data_pos == pkt_in->data_len) {
> +				pkt_in = pkt_in->next;
> +				pkt_in_data_pos = 0;
> +				if (pkt_in == NULL)
> +					more_in_pkt = 0;
> +			}
> +
> +			/* Finish generating a GSO segment */
> +			if (segment_bytes_remaining == 0)
> +				more_out_segs = 0;
> +		}
> +		pkts_out[nb_segs++] = hdr_segment;
> +	}
> +	return nb_segs;
> +}
> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> new file mode 100644
> index 0000000..8d9b94e
> --- /dev/null
> +++ b/lib/librte_gso/gso_common.h
> @@ -0,0 +1,141 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _GSO_COMMON_H_
> +#define _GSO_COMMON_H_
> +
> +#include <stdint.h>
> +
> +#include <rte_mbuf.h>
> +#include <rte_ip.h>
> +#include <rte_tcp.h>
> +
> +#define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
> +		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
> +
> +#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
> +#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
> +
> +#define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
> +		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
> +
> +/**
> + * Internal function which updates the TCP header of a packet, following
> + * segmentation. This is required to update the header's 'sent' sequence
> + * number, and also to clear 'PSH' and 'FIN' flags for non-tail segments.
> + *
> + * @param pkt
> + *  The packet containing the TCP header.
> + * @param l4_offset
> + *  The offset of the TCP header from the start of the packet.
> + * @param sent_seq
> + *  The sent sequence number.
> + * @param non-tail
> + *  Indicates whether or not this is a tail segment.
> + */
> +static inline void
> +update_tcp_header(struct rte_mbuf *pkt, uint16_t l4_offset, uint32_t sent_seq,
> +		uint8_t non_tail)
> +{
> +	struct tcp_hdr *tcp_hdr;
> +
> +	tcp_hdr = (struct tcp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			l4_offset);
> +	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
> +	if (likely(non_tail))
> +		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
> +					TCP_HDR_FIN_MASK));
> +}
> +
> +/**
> + * Internal function which updates the IPv4 header of a packet, following
> + * segmentation. This is required to update the header's 'total_length' field,
> + * to reflect the reduced length of the now-segmented packet. Furthermore, the
> + * header's 'packet_id' field must be updated to reflect the new ID of the
> + * now-segmented packet.
> + *
> + * @param pkt
> + *  The packet containing the IPv4 header.
> + * @param l3_offset
> + *  The offset of the IPv4 header from the start of the packet.
> + * @param id
> + *  The new ID of the packet.
> +  */
> +static inline void
> +update_ipv4_header(struct rte_mbuf *pkt, uint16_t l3_offset, uint16_t id)
> +{
> +	struct ipv4_hdr *ipv4_hdr;
> +
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			l3_offset);
> +	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len - l3_offset);
> +	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
> +}
> +
> +/**
> + * Internal function which divides the input packet into small segments.
> + * Each of the newly-created segments is organized as a two-segment MBUF,
> + * where the first segment is a standard mbuf, which stores a copy of
> + * packet header, and the second is an indirect mbuf which points to a
> + * section of data in the input packet.
> + *
> + * @param pkt
> + *  Packet to segment.
> + * @param pkt_hdr_offset
> + *  Packet header offset, measured in bytes.
> + * @param pyld_unit_size
> + *  The max payload length of a GSO segment.
> + * @param direct_pool
> + *  MBUF pool used for allocating direct buffers for output segments.
> + * @param indirect_pool
> + *  MBUF pool used for allocating indirect buffers for output segments.
> + * @param pkts_out
> + *  Pointer array used to keep the mbuf addresses of output segments. If
> + *  the memory space in pkts_out is insufficient, gso_do_segment() fails
> + *  and returns -EINVAL.
> + * @param nb_pkts_out
> + *  The max number of items that pkts_out can keep.
> + *
> + * @return
> + *  - The number of segments created in the event of success.
> + *  - Return -ENOMEM if run out of memory in MBUF pools.
> + *  - Return -EINVAL for invalid parameters.
> + */
> +int gso_do_segment(struct rte_mbuf *pkt,
> +		uint16_t pkt_hdr_offset,
> +		uint16_t pyld_unit_size,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out);
> +#endif
> diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
> new file mode 100644
> index 0000000..d83e610
> --- /dev/null
> +++ b/lib/librte_gso/gso_tcp4.c
> @@ -0,0 +1,104 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include "gso_common.h"
> +#include "gso_tcp4.h"
> +
> +static void
> +update_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
> +		struct rte_mbuf **segs, uint16_t nb_segs)
> +{
> +	struct ipv4_hdr *ipv4_hdr;
> +	struct tcp_hdr *tcp_hdr;
> +	uint32_t sent_seq;
> +	uint16_t id, tail_idx, i;
> +	uint16_t l3_offset = pkt->l2_len;
> +	uint16_t l4_offset = l3_offset + pkt->l3_len;
> +
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char*) +
> +			l3_offset);
> +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> +	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> +	tail_idx = nb_segs - 1;
> +
> +	for (i = 0; i < nb_segs; i++) {
> +		update_ipv4_header(segs[i], l3_offset, id);
> +		update_tcp_header(segs[i], l4_offset, sent_seq, i < tail_idx);
> +		id += ipid_delta;
> +		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
> +	}
> +}
> +
> +int
> +gso_tcp4_segment(struct rte_mbuf *pkt,
> +		uint16_t gso_size,
> +		uint8_t ipid_delta,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out)
> +{
> +	struct ipv4_hdr *ipv4_hdr;
> +	uint16_t tcp_dl;
> +	uint16_t pyld_unit_size, hdr_offset;
> +	uint16_t frag_off;
> +	int ret;
> +
> +	/* Don't process the fragmented packet */
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			pkt->l2_len);
> +	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
> +	if (unlikely(IS_FRAGMENTED(frag_off))) {
> +		pkts_out[0] = pkt;
> +		return 1;
> +	}
> +
> +	/* Don't process the packet without data */
> +	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
> +	if (unlikely(tcp_dl == 0)) {
> +		pkts_out[0] = pkt;
> +		return 1;
> +	}
> +
> +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> +	pyld_unit_size = gso_size - hdr_offset;
> +
> +	/* Segment the payload */
> +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
> +			indirect_pool, pkts_out, nb_pkts_out);
> +	if (ret > 1)
> +		update_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
> +
> +	return ret;
> +}
> diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
> new file mode 100644
> index 0000000..1c57441
> --- /dev/null
> +++ b/lib/librte_gso/gso_tcp4.h
> @@ -0,0 +1,74 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _GSO_TCP4_H_
> +#define _GSO_TCP4_H_
> +
> +#include <stdint.h>
> +#include <rte_mbuf.h>
> +
> +/**
> + * Segment an IPv4/TCP packet. This function doesn't check if the input
> + * packet has correct checksums, and doesn't update checksums for output
> + * GSO segments. Furthermore, it doesn't process IP fragment packets.
> + *
> + * @param pkt
> + *  The packet mbuf to segment.
> + * @param gso_size
> + *  The max length of a GSO segment, measured in bytes.
> + * @param ipid_delta
> + *  The increasing unit of IP ids.
> + * @param direct_pool
> + *  MBUF pool used for allocating direct buffers for output segments.
> + * @param indirect_pool
> + *  MBUF pool used for allocating indirect buffers for output segments.
> + * @param pkts_out
> + *  Pointer array used to store the MBUF addresses of output GSO
> + *  segments, when the function succeeds. If the memory space in
> + *  pkts_out is insufficient, it fails and returns -EINVAL.
> + * @param nb_pkts_out
> + *  The max number of items that 'pkts_out' can keep.
> + *
> + * @return
> + *   - The number of GSO segments filled in pkts_out on success.
> + *   - Return -ENOMEM if run out of memory in MBUF pools.
> + *   - Return -EINVAL for invalid parameters.
> + */
> +int gso_tcp4_segment(struct rte_mbuf *pkt,
> +		uint16_t gso_size,
> +		uint8_t ip_delta,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out);
> +#endif
> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> index b773636..a4fce50 100644
> --- a/lib/librte_gso/rte_gso.c
> +++ b/lib/librte_gso/rte_gso.c
> @@ -33,7 +33,12 @@
> 
>  #include <errno.h>
> 
> +#include <rte_log.h>
> +#include <rte_ethdev.h>
> +
>  #include "rte_gso.h"
> +#include "gso_common.h"
> +#include "gso_tcp4.h"
> 
>  int
>  rte_gso_segment(struct rte_mbuf *pkt,
> @@ -41,12 +46,53 @@
>  		struct rte_mbuf **pkts_out,
>  		uint16_t nb_pkts_out)
>  {
> +	struct rte_mempool *direct_pool, *indirect_pool;
> +	struct rte_mbuf *pkt_seg;
> +	uint64_t ol_flags;
> +	uint16_t gso_size;
> +	uint8_t ipid_delta;
> +	int ret = 1;
> +
>  	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
>  			nb_pkts_out < 1)
>  		return -EINVAL;
> 
> -	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> -	pkts_out[0] = pkt;
> +	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
> +				DEV_TX_OFFLOAD_TCP_TSO) !=
> +			gso_ctx->gso_types) {
> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> +		pkts_out[0] = pkt;
> +		return 1;
> +	}
> +
> +	direct_pool = gso_ctx->direct_pool;
> +	indirect_pool = gso_ctx->indirect_pool;
> +	gso_size = gso_ctx->gso_size;
> +	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
> +	ol_flags = pkt->ol_flags;
> +
> +	if (IS_IPV4_TCP(pkt->ol_flags)) {
> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> +		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> +				direct_pool, indirect_pool,
> +				pkts_out, nb_pkts_out);
> +	} else {
> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);

Not sure why do you clean this flag if you don't support that packet type
and no action was perfomed?
Suppose you have a mix ipv4 and ipv6 packets - gso lib would do ipv4 and someone else
(HW?) can do ipv4 segmentation.
BTW, did you notice that buiding of shared target fails?
Konstantin


> +		pkts_out[0] = pkt;
> +		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
> +		return 1;
> +	}
> +
> +	if (ret > 1) {
> +		pkt_seg = pkt;
> +		while (pkt_seg) {
> +			rte_mbuf_refcnt_update(pkt_seg, -1);
> +			pkt_seg = pkt_seg->next;
> +		}
> +	} else if (ret < 0) {
> +		/* Revert the ol_flags in the event of failure. */
> +		pkt->ol_flags = ol_flags;
> +	}
> 
> -	return 1;
> +	return ret;
>  }
> --
> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
  2017-10-02 16:45           ` [PATCH v6 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
  2017-10-04 13:32             ` Ananyev, Konstantin
@ 2017-10-04 13:35             ` Ananyev, Konstantin
  2017-10-04 14:22               ` Kavanagh, Mark B
  1 sibling, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-10-04 13:35 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Wednesday, October 4, 2017 2:32 PM
> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
> Subject: RE: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
> 
> Hi Mark,
> 
> > -----Original Message-----
> > From: Kavanagh, Mark B
> > Sent: Monday, October 2, 2017 5:46 PM
> > To: dev@dpdk.org
> > Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> Yigit,
> > Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> > Subject: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
> >
> > From: Jiayu Hu <jiayu.hu@intel.com>
> >
> > This patch adds GSO support for TCP/IPv4 packets. Supported packets
> > may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
> > packets have correct checksums, and doesn't update checksums for
> > output packets (the responsibility for this lies with the application).
> > Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
> >
> > TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
> > MBUF, to organize an output packet. Note that we refer to these two
> > chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
> > header, while the indirect mbuf simply points to a location within the
> > original packet's payload. Consequently, use of the GSO library requires
> > multi-segment MBUF support in the TX functions of the NIC driver.
> >
> > If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> > result, when all of its GSOed segments are freed, the packet is freed
> > automatically.
> >
> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> > Tested-by: Lei Yao <lei.a.yao@intel.com>
> > ---
> >  doc/guides/rel_notes/release_17_11.rst  |  12 +++
> >  lib/librte_eal/common/include/rte_log.h |   1 +
> >  lib/librte_gso/Makefile                 |   2 +
> >  lib/librte_gso/gso_common.c             | 153 ++++++++++++++++++++++++++++++++
> >  lib/librte_gso/gso_common.h             | 141 +++++++++++++++++++++++++++++
> >  lib/librte_gso/gso_tcp4.c               | 104 ++++++++++++++++++++++
> >  lib/librte_gso/gso_tcp4.h               |  74 +++++++++++++++
> >  lib/librte_gso/rte_gso.c                |  52 ++++++++++-
> >  8 files changed, 536 insertions(+), 3 deletions(-)
> >  create mode 100644 lib/librte_gso/gso_common.c
> >  create mode 100644 lib/librte_gso/gso_common.h
> >  create mode 100644 lib/librte_gso/gso_tcp4.c
> >  create mode 100644 lib/librte_gso/gso_tcp4.h
> >
> > diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
> > index 7508be7..c414f73 100644
> > --- a/doc/guides/rel_notes/release_17_11.rst
> > +++ b/doc/guides/rel_notes/release_17_11.rst
> > @@ -41,6 +41,18 @@ New Features
> >       Also, make sure to start the actual text at the margin.
> >       =========================================================
> >
> > +* **Added the Generic Segmentation Offload Library.**
> > +
> > +  Added the Generic Segmentation Offload (GSO) library to enable
> > +  applications to split large packets (e.g. MTU is 64KB) into small
> > +  ones (e.g. MTU is 1500B). Supported packet types are:
> > +
> > +  * TCP/IPv4 packets, which may include a single VLAN tag.

As a nit: I think it doesn't matter as you are relying on mbuf->l2_len.
Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 6/6] doc: add GSO programmer's guide
  2017-10-02 16:45           ` [PATCH v6 6/6] doc: add GSO programmer's guide Mark Kavanagh
@ 2017-10-04 13:51             ` Mcnamara, John
  0 siblings, 0 replies; 157+ messages in thread
From: Mcnamara, John @ 2017-10-04 13:51 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev
  Cc: Hu, Jiayu, Tan, Jianfeng, Ananyev, Konstantin, Yigit, Ferruh,
	thomas, Kavanagh, Mark B



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Mark Kavanagh
> Sent: Monday, October 2, 2017 5:46 PM
> To: dev@dpdk.org
> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng
> <jianfeng.tan@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> thomas@monjalon.net; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> Subject: [dpdk-dev] [PATCH v6 6/6] doc: add GSO programmer's guide
> 
> Add programmer's guide doc to explain the design and use of the
> GSO library.
> 
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>

Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 3/6] gso: add VxLAN GSO support
  2017-10-02 16:45           ` [PATCH v6 3/6] gso: add VxLAN " Mark Kavanagh
@ 2017-10-04 14:12             ` Ananyev, Konstantin
  2017-10-04 14:35               ` Kavanagh, Mark B
  2017-10-04 16:13               ` Kavanagh, Mark B
  0 siblings, 2 replies; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-10-04 14:12 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Monday, October 2, 2017 5:46 PM
> To: dev@dpdk.org
> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> Subject: [PATCH v6 3/6] gso: add VxLAN GSO support
> 
> This patch adds a framework that allows GSO on tunneled packets.
> Furthermore, it leverages that framework to provide GSO support for
> VxLAN-encapsulated packets.
> 
> Supported VxLAN packets must have an outer IPv4 header (prepended by an
> optional VLAN tag), and contain an inner TCP/IPv4 packet (with an optional
> inner VLAN tag).
> 
> VxLAN GSO doesn't check if input packets have correct checksums and
> doesn't update checksums for output packets. Additionally, it doesn't
> process IP fragmented packets.
> 
> As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
> output packet, which mandates support for multi-segment mbufs in the TX
> functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
> reduces its MBUF refcnt by 1. As a result, when all of its GSO'd segments
> are freed, the packet is freed automatically.
> 
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> ---
>  doc/guides/rel_notes/release_17_11.rst |   3 +
>  lib/librte_gso/Makefile                |   1 +
>  lib/librte_gso/gso_common.h            |  25 +++++++
>  lib/librte_gso/gso_tunnel_tcp4.c       | 123 +++++++++++++++++++++++++++++++++
>  lib/librte_gso/gso_tunnel_tcp4.h       |  75 ++++++++++++++++++++
>  lib/librte_gso/rte_gso.c               |  13 +++-
>  6 files changed, 237 insertions(+), 3 deletions(-)
>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
> 
> diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
> index c414f73..25b8a78 100644
> --- a/doc/guides/rel_notes/release_17_11.rst
> +++ b/doc/guides/rel_notes/release_17_11.rst
> @@ -48,6 +48,9 @@ New Features
>    ones (e.g. MTU is 1500B). Supported packet types are:
> 
>    * TCP/IPv4 packets, which may include a single VLAN tag.
> +  * VxLAN packets, which must have an outer IPv4 header (prepended by
> +    an optional VLAN tag), and contain an inner TCP/IPv4 packet (with
> +    an optional VLAN tag).
> 
>    The GSO library doesn't check if the input packets have correct
>    checksums, and doesn't update checksums for output packets.
> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> index 2be64d1..e6d41df 100644
> --- a/lib/librte_gso/Makefile
> +++ b/lib/librte_gso/Makefile
> @@ -44,6 +44,7 @@ LIBABIVER := 1
>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
> 
>  # install this header file
>  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> index 8d9b94e..c051295 100644
> --- a/lib/librte_gso/gso_common.h
> +++ b/lib/librte_gso/gso_common.h
> @@ -39,6 +39,7 @@
>  #include <rte_mbuf.h>
>  #include <rte_ip.h>
>  #include <rte_tcp.h>
> +#include <rte_udp.h>
> 
>  #define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
>  		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
> @@ -49,6 +50,30 @@
>  #define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
>  		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
> 
> +#define IS_IPV4_VXLAN_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
> +				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN)) == \
> +		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
> +		 PKT_TX_TUNNEL_VXLAN))
> +
> +/**
> + * Internal function which updates the UDP header of a packet, following
> + * segmentation. This is required to update the header's datagram length field.
> + *
> + * @param pkt
> + *  The packet containing the UDP header.
> + * @param udp_offset
> + *  The offset of the UDP header from the start of the packet.
> + */
> +static inline void
> +update_udp_header(struct rte_mbuf *pkt, uint16_t udp_offset)
> +{
> +	struct udp_hdr *udp_hdr;
> +
> +	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			udp_offset);
> +	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - udp_offset);
> +}
> +
>  /**
>   * Internal function which updates the TCP header of a packet, following
>   * segmentation. This is required to update the header's 'sent' sequence
> diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
> new file mode 100644
> index 0000000..34bbbd7
> --- /dev/null
> +++ b/lib/librte_gso/gso_tunnel_tcp4.c
> @@ -0,0 +1,123 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include "gso_common.h"
> +#include "gso_tunnel_tcp4.h"
> +
> +static void
> +update_tunnel_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
> +		struct rte_mbuf **segs, uint16_t nb_segs)
> +{
> +	struct ipv4_hdr *ipv4_hdr;
> +	struct tcp_hdr *tcp_hdr;
> +	uint32_t sent_seq;
> +	uint16_t outer_id, inner_id, tail_idx, i;
> +	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
> +
> +	outer_ipv4_offset = pkt->outer_l2_len;
> +	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
> +	inner_ipv4_offset = udp_offset + pkt->l2_len;
> +	tcp_offset = inner_ipv4_offset + pkt->l3_len;
> +
> +	/* Outer IPv4 header. */
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			outer_ipv4_offset);
> +	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> +
> +	/* Inner IPv4 header. */
> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			inner_ipv4_offset);
> +	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> +
> +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> +	tail_idx = nb_segs - 1;
> +
> +	for (i = 0; i < nb_segs; i++) {
> +		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
> +		update_udp_header(segs[i], udp_offset);
> +		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
> +		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
> +		outer_id++;
> +		inner_id += ipid_delta;
> +		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
> +	}
> +}
> +
> +int
> +gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
> +		uint16_t gso_size,
> +		uint8_t ipid_delta,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out)
> +{
> +	struct ipv4_hdr *inner_ipv4_hdr;
> +	uint16_t pyld_unit_size, hdr_offset;
> +	uint16_t tcp_dl, frag_off;
> +	int ret = 1;
> +
> +	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
> +	inner_ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> +			hdr_offset);
> +	/*
> +	 * Don't process the packet whose MF bit or offset in the inner
> +	 * IPv4 header are non-zero.
> +	 */
> +	frag_off = rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset);
> +	if (unlikely(IS_FRAGMENTED(frag_off))) {
> +		pkts_out[0] = pkt;
> +		return 1;
> +	}
> +
> +	/* Don't process the packet without data */
> +	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
> +	if (unlikely(tcp_dl == 0)) {

You probably need to take into account outer_len* too..
Probably better to move that check after final hdr_offset calculations:

...
hdr_offset += pkt->l3_len + pkt->l4_len;
if (hdr_offset >= pkt->pkt_len) {..;' return 1;}
...

> +		pkts_out[0] = pkt;
> +		return 1;
> +	}
> +
> +	hdr_offset += pkt->l3_len + pkt->l4_len;
> +	pyld_unit_size = gso_size - hdr_offset;
> +
> +	/* Segment the payload */
> +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
> +			indirect_pool, pkts_out, nb_pkts_out);
> +	if (ret <= 1)
> +		return ret;
> +
> +	update_tunnel_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
> +
> +	return ret;
> +}
> diff --git a/lib/librte_gso/gso_tunnel_tcp4.h b/lib/librte_gso/gso_tunnel_tcp4.h
> new file mode 100644
> index 0000000..3c67f0c
> --- /dev/null
> +++ b/lib/librte_gso/gso_tunnel_tcp4.h
> @@ -0,0 +1,75 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _GSO_TUNNEL_TCP4_H_
> +#define _GSO_TUNNEL_TCP4_H_
> +
> +#include <stdint.h>
> +#include <rte_mbuf.h>
> +
> +/**
> + * Segment a tunneling packet with inner TCP/IPv4 headers. This function
> + * doesn't check if the input packet has correct checksums, and doesn't
> + * update checksums for output GSO segments. Furthermore, it doesn't
> + * process IP fragment packets.
> + *
> + * @param pkt
> + *  The packet mbuf to segment.
> + * @param gso_size
> + *  The max length of a GSO segment, measured in bytes.
> + * @param ipid_delta
> + *  The increasing unit of IP ids.
> + * @param direct_pool
> + *  MBUF pool used for allocating direct buffers for output segments.
> + * @param indirect_pool
> + *  MBUF pool used for allocating indirect buffers for output segments.
> + * @param pkts_out
> + *  Pointer array used to store the MBUF addresses of output GSO
> + *  segments, when it succeeds. If the memory space in pkts_out is
> + *  insufficient, it fails and returns -EINVAL.
> + * @param nb_pkts_out
> + *  The max number of items that 'pkts_out' can keep.
> + *
> + * @return
> + *   - The number of GSO segments filled in pkts_out on success.
> + *   - Return -ENOMEM if run out of memory in MBUF pools.
> + *   - Return -EINVAL for invalid parameters.
> + */
> +int gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
> +		uint16_t gso_size,
> +		uint8_t ipid_delta,
> +		struct rte_mempool *direct_pool,
> +		struct rte_mempool *indirect_pool,
> +		struct rte_mbuf **pkts_out,
> +		uint16_t nb_pkts_out);
> +#endif
> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> index a4fce50..6095689 100644
> --- a/lib/librte_gso/rte_gso.c
> +++ b/lib/librte_gso/rte_gso.c
> @@ -39,6 +39,7 @@
>  #include "rte_gso.h"
>  #include "gso_common.h"
>  #include "gso_tcp4.h"
> +#include "gso_tunnel_tcp4.h"
> 
>  int
>  rte_gso_segment(struct rte_mbuf *pkt,
> @@ -58,8 +59,9 @@
>  		return -EINVAL;
> 
>  	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
> -				DEV_TX_OFFLOAD_TCP_TSO) !=
> -			gso_ctx->gso_types) {
> +				(DEV_TX_OFFLOAD_TCP_TSO |
> +				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
> +				gso_ctx->gso_types) {
>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>  		pkts_out[0] = pkt;
>  		return 1;
> @@ -71,7 +73,12 @@
>  	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
>  	ol_flags = pkt->ol_flags;
> 
> -	if (IS_IPV4_TCP(pkt->ol_flags)) {
> +	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) {
> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> +		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
> +				direct_pool, indirect_pool,
> +				pkts_out, nb_pkts_out);
> +	} else if (IS_IPV4_TCP(pkt->ol_flags)) {

Hmm it doesn't look quite right.
Imagine user doesn't want libgso to segment plain TCP packets with that ctx, just VXLAN+TCP.

I think you need to merge that if and one above to something like that:

If (IS_IPV4_VXLAN_TCP4(pkt->ol_flags))
  &&  (gso_ctx->gso_types & (DEV_TX_OFFLOAD_VXLAN_TNL_TSO | DEV_TX_OFFLOAD_TCP_TSO)) == 
   (DEV_TX_OFFLOAD_VXLAN_TNL_TSO | DEV_TX_OFFLOAD_TCP_TSO)) {
   ...
} else if (IS_IPV4_TCP(pkt->ol_flags) && (gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO)) {
   ...
} else {
     /* unsupported packet, skip */
}

Konstantin

>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
>  				direct_pool, indirect_pool,
> --
> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 4/6] gso: add GRE GSO support
  2017-10-02 16:45           ` [PATCH v6 4/6] gso: add GRE " Mark Kavanagh
@ 2017-10-04 14:15             ` Ananyev, Konstantin
  2017-10-04 14:36               ` Kavanagh, Mark B
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-10-04 14:15 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas


> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> index 6095689..b748ab1 100644
> --- a/lib/librte_gso/rte_gso.c
> +++ b/lib/librte_gso/rte_gso.c
> @@ -60,8 +60,9 @@
> 
>  	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
>  				(DEV_TX_OFFLOAD_TCP_TSO |
> -				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
> -				gso_ctx->gso_types) {
> +				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
> +				 DEV_TX_OFFLOAD_GRE_TNL_TSO)) !=
> +				 gso_ctx->gso_types) {
>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>  		pkts_out[0] = pkt;
>  		return 1;
> @@ -73,7 +74,8 @@
>  	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
>  	ol_flags = pkt->ol_flags;
> 
> -	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) {
> +	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags) ||
> +			IS_IPV4_GRE_TCP4(pkt->ol_flags)) {

Same comment as for previous patch: user might want that ctx to 
Segment vxlan packets and not segment gro packets.
Konstantin

>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>  		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
>  				direct_pool, indirect_pool,
> --
> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
  2017-10-04 13:35             ` Ananyev, Konstantin
@ 2017-10-04 14:22               ` Kavanagh, Mark B
  0 siblings, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-10-04 14:22 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



>-----Original Message-----
>From: Ananyev, Konstantin
>Sent: Wednesday, October 4, 2017 2:36 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>Subject: RE: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
>
>
>
>> -----Original Message-----
>> From: Ananyev, Konstantin
>> Sent: Wednesday, October 4, 2017 2:32 PM
>> To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>> Subject: RE: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
>>
>> Hi Mark,
>>
>> > -----Original Message-----
>> > From: Kavanagh, Mark B
>> > Sent: Monday, October 2, 2017 5:46 PM
>> > To: dev@dpdk.org
>> > Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng
><jianfeng.tan@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>;
>> Yigit,
>> > Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>
>> > Subject: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
>> >
>> > From: Jiayu Hu <jiayu.hu@intel.com>
>> >
>> > This patch adds GSO support for TCP/IPv4 packets. Supported packets
>> > may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
>> > packets have correct checksums, and doesn't update checksums for
>> > output packets (the responsibility for this lies with the application).
>> > Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
>> >
>> > TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
>> > MBUF, to organize an output packet. Note that we refer to these two
>> > chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
>> > header, while the indirect mbuf simply points to a location within the
>> > original packet's payload. Consequently, use of the GSO library requires
>> > multi-segment MBUF support in the TX functions of the NIC driver.
>> >
>> > If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
>> > result, when all of its GSOed segments are freed, the packet is freed
>> > automatically.
>> >
>> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
>> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> > Tested-by: Lei Yao <lei.a.yao@intel.com>
>> > ---
>> >  doc/guides/rel_notes/release_17_11.rst  |  12 +++
>> >  lib/librte_eal/common/include/rte_log.h |   1 +
>> >  lib/librte_gso/Makefile                 |   2 +
>> >  lib/librte_gso/gso_common.c             | 153
>++++++++++++++++++++++++++++++++
>> >  lib/librte_gso/gso_common.h             | 141
>+++++++++++++++++++++++++++++
>> >  lib/librte_gso/gso_tcp4.c               | 104 ++++++++++++++++++++++
>> >  lib/librte_gso/gso_tcp4.h               |  74 +++++++++++++++
>> >  lib/librte_gso/rte_gso.c                |  52 ++++++++++-
>> >  8 files changed, 536 insertions(+), 3 deletions(-)
>> >  create mode 100644 lib/librte_gso/gso_common.c
>> >  create mode 100644 lib/librte_gso/gso_common.h
>> >  create mode 100644 lib/librte_gso/gso_tcp4.c
>> >  create mode 100644 lib/librte_gso/gso_tcp4.h
>> >
>> > diff --git a/doc/guides/rel_notes/release_17_11.rst
>b/doc/guides/rel_notes/release_17_11.rst
>> > index 7508be7..c414f73 100644
>> > --- a/doc/guides/rel_notes/release_17_11.rst
>> > +++ b/doc/guides/rel_notes/release_17_11.rst
>> > @@ -41,6 +41,18 @@ New Features
>> >       Also, make sure to start the actual text at the margin.
>> >       =========================================================
>> >
>> > +* **Added the Generic Segmentation Offload Library.**
>> > +
>> > +  Added the Generic Segmentation Offload (GSO) library to enable
>> > +  applications to split large packets (e.g. MTU is 64KB) into small
>> > +  ones (e.g. MTU is 1500B). Supported packet types are:
>> > +
>> > +  * TCP/IPv4 packets, which may include a single VLAN tag.
>
>As a nit: I think it doesn't matter as you are relying on mbuf->l2_len.
>Konstantin
>

Okay, I'll remove any mention of VLAN tags in the description - thanks!

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
  2017-10-04 13:32             ` Ananyev, Konstantin
@ 2017-10-04 14:30               ` Kavanagh, Mark B
  2017-10-04 14:49                 ` Ananyev, Konstantin
  0 siblings, 1 reply; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-10-04 14:30 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



>-----Original Message-----
>From: Ananyev, Konstantin
>Sent: Wednesday, October 4, 2017 2:32 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>Subject: RE: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
>
>Hi Mark,
>
>> -----Original Message-----
>> From: Kavanagh, Mark B
>> Sent: Monday, October 2, 2017 5:46 PM
>> To: dev@dpdk.org
>> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
>> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>
>> Subject: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
>>
>> From: Jiayu Hu <jiayu.hu@intel.com>
>>
>> This patch adds GSO support for TCP/IPv4 packets. Supported packets
>> may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
>> packets have correct checksums, and doesn't update checksums for
>> output packets (the responsibility for this lies with the application).
>> Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
>>
>> TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
>> MBUF, to organize an output packet. Note that we refer to these two
>> chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
>> header, while the indirect mbuf simply points to a location within the
>> original packet's payload. Consequently, use of the GSO library requires
>> multi-segment MBUF support in the TX functions of the NIC driver.
>>
>> If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
>> result, when all of its GSOed segments are freed, the packet is freed
>> automatically.
>>
>> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> Tested-by: Lei Yao <lei.a.yao@intel.com>
>> ---
>>  doc/guides/rel_notes/release_17_11.rst  |  12 +++
>>  lib/librte_eal/common/include/rte_log.h |   1 +
>>  lib/librte_gso/Makefile                 |   2 +
>>  lib/librte_gso/gso_common.c             | 153
>++++++++++++++++++++++++++++++++
>>  lib/librte_gso/gso_common.h             | 141 +++++++++++++++++++++++++++++
>>  lib/librte_gso/gso_tcp4.c               | 104 ++++++++++++++++++++++
>>  lib/librte_gso/gso_tcp4.h               |  74 +++++++++++++++
>>  lib/librte_gso/rte_gso.c                |  52 ++++++++++-
>>  8 files changed, 536 insertions(+), 3 deletions(-)
>>  create mode 100644 lib/librte_gso/gso_common.c
>>  create mode 100644 lib/librte_gso/gso_common.h
>>  create mode 100644 lib/librte_gso/gso_tcp4.c
>>  create mode 100644 lib/librte_gso/gso_tcp4.h
>>
>> diff --git a/doc/guides/rel_notes/release_17_11.rst
>b/doc/guides/rel_notes/release_17_11.rst
>> index 7508be7..c414f73 100644
>> --- a/doc/guides/rel_notes/release_17_11.rst
>> +++ b/doc/guides/rel_notes/release_17_11.rst
>> @@ -41,6 +41,18 @@ New Features
>>       Also, make sure to start the actual text at the margin.
>>       =========================================================
>>
>> +* **Added the Generic Segmentation Offload Library.**
>> +
>> +  Added the Generic Segmentation Offload (GSO) library to enable
>> +  applications to split large packets (e.g. MTU is 64KB) into small
>> +  ones (e.g. MTU is 1500B). Supported packet types are:
>> +
>> +  * TCP/IPv4 packets, which may include a single VLAN tag.
>> +
>> +  The GSO library doesn't check if the input packets have correct
>> +  checksums, and doesn't update checksums for output packets.
>> +  Additionally, the GSO library doesn't process IP fragmented packets.
>> +
>>
>>  Resolved Issues
>>  ---------------
>> diff --git a/lib/librte_eal/common/include/rte_log.h
>b/lib/librte_eal/common/include/rte_log.h
>> index ec8dba7..2fa1199 100644
>> --- a/lib/librte_eal/common/include/rte_log.h
>> +++ b/lib/librte_eal/common/include/rte_log.h
>> @@ -87,6 +87,7 @@ struct rte_logs {
>>  #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
>>  #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
>>  #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
>> +#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
>>
>>  /* these log types can be used in an application */
>>  #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
>> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
>> index aeaacbc..2be64d1 100644
>> --- a/lib/librte_gso/Makefile
>> +++ b/lib/librte_gso/Makefile
>> @@ -42,6 +42,8 @@ LIBABIVER := 1
>>
>>  #source files
>>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
>> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
>> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
>>
>>  # install this header file
>>  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
>> diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
>> new file mode 100644
>> index 0000000..ee75d4c
>> --- /dev/null
>> +++ b/lib/librte_gso/gso_common.c
>> @@ -0,0 +1,153 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#include <stdbool.h>
>> +#include <errno.h>
>> +
>> +#include <rte_memcpy.h>
>> +#include <rte_mempool.h>
>> +
>> +#include "gso_common.h"
>> +
>> +static inline void
>> +hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
>> +		uint16_t pkt_hdr_offset)
>> +{
>> +	/* Copy MBUF metadata */
>> +	hdr_segment->nb_segs = 1;
>> +	hdr_segment->port = pkt->port;
>> +	hdr_segment->ol_flags = pkt->ol_flags;
>> +	hdr_segment->packet_type = pkt->packet_type;
>> +	hdr_segment->pkt_len = pkt_hdr_offset;
>> +	hdr_segment->data_len = pkt_hdr_offset;
>> +	hdr_segment->tx_offload = pkt->tx_offload;
>> +
>> +	/* Copy the packet header */
>> +	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
>> +			rte_pktmbuf_mtod(pkt, char *),
>> +			pkt_hdr_offset);
>> +}
>> +
>> +static inline void
>> +free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
>> +{
>> +	uint16_t i;
>> +
>> +	for (i = 0; i < nb_pkts; i++)
>> +		rte_pktmbuf_free(pkts[i]);
>> +}
>> +
>> +int
>> +gso_do_segment(struct rte_mbuf *pkt,
>> +		uint16_t pkt_hdr_offset,
>> +		uint16_t pyld_unit_size,
>> +		struct rte_mempool *direct_pool,
>> +		struct rte_mempool *indirect_pool,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out)
>> +{
>> +	struct rte_mbuf *pkt_in;
>> +	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
>> +	uint16_t pkt_in_data_pos, segment_bytes_remaining;
>> +	uint16_t pyld_len, nb_segs;
>> +	bool more_in_pkt, more_out_segs;
>> +
>> +	pkt_in = pkt;
>> +	nb_segs = 0;
>> +	more_in_pkt = 1;
>> +	pkt_in_data_pos = pkt_hdr_offset;
>> +
>> +	while (more_in_pkt) {
>> +		if (unlikely(nb_segs >= nb_pkts_out)) {
>> +			free_gso_segment(pkts_out, nb_segs);
>> +			return -EINVAL;
>> +		}
>> +
>> +		/* Allocate a direct MBUF */
>> +		hdr_segment = rte_pktmbuf_alloc(direct_pool);
>> +		if (unlikely(hdr_segment == NULL)) {
>> +			free_gso_segment(pkts_out, nb_segs);
>> +			return -ENOMEM;
>> +		}
>> +		/* Fill the packet header */
>> +		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
>> +
>> +		prev_segment = hdr_segment;
>> +		segment_bytes_remaining = pyld_unit_size;
>> +		more_out_segs = 1;
>> +
>> +		while (more_out_segs && more_in_pkt) {
>> +			/* Allocate an indirect MBUF */
>> +			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
>> +			if (unlikely(pyld_segment == NULL)) {
>> +				rte_pktmbuf_free(hdr_segment);
>> +				free_gso_segment(pkts_out, nb_segs);
>> +				return -ENOMEM;
>> +			}
>> +			/* Attach to current MBUF segment of pkt */
>> +			rte_pktmbuf_attach(pyld_segment, pkt_in);
>> +
>> +			prev_segment->next = pyld_segment;
>> +			prev_segment = pyld_segment;
>> +
>> +			pyld_len = segment_bytes_remaining;
>> +			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
>> +				pyld_len = pkt_in->data_len - pkt_in_data_pos;
>> +
>> +			pyld_segment->data_off = pkt_in_data_pos +
>> +				pkt_in->data_off;
>> +			pyld_segment->data_len = pyld_len;
>> +
>> +			/* Update header segment */
>> +			hdr_segment->pkt_len += pyld_len;
>> +			hdr_segment->nb_segs++;
>> +
>> +			pkt_in_data_pos += pyld_len;
>> +			segment_bytes_remaining -= pyld_len;
>> +
>> +			/* Finish processing a MBUF segment of pkt */
>> +			if (pkt_in_data_pos == pkt_in->data_len) {
>> +				pkt_in = pkt_in->next;
>> +				pkt_in_data_pos = 0;
>> +				if (pkt_in == NULL)
>> +					more_in_pkt = 0;
>> +			}
>> +
>> +			/* Finish generating a GSO segment */
>> +			if (segment_bytes_remaining == 0)
>> +				more_out_segs = 0;
>> +		}
>> +		pkts_out[nb_segs++] = hdr_segment;
>> +	}
>> +	return nb_segs;
>> +}
>> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
>> new file mode 100644
>> index 0000000..8d9b94e
>> --- /dev/null
>> +++ b/lib/librte_gso/gso_common.h
>> @@ -0,0 +1,141 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#ifndef _GSO_COMMON_H_
>> +#define _GSO_COMMON_H_
>> +
>> +#include <stdint.h>
>> +
>> +#include <rte_mbuf.h>
>> +#include <rte_ip.h>
>> +#include <rte_tcp.h>
>> +
>> +#define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
>> +		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
>> +
>> +#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
>> +#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
>> +
>> +#define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
>> +		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
>> +
>> +/**
>> + * Internal function which updates the TCP header of a packet, following
>> + * segmentation. This is required to update the header's 'sent' sequence
>> + * number, and also to clear 'PSH' and 'FIN' flags for non-tail segments.
>> + *
>> + * @param pkt
>> + *  The packet containing the TCP header.
>> + * @param l4_offset
>> + *  The offset of the TCP header from the start of the packet.
>> + * @param sent_seq
>> + *  The sent sequence number.
>> + * @param non-tail
>> + *  Indicates whether or not this is a tail segment.
>> + */
>> +static inline void
>> +update_tcp_header(struct rte_mbuf *pkt, uint16_t l4_offset, uint32_t
>sent_seq,
>> +		uint8_t non_tail)
>> +{
>> +	struct tcp_hdr *tcp_hdr;
>> +
>> +	tcp_hdr = (struct tcp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> +			l4_offset);
>> +	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
>> +	if (likely(non_tail))
>> +		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
>> +					TCP_HDR_FIN_MASK));
>> +}
>> +
>> +/**
>> + * Internal function which updates the IPv4 header of a packet, following
>> + * segmentation. This is required to update the header's 'total_length'
>field,
>> + * to reflect the reduced length of the now-segmented packet. Furthermore,
>the
>> + * header's 'packet_id' field must be updated to reflect the new ID of the
>> + * now-segmented packet.
>> + *
>> + * @param pkt
>> + *  The packet containing the IPv4 header.
>> + * @param l3_offset
>> + *  The offset of the IPv4 header from the start of the packet.
>> + * @param id
>> + *  The new ID of the packet.
>> +  */
>> +static inline void
>> +update_ipv4_header(struct rte_mbuf *pkt, uint16_t l3_offset, uint16_t id)
>> +{
>> +	struct ipv4_hdr *ipv4_hdr;
>> +
>> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> +			l3_offset);
>> +	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len - l3_offset);
>> +	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
>> +}
>> +
>> +/**
>> + * Internal function which divides the input packet into small segments.
>> + * Each of the newly-created segments is organized as a two-segment MBUF,
>> + * where the first segment is a standard mbuf, which stores a copy of
>> + * packet header, and the second is an indirect mbuf which points to a
>> + * section of data in the input packet.
>> + *
>> + * @param pkt
>> + *  Packet to segment.
>> + * @param pkt_hdr_offset
>> + *  Packet header offset, measured in bytes.
>> + * @param pyld_unit_size
>> + *  The max payload length of a GSO segment.
>> + * @param direct_pool
>> + *  MBUF pool used for allocating direct buffers for output segments.
>> + * @param indirect_pool
>> + *  MBUF pool used for allocating indirect buffers for output segments.
>> + * @param pkts_out
>> + *  Pointer array used to keep the mbuf addresses of output segments. If
>> + *  the memory space in pkts_out is insufficient, gso_do_segment() fails
>> + *  and returns -EINVAL.
>> + * @param nb_pkts_out
>> + *  The max number of items that pkts_out can keep.
>> + *
>> + * @return
>> + *  - The number of segments created in the event of success.
>> + *  - Return -ENOMEM if run out of memory in MBUF pools.
>> + *  - Return -EINVAL for invalid parameters.
>> + */
>> +int gso_do_segment(struct rte_mbuf *pkt,
>> +		uint16_t pkt_hdr_offset,
>> +		uint16_t pyld_unit_size,
>> +		struct rte_mempool *direct_pool,
>> +		struct rte_mempool *indirect_pool,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out);
>> +#endif
>> diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
>> new file mode 100644
>> index 0000000..d83e610
>> --- /dev/null
>> +++ b/lib/librte_gso/gso_tcp4.c
>> @@ -0,0 +1,104 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#include "gso_common.h"
>> +#include "gso_tcp4.h"
>> +
>> +static void
>> +update_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
>> +		struct rte_mbuf **segs, uint16_t nb_segs)
>> +{
>> +	struct ipv4_hdr *ipv4_hdr;
>> +	struct tcp_hdr *tcp_hdr;
>> +	uint32_t sent_seq;
>> +	uint16_t id, tail_idx, i;
>> +	uint16_t l3_offset = pkt->l2_len;
>> +	uint16_t l4_offset = l3_offset + pkt->l3_len;
>> +
>> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char*) +
>> +			l3_offset);
>> +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
>> +	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
>> +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
>> +	tail_idx = nb_segs - 1;
>> +
>> +	for (i = 0; i < nb_segs; i++) {
>> +		update_ipv4_header(segs[i], l3_offset, id);
>> +		update_tcp_header(segs[i], l4_offset, sent_seq, i < tail_idx);
>> +		id += ipid_delta;
>> +		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
>> +	}
>> +}
>> +
>> +int
>> +gso_tcp4_segment(struct rte_mbuf *pkt,
>> +		uint16_t gso_size,
>> +		uint8_t ipid_delta,
>> +		struct rte_mempool *direct_pool,
>> +		struct rte_mempool *indirect_pool,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out)
>> +{
>> +	struct ipv4_hdr *ipv4_hdr;
>> +	uint16_t tcp_dl;
>> +	uint16_t pyld_unit_size, hdr_offset;
>> +	uint16_t frag_off;
>> +	int ret;
>> +
>> +	/* Don't process the fragmented packet */
>> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> +			pkt->l2_len);
>> +	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
>> +	if (unlikely(IS_FRAGMENTED(frag_off))) {
>> +		pkts_out[0] = pkt;
>> +		return 1;
>> +	}
>> +
>> +	/* Don't process the packet without data */
>> +	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
>> +	if (unlikely(tcp_dl == 0)) {
>> +		pkts_out[0] = pkt;
>> +		return 1;
>> +	}
>> +
>> +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
>> +	pyld_unit_size = gso_size - hdr_offset;
>> +
>> +	/* Segment the payload */
>> +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
>> +			indirect_pool, pkts_out, nb_pkts_out);
>> +	if (ret > 1)
>> +		update_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
>> +
>> +	return ret;
>> +}
>> diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
>> new file mode 100644
>> index 0000000..1c57441
>> --- /dev/null
>> +++ b/lib/librte_gso/gso_tcp4.h
>> @@ -0,0 +1,74 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#ifndef _GSO_TCP4_H_
>> +#define _GSO_TCP4_H_
>> +
>> +#include <stdint.h>
>> +#include <rte_mbuf.h>
>> +
>> +/**
>> + * Segment an IPv4/TCP packet. This function doesn't check if the input
>> + * packet has correct checksums, and doesn't update checksums for output
>> + * GSO segments. Furthermore, it doesn't process IP fragment packets.
>> + *
>> + * @param pkt
>> + *  The packet mbuf to segment.
>> + * @param gso_size
>> + *  The max length of a GSO segment, measured in bytes.
>> + * @param ipid_delta
>> + *  The increasing unit of IP ids.
>> + * @param direct_pool
>> + *  MBUF pool used for allocating direct buffers for output segments.
>> + * @param indirect_pool
>> + *  MBUF pool used for allocating indirect buffers for output segments.
>> + * @param pkts_out
>> + *  Pointer array used to store the MBUF addresses of output GSO
>> + *  segments, when the function succeeds. If the memory space in
>> + *  pkts_out is insufficient, it fails and returns -EINVAL.
>> + * @param nb_pkts_out
>> + *  The max number of items that 'pkts_out' can keep.
>> + *
>> + * @return
>> + *   - The number of GSO segments filled in pkts_out on success.
>> + *   - Return -ENOMEM if run out of memory in MBUF pools.
>> + *   - Return -EINVAL for invalid parameters.
>> + */
>> +int gso_tcp4_segment(struct rte_mbuf *pkt,
>> +		uint16_t gso_size,
>> +		uint8_t ip_delta,
>> +		struct rte_mempool *direct_pool,
>> +		struct rte_mempool *indirect_pool,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out);
>> +#endif
>> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
>> index b773636..a4fce50 100644
>> --- a/lib/librte_gso/rte_gso.c
>> +++ b/lib/librte_gso/rte_gso.c
>> @@ -33,7 +33,12 @@
>>
>>  #include <errno.h>
>>
>> +#include <rte_log.h>
>> +#include <rte_ethdev.h>
>> +
>>  #include "rte_gso.h"
>> +#include "gso_common.h"
>> +#include "gso_tcp4.h"
>>
>>  int
>>  rte_gso_segment(struct rte_mbuf *pkt,
>> @@ -41,12 +46,53 @@
>>  		struct rte_mbuf **pkts_out,
>>  		uint16_t nb_pkts_out)
>>  {
>> +	struct rte_mempool *direct_pool, *indirect_pool;
>> +	struct rte_mbuf *pkt_seg;
>> +	uint64_t ol_flags;
>> +	uint16_t gso_size;
>> +	uint8_t ipid_delta;
>> +	int ret = 1;
>> +
>>  	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
>>  			nb_pkts_out < 1)
>>  		return -EINVAL;
>>
>> -	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> -	pkts_out[0] = pkt;
>> +	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
>> +				DEV_TX_OFFLOAD_TCP_TSO) !=
>> +			gso_ctx->gso_types) {
>> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> +		pkts_out[0] = pkt;
>> +		return 1;
>> +	}
>> +
>> +	direct_pool = gso_ctx->direct_pool;
>> +	indirect_pool = gso_ctx->indirect_pool;
>> +	gso_size = gso_ctx->gso_size;
>> +	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
>> +	ol_flags = pkt->ol_flags;
>> +
>> +	if (IS_IPV4_TCP(pkt->ol_flags)) {
>> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> +		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
>> +				direct_pool, indirect_pool,
>> +				pkts_out, nb_pkts_out);
>> +	} else {
>> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>
>Not sure why do you clean this flag if you don't support that packet type
>and no action was perfomed?
>Suppose you have a mix ipv4 and ipv6 packets - gso lib would do ipv4 and
>someone else
>(HW?) can do ipv4 segmentation.

I can't say for definite, since I didn't implement this change. However, I can only presume that the assumption here is that since segmentation is being done in S/W that the underlying H/W does not support TSO.
Since the underlying HW can't segment the packet in HW, we should clear the flag; otherwise, if an mbuf marked for TCP segmentation is passed to the driver of a NIC that does not support/understand that feature, the behavior is undefined.
Is this a fair assumption in your opinion, or is it the case that the packet would simply be transmitted un-segmented in that case, and so we shouldn't clear the flag?

Thanks again,
Mark

>BTW, did you notice that building of shared target fails?
>Konstantin

I didn't, but I'll take a look right now - thanks for the catch!

>
>
>> +		pkts_out[0] = pkt;
>> +		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
>> +		return 1;
>> +	}
>> +
>> +	if (ret > 1) {
>> +		pkt_seg = pkt;
>> +		while (pkt_seg) {
>> +			rte_mbuf_refcnt_update(pkt_seg, -1);
>> +			pkt_seg = pkt_seg->next;
>> +		}
>> +	} else if (ret < 0) {
>> +		/* Revert the ol_flags in the event of failure. */
>> +		pkt->ol_flags = ol_flags;
>> +	}
>>
>> -	return 1;
>> +	return ret;
>>  }
>> --
>> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 3/6] gso: add VxLAN GSO support
  2017-10-04 14:12             ` Ananyev, Konstantin
@ 2017-10-04 14:35               ` Kavanagh, Mark B
  2017-10-04 16:13               ` Kavanagh, Mark B
  1 sibling, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-10-04 14:35 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



>-----Original Message-----
>From: Ananyev, Konstantin
>Sent: Wednesday, October 4, 2017 3:12 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>Subject: RE: [PATCH v6 3/6] gso: add VxLAN GSO support
>
>
>
>> -----Original Message-----
>> From: Kavanagh, Mark B
>> Sent: Monday, October 2, 2017 5:46 PM
>> To: dev@dpdk.org
>> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
>> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>
>> Subject: [PATCH v6 3/6] gso: add VxLAN GSO support
>>
>> This patch adds a framework that allows GSO on tunneled packets.
>> Furthermore, it leverages that framework to provide GSO support for
>> VxLAN-encapsulated packets.
>>
>> Supported VxLAN packets must have an outer IPv4 header (prepended by an
>> optional VLAN tag), and contain an inner TCP/IPv4 packet (with an optional
>> inner VLAN tag).
>>
>> VxLAN GSO doesn't check if input packets have correct checksums and
>> doesn't update checksums for output packets. Additionally, it doesn't
>> process IP fragmented packets.
>>
>> As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
>> output packet, which mandates support for multi-segment mbufs in the TX
>> functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
>> reduces its MBUF refcnt by 1. As a result, when all of its GSO'd segments
>> are freed, the packet is freed automatically.
>>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
>> ---
>>  doc/guides/rel_notes/release_17_11.rst |   3 +
>>  lib/librte_gso/Makefile                |   1 +
>>  lib/librte_gso/gso_common.h            |  25 +++++++
>>  lib/librte_gso/gso_tunnel_tcp4.c       | 123
>+++++++++++++++++++++++++++++++++
>>  lib/librte_gso/gso_tunnel_tcp4.h       |  75 ++++++++++++++++++++
>>  lib/librte_gso/rte_gso.c               |  13 +++-
>>  6 files changed, 237 insertions(+), 3 deletions(-)
>>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
>>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
>>
>> diff --git a/doc/guides/rel_notes/release_17_11.rst
>b/doc/guides/rel_notes/release_17_11.rst
>> index c414f73..25b8a78 100644
>> --- a/doc/guides/rel_notes/release_17_11.rst
>> +++ b/doc/guides/rel_notes/release_17_11.rst
>> @@ -48,6 +48,9 @@ New Features
>>    ones (e.g. MTU is 1500B). Supported packet types are:
>>
>>    * TCP/IPv4 packets, which may include a single VLAN tag.
>> +  * VxLAN packets, which must have an outer IPv4 header (prepended by
>> +    an optional VLAN tag), and contain an inner TCP/IPv4 packet (with
>> +    an optional VLAN tag).
>>
>>    The GSO library doesn't check if the input packets have correct
>>    checksums, and doesn't update checksums for output packets.
>> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
>> index 2be64d1..e6d41df 100644
>> --- a/lib/librte_gso/Makefile
>> +++ b/lib/librte_gso/Makefile
>> @@ -44,6 +44,7 @@ LIBABIVER := 1
>>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
>>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
>>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
>> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
>>
>>  # install this header file
>>  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
>> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
>> index 8d9b94e..c051295 100644
>> --- a/lib/librte_gso/gso_common.h
>> +++ b/lib/librte_gso/gso_common.h
>> @@ -39,6 +39,7 @@
>>  #include <rte_mbuf.h>
>>  #include <rte_ip.h>
>>  #include <rte_tcp.h>
>> +#include <rte_udp.h>
>>
>>  #define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
>>  		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
>> @@ -49,6 +50,30 @@
>>  #define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
>>  		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
>>
>> +#define IS_IPV4_VXLAN_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 |
>\
>> +				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN)) == \
>> +		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
>> +		 PKT_TX_TUNNEL_VXLAN))
>> +
>> +/**
>> + * Internal function which updates the UDP header of a packet, following
>> + * segmentation. This is required to update the header's datagram length
>field.
>> + *
>> + * @param pkt
>> + *  The packet containing the UDP header.
>> + * @param udp_offset
>> + *  The offset of the UDP header from the start of the packet.
>> + */
>> +static inline void
>> +update_udp_header(struct rte_mbuf *pkt, uint16_t udp_offset)
>> +{
>> +	struct udp_hdr *udp_hdr;
>> +
>> +	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> +			udp_offset);
>> +	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - udp_offset);
>> +}
>> +
>>  /**
>>   * Internal function which updates the TCP header of a packet, following
>>   * segmentation. This is required to update the header's 'sent' sequence
>> diff --git a/lib/librte_gso/gso_tunnel_tcp4.c
>b/lib/librte_gso/gso_tunnel_tcp4.c
>> new file mode 100644
>> index 0000000..34bbbd7
>> --- /dev/null
>> +++ b/lib/librte_gso/gso_tunnel_tcp4.c
>> @@ -0,0 +1,123 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#include "gso_common.h"
>> +#include "gso_tunnel_tcp4.h"
>> +
>> +static void
>> +update_tunnel_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
>> +		struct rte_mbuf **segs, uint16_t nb_segs)
>> +{
>> +	struct ipv4_hdr *ipv4_hdr;
>> +	struct tcp_hdr *tcp_hdr;
>> +	uint32_t sent_seq;
>> +	uint16_t outer_id, inner_id, tail_idx, i;
>> +	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
>> +
>> +	outer_ipv4_offset = pkt->outer_l2_len;
>> +	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
>> +	inner_ipv4_offset = udp_offset + pkt->l2_len;
>> +	tcp_offset = inner_ipv4_offset + pkt->l3_len;
>> +
>> +	/* Outer IPv4 header. */
>> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> +			outer_ipv4_offset);
>> +	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
>> +
>> +	/* Inner IPv4 header. */
>> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> +			inner_ipv4_offset);
>> +	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
>> +
>> +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
>> +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
>> +	tail_idx = nb_segs - 1;
>> +
>> +	for (i = 0; i < nb_segs; i++) {
>> +		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
>> +		update_udp_header(segs[i], udp_offset);
>> +		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
>> +		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
>> +		outer_id++;
>> +		inner_id += ipid_delta;
>> +		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
>> +	}
>> +}
>> +
>> +int
>> +gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
>> +		uint16_t gso_size,
>> +		uint8_t ipid_delta,
>> +		struct rte_mempool *direct_pool,
>> +		struct rte_mempool *indirect_pool,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out)
>> +{
>> +	struct ipv4_hdr *inner_ipv4_hdr;
>> +	uint16_t pyld_unit_size, hdr_offset;
>> +	uint16_t tcp_dl, frag_off;
>> +	int ret = 1;
>> +
>> +	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
>> +	inner_ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> +			hdr_offset);
>> +	/*
>> +	 * Don't process the packet whose MF bit or offset in the inner
>> +	 * IPv4 header are non-zero.
>> +	 */
>> +	frag_off = rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset);
>> +	if (unlikely(IS_FRAGMENTED(frag_off))) {
>> +		pkts_out[0] = pkt;
>> +		return 1;
>> +	}
>> +
>> +	/* Don't process the packet without data */
>> +	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
>> +	if (unlikely(tcp_dl == 0)) {
>
>You probably need to take into account outer_len* too..
>Probably better to move that check after final hdr_offset calculations:

Agreed - thanks.

>
>...
>hdr_offset += pkt->l3_len + pkt->l4_len;
>if (hdr_offset >= pkt->pkt_len) {..;' return 1;}
>...
>
>> +		pkts_out[0] = pkt;
>> +		return 1;
>> +	}
>> +
>> +	hdr_offset += pkt->l3_len + pkt->l4_len;
>> +	pyld_unit_size = gso_size - hdr_offset;
>> +
>> +	/* Segment the payload */
>> +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
>> +			indirect_pool, pkts_out, nb_pkts_out);
>> +	if (ret <= 1)
>> +		return ret;
>> +
>> +	update_tunnel_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
>> +
>> +	return ret;
>> +}
>> diff --git a/lib/librte_gso/gso_tunnel_tcp4.h
>b/lib/librte_gso/gso_tunnel_tcp4.h
>> new file mode 100644
>> index 0000000..3c67f0c
>> --- /dev/null
>> +++ b/lib/librte_gso/gso_tunnel_tcp4.h
>> @@ -0,0 +1,75 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#ifndef _GSO_TUNNEL_TCP4_H_
>> +#define _GSO_TUNNEL_TCP4_H_
>> +
>> +#include <stdint.h>
>> +#include <rte_mbuf.h>
>> +
>> +/**
>> + * Segment a tunneling packet with inner TCP/IPv4 headers. This function
>> + * doesn't check if the input packet has correct checksums, and doesn't
>> + * update checksums for output GSO segments. Furthermore, it doesn't
>> + * process IP fragment packets.
>> + *
>> + * @param pkt
>> + *  The packet mbuf to segment.
>> + * @param gso_size
>> + *  The max length of a GSO segment, measured in bytes.
>> + * @param ipid_delta
>> + *  The increasing unit of IP ids.
>> + * @param direct_pool
>> + *  MBUF pool used for allocating direct buffers for output segments.
>> + * @param indirect_pool
>> + *  MBUF pool used for allocating indirect buffers for output segments.
>> + * @param pkts_out
>> + *  Pointer array used to store the MBUF addresses of output GSO
>> + *  segments, when it succeeds. If the memory space in pkts_out is
>> + *  insufficient, it fails and returns -EINVAL.
>> + * @param nb_pkts_out
>> + *  The max number of items that 'pkts_out' can keep.
>> + *
>> + * @return
>> + *   - The number of GSO segments filled in pkts_out on success.
>> + *   - Return -ENOMEM if run out of memory in MBUF pools.
>> + *   - Return -EINVAL for invalid parameters.
>> + */
>> +int gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
>> +		uint16_t gso_size,
>> +		uint8_t ipid_delta,
>> +		struct rte_mempool *direct_pool,
>> +		struct rte_mempool *indirect_pool,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out);
>> +#endif
>> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
>> index a4fce50..6095689 100644
>> --- a/lib/librte_gso/rte_gso.c
>> +++ b/lib/librte_gso/rte_gso.c
>> @@ -39,6 +39,7 @@
>>  #include "rte_gso.h"
>>  #include "gso_common.h"
>>  #include "gso_tcp4.h"
>> +#include "gso_tunnel_tcp4.h"
>>
>>  int
>>  rte_gso_segment(struct rte_mbuf *pkt,
>> @@ -58,8 +59,9 @@
>>  		return -EINVAL;
>>
>>  	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
>> -				DEV_TX_OFFLOAD_TCP_TSO) !=
>> -			gso_ctx->gso_types) {
>> +				(DEV_TX_OFFLOAD_TCP_TSO |
>> +				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
>> +				gso_ctx->gso_types) {
>>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>>  		pkts_out[0] = pkt;
>>  		return 1;
>> @@ -71,7 +73,12 @@
>>  	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
>>  	ol_flags = pkt->ol_flags;
>>
>> -	if (IS_IPV4_TCP(pkt->ol_flags)) {
>> +	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) {
>> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> +		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
>> +				direct_pool, indirect_pool,
>> +				pkts_out, nb_pkts_out);
>> +	} else if (IS_IPV4_TCP(pkt->ol_flags)) {
>
>Hmm it doesn't look quite right.
>Imagine user doesn't want libgso to segment plain TCP packets with that ctx,
>just VXLAN+TCP.

That's a very good point - I'll update the code as per your suggestion.
Thanks!

>
>I think you need to merge that if and one above to something like that:
>
>If (IS_IPV4_VXLAN_TCP4(pkt->ol_flags))
>  &&  (gso_ctx->gso_types & (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
>DEV_TX_OFFLOAD_TCP_TSO)) ==
>   (DEV_TX_OFFLOAD_VXLAN_TNL_TSO | DEV_TX_OFFLOAD_TCP_TSO)) {
>   ...
>} else if (IS_IPV4_TCP(pkt->ol_flags) && (gso_ctx->gso_types &
>DEV_TX_OFFLOAD_TCP_TSO)) {
>   ...
>} else {
>     /* unsupported packet, skip */
>}
>
>Konstantin
>
>>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>>  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
>>  				direct_pool, indirect_pool,
>> --
>> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 4/6] gso: add GRE GSO support
  2017-10-04 14:15             ` Ananyev, Konstantin
@ 2017-10-04 14:36               ` Kavanagh, Mark B
  0 siblings, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-10-04 14:36 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



>-----Original Message-----
>From: Ananyev, Konstantin
>Sent: Wednesday, October 4, 2017 3:16 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>Subject: RE: [PATCH v6 4/6] gso: add GRE GSO support
>
>
>> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
>> index 6095689..b748ab1 100644
>> --- a/lib/librte_gso/rte_gso.c
>> +++ b/lib/librte_gso/rte_gso.c
>> @@ -60,8 +60,9 @@
>>
>>  	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
>>  				(DEV_TX_OFFLOAD_TCP_TSO |
>> -				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
>> -				gso_ctx->gso_types) {
>> +				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
>> +				 DEV_TX_OFFLOAD_GRE_TNL_TSO)) !=
>> +				 gso_ctx->gso_types) {
>>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>>  		pkts_out[0] = pkt;
>>  		return 1;
>> @@ -73,7 +74,8 @@
>>  	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
>>  	ol_flags = pkt->ol_flags;
>>
>> -	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) {
>> +	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags) ||
>> +			IS_IPV4_GRE_TCP4(pkt->ol_flags)) {
>
>Same comment as for previous patch: user might want that ctx to
>Segment vxlan packets and not segment gro packets.
>Konstantin

Thanks Konstantin - I'll update appropriately.

>
>>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>>  		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
>>  				direct_pool, indirect_pool,
>> --
>> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
  2017-10-04 14:30               ` Kavanagh, Mark B
@ 2017-10-04 14:49                 ` Ananyev, Konstantin
  2017-10-04 14:59                   ` Kavanagh, Mark B
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-10-04 14:49 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas

> >>  int
> >>  rte_gso_segment(struct rte_mbuf *pkt,
> >> @@ -41,12 +46,53 @@
> >>  		struct rte_mbuf **pkts_out,
> >>  		uint16_t nb_pkts_out)
> >>  {
> >> +	struct rte_mempool *direct_pool, *indirect_pool;
> >> +	struct rte_mbuf *pkt_seg;
> >> +	uint64_t ol_flags;
> >> +	uint16_t gso_size;
> >> +	uint8_t ipid_delta;
> >> +	int ret = 1;
> >> +
> >>  	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
> >>  			nb_pkts_out < 1)
> >>  		return -EINVAL;
> >>
> >> -	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> >> -	pkts_out[0] = pkt;
> >> +	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
> >> +				DEV_TX_OFFLOAD_TCP_TSO) !=
> >> +			gso_ctx->gso_types) {
> >> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> >> +		pkts_out[0] = pkt;
> >> +		return 1;
> >> +	}
> >> +
> >> +	direct_pool = gso_ctx->direct_pool;
> >> +	indirect_pool = gso_ctx->indirect_pool;
> >> +	gso_size = gso_ctx->gso_size;
> >> +	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
> >> +	ol_flags = pkt->ol_flags;
> >> +
> >> +	if (IS_IPV4_TCP(pkt->ol_flags)) {
> >> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> >> +		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> >> +				direct_pool, indirect_pool,
> >> +				pkts_out, nb_pkts_out);
> >> +	} else {
> >> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> >
> >Not sure why do you clean this flag if you don't support that packet type
> >and no action was perfomed?
> >Suppose you have a mix ipv4 and ipv6 packets - gso lib would do ipv4 and
> >someone else
> >(HW?) can do ipv4 segmentation.
> 
> I can't say for definite, since I didn't implement this change. However, I can only presume that the assumption here is that since
> segmentation is being done in S/W that the underlying H/W does not support TSO.
> Since the underlying HW can't segment the packet in HW, we should clear the flag; otherwise, if an mbuf marked for TCP segmentation is
> passed to the driver of a NIC that does not support/understand that feature, the behavior is undefined.
> Is this a fair assumption in your opinion, or is it the case that the packet would simply be transmitted un-segmented in that case, and so we
> shouldn't clear the flag?

Yes, I think if we shouldn't clear the flag if we didn't do any segmentation (we just encounter a packet type that we don't support).
Konstantin

> 
> Thanks again,
> Mark
> 
> >BTW, did you notice that building of shared target fails?
> >Konstantin
> 
> I didn't, but I'll take a look right now - thanks for the catch!
> 
> >
> >
> >> +		pkts_out[0] = pkt;
> >> +		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
> >> +		return 1;
> >> +	}
> >> +
> >> +	if (ret > 1) {
> >> +		pkt_seg = pkt;
> >> +		while (pkt_seg) {
> >> +			rte_mbuf_refcnt_update(pkt_seg, -1);
> >> +			pkt_seg = pkt_seg->next;
> >> +		}
> >> +	} else if (ret < 0) {
> >> +		/* Revert the ol_flags in the event of failure. */
> >> +		pkt->ol_flags = ol_flags;
> >> +	}
> >>
> >> -	return 1;
> >> +	return ret;
> >>  }
> >> --
> >> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
  2017-10-04 14:49                 ` Ananyev, Konstantin
@ 2017-10-04 14:59                   ` Kavanagh, Mark B
  0 siblings, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-10-04 14:59 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas

>From: Ananyev, Konstantin
>Sent: Wednesday, October 4, 2017 3:49 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>Subject: RE: [PATCH v6 2/6] gso: add TCP/IPv4 GSO support
>
>> >>  int
>> >>  rte_gso_segment(struct rte_mbuf *pkt,
>> >> @@ -41,12 +46,53 @@
>> >>  		struct rte_mbuf **pkts_out,
>> >>  		uint16_t nb_pkts_out)
>> >>  {
>> >> +	struct rte_mempool *direct_pool, *indirect_pool;
>> >> +	struct rte_mbuf *pkt_seg;
>> >> +	uint64_t ol_flags;
>> >> +	uint16_t gso_size;
>> >> +	uint8_t ipid_delta;
>> >> +	int ret = 1;
>> >> +
>> >>  	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
>> >>  			nb_pkts_out < 1)
>> >>  		return -EINVAL;
>> >>
>> >> -	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> >> -	pkts_out[0] = pkt;
>> >> +	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
>> >> +				DEV_TX_OFFLOAD_TCP_TSO) !=
>> >> +			gso_ctx->gso_types) {
>> >> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> >> +		pkts_out[0] = pkt;
>> >> +		return 1;
>> >> +	}
>> >> +
>> >> +	direct_pool = gso_ctx->direct_pool;
>> >> +	indirect_pool = gso_ctx->indirect_pool;
>> >> +	gso_size = gso_ctx->gso_size;
>> >> +	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
>> >> +	ol_flags = pkt->ol_flags;
>> >> +
>> >> +	if (IS_IPV4_TCP(pkt->ol_flags)) {
>> >> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> >> +		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
>> >> +				direct_pool, indirect_pool,
>> >> +				pkts_out, nb_pkts_out);
>> >> +	} else {
>> >> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> >
>> >Not sure why do you clean this flag if you don't support that packet type
>> >and no action was perfomed?
>> >Suppose you have a mix ipv4 and ipv6 packets - gso lib would do ipv4 and
>> >someone else
>> >(HW?) can do ipv4 segmentation.
>>
>> I can't say for definite, since I didn't implement this change. However, I
>can only presume that the assumption here is that since
>> segmentation is being done in S/W that the underlying H/W does not support
>TSO.
>> Since the underlying HW can't segment the packet in HW, we should clear the
>flag; otherwise, if an mbuf marked for TCP segmentation is
>> passed to the driver of a NIC that does not support/understand that feature,
>the behavior is undefined.
>> Is this a fair assumption in your opinion, or is it the case that the packet
>would simply be transmitted un-segmented in that case, and so we
>> shouldn't clear the flag?
>
>Yes, I think if we shouldn't clear the flag if we didn't do any segmentation
>(we just encounter a packet type that we don't support).
>Konstantin

Okay, thanks for clarifying - I'll update the code accordingly.
-Mark

>
>>
>> Thanks again,
>> Mark
>>
>> >BTW, did you notice that building of shared target fails?
>> >Konstantin
>>
>> I didn't, but I'll take a look right now - thanks for the catch!
>>
>> >
>> >
>> >> +		pkts_out[0] = pkt;
>> >> +		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
>> >> +		return 1;
>> >> +	}
>> >> +
>> >> +	if (ret > 1) {
>> >> +		pkt_seg = pkt;
>> >> +		while (pkt_seg) {
>> >> +			rte_mbuf_refcnt_update(pkt_seg, -1);
>> >> +			pkt_seg = pkt_seg->next;
>> >> +		}
>> >> +	} else if (ret < 0) {
>> >> +		/* Revert the ol_flags in the event of failure. */
>> >> +		pkt->ol_flags = ol_flags;
>> >> +	}
>> >>
>> >> -	return 1;
>> >> +	return ret;
>> >>  }
>> >> --
>> >> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-10-02 16:45           ` [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
@ 2017-10-04 15:08             ` Ananyev, Konstantin
  2017-10-04 16:23               ` Kavanagh, Mark B
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-10-04 15:08 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Monday, October 2, 2017 5:46 PM
> To: dev@dpdk.org
> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> Subject: [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
> 
> From: Jiayu Hu <jiayu.hu@intel.com>
> 
> This patch adds GSO support to the csum forwarding engine. Oversized
> packets transmitted over a GSO-enabled port will undergo segmentation
> (with the exception of packet-types unsupported by the GSO library).
> GSO support is disabled by default.
> 
> GSO support may be toggled on a per-port basis, using the command:
> 
>         "set port <port_id> gso on|off"
> 
> The maximum packet length (including the packet header and payload) for
> GSO segments may be set with the command:
> 
>         "set gso segsz <length>"
> 
> Show GSO configuration for a given port with the command:
> 
> 	"show port <port_id> gso"
> 
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> ---
>  app/test-pmd/cmdline.c                      | 178 ++++++++++++++++++++++++++++
>  app/test-pmd/config.c                       |  24 ++++
>  app/test-pmd/csumonly.c                     |  69 ++++++++++-
>  app/test-pmd/testpmd.c                      |  13 ++
>  app/test-pmd/testpmd.h                      |  10 ++
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  46 +++++++
>  6 files changed, 335 insertions(+), 5 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index ccdf239..05b0ce8 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -431,6 +431,17 @@ static void cmd_help_long_parsed(void *parsed_result,
>  			"    Set max flow number and max packet number per-flow"
>  			" for GRO.\n\n"
> 
> +			"set port (port_id) gso (on|off)"
> +			"    Enable or disable Generic Segmentation Offload in"
> +			" csum forwarding engine.\n\n"
> +
> +			"set gso segsz (length)\n"
> +			"    Set max packet length for output GSO segments,"
> +			" including packet header and payload.\n\n"

Probably a  good future improvement would be to allow user to specify gso_type too.

> +
> +			"show port (port_id) gso\n"
> +			"    Show GSO configuration.\n\n"
> +
>  			"set fwd (%s)\n"
>  			"    Set packet forwarding mode.\n\n"
> 
> @@ -3967,6 +3978,170 @@ struct cmd_gro_set_result {
>  	},
>  };
> 
> +/* *** ENABLE/DISABLE GSO *** */
> +struct cmd_gso_enable_result {
> +	cmdline_fixed_string_t cmd_set;
> +	cmdline_fixed_string_t cmd_port;
> +	cmdline_fixed_string_t cmd_keyword;
> +	cmdline_fixed_string_t cmd_mode;
> +	uint8_t cmd_pid;
> +};
> +
> +static void
> +cmd_gso_enable_parsed(void *parsed_result,
> +		__attribute__((unused)) struct cmdline *cl,
> +		__attribute__((unused)) void *data)
> +{
> +	struct cmd_gso_enable_result *res;
> +
> +	res = parsed_result;
> +	if (!strcmp(res->cmd_keyword, "gso"))
> +		setup_gso(res->cmd_mode, res->cmd_pid);
> +}
> +
> +cmdline_parse_token_string_t cmd_gso_enable_set =
> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
> +			cmd_set, "set");
> +cmdline_parse_token_string_t cmd_gso_enable_port =
> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
> +			cmd_port, "port");
> +cmdline_parse_token_string_t cmd_gso_enable_keyword =
> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
> +			cmd_keyword, "gso");
> +cmdline_parse_token_string_t cmd_gso_enable_mode =
> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
> +			cmd_mode, "on#off");
> +cmdline_parse_token_num_t cmd_gso_enable_pid =
> +	TOKEN_NUM_INITIALIZER(struct cmd_gso_enable_result,
> +			cmd_pid, UINT8);
> +
> +cmdline_parse_inst_t cmd_gso_enable = {
> +	.f = cmd_gso_enable_parsed,
> +	.data = NULL,
> +	.help_str = "set port <port_id> gso on|off",
> +	.tokens = {
> +		(void *)&cmd_gso_enable_set,
> +		(void *)&cmd_gso_enable_port,
> +		(void *)&cmd_gso_enable_pid,
> +		(void *)&cmd_gso_enable_keyword,
> +		(void *)&cmd_gso_enable_mode,
> +		NULL,
> +	},
> +};
> +
> +/* *** SET MAX PACKET LENGTH FOR GSO SEGMENTS *** */
> +struct cmd_gso_size_result {
> +	cmdline_fixed_string_t cmd_set;
> +	cmdline_fixed_string_t cmd_keyword;
> +	cmdline_fixed_string_t cmd_segsz;
> +	uint16_t cmd_size;
> +};
> +
> +static void
> +cmd_gso_size_parsed(void *parsed_result,
> +		       __attribute__((unused)) struct cmdline *cl,
> +		       __attribute__((unused)) void *data)
> +{
> +	struct cmd_gso_size_result *res = parsed_result;
> +
> +	if (test_done == 0) {
> +		printf("Before setting GSO segsz, please first stop fowarding\n");
> +		return;
> +	}
> +
> +	if (!strcmp(res->cmd_keyword, "gso") &&
> +			!strcmp(res->cmd_segsz, "segsz")) {
> +		if (res->cmd_size == 0) {

As your gso_size includes packet header too, you probably shouldn't allow gso_size less
then some minimal value of l2_len + l3_len + l4_len + ...
Another alternative change gso_ctx.gso_size to count only payload size.

> +			printf("gso_size should be larger than 0."
> +					" Please input a legal value\n");
> +		} else
> +			gso_max_segment_size = res->cmd_size;
> +	}
> +}
> +
> +cmdline_parse_token_string_t cmd_gso_size_set =
> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
> +				cmd_set, "set");
> +cmdline_parse_token_string_t cmd_gso_size_keyword =
> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
> +				cmd_keyword, "gso");
> +cmdline_parse_token_string_t cmd_gso_size_segsz =
> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
> +				cmd_segsz, "segsz");
> +cmdline_parse_token_num_t cmd_gso_size_size =
> +	TOKEN_NUM_INITIALIZER(struct cmd_gso_size_result,
> +				cmd_size, UINT16);
> +
> +cmdline_parse_inst_t cmd_gso_size = {
> +	.f = cmd_gso_size_parsed,
> +	.data = NULL,
> +	.help_str = "set gso segsz <length>",
> +	.tokens = {
> +		(void *)&cmd_gso_size_set,
> +		(void *)&cmd_gso_size_keyword,
> +		(void *)&cmd_gso_size_segsz,
> +		(void *)&cmd_gso_size_size,
> +		NULL,
> +	},
> +};
> +
> +/* *** SHOW GSO CONFIGURATION *** */
> +struct cmd_gso_show_result {
> +	cmdline_fixed_string_t cmd_show;
> +	cmdline_fixed_string_t cmd_port;
> +	cmdline_fixed_string_t cmd_keyword;
> +	uint8_t cmd_pid;
> +};
> +
> +static void
> +cmd_gso_show_parsed(void *parsed_result,
> +		       __attribute__((unused)) struct cmdline *cl,
> +		       __attribute__((unused)) void *data)
> +{
> +	struct cmd_gso_show_result *res = parsed_result;
> +
> +	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
> +		printf("invalid port id %u\n", res->cmd_pid);
> +		return;
> +	}
> +	if (!strcmp(res->cmd_keyword, "gso")) {
> +		if (gso_ports[res->cmd_pid].enable) {
> +			printf("Max GSO'd packet size: %uB\n"
> +					"Supported GSO types: TCP/IPv4, "
> +					"VxLAN with inner TCP/IPv4 packet, "
> +					"GRE with inner TCP/IPv4  packet\n",
> +					gso_max_segment_size);
> +		} else
> +			printf("GSO is not enabled on Port %u\n", res->cmd_pid);
> +	}
> +}
> +
> +cmdline_parse_token_string_t cmd_gso_show_show =
> +TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
> +		cmd_show, "show");
> +cmdline_parse_token_string_t cmd_gso_show_port =
> +TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
> +		cmd_port, "port");
> +cmdline_parse_token_string_t cmd_gso_show_keyword =
> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
> +				cmd_keyword, "gso");
> +cmdline_parse_token_num_t cmd_gso_show_pid =
> +	TOKEN_NUM_INITIALIZER(struct cmd_gso_show_result,
> +				cmd_pid, UINT8);
> +
> +cmdline_parse_inst_t cmd_gso_show = {
> +	.f = cmd_gso_show_parsed,
> +	.data = NULL,
> +	.help_str = "show port <port_id> gso",
> +	.tokens = {
> +		(void *)&cmd_gso_show_show,
> +		(void *)&cmd_gso_show_port,
> +		(void *)&cmd_gso_show_pid,
> +		(void *)&cmd_gso_show_keyword,
> +		NULL,
> +	},
> +};
> +
>  /* *** ENABLE/DISABLE FLUSH ON RX STREAMS *** */
>  struct cmd_set_flush_rx {
>  	cmdline_fixed_string_t set;
> @@ -14255,6 +14430,9 @@ struct cmd_cmdfile_result {
>  	(cmdline_parse_inst_t *)&cmd_tunnel_tso_show,
>  	(cmdline_parse_inst_t *)&cmd_enable_gro,
>  	(cmdline_parse_inst_t *)&cmd_gro_set,
> +	(cmdline_parse_inst_t *)&cmd_gso_enable,
> +	(cmdline_parse_inst_t *)&cmd_gso_size,
> +	(cmdline_parse_inst_t *)&cmd_gso_show,
>  	(cmdline_parse_inst_t *)&cmd_link_flow_control_set,
>  	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_rx,
>  	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_tx,
> diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
> index 3ae3e1c..88d09d0 100644
> --- a/app/test-pmd/config.c
> +++ b/app/test-pmd/config.c
> @@ -2454,6 +2454,30 @@ struct igb_ring_desc_16_bytes {
>  	}
>  }
> 
> +void
> +setup_gso(const char *mode, uint8_t port_id)
> +{
> +	if (!rte_eth_dev_is_valid_port(port_id)) {
> +		printf("invalid port id %u\n", port_id);
> +		return;
> +	}
> +	if (strcmp(mode, "on") == 0) {
> +		if (test_done == 0) {
> +			printf("before enabling GSO,"
> +					" please stop forwarding first\n");
> +			return;
> +		}
> +		gso_ports[port_id].enable = 1;
> +	} else if (strcmp(mode, "off") == 0) {
> +		if (test_done == 0) {
> +			printf("before disabling GSO,"
> +					" please stop forwarding first\n");
> +			return;
> +		}
> +		gso_ports[port_id].enable = 0;
> +	}
> +}
> +
>  char*
>  list_pkt_forwarding_modes(void)
>  {
> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
> index 90c8119..bd1a287 100644
> --- a/app/test-pmd/csumonly.c
> +++ b/app/test-pmd/csumonly.c
> @@ -70,6 +70,8 @@
>  #include <rte_string_fns.h>
>  #include <rte_flow.h>
>  #include <rte_gro.h>
> +#include <rte_gso.h>
> +
>  #include "testpmd.h"
> 
>  #define IP_DEFTTL  64   /* from RFC 1340. */
> @@ -91,6 +93,7 @@
>  /* structure that caches offload info for the current packet */
>  struct testpmd_offload_info {
>  	uint16_t ethertype;
> +	uint8_t gso_enable;
>  	uint16_t l2_len;
>  	uint16_t l3_len;
>  	uint16_t l4_len;
> @@ -381,6 +384,8 @@ struct simple_gre_hdr {
>  				get_udptcp_checksum(l3_hdr, tcp_hdr,
>  					info->ethertype);
>  		}
> +		if (info->gso_enable)
> +			ol_flags |= PKT_TX_TCP_SEG;
>  	} else if (info->l4_proto == IPPROTO_SCTP) {
>  		sctp_hdr = (struct sctp_hdr *)((char *)l3_hdr + info->l3_len);
>  		sctp_hdr->cksum = 0;
> @@ -627,6 +632,9 @@ struct simple_gre_hdr {
>  pkt_burst_checksum_forward(struct fwd_stream *fs)
>  {
>  	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
> +	struct rte_mbuf *gso_segments[GSO_MAX_PKT_BURST];
> +	struct rte_gso_ctx *gso_ctx;
> +	struct rte_mbuf **tx_pkts_burst;
>  	struct rte_port *txp;
>  	struct rte_mbuf *m, *p;
>  	struct ether_hdr *eth_hdr;
> @@ -634,13 +642,15 @@ struct simple_gre_hdr {
>  	uint16_t nb_rx;
>  	uint16_t nb_tx;
>  	uint16_t nb_prep;
> -	uint16_t i;
> +	uint16_t i, j;
>  	uint64_t rx_ol_flags, tx_ol_flags;
>  	uint16_t testpmd_ol_flags;
>  	uint32_t retry;
>  	uint32_t rx_bad_ip_csum;
>  	uint32_t rx_bad_l4_csum;
>  	struct testpmd_offload_info info;
> +	uint16_t nb_segments = 0;
> +	int ret;
> 
>  #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
>  	uint64_t start_tsc;
> @@ -674,6 +684,8 @@ struct simple_gre_hdr {
>  	memset(&info, 0, sizeof(info));
>  	info.tso_segsz = txp->tso_segsz;
>  	info.tunnel_tso_segsz = txp->tunnel_tso_segsz;
> +	if (gso_ports[fs->tx_port].enable)
> +		info.gso_enable = 1;
> 
>  	for (i = 0; i < nb_rx; i++) {
>  		if (likely(i < nb_rx - 1))
> @@ -851,13 +863,59 @@ struct simple_gre_hdr {
>  		}
>  	}
> 
> +	if (gso_ports[fs->tx_port].enable == 0)
> +		tx_pkts_burst = pkts_burst;
> +	else {
> +		gso_ctx = &(current_fwd_lcore()->gso_ctx);
> +		gso_ctx->gso_size = gso_max_segment_size;
> +		for (i = 0; i < nb_rx; i++) {

It seems quite a lot of code to handle an error case, which I suppose
will happen pretty rare.
Why not just:

ret = rte_gso_segment(pkts_burst[i], gso_ctx,
					&gso_segments[nb_segments],
					RTE_DIM(gso_segments) - nb_segments);
If (ret < 0)  {
    RTE_LOG(DEBUG, ....);
     rte_free(pkts_burst[i]);
} else
   nb_segments += ret;



> +			if (unlikely(nb_rx - i >= GSO_MAX_PKT_BURST -
> +						nb_segments)) {
> +				/*
> +				 * insufficient space in gso_segments,
> +				 * stop GSO.
> +				 */
> +				for (j = i; j < GSO_MAX_PKT_BURST -
> +						nb_segments; j++) {
> +					pkts_burst[j]->ol_flags &=
> +						(~PKT_TX_TCP_SEG);
> +					gso_segments[nb_segments++] =
> +						pkts_burst[j];
> +				}
> +				for (; j < nb_rx; j++)
> +					rte_pktmbuf_free(pkts_burst[j]);
> +				break;
> +			}
> +			ret = rte_gso_segment(pkts_burst[i], gso_ctx,
> +					&gso_segments[nb_segments],
> +					GSO_MAX_PKT_BURST - nb_segments);
> +			if (ret >= 1)
> +				nb_segments += ret;
> +			else if (ret < 0) {
> +				/*
> +				 * insufficient MBUFs or space in
> +				 * gso_segments, stop GSO.
> +				 */
> +				for (j = i; j < nb_rx; j++) {
> +					pkts_burst[j]->ol_flags &=
> +						(~PKT_TX_TCP_SEG);
> +					gso_segments[nb_segments++] =
> +						pkts_burst[j];
> +				}
> +				break;
> +			}
> +		}
> +		tx_pkts_burst = gso_segments;
> +		nb_rx = nb_segments;
> +	}
> +
>  	nb_prep = rte_eth_tx_prepare(fs->tx_port, fs->tx_queue,
> -			pkts_burst, nb_rx);
> +			tx_pkts_burst, nb_rx);
>  	if (nb_prep != nb_rx)
>  		printf("Preparing packet burst to transmit failed: %s\n",
>  				rte_strerror(rte_errno));
> 
> -	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst,
> +	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, tx_pkts_burst,
>  			nb_prep);
> 
>  	/*
> @@ -868,7 +926,7 @@ struct simple_gre_hdr {
>  		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
>  			rte_delay_us(burst_tx_delay_time);
>  			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
> -					&pkts_burst[nb_tx], nb_rx - nb_tx);
> +					&tx_pkts_burst[nb_tx], nb_rx - nb_tx);
>  		}
>  	}
>  	fs->tx_packets += nb_tx;
> @@ -881,9 +939,10 @@ struct simple_gre_hdr {
>  	if (unlikely(nb_tx < nb_rx)) {
>  		fs->fwd_dropped += (nb_rx - nb_tx);
>  		do {
> -			rte_pktmbuf_free(pkts_burst[nb_tx]);
> +			rte_pktmbuf_free(tx_pkts_burst[nb_tx]);
>  		} while (++nb_tx < nb_rx);
>  	}
> +
>  #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
>  	end_tsc = rte_rdtsc();
>  	core_cycles = (end_tsc - start_tsc);
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index e097ee0..97e349d 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -400,6 +400,9 @@ static int eth_event_callback(uint8_t port_id,
>   */
>  static int all_ports_started(void);
> 
> +struct gso_status gso_ports[RTE_MAX_ETHPORTS];
> +uint16_t gso_max_segment_size = ETHER_MAX_LEN - ETHER_CRC_LEN;
> +
>  /*
>   * Helper function to check if socket is already discovered.
>   * If yes, return positive value. If not, return zero.
> @@ -570,6 +573,7 @@ static int eth_event_callback(uint8_t port_id,
>  	unsigned int nb_mbuf_per_pool;
>  	lcoreid_t  lc_id;
>  	uint8_t port_per_socket[RTE_MAX_NUMA_NODES];
> +	uint32_t gso_types = 0;
> 
>  	memset(port_per_socket,0,RTE_MAX_NUMA_NODES);
> 
> @@ -654,6 +658,8 @@ static int eth_event_callback(uint8_t port_id,
> 
>  	init_port_config();
> 
> +	gso_types = DEV_TX_OFFLOAD_TCP_TSO | DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
> +		DEV_TX_OFFLOAD_GRE_TNL_TSO;
>  	/*
>  	 * Records which Mbuf pool to use by each logical core, if needed.
>  	 */
> @@ -664,6 +670,13 @@ static int eth_event_callback(uint8_t port_id,
>  		if (mbp == NULL)
>  			mbp = mbuf_pool_find(0);
>  		fwd_lcores[lc_id]->mbp = mbp;
> +		/* initialize GSO context */
> +		fwd_lcores[lc_id]->gso_ctx.direct_pool = mbp;
> +		fwd_lcores[lc_id]->gso_ctx.indirect_pool = mbp;
> +		fwd_lcores[lc_id]->gso_ctx.gso_types = gso_types;
> +		fwd_lcores[lc_id]->gso_ctx.gso_size = ETHER_MAX_LEN -
> +			ETHER_CRC_LEN;
> +		fwd_lcores[lc_id]->gso_ctx.ipid_flag = !RTE_GSO_IPID_FIXED;

Just
fwd_lcores[lc_id]->gso_ctx.ipid_flag = 0;
should do here.

Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 3/6] gso: add VxLAN GSO support
  2017-10-04 14:12             ` Ananyev, Konstantin
  2017-10-04 14:35               ` Kavanagh, Mark B
@ 2017-10-04 16:13               ` Kavanagh, Mark B
  2017-10-04 16:17                 ` Ananyev, Konstantin
  1 sibling, 1 reply; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-10-04 16:13 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas

>-----Original Message-----
>From: Ananyev, Konstantin
>Sent: Wednesday, October 4, 2017 3:12 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>Subject: RE: [PATCH v6 3/6] gso: add VxLAN GSO support
>
>
>
>> -----Original Message-----
>> From: Kavanagh, Mark B
>> Sent: Monday, October 2, 2017 5:46 PM
>> To: dev@dpdk.org
>> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
>> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>
>> Subject: [PATCH v6 3/6] gso: add VxLAN GSO support
>>
>> This patch adds a framework that allows GSO on tunneled packets.
>> Furthermore, it leverages that framework to provide GSO support for
>> VxLAN-encapsulated packets.
>>
>> Supported VxLAN packets must have an outer IPv4 header (prepended by an
>> optional VLAN tag), and contain an inner TCP/IPv4 packet (with an optional
>> inner VLAN tag).
>>
>> VxLAN GSO doesn't check if input packets have correct checksums and
>> doesn't update checksums for output packets. Additionally, it doesn't
>> process IP fragmented packets.
>>
>> As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
>> output packet, which mandates support for multi-segment mbufs in the TX
>> functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
>> reduces its MBUF refcnt by 1. As a result, when all of its GSO'd segments
>> are freed, the packet is freed automatically.
>>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
>> ---
>>  doc/guides/rel_notes/release_17_11.rst |   3 +
>>  lib/librte_gso/Makefile                |   1 +
>>  lib/librte_gso/gso_common.h            |  25 +++++++
>>  lib/librte_gso/gso_tunnel_tcp4.c       | 123
>+++++++++++++++++++++++++++++++++
>>  lib/librte_gso/gso_tunnel_tcp4.h       |  75 ++++++++++++++++++++
>>  lib/librte_gso/rte_gso.c               |  13 +++-
>>  6 files changed, 237 insertions(+), 3 deletions(-)
>>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
>>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
>>
>> diff --git a/doc/guides/rel_notes/release_17_11.rst
>b/doc/guides/rel_notes/release_17_11.rst
>> index c414f73..25b8a78 100644
>> --- a/doc/guides/rel_notes/release_17_11.rst
>> +++ b/doc/guides/rel_notes/release_17_11.rst
>> @@ -48,6 +48,9 @@ New Features
>>    ones (e.g. MTU is 1500B). Supported packet types are:
>>
>>    * TCP/IPv4 packets, which may include a single VLAN tag.
>> +  * VxLAN packets, which must have an outer IPv4 header (prepended by
>> +    an optional VLAN tag), and contain an inner TCP/IPv4 packet (with
>> +    an optional VLAN tag).
>>
>>    The GSO library doesn't check if the input packets have correct
>>    checksums, and doesn't update checksums for output packets.
>> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
>> index 2be64d1..e6d41df 100644
>> --- a/lib/librte_gso/Makefile
>> +++ b/lib/librte_gso/Makefile
>> @@ -44,6 +44,7 @@ LIBABIVER := 1
>>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
>>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
>>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
>> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
>>
>>  # install this header file
>>  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
>> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
>> index 8d9b94e..c051295 100644
>> --- a/lib/librte_gso/gso_common.h
>> +++ b/lib/librte_gso/gso_common.h
>> @@ -39,6 +39,7 @@
>>  #include <rte_mbuf.h>
>>  #include <rte_ip.h>
>>  #include <rte_tcp.h>
>> +#include <rte_udp.h>
>>
>>  #define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
>>  		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
>> @@ -49,6 +50,30 @@
>>  #define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
>>  		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
>>
>> +#define IS_IPV4_VXLAN_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 |
>\
>> +				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN)) == \
>> +		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
>> +		 PKT_TX_TUNNEL_VXLAN))
>> +
>> +/**
>> + * Internal function which updates the UDP header of a packet, following
>> + * segmentation. This is required to update the header's datagram length
>field.
>> + *
>> + * @param pkt
>> + *  The packet containing the UDP header.
>> + * @param udp_offset
>> + *  The offset of the UDP header from the start of the packet.
>> + */
>> +static inline void
>> +update_udp_header(struct rte_mbuf *pkt, uint16_t udp_offset)
>> +{
>> +	struct udp_hdr *udp_hdr;
>> +
>> +	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> +			udp_offset);
>> +	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - udp_offset);
>> +}
>> +
>>  /**
>>   * Internal function which updates the TCP header of a packet, following
>>   * segmentation. This is required to update the header's 'sent' sequence
>> diff --git a/lib/librte_gso/gso_tunnel_tcp4.c
>b/lib/librte_gso/gso_tunnel_tcp4.c
>> new file mode 100644
>> index 0000000..34bbbd7
>> --- /dev/null
>> +++ b/lib/librte_gso/gso_tunnel_tcp4.c
>> @@ -0,0 +1,123 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#include "gso_common.h"
>> +#include "gso_tunnel_tcp4.h"
>> +
>> +static void
>> +update_tunnel_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
>> +		struct rte_mbuf **segs, uint16_t nb_segs)
>> +{
>> +	struct ipv4_hdr *ipv4_hdr;
>> +	struct tcp_hdr *tcp_hdr;
>> +	uint32_t sent_seq;
>> +	uint16_t outer_id, inner_id, tail_idx, i;
>> +	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
>> +
>> +	outer_ipv4_offset = pkt->outer_l2_len;
>> +	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
>> +	inner_ipv4_offset = udp_offset + pkt->l2_len;
>> +	tcp_offset = inner_ipv4_offset + pkt->l3_len;
>> +
>> +	/* Outer IPv4 header. */
>> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> +			outer_ipv4_offset);
>> +	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
>> +
>> +	/* Inner IPv4 header. */
>> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> +			inner_ipv4_offset);
>> +	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
>> +
>> +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
>> +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
>> +	tail_idx = nb_segs - 1;
>> +
>> +	for (i = 0; i < nb_segs; i++) {
>> +		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
>> +		update_udp_header(segs[i], udp_offset);
>> +		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
>> +		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
>> +		outer_id++;
>> +		inner_id += ipid_delta;
>> +		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
>> +	}
>> +}
>> +
>> +int
>> +gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
>> +		uint16_t gso_size,
>> +		uint8_t ipid_delta,
>> +		struct rte_mempool *direct_pool,
>> +		struct rte_mempool *indirect_pool,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out)
>> +{
>> +	struct ipv4_hdr *inner_ipv4_hdr;
>> +	uint16_t pyld_unit_size, hdr_offset;
>> +	uint16_t tcp_dl, frag_off;
>> +	int ret = 1;
>> +
>> +	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
>> +	inner_ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
>> +			hdr_offset);
>> +	/*
>> +	 * Don't process the packet whose MF bit or offset in the inner
>> +	 * IPv4 header are non-zero.
>> +	 */
>> +	frag_off = rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset);
>> +	if (unlikely(IS_FRAGMENTED(frag_off))) {
>> +		pkts_out[0] = pkt;
>> +		return 1;
>> +	}
>> +
>> +	/* Don't process the packet without data */
>> +	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
>> +	if (unlikely(tcp_dl == 0)) {
>
>You probably need to take into account outer_len* too..
>Probably better to move that check after final hdr_offset calculations:
>
>...
>hdr_offset += pkt->l3_len + pkt->l4_len;
>if (hdr_offset >= pkt->pkt_len) {..;' return 1;}
>...
>
>> +		pkts_out[0] = pkt;
>> +		return 1;
>> +	}
>> +
>> +	hdr_offset += pkt->l3_len + pkt->l4_len;
>> +	pyld_unit_size = gso_size - hdr_offset;
>> +
>> +	/* Segment the payload */
>> +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
>> +			indirect_pool, pkts_out, nb_pkts_out);
>> +	if (ret <= 1)
>> +		return ret;
>> +
>> +	update_tunnel_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
>> +
>> +	return ret;
>> +}
>> diff --git a/lib/librte_gso/gso_tunnel_tcp4.h
>b/lib/librte_gso/gso_tunnel_tcp4.h
>> new file mode 100644
>> index 0000000..3c67f0c
>> --- /dev/null
>> +++ b/lib/librte_gso/gso_tunnel_tcp4.h
>> @@ -0,0 +1,75 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of Intel Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#ifndef _GSO_TUNNEL_TCP4_H_
>> +#define _GSO_TUNNEL_TCP4_H_
>> +
>> +#include <stdint.h>
>> +#include <rte_mbuf.h>
>> +
>> +/**
>> + * Segment a tunneling packet with inner TCP/IPv4 headers. This function
>> + * doesn't check if the input packet has correct checksums, and doesn't
>> + * update checksums for output GSO segments. Furthermore, it doesn't
>> + * process IP fragment packets.
>> + *
>> + * @param pkt
>> + *  The packet mbuf to segment.
>> + * @param gso_size
>> + *  The max length of a GSO segment, measured in bytes.
>> + * @param ipid_delta
>> + *  The increasing unit of IP ids.
>> + * @param direct_pool
>> + *  MBUF pool used for allocating direct buffers for output segments.
>> + * @param indirect_pool
>> + *  MBUF pool used for allocating indirect buffers for output segments.
>> + * @param pkts_out
>> + *  Pointer array used to store the MBUF addresses of output GSO
>> + *  segments, when it succeeds. If the memory space in pkts_out is
>> + *  insufficient, it fails and returns -EINVAL.
>> + * @param nb_pkts_out
>> + *  The max number of items that 'pkts_out' can keep.
>> + *
>> + * @return
>> + *   - The number of GSO segments filled in pkts_out on success.
>> + *   - Return -ENOMEM if run out of memory in MBUF pools.
>> + *   - Return -EINVAL for invalid parameters.
>> + */
>> +int gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
>> +		uint16_t gso_size,
>> +		uint8_t ipid_delta,
>> +		struct rte_mempool *direct_pool,
>> +		struct rte_mempool *indirect_pool,
>> +		struct rte_mbuf **pkts_out,
>> +		uint16_t nb_pkts_out);
>> +#endif
>> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
>> index a4fce50..6095689 100644
>> --- a/lib/librte_gso/rte_gso.c
>> +++ b/lib/librte_gso/rte_gso.c
>> @@ -39,6 +39,7 @@
>>  #include "rte_gso.h"
>>  #include "gso_common.h"
>>  #include "gso_tcp4.h"
>> +#include "gso_tunnel_tcp4.h"
>>
>>  int
>>  rte_gso_segment(struct rte_mbuf *pkt,
>> @@ -58,8 +59,9 @@
>>  		return -EINVAL;
>>
>>  	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
>> -				DEV_TX_OFFLOAD_TCP_TSO) !=
>> -			gso_ctx->gso_types) {
>> +				(DEV_TX_OFFLOAD_TCP_TSO |
>> +				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
>> +				gso_ctx->gso_types) {
>>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>>  		pkts_out[0] = pkt;
>>  		return 1;
>> @@ -71,7 +73,12 @@
>>  	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
>>  	ol_flags = pkt->ol_flags;
>>
>> -	if (IS_IPV4_TCP(pkt->ol_flags)) {
>> +	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) {
>> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>> +		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
>> +				direct_pool, indirect_pool,
>> +				pkts_out, nb_pkts_out);
>> +	} else if (IS_IPV4_TCP(pkt->ol_flags)) {
>
>Hmm it doesn't look quite right.
>Imagine user doesn't want libgso to segment plain TCP packets with that ctx,
>just VXLAN+TCP.
>
>I think you need to merge that if and one above to something like that:
>
>If (IS_IPV4_VXLAN_TCP4(pkt->ol_flags))
>  &&  (gso_ctx->gso_types & (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
>DEV_TX_OFFLOAD_TCP_TSO)) ==
>   (DEV_TX_OFFLOAD_VXLAN_TNL_TSO | DEV_TX_OFFLOAD_TCP_TSO)) {

One question on this Konstantin - in the case of VxLAN, do we even need to check if DEV_TX_OFFLOAD_TCP_TSO is set in gso_ctx, since that pertains to plain TCP packets?
If DEV_TX_OFFLOAD_VXLAN_TNL_TSO is set, then the inner protocol is probably irrelevant, since we'll segment regardless.

Bearing that in mind, I imagine that the above code block should look as follows, but I'm interested in your thoughts on same:

if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags))
  &&  (gso_ctx->gso_types & (DEV_TX_OFFLOAD_VXLAN_TNL_TSO) ==
   (DEV_TX_OFFLOAD_VXLAN_TNL_TSO) {
    ...
} else if (...
...

Thanks,
Mark

>   ...
>} else if (IS_IPV4_TCP(pkt->ol_flags) && (gso_ctx->gso_types &
>DEV_TX_OFFLOAD_TCP_TSO)) {
>   ...
>} else {
>     /* unsupported packet, skip */
>}
>
>Konstantin
>
>>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>>  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
>>  				direct_pool, indirect_pool,
>> --
>> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 3/6] gso: add VxLAN GSO support
  2017-10-04 16:13               ` Kavanagh, Mark B
@ 2017-10-04 16:17                 ` Ananyev, Konstantin
  0 siblings, 0 replies; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-10-04 16:17 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Wednesday, October 4, 2017 5:14 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; dev@dpdk.org
> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
> Subject: RE: [PATCH v6 3/6] gso: add VxLAN GSO support
> 
> >-----Original Message-----
> >From: Ananyev, Konstantin
> >Sent: Wednesday, October 4, 2017 3:12 PM
> >To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
> >Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
> >Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
> >Subject: RE: [PATCH v6 3/6] gso: add VxLAN GSO support
> >
> >
> >
> >> -----Original Message-----
> >> From: Kavanagh, Mark B
> >> Sent: Monday, October 2, 2017 5:46 PM
> >> To: dev@dpdk.org
> >> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
> >Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
> >> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B
> ><mark.b.kavanagh@intel.com>
> >> Subject: [PATCH v6 3/6] gso: add VxLAN GSO support
> >>
> >> This patch adds a framework that allows GSO on tunneled packets.
> >> Furthermore, it leverages that framework to provide GSO support for
> >> VxLAN-encapsulated packets.
> >>
> >> Supported VxLAN packets must have an outer IPv4 header (prepended by an
> >> optional VLAN tag), and contain an inner TCP/IPv4 packet (with an optional
> >> inner VLAN tag).
> >>
> >> VxLAN GSO doesn't check if input packets have correct checksums and
> >> doesn't update checksums for output packets. Additionally, it doesn't
> >> process IP fragmented packets.
> >>
> >> As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
> >> output packet, which mandates support for multi-segment mbufs in the TX
> >> functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
> >> reduces its MBUF refcnt by 1. As a result, when all of its GSO'd segments
> >> are freed, the packet is freed automatically.
> >>
> >> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> >> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> >> ---
> >>  doc/guides/rel_notes/release_17_11.rst |   3 +
> >>  lib/librte_gso/Makefile                |   1 +
> >>  lib/librte_gso/gso_common.h            |  25 +++++++
> >>  lib/librte_gso/gso_tunnel_tcp4.c       | 123
> >+++++++++++++++++++++++++++++++++
> >>  lib/librte_gso/gso_tunnel_tcp4.h       |  75 ++++++++++++++++++++
> >>  lib/librte_gso/rte_gso.c               |  13 +++-
> >>  6 files changed, 237 insertions(+), 3 deletions(-)
> >>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
> >>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
> >>
> >> diff --git a/doc/guides/rel_notes/release_17_11.rst
> >b/doc/guides/rel_notes/release_17_11.rst
> >> index c414f73..25b8a78 100644
> >> --- a/doc/guides/rel_notes/release_17_11.rst
> >> +++ b/doc/guides/rel_notes/release_17_11.rst
> >> @@ -48,6 +48,9 @@ New Features
> >>    ones (e.g. MTU is 1500B). Supported packet types are:
> >>
> >>    * TCP/IPv4 packets, which may include a single VLAN tag.
> >> +  * VxLAN packets, which must have an outer IPv4 header (prepended by
> >> +    an optional VLAN tag), and contain an inner TCP/IPv4 packet (with
> >> +    an optional VLAN tag).
> >>
> >>    The GSO library doesn't check if the input packets have correct
> >>    checksums, and doesn't update checksums for output packets.
> >> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> >> index 2be64d1..e6d41df 100644
> >> --- a/lib/librte_gso/Makefile
> >> +++ b/lib/librte_gso/Makefile
> >> @@ -44,6 +44,7 @@ LIBABIVER := 1
> >>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
> >>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
> >>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
> >> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
> >>
> >>  # install this header file
> >>  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> >> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> >> index 8d9b94e..c051295 100644
> >> --- a/lib/librte_gso/gso_common.h
> >> +++ b/lib/librte_gso/gso_common.h
> >> @@ -39,6 +39,7 @@
> >>  #include <rte_mbuf.h>
> >>  #include <rte_ip.h>
> >>  #include <rte_tcp.h>
> >> +#include <rte_udp.h>
> >>
> >>  #define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
> >>  		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
> >> @@ -49,6 +50,30 @@
> >>  #define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
> >>  		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
> >>
> >> +#define IS_IPV4_VXLAN_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 |
> >\
> >> +				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN)) == \
> >> +		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
> >> +		 PKT_TX_TUNNEL_VXLAN))
> >> +
> >> +/**
> >> + * Internal function which updates the UDP header of a packet, following
> >> + * segmentation. This is required to update the header's datagram length
> >field.
> >> + *
> >> + * @param pkt
> >> + *  The packet containing the UDP header.
> >> + * @param udp_offset
> >> + *  The offset of the UDP header from the start of the packet.
> >> + */
> >> +static inline void
> >> +update_udp_header(struct rte_mbuf *pkt, uint16_t udp_offset)
> >> +{
> >> +	struct udp_hdr *udp_hdr;
> >> +
> >> +	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> >> +			udp_offset);
> >> +	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - udp_offset);
> >> +}
> >> +
> >>  /**
> >>   * Internal function which updates the TCP header of a packet, following
> >>   * segmentation. This is required to update the header's 'sent' sequence
> >> diff --git a/lib/librte_gso/gso_tunnel_tcp4.c
> >b/lib/librte_gso/gso_tunnel_tcp4.c
> >> new file mode 100644
> >> index 0000000..34bbbd7
> >> --- /dev/null
> >> +++ b/lib/librte_gso/gso_tunnel_tcp4.c
> >> @@ -0,0 +1,123 @@
> >> +/*-
> >> + *   BSD LICENSE
> >> + *
> >> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> >> + *   All rights reserved.
> >> + *
> >> + *   Redistribution and use in source and binary forms, with or without
> >> + *   modification, are permitted provided that the following conditions
> >> + *   are met:
> >> + *
> >> + *     * Redistributions of source code must retain the above copyright
> >> + *       notice, this list of conditions and the following disclaimer.
> >> + *     * Redistributions in binary form must reproduce the above copyright
> >> + *       notice, this list of conditions and the following disclaimer in
> >> + *       the documentation and/or other materials provided with the
> >> + *       distribution.
> >> + *     * Neither the name of Intel Corporation nor the names of its
> >> + *       contributors may be used to endorse or promote products derived
> >> + *       from this software without specific prior written permission.
> >> + *
> >> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> >> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> >> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> >> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> >> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> >> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> >> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> >> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> >> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> >> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> >> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> >> + */
> >> +
> >> +#include "gso_common.h"
> >> +#include "gso_tunnel_tcp4.h"
> >> +
> >> +static void
> >> +update_tunnel_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
> >> +		struct rte_mbuf **segs, uint16_t nb_segs)
> >> +{
> >> +	struct ipv4_hdr *ipv4_hdr;
> >> +	struct tcp_hdr *tcp_hdr;
> >> +	uint32_t sent_seq;
> >> +	uint16_t outer_id, inner_id, tail_idx, i;
> >> +	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
> >> +
> >> +	outer_ipv4_offset = pkt->outer_l2_len;
> >> +	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
> >> +	inner_ipv4_offset = udp_offset + pkt->l2_len;
> >> +	tcp_offset = inner_ipv4_offset + pkt->l3_len;
> >> +
> >> +	/* Outer IPv4 header. */
> >> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> >> +			outer_ipv4_offset);
> >> +	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> >> +
> >> +	/* Inner IPv4 header. */
> >> +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> >> +			inner_ipv4_offset);
> >> +	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> >> +
> >> +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> >> +	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> >> +	tail_idx = nb_segs - 1;
> >> +
> >> +	for (i = 0; i < nb_segs; i++) {
> >> +		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
> >> +		update_udp_header(segs[i], udp_offset);
> >> +		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
> >> +		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
> >> +		outer_id++;
> >> +		inner_id += ipid_delta;
> >> +		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
> >> +	}
> >> +}
> >> +
> >> +int
> >> +gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
> >> +		uint16_t gso_size,
> >> +		uint8_t ipid_delta,
> >> +		struct rte_mempool *direct_pool,
> >> +		struct rte_mempool *indirect_pool,
> >> +		struct rte_mbuf **pkts_out,
> >> +		uint16_t nb_pkts_out)
> >> +{
> >> +	struct ipv4_hdr *inner_ipv4_hdr;
> >> +	uint16_t pyld_unit_size, hdr_offset;
> >> +	uint16_t tcp_dl, frag_off;
> >> +	int ret = 1;
> >> +
> >> +	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
> >> +	inner_ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> >> +			hdr_offset);
> >> +	/*
> >> +	 * Don't process the packet whose MF bit or offset in the inner
> >> +	 * IPv4 header are non-zero.
> >> +	 */
> >> +	frag_off = rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset);
> >> +	if (unlikely(IS_FRAGMENTED(frag_off))) {
> >> +		pkts_out[0] = pkt;
> >> +		return 1;
> >> +	}
> >> +
> >> +	/* Don't process the packet without data */
> >> +	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
> >> +	if (unlikely(tcp_dl == 0)) {
> >
> >You probably need to take into account outer_len* too..
> >Probably better to move that check after final hdr_offset calculations:
> >
> >...
> >hdr_offset += pkt->l3_len + pkt->l4_len;
> >if (hdr_offset >= pkt->pkt_len) {..;' return 1;}
> >...
> >
> >> +		pkts_out[0] = pkt;
> >> +		return 1;
> >> +	}
> >> +
> >> +	hdr_offset += pkt->l3_len + pkt->l4_len;
> >> +	pyld_unit_size = gso_size - hdr_offset;
> >> +
> >> +	/* Segment the payload */
> >> +	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
> >> +			indirect_pool, pkts_out, nb_pkts_out);
> >> +	if (ret <= 1)
> >> +		return ret;
> >> +
> >> +	update_tunnel_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
> >> +
> >> +	return ret;
> >> +}
> >> diff --git a/lib/librte_gso/gso_tunnel_tcp4.h
> >b/lib/librte_gso/gso_tunnel_tcp4.h
> >> new file mode 100644
> >> index 0000000..3c67f0c
> >> --- /dev/null
> >> +++ b/lib/librte_gso/gso_tunnel_tcp4.h
> >> @@ -0,0 +1,75 @@
> >> +/*-
> >> + *   BSD LICENSE
> >> + *
> >> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> >> + *   All rights reserved.
> >> + *
> >> + *   Redistribution and use in source and binary forms, with or without
> >> + *   modification, are permitted provided that the following conditions
> >> + *   are met:
> >> + *
> >> + *     * Redistributions of source code must retain the above copyright
> >> + *       notice, this list of conditions and the following disclaimer.
> >> + *     * Redistributions in binary form must reproduce the above copyright
> >> + *       notice, this list of conditions and the following disclaimer in
> >> + *       the documentation and/or other materials provided with the
> >> + *       distribution.
> >> + *     * Neither the name of Intel Corporation nor the names of its
> >> + *       contributors may be used to endorse or promote products derived
> >> + *       from this software without specific prior written permission.
> >> + *
> >> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> >> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> >> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> >> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> >> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> >> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> >> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> >> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> >> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> >> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> >> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> >> + */
> >> +
> >> +#ifndef _GSO_TUNNEL_TCP4_H_
> >> +#define _GSO_TUNNEL_TCP4_H_
> >> +
> >> +#include <stdint.h>
> >> +#include <rte_mbuf.h>
> >> +
> >> +/**
> >> + * Segment a tunneling packet with inner TCP/IPv4 headers. This function
> >> + * doesn't check if the input packet has correct checksums, and doesn't
> >> + * update checksums for output GSO segments. Furthermore, it doesn't
> >> + * process IP fragment packets.
> >> + *
> >> + * @param pkt
> >> + *  The packet mbuf to segment.
> >> + * @param gso_size
> >> + *  The max length of a GSO segment, measured in bytes.
> >> + * @param ipid_delta
> >> + *  The increasing unit of IP ids.
> >> + * @param direct_pool
> >> + *  MBUF pool used for allocating direct buffers for output segments.
> >> + * @param indirect_pool
> >> + *  MBUF pool used for allocating indirect buffers for output segments.
> >> + * @param pkts_out
> >> + *  Pointer array used to store the MBUF addresses of output GSO
> >> + *  segments, when it succeeds. If the memory space in pkts_out is
> >> + *  insufficient, it fails and returns -EINVAL.
> >> + * @param nb_pkts_out
> >> + *  The max number of items that 'pkts_out' can keep.
> >> + *
> >> + * @return
> >> + *   - The number of GSO segments filled in pkts_out on success.
> >> + *   - Return -ENOMEM if run out of memory in MBUF pools.
> >> + *   - Return -EINVAL for invalid parameters.
> >> + */
> >> +int gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
> >> +		uint16_t gso_size,
> >> +		uint8_t ipid_delta,
> >> +		struct rte_mempool *direct_pool,
> >> +		struct rte_mempool *indirect_pool,
> >> +		struct rte_mbuf **pkts_out,
> >> +		uint16_t nb_pkts_out);
> >> +#endif
> >> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> >> index a4fce50..6095689 100644
> >> --- a/lib/librte_gso/rte_gso.c
> >> +++ b/lib/librte_gso/rte_gso.c
> >> @@ -39,6 +39,7 @@
> >>  #include "rte_gso.h"
> >>  #include "gso_common.h"
> >>  #include "gso_tcp4.h"
> >> +#include "gso_tunnel_tcp4.h"
> >>
> >>  int
> >>  rte_gso_segment(struct rte_mbuf *pkt,
> >> @@ -58,8 +59,9 @@
> >>  		return -EINVAL;
> >>
> >>  	if ((gso_ctx->gso_size >= pkt->pkt_len) || (gso_ctx->gso_types &
> >> -				DEV_TX_OFFLOAD_TCP_TSO) !=
> >> -			gso_ctx->gso_types) {
> >> +				(DEV_TX_OFFLOAD_TCP_TSO |
> >> +				 DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
> >> +				gso_ctx->gso_types) {
> >>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> >>  		pkts_out[0] = pkt;
> >>  		return 1;
> >> @@ -71,7 +73,12 @@
> >>  	ipid_delta = (gso_ctx->ipid_flag != RTE_GSO_IPID_FIXED);
> >>  	ol_flags = pkt->ol_flags;
> >>
> >> -	if (IS_IPV4_TCP(pkt->ol_flags)) {
> >> +	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)) {
> >> +		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> >> +		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
> >> +				direct_pool, indirect_pool,
> >> +				pkts_out, nb_pkts_out);
> >> +	} else if (IS_IPV4_TCP(pkt->ol_flags)) {
> >
> >Hmm it doesn't look quite right.
> >Imagine user doesn't want libgso to segment plain TCP packets with that ctx,
> >just VXLAN+TCP.
> >
> >I think you need to merge that if and one above to something like that:
> >
> >If (IS_IPV4_VXLAN_TCP4(pkt->ol_flags))
> >  &&  (gso_ctx->gso_types & (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
> >DEV_TX_OFFLOAD_TCP_TSO)) ==
> >   (DEV_TX_OFFLOAD_VXLAN_TNL_TSO | DEV_TX_OFFLOAD_TCP_TSO)) {
> 
> One question on this Konstantin - in the case of VxLAN, do we even need to check if DEV_TX_OFFLOAD_TCP_TSO is set in gso_ctx, since
> that pertains to plain TCP packets?
> If DEV_TX_OFFLOAD_VXLAN_TNL_TSO is set, then the inner protocol is probably irrelevant, since we'll segment regardless.
> 
> Bearing that in mind, I imagine that the above code block should look as follows, but I'm interested in your thoughts on same:
> 
> if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags))
>   &&  (gso_ctx->gso_types & (DEV_TX_OFFLOAD_VXLAN_TNL_TSO) ==
>    (DEV_TX_OFFLOAD_VXLAN_TNL_TSO) {
>     ...
> } else if (...
> ...

Yes, I think you are right.
Thanks
Konstantin

> 
> Thanks,
> Mark
> 
> >   ...
> >} else if (IS_IPV4_TCP(pkt->ol_flags) && (gso_ctx->gso_types &
> >DEV_TX_OFFLOAD_TCP_TSO)) {
> >   ...
> >} else {
> >     /* unsupported packet, skip */
> >}
> >
> >Konstantin
> >
> >>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> >>  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> >>  				direct_pool, indirect_pool,
> >> --
> >> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-10-04 15:08             ` Ananyev, Konstantin
@ 2017-10-04 16:23               ` Kavanagh, Mark B
  2017-10-04 16:26                 ` Ananyev, Konstantin
  0 siblings, 1 reply; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-10-04 16:23 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



>-----Original Message-----
>From: Ananyev, Konstantin
>Sent: Wednesday, October 4, 2017 4:09 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>Subject: RE: [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
>
>
>
>> -----Original Message-----
>> From: Kavanagh, Mark B
>> Sent: Monday, October 2, 2017 5:46 PM
>> To: dev@dpdk.org
>> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
>> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>
>> Subject: [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
>>
>> From: Jiayu Hu <jiayu.hu@intel.com>
>>
>> This patch adds GSO support to the csum forwarding engine. Oversized
>> packets transmitted over a GSO-enabled port will undergo segmentation
>> (with the exception of packet-types unsupported by the GSO library).
>> GSO support is disabled by default.
>>
>> GSO support may be toggled on a per-port basis, using the command:
>>
>>         "set port <port_id> gso on|off"
>>
>> The maximum packet length (including the packet header and payload) for
>> GSO segments may be set with the command:
>>
>>         "set gso segsz <length>"
>>
>> Show GSO configuration for a given port with the command:
>>
>> 	"show port <port_id> gso"
>>
>> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> ---
>>  app/test-pmd/cmdline.c                      | 178
>++++++++++++++++++++++++++++
>>  app/test-pmd/config.c                       |  24 ++++
>>  app/test-pmd/csumonly.c                     |  69 ++++++++++-
>>  app/test-pmd/testpmd.c                      |  13 ++
>>  app/test-pmd/testpmd.h                      |  10 ++
>>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  46 +++++++
>>  6 files changed, 335 insertions(+), 5 deletions(-)
>>
>> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
>> index ccdf239..05b0ce8 100644
>> --- a/app/test-pmd/cmdline.c
>> +++ b/app/test-pmd/cmdline.c
>> @@ -431,6 +431,17 @@ static void cmd_help_long_parsed(void *parsed_result,
>>  			"    Set max flow number and max packet number per-flow"
>>  			" for GRO.\n\n"
>>
>> +			"set port (port_id) gso (on|off)"
>> +			"    Enable or disable Generic Segmentation Offload in"
>> +			" csum forwarding engine.\n\n"
>> +
>> +			"set gso segsz (length)\n"
>> +			"    Set max packet length for output GSO segments,"
>> +			" including packet header and payload.\n\n"
>
>Probably a  good future improvement would be to allow user to specify gso_type
>too.

Would you like to see that change implemented in time for the 17.11 release?

>
>> +
>> +			"show port (port_id) gso\n"
>> +			"    Show GSO configuration.\n\n"
>> +
>>  			"set fwd (%s)\n"
>>  			"    Set packet forwarding mode.\n\n"
>>
>> @@ -3967,6 +3978,170 @@ struct cmd_gro_set_result {
>>  	},
>>  };
>>
>> +/* *** ENABLE/DISABLE GSO *** */
>> +struct cmd_gso_enable_result {
>> +	cmdline_fixed_string_t cmd_set;
>> +	cmdline_fixed_string_t cmd_port;
>> +	cmdline_fixed_string_t cmd_keyword;
>> +	cmdline_fixed_string_t cmd_mode;
>> +	uint8_t cmd_pid;
>> +};
>> +
>> +static void
>> +cmd_gso_enable_parsed(void *parsed_result,
>> +		__attribute__((unused)) struct cmdline *cl,
>> +		__attribute__((unused)) void *data)
>> +{
>> +	struct cmd_gso_enable_result *res;
>> +
>> +	res = parsed_result;
>> +	if (!strcmp(res->cmd_keyword, "gso"))
>> +		setup_gso(res->cmd_mode, res->cmd_pid);
>> +}
>> +
>> +cmdline_parse_token_string_t cmd_gso_enable_set =
>> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
>> +			cmd_set, "set");
>> +cmdline_parse_token_string_t cmd_gso_enable_port =
>> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
>> +			cmd_port, "port");
>> +cmdline_parse_token_string_t cmd_gso_enable_keyword =
>> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
>> +			cmd_keyword, "gso");
>> +cmdline_parse_token_string_t cmd_gso_enable_mode =
>> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
>> +			cmd_mode, "on#off");
>> +cmdline_parse_token_num_t cmd_gso_enable_pid =
>> +	TOKEN_NUM_INITIALIZER(struct cmd_gso_enable_result,
>> +			cmd_pid, UINT8);
>> +
>> +cmdline_parse_inst_t cmd_gso_enable = {
>> +	.f = cmd_gso_enable_parsed,
>> +	.data = NULL,
>> +	.help_str = "set port <port_id> gso on|off",
>> +	.tokens = {
>> +		(void *)&cmd_gso_enable_set,
>> +		(void *)&cmd_gso_enable_port,
>> +		(void *)&cmd_gso_enable_pid,
>> +		(void *)&cmd_gso_enable_keyword,
>> +		(void *)&cmd_gso_enable_mode,
>> +		NULL,
>> +	},
>> +};
>> +
>> +/* *** SET MAX PACKET LENGTH FOR GSO SEGMENTS *** */
>> +struct cmd_gso_size_result {
>> +	cmdline_fixed_string_t cmd_set;
>> +	cmdline_fixed_string_t cmd_keyword;
>> +	cmdline_fixed_string_t cmd_segsz;
>> +	uint16_t cmd_size;
>> +};
>> +
>> +static void
>> +cmd_gso_size_parsed(void *parsed_result,
>> +		       __attribute__((unused)) struct cmdline *cl,
>> +		       __attribute__((unused)) void *data)
>> +{
>> +	struct cmd_gso_size_result *res = parsed_result;
>> +
>> +	if (test_done == 0) {
>> +		printf("Before setting GSO segsz, please first stop fowarding\n");
>> +		return;
>> +	}
>> +
>> +	if (!strcmp(res->cmd_keyword, "gso") &&
>> +			!strcmp(res->cmd_segsz, "segsz")) {
>> +		if (res->cmd_size == 0) {
>
>As your gso_size includes packet header too, you probably shouldn't allow
>gso_size less
>then some minimal value of l2_len + l3_len + l4_len + ...
>Another alternative change gso_ctx.gso_size to count only payload size.

Okay - I'll take a look at this, thanks.

>
>> +			printf("gso_size should be larger than 0."
>> +					" Please input a legal value\n");
>> +		} else
>> +			gso_max_segment_size = res->cmd_size;
>> +	}
>> +}
>> +
>> +cmdline_parse_token_string_t cmd_gso_size_set =
>> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
>> +				cmd_set, "set");
>> +cmdline_parse_token_string_t cmd_gso_size_keyword =
>> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
>> +				cmd_keyword, "gso");
>> +cmdline_parse_token_string_t cmd_gso_size_segsz =
>> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
>> +				cmd_segsz, "segsz");
>> +cmdline_parse_token_num_t cmd_gso_size_size =
>> +	TOKEN_NUM_INITIALIZER(struct cmd_gso_size_result,
>> +				cmd_size, UINT16);
>> +
>> +cmdline_parse_inst_t cmd_gso_size = {
>> +	.f = cmd_gso_size_parsed,
>> +	.data = NULL,
>> +	.help_str = "set gso segsz <length>",
>> +	.tokens = {
>> +		(void *)&cmd_gso_size_set,
>> +		(void *)&cmd_gso_size_keyword,
>> +		(void *)&cmd_gso_size_segsz,
>> +		(void *)&cmd_gso_size_size,
>> +		NULL,
>> +	},
>> +};
>> +
>> +/* *** SHOW GSO CONFIGURATION *** */
>> +struct cmd_gso_show_result {
>> +	cmdline_fixed_string_t cmd_show;
>> +	cmdline_fixed_string_t cmd_port;
>> +	cmdline_fixed_string_t cmd_keyword;
>> +	uint8_t cmd_pid;
>> +};
>> +
>> +static void
>> +cmd_gso_show_parsed(void *parsed_result,
>> +		       __attribute__((unused)) struct cmdline *cl,
>> +		       __attribute__((unused)) void *data)
>> +{
>> +	struct cmd_gso_show_result *res = parsed_result;
>> +
>> +	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
>> +		printf("invalid port id %u\n", res->cmd_pid);
>> +		return;
>> +	}
>> +	if (!strcmp(res->cmd_keyword, "gso")) {
>> +		if (gso_ports[res->cmd_pid].enable) {
>> +			printf("Max GSO'd packet size: %uB\n"
>> +					"Supported GSO types: TCP/IPv4, "
>> +					"VxLAN with inner TCP/IPv4 packet, "
>> +					"GRE with inner TCP/IPv4  packet\n",
>> +					gso_max_segment_size);
>> +		} else
>> +			printf("GSO is not enabled on Port %u\n", res->cmd_pid);
>> +	}
>> +}
>> +
>> +cmdline_parse_token_string_t cmd_gso_show_show =
>> +TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
>> +		cmd_show, "show");
>> +cmdline_parse_token_string_t cmd_gso_show_port =
>> +TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
>> +		cmd_port, "port");
>> +cmdline_parse_token_string_t cmd_gso_show_keyword =
>> +	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
>> +				cmd_keyword, "gso");
>> +cmdline_parse_token_num_t cmd_gso_show_pid =
>> +	TOKEN_NUM_INITIALIZER(struct cmd_gso_show_result,
>> +				cmd_pid, UINT8);
>> +
>> +cmdline_parse_inst_t cmd_gso_show = {
>> +	.f = cmd_gso_show_parsed,
>> +	.data = NULL,
>> +	.help_str = "show port <port_id> gso",
>> +	.tokens = {
>> +		(void *)&cmd_gso_show_show,
>> +		(void *)&cmd_gso_show_port,
>> +		(void *)&cmd_gso_show_pid,
>> +		(void *)&cmd_gso_show_keyword,
>> +		NULL,
>> +	},
>> +};
>> +
>>  /* *** ENABLE/DISABLE FLUSH ON RX STREAMS *** */
>>  struct cmd_set_flush_rx {
>>  	cmdline_fixed_string_t set;
>> @@ -14255,6 +14430,9 @@ struct cmd_cmdfile_result {
>>  	(cmdline_parse_inst_t *)&cmd_tunnel_tso_show,
>>  	(cmdline_parse_inst_t *)&cmd_enable_gro,
>>  	(cmdline_parse_inst_t *)&cmd_gro_set,
>> +	(cmdline_parse_inst_t *)&cmd_gso_enable,
>> +	(cmdline_parse_inst_t *)&cmd_gso_size,
>> +	(cmdline_parse_inst_t *)&cmd_gso_show,
>>  	(cmdline_parse_inst_t *)&cmd_link_flow_control_set,
>>  	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_rx,
>>  	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_tx,
>> diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
>> index 3ae3e1c..88d09d0 100644
>> --- a/app/test-pmd/config.c
>> +++ b/app/test-pmd/config.c
>> @@ -2454,6 +2454,30 @@ struct igb_ring_desc_16_bytes {
>>  	}
>>  }
>>
>> +void
>> +setup_gso(const char *mode, uint8_t port_id)
>> +{
>> +	if (!rte_eth_dev_is_valid_port(port_id)) {
>> +		printf("invalid port id %u\n", port_id);
>> +		return;
>> +	}
>> +	if (strcmp(mode, "on") == 0) {
>> +		if (test_done == 0) {
>> +			printf("before enabling GSO,"
>> +					" please stop forwarding first\n");
>> +			return;
>> +		}
>> +		gso_ports[port_id].enable = 1;
>> +	} else if (strcmp(mode, "off") == 0) {
>> +		if (test_done == 0) {
>> +			printf("before disabling GSO,"
>> +					" please stop forwarding first\n");
>> +			return;
>> +		}
>> +		gso_ports[port_id].enable = 0;
>> +	}
>> +}
>> +
>>  char*
>>  list_pkt_forwarding_modes(void)
>>  {
>> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
>> index 90c8119..bd1a287 100644
>> --- a/app/test-pmd/csumonly.c
>> +++ b/app/test-pmd/csumonly.c
>> @@ -70,6 +70,8 @@
>>  #include <rte_string_fns.h>
>>  #include <rte_flow.h>
>>  #include <rte_gro.h>
>> +#include <rte_gso.h>
>> +
>>  #include "testpmd.h"
>>
>>  #define IP_DEFTTL  64   /* from RFC 1340. */
>> @@ -91,6 +93,7 @@
>>  /* structure that caches offload info for the current packet */
>>  struct testpmd_offload_info {
>>  	uint16_t ethertype;
>> +	uint8_t gso_enable;
>>  	uint16_t l2_len;
>>  	uint16_t l3_len;
>>  	uint16_t l4_len;
>> @@ -381,6 +384,8 @@ struct simple_gre_hdr {
>>  				get_udptcp_checksum(l3_hdr, tcp_hdr,
>>  					info->ethertype);
>>  		}
>> +		if (info->gso_enable)
>> +			ol_flags |= PKT_TX_TCP_SEG;
>>  	} else if (info->l4_proto == IPPROTO_SCTP) {
>>  		sctp_hdr = (struct sctp_hdr *)((char *)l3_hdr + info->l3_len);
>>  		sctp_hdr->cksum = 0;
>> @@ -627,6 +632,9 @@ struct simple_gre_hdr {
>>  pkt_burst_checksum_forward(struct fwd_stream *fs)
>>  {
>>  	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
>> +	struct rte_mbuf *gso_segments[GSO_MAX_PKT_BURST];
>> +	struct rte_gso_ctx *gso_ctx;
>> +	struct rte_mbuf **tx_pkts_burst;
>>  	struct rte_port *txp;
>>  	struct rte_mbuf *m, *p;
>>  	struct ether_hdr *eth_hdr;
>> @@ -634,13 +642,15 @@ struct simple_gre_hdr {
>>  	uint16_t nb_rx;
>>  	uint16_t nb_tx;
>>  	uint16_t nb_prep;
>> -	uint16_t i;
>> +	uint16_t i, j;
>>  	uint64_t rx_ol_flags, tx_ol_flags;
>>  	uint16_t testpmd_ol_flags;
>>  	uint32_t retry;
>>  	uint32_t rx_bad_ip_csum;
>>  	uint32_t rx_bad_l4_csum;
>>  	struct testpmd_offload_info info;
>> +	uint16_t nb_segments = 0;
>> +	int ret;
>>
>>  #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
>>  	uint64_t start_tsc;
>> @@ -674,6 +684,8 @@ struct simple_gre_hdr {
>>  	memset(&info, 0, sizeof(info));
>>  	info.tso_segsz = txp->tso_segsz;
>>  	info.tunnel_tso_segsz = txp->tunnel_tso_segsz;
>> +	if (gso_ports[fs->tx_port].enable)
>> +		info.gso_enable = 1;
>>
>>  	for (i = 0; i < nb_rx; i++) {
>>  		if (likely(i < nb_rx - 1))
>> @@ -851,13 +863,59 @@ struct simple_gre_hdr {
>>  		}
>>  	}
>>
>> +	if (gso_ports[fs->tx_port].enable == 0)
>> +		tx_pkts_burst = pkts_burst;
>> +	else {
>> +		gso_ctx = &(current_fwd_lcore()->gso_ctx);
>> +		gso_ctx->gso_size = gso_max_segment_size;
>> +		for (i = 0; i < nb_rx; i++) {
>
>It seems quite a lot of code to handle an error case, which I suppose
>will happen pretty rare.
>Why not just:
>
>ret = rte_gso_segment(pkts_burst[i], gso_ctx,
>					&gso_segments[nb_segments],
>					RTE_DIM(gso_segments) - nb_segments);
>If (ret < 0)  {
>    RTE_LOG(DEBUG, ....);
>     rte_free(pkts_burst[i]);
>} else
>   nb_segments += ret;

Ditto - thanks!

>
>
>
>> +			if (unlikely(nb_rx - i >= GSO_MAX_PKT_BURST -
>> +						nb_segments)) {
>> +				/*
>> +				 * insufficient space in gso_segments,
>> +				 * stop GSO.
>> +				 */
>> +				for (j = i; j < GSO_MAX_PKT_BURST -
>> +						nb_segments; j++) {
>> +					pkts_burst[j]->ol_flags &=
>> +						(~PKT_TX_TCP_SEG);
>> +					gso_segments[nb_segments++] =
>> +						pkts_burst[j];
>> +				}
>> +				for (; j < nb_rx; j++)
>> +					rte_pktmbuf_free(pkts_burst[j]);
>> +				break;
>> +			}
>> +			ret = rte_gso_segment(pkts_burst[i], gso_ctx,
>> +					&gso_segments[nb_segments],
>> +					GSO_MAX_PKT_BURST - nb_segments);
>> +			if (ret >= 1)
>> +				nb_segments += ret;
>> +			else if (ret < 0) {
>> +				/*
>> +				 * insufficient MBUFs or space in
>> +				 * gso_segments, stop GSO.
>> +				 */
>> +				for (j = i; j < nb_rx; j++) {
>> +					pkts_burst[j]->ol_flags &=
>> +						(~PKT_TX_TCP_SEG);
>> +					gso_segments[nb_segments++] =
>> +						pkts_burst[j];
>> +				}
>> +				break;
>> +			}
>> +		}
>> +		tx_pkts_burst = gso_segments;
>> +		nb_rx = nb_segments;
>> +	}
>> +
>>  	nb_prep = rte_eth_tx_prepare(fs->tx_port, fs->tx_queue,
>> -			pkts_burst, nb_rx);
>> +			tx_pkts_burst, nb_rx);
>>  	if (nb_prep != nb_rx)
>>  		printf("Preparing packet burst to transmit failed: %s\n",
>>  				rte_strerror(rte_errno));
>>
>> -	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst,
>> +	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, tx_pkts_burst,
>>  			nb_prep);
>>
>>  	/*
>> @@ -868,7 +926,7 @@ struct simple_gre_hdr {
>>  		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
>>  			rte_delay_us(burst_tx_delay_time);
>>  			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
>> -					&pkts_burst[nb_tx], nb_rx - nb_tx);
>> +					&tx_pkts_burst[nb_tx], nb_rx - nb_tx);
>>  		}
>>  	}
>>  	fs->tx_packets += nb_tx;
>> @@ -881,9 +939,10 @@ struct simple_gre_hdr {
>>  	if (unlikely(nb_tx < nb_rx)) {
>>  		fs->fwd_dropped += (nb_rx - nb_tx);
>>  		do {
>> -			rte_pktmbuf_free(pkts_burst[nb_tx]);
>> +			rte_pktmbuf_free(tx_pkts_burst[nb_tx]);
>>  		} while (++nb_tx < nb_rx);
>>  	}
>> +
>>  #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
>>  	end_tsc = rte_rdtsc();
>>  	core_cycles = (end_tsc - start_tsc);
>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>> index e097ee0..97e349d 100644
>> --- a/app/test-pmd/testpmd.c
>> +++ b/app/test-pmd/testpmd.c
>> @@ -400,6 +400,9 @@ static int eth_event_callback(uint8_t port_id,
>>   */
>>  static int all_ports_started(void);
>>
>> +struct gso_status gso_ports[RTE_MAX_ETHPORTS];
>> +uint16_t gso_max_segment_size = ETHER_MAX_LEN - ETHER_CRC_LEN;
>> +
>>  /*
>>   * Helper function to check if socket is already discovered.
>>   * If yes, return positive value. If not, return zero.
>> @@ -570,6 +573,7 @@ static int eth_event_callback(uint8_t port_id,
>>  	unsigned int nb_mbuf_per_pool;
>>  	lcoreid_t  lc_id;
>>  	uint8_t port_per_socket[RTE_MAX_NUMA_NODES];
>> +	uint32_t gso_types = 0;
>>
>>  	memset(port_per_socket,0,RTE_MAX_NUMA_NODES);
>>
>> @@ -654,6 +658,8 @@ static int eth_event_callback(uint8_t port_id,
>>
>>  	init_port_config();
>>
>> +	gso_types = DEV_TX_OFFLOAD_TCP_TSO | DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
>> +		DEV_TX_OFFLOAD_GRE_TNL_TSO;
>>  	/*
>>  	 * Records which Mbuf pool to use by each logical core, if needed.
>>  	 */
>> @@ -664,6 +670,13 @@ static int eth_event_callback(uint8_t port_id,
>>  		if (mbp == NULL)
>>  			mbp = mbuf_pool_find(0);
>>  		fwd_lcores[lc_id]->mbp = mbp;
>> +		/* initialize GSO context */
>> +		fwd_lcores[lc_id]->gso_ctx.direct_pool = mbp;
>> +		fwd_lcores[lc_id]->gso_ctx.indirect_pool = mbp;
>> +		fwd_lcores[lc_id]->gso_ctx.gso_types = gso_types;
>> +		fwd_lcores[lc_id]->gso_ctx.gso_size = ETHER_MAX_LEN -
>> +			ETHER_CRC_LEN;
>> +		fwd_lcores[lc_id]->gso_ctx.ipid_flag = !RTE_GSO_IPID_FIXED;
>
>Just
>fwd_lcores[lc_id]->gso_ctx.ipid_flag = 0;
>should do here.

Agreed.

Thanks again,
Mark 


>
>Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-10-04 16:23               ` Kavanagh, Mark B
@ 2017-10-04 16:26                 ` Ananyev, Konstantin
  2017-10-04 16:51                   ` Kavanagh, Mark B
  0 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-10-04 16:26 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Wednesday, October 4, 2017 5:23 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; dev@dpdk.org
> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
> Subject: RE: [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
> 
> 
> 
> >-----Original Message-----
> >From: Ananyev, Konstantin
> >Sent: Wednesday, October 4, 2017 4:09 PM
> >To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
> >Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
> >Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
> >Subject: RE: [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
> >
> >
> >
> >> -----Original Message-----
> >> From: Kavanagh, Mark B
> >> Sent: Monday, October 2, 2017 5:46 PM
> >> To: dev@dpdk.org
> >> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
> >Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
> >> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B
> ><mark.b.kavanagh@intel.com>
> >> Subject: [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
> >>
> >> From: Jiayu Hu <jiayu.hu@intel.com>
> >>
> >> This patch adds GSO support to the csum forwarding engine. Oversized
> >> packets transmitted over a GSO-enabled port will undergo segmentation
> >> (with the exception of packet-types unsupported by the GSO library).
> >> GSO support is disabled by default.
> >>
> >> GSO support may be toggled on a per-port basis, using the command:
> >>
> >>         "set port <port_id> gso on|off"
> >>
> >> The maximum packet length (including the packet header and payload) for
> >> GSO segments may be set with the command:
> >>
> >>         "set gso segsz <length>"
> >>
> >> Show GSO configuration for a given port with the command:
> >>
> >> 	"show port <port_id> gso"
> >>
> >> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> >> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> >> ---
> >>  app/test-pmd/cmdline.c                      | 178
> >++++++++++++++++++++++++++++
> >>  app/test-pmd/config.c                       |  24 ++++
> >>  app/test-pmd/csumonly.c                     |  69 ++++++++++-
> >>  app/test-pmd/testpmd.c                      |  13 ++
> >>  app/test-pmd/testpmd.h                      |  10 ++
> >>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  46 +++++++
> >>  6 files changed, 335 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> >> index ccdf239..05b0ce8 100644
> >> --- a/app/test-pmd/cmdline.c
> >> +++ b/app/test-pmd/cmdline.c
> >> @@ -431,6 +431,17 @@ static void cmd_help_long_parsed(void *parsed_result,
> >>  			"    Set max flow number and max packet number per-flow"
> >>  			" for GRO.\n\n"
> >>
> >> +			"set port (port_id) gso (on|off)"
> >> +			"    Enable or disable Generic Segmentation Offload in"
> >> +			" csum forwarding engine.\n\n"
> >> +
> >> +			"set gso segsz (length)\n"
> >> +			"    Set max packet length for output GSO segments,"
> >> +			" including packet header and payload.\n\n"
> >
> >Probably a  good future improvement would be to allow user to specify gso_type
> >too.
> 
> Would you like to see that change implemented in time for the 17.11 release?

I think that's too late for such change in 17.11.
My thought was about 18.02 here.
Konstantin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-10-04 16:26                 ` Ananyev, Konstantin
@ 2017-10-04 16:51                   ` Kavanagh, Mark B
  0 siblings, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-10-04 16:51 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas

>-----Original Message-----
>From: Ananyev, Konstantin
>Sent: Wednesday, October 4, 2017 5:27 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>Subject: RE: [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
>
>
>
>> -----Original Message-----
>> From: Kavanagh, Mark B
>> Sent: Wednesday, October 4, 2017 5:23 PM
>> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; dev@dpdk.org
>> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>> Subject: RE: [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
>>
>>
>>
>> >-----Original Message-----
>> >From: Ananyev, Konstantin
>> >Sent: Wednesday, October 4, 2017 4:09 PM
>> >To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>> >Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>> >Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>> >Subject: RE: [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: Kavanagh, Mark B
>> >> Sent: Monday, October 2, 2017 5:46 PM
>> >> To: dev@dpdk.org
>> >> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng
><jianfeng.tan@intel.com>;
>> >Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
>> >> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B
>> ><mark.b.kavanagh@intel.com>
>> >> Subject: [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
>> >>
>> >> From: Jiayu Hu <jiayu.hu@intel.com>
>> >>
>> >> This patch adds GSO support to the csum forwarding engine. Oversized
>> >> packets transmitted over a GSO-enabled port will undergo segmentation
>> >> (with the exception of packet-types unsupported by the GSO library).
>> >> GSO support is disabled by default.
>> >>
>> >> GSO support may be toggled on a per-port basis, using the command:
>> >>
>> >>         "set port <port_id> gso on|off"
>> >>
>> >> The maximum packet length (including the packet header and payload) for
>> >> GSO segments may be set with the command:
>> >>
>> >>         "set gso segsz <length>"
>> >>
>> >> Show GSO configuration for a given port with the command:
>> >>
>> >> 	"show port <port_id> gso"
>> >>
>> >> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
>> >> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> >> ---
>> >>  app/test-pmd/cmdline.c                      | 178
>> >++++++++++++++++++++++++++++
>> >>  app/test-pmd/config.c                       |  24 ++++
>> >>  app/test-pmd/csumonly.c                     |  69 ++++++++++-
>> >>  app/test-pmd/testpmd.c                      |  13 ++
>> >>  app/test-pmd/testpmd.h                      |  10 ++
>> >>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  46 +++++++
>> >>  6 files changed, 335 insertions(+), 5 deletions(-)
>> >>
>> >> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
>> >> index ccdf239..05b0ce8 100644
>> >> --- a/app/test-pmd/cmdline.c
>> >> +++ b/app/test-pmd/cmdline.c
>> >> @@ -431,6 +431,17 @@ static void cmd_help_long_parsed(void
>*parsed_result,
>> >>  			"    Set max flow number and max packet number per-flow"
>> >>  			" for GRO.\n\n"
>> >>
>> >> +			"set port (port_id) gso (on|off)"
>> >> +			"    Enable or disable Generic Segmentation Offload in"
>> >> +			" csum forwarding engine.\n\n"
>> >> +
>> >> +			"set gso segsz (length)\n"
>> >> +			"    Set max packet length for output GSO segments,"
>> >> +			" including packet header and payload.\n\n"
>> >
>> >Probably a  good future improvement would be to allow user to specify
>gso_type
>> >too.
>>
>> Would you like to see that change implemented in time for the 17.11 release?
>
>I think that's too late for such change in 17.11.
>My thought was about 18.02 here.
>Konstantin

No problem - thanks Konstantin.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-10-02 16:45         ` [PATCH v6 0/6] Support TCP/IPv4, VxLAN, " Mark Kavanagh
                             ` (5 preceding siblings ...)
  2017-10-02 16:45           ` [PATCH v6 6/6] doc: add GSO programmer's guide Mark Kavanagh
@ 2017-10-05 11:02           ` Mark Kavanagh
  2017-10-05 11:02             ` [PATCH v7 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
                               ` (13 more replies)
  6 siblings, 14 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 11:02 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. This patch adds GSO support to DPDK for specific
packet types: specifically, TCP/IPv4, VxLAN, and GRE.

The first patch introduces the GSO API framework. The second patch
adds GSO support for TCP/IPv4 packets (containing an optional VLAN
tag). The third patch adds GSO support for VxLAN packets that contain
outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or 
outer VLAN tags). The fourth patch adds GSO support for GRE packets
that contain outer IPv4, and inner TCP/IPv4 headers (with optional 
outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
and GRE GSO in testpmd's checksum forwarding engine. The final patch
in the series adds GSO documentation to the programmer's guide.

Performance Testing
===================
The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
iperf. Setup for the test is described as follows:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum
   forwarding engine with "retry".
c. Select IP and TCP HW checksum calculation for P0; select TCP HW
   checksum calculation for vhost-user port.
d. Launch a VM with csum and tso offloading enabled.
e. Run iperf-client on virtio-net port in the VM to send TCP packets.
   With enabling csum and tso, the VM can send large TCP/IPv4 packets
   (mss is up to 64KB).
f. P1 is assigned to linux kernel and enabled kernel GRO. Run
   iperf-server on P1.

We conduct three iperf tests:

test-1: enable GSO for P0 in testpmd, and set max GSO segment length
    to 1518B. Run two iperf-client in the VM.
test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
    two iperf-client in the VM.
test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.

Throughput of the above three tests:

test-1: 9.4Gbps
test-2: 9.5Gbps
test-3: 3Mbps

Functional Testing
==================
Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
length of tunneled packets from VMs is 1514B. So current experiment
method can't be used to measure VxLAN and GRE GSO performance, but simply
test the functionality via setting small GSO segment length (e.g. 500B).

VxLAN
-----
To test VxLAN GSO functionality, we use the following setup:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
   engine with "retry".
c. Testpmd commands:
    - csum parse_tunnel on "P0"
    - csum parse_tunnel on "vhost-user port"
    - csum set outer-ip hw "P0"
    - csum set ip hw "P0"
    - csum set tcp hw "P0"
    - csum set tcp hw "vhost-user port"
    - set port "P0" gso on
    - set gso segsz 500
d. Launch a VM with csum and tso offloading enabled.
e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
   on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
   max packet length is 1514B.
f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
   create a VxLAN port for P1, and run iperf-server on the VxLAN port.

In testpmd, we can see the length of all packets sent from P0 is smaller
than or equal to 500B. Additionally, the packets arriving in P1 is
encapsulated and is smaller than or equal to 500B.

GRE
---
The same process may be used to test GRE functionality, with the exception that
the tunnel type created for both the guest's virtio-net, and the host's kernel
interfaces is GRE:
   `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`

As in the VxLAN testcase, the length of packets sent from P0, and received on
P1, is less than 500B.

Change log
==========
v7:
- add RTE_GSO_SEG_SIZE_MIN macro; use this to validate gso_ctx.gso_segsz.
- rename 'ipid_flag' member of gso_ctx to 'flag'.
- remove mention of VLAN tags in supported packet types.
- don't clear PKT_TX_TCP_SEG flag if GSO fails.
- take all packet overhead into account when checking for empty packet.
- ensure that only enabled GSO types are enacted upon (i.e. no fall-through to
  TCP/IPv4 case from tunneled case).
- validate user-supplied gso segsz arg against RTE_GSO_SEG_SIZE_MIN in testpmd.
- simplify error-checking/handling for GSO failure case in testpmd csum engine.
- use 0 instead of !RTE_GSO_IPID_FIXED in testpmd.

v6:
- rebase to HEAD of master (i5dce9fcA)
- remove 'l3_offset' parameter from 'update_ipv4_tcp_headers'

v5:
- add GSO section to the programmer's guide.
- use MF or (previously 'and') offset to check if a packet is IP
  fragmented.
- move 'update_header' helper functions to gso_common.h.
- move txp/ipv4 'update_header' function to gso_tcp4.c.
- move tunnel 'update_header' function to gso_tunnel_tcp4.c.
- add offset parameter to 'update_header' functions.
- combine GRE and VxLAN tunnel header update functions into a single
  function.
- correct typos and errors in comments/commit messages.

v4:
- use ol_flags instead of packet_type to decide which segmentation
  function to use.
- use MF and offset to check if a packet is IP fragmented, instead of
  using DF.
- remove ETHER_CRC_LEN from gso segment payload length calculation.
- refactor internal header update and other functions.
- remove RTE_GSO_IPID_INCREASE.
- add some of GSO documents.
- set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
  packets sent from GSO-enabled ports in testpmd.
v3:
- support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
  RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
  UNKNOWN.
- fill mbuf->packet_type instead of using rte_net_get_ptype() in
  csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
- store the input packet into pkts_out inside gso_tcp4_segment() and
  gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
  is performed.
- add missing incldues.
- optimize file names, function names and function description.
- fix one bug in testpmd.
v2:
- merge data segments whose data_len is less than mss into a large data
  segment in gso_do_segment().
- use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
  header in rte_gso_segment().
- provide IP id macros for applications to select fixed or incremental IP
  ids.

Jiayu Hu (3):
  gso: add Generic Segmentation Offload API framework
  gso: add TCP/IPv4 GSO support
  app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO

Mark Kavanagh (3):
  gso: add VxLAN GSO support
  gso: add GRE GSO support
  doc: add GSO programmer's guide

 MAINTAINERS                                        |   6 +
 app/test-pmd/cmdline.c                             | 179 ++++++++
 app/test-pmd/config.c                              |  24 ++
 app/test-pmd/csumonly.c                            |  43 +-
 app/test-pmd/testpmd.c                             |  13 +
 app/test-pmd/testpmd.h                             |  10 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   1 +
 doc/api/doxy-api.conf                              |   1 +
 .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
 .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
 doc/guides/prog_guide/index.rst                    |   1 +
 doc/guides/rel_notes/release_17_11.rst             |  17 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  46 ++
 lib/Makefile                                       |   2 +
 lib/librte_eal/common/include/rte_log.h            |   1 +
 lib/librte_gso/Makefile                            |  52 +++
 lib/librte_gso/gso_common.c                        | 153 +++++++
 lib/librte_gso/gso_common.h                        | 171 ++++++++
 lib/librte_gso/gso_tcp4.c                          | 104 +++++
 lib/librte_gso/gso_tcp4.h                          |  74 ++++
 lib/librte_gso/gso_tunnel_tcp4.c                   | 126 ++++++
 lib/librte_gso/gso_tunnel_tcp4.h                   |  75 ++++
 lib/librte_gso/rte_gso.c                           | 111 +++++
 lib/librte_gso/rte_gso.h                           | 148 +++++++
 lib/librte_gso/rte_gso_version.map                 |   7 +
 mk/rte.app.mk                                      |   1 +
 28 files changed, 2413 insertions(+), 4 deletions(-)
 create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
 create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
 create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

-- 
1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v7 1/6] gso: add Generic Segmentation Offload API framework
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
@ 2017-10-05 11:02             ` Mark Kavanagh
  2017-10-05 11:02             ` [PATCH v7 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
                               ` (12 subsequent siblings)
  13 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 11:02 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. To segment a packet requires two steps. The first
is to set proper flags to mbuf->ol_flags, where the flags are the same
as that of TSO. The second is to call the segmentation API,
rte_gso_segment(). This patch introduces the GSO API framework to DPDK.

rte_gso_segment() splits an input packet into small ones in each
invocation. The GSO library refers to these small packets generated
by rte_gso_segment() as GSO segments. Each of the newly-created GSO
segments is organized as a two-segment MBUF, where the first segment is a
standard MBUF, which stores a copy of packet header, and the second is an
indirect MBUF which points to a section of data in the input packet.
rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
when all GSO segments are freed, the input packet is freed automatically.
Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
the driver of the interface which the GSO segments are sent to should
support to transmit multi-segment packets.

The GSO framework clears the PKT_TX_TCP_SEG flag for both the input
packet, and all produced GSO segments in the event of success, since
segmentation in hardware is no longer required at that point.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 config/common_base                     |   5 ++
 doc/api/doxy-api-index.md              |   1 +
 doc/api/doxy-api.conf                  |   1 +
 doc/guides/rel_notes/release_17_11.rst |   1 +
 lib/Makefile                           |   2 +
 lib/librte_gso/Makefile                |  49 +++++++++++
 lib/librte_gso/rte_gso.c               |  52 ++++++++++++
 lib/librte_gso/rte_gso.h               | 143 +++++++++++++++++++++++++++++++++
 lib/librte_gso/rte_gso_version.map     |   7 ++
 mk/rte.app.mk                          |   1 +
 10 files changed, 262 insertions(+)
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

diff --git a/config/common_base b/config/common_base
index 12f6be9..58ca5c0 100644
--- a/config/common_base
+++ b/config/common_base
@@ -653,6 +653,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
 CONFIG_RTE_LIBRTE_GRO=y
 
 #
+# Compile GSO library
+#
+CONFIG_RTE_LIBRTE_GSO=y
+
+#
 # Compile librte_meter
 #
 CONFIG_RTE_LIBRTE_METER=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 19e0d4f..6512918 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -101,6 +101,7 @@ The public API headers are grouped by topics:
   [TCP]                (@ref rte_tcp.h),
   [UDP]                (@ref rte_udp.h),
   [GRO]                (@ref rte_gro.h),
+  [GSO]                (@ref rte_gso.h),
   [frag/reass]         (@ref rte_ip_frag.h),
   [LPM IPv4 route]     (@ref rte_lpm.h),
   [LPM IPv6 route]     (@ref rte_lpm6.h),
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 823554f..408f2e6 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -47,6 +47,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_ether \
                           lib/librte_eventdev \
                           lib/librte_gro \
+                          lib/librte_gso \
                           lib/librte_hash \
                           lib/librte_ip_frag \
                           lib/librte_jobstats \
diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index f6f9169..5bb36b7 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -174,6 +174,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_ethdev.so.7
      librte_eventdev.so.2
      librte_gro.so.1
+   + librte_gso.so.1
      librte_hash.so.2
      librte_ip_frag.so.1
      librte_jobstats.so.1
diff --git a/lib/Makefile b/lib/Makefile
index 86caba1..3d123f4 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
 DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
+DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
+DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
new file mode 100644
index 0000000..aeaacbc
--- /dev/null
+++ b/lib/librte_gso/Makefile
@@ -0,0 +1,49 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_gso.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+
+EXPORT_MAP := rte_gso_version.map
+
+LIBABIVER := 1
+
+#source files
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
new file mode 100644
index 0000000..b773636
--- /dev/null
+++ b/lib/librte_gso/rte_gso.c
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <errno.h>
+
+#include "rte_gso.h"
+
+int
+rte_gso_segment(struct rte_mbuf *pkt,
+		const struct rte_gso_ctx *gso_ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
+			nb_pkts_out < 1)
+		return -EINVAL;
+
+	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+	pkts_out[0] = pkt;
+
+	return 1;
+}
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
new file mode 100644
index 0000000..7d343d7
--- /dev/null
+++ b/lib/librte_gso/rte_gso.h
@@ -0,0 +1,143 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_GSO_H_
+#define _RTE_GSO_H_
+
+/**
+ * @file
+ * Interface to GSO library
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/* GSO flags for rte_gso_ctx. */
+#define RTE_GSO_FLAG_IPID_FIXED (1ULL << 0)
+/**< Use fixed IP ids for output GSO segments. Setting
+ * !RTE_GSO_IPID_FIXED indicates using incremental IP ids.
+ */
+
+/**
+ * GSO context structure.
+ */
+struct rte_gso_ctx {
+	struct rte_mempool *direct_pool;
+	/**< MBUF pool for allocating direct buffers, which are used
+	 * to store packet headers for GSO segments.
+	 */
+	struct rte_mempool *indirect_pool;
+	/**< MBUF pool for allocating indirect buffers, which are used
+	 * to locate packet payloads for GSO segments. The indirect
+	 * buffer doesn't contain any data, but simply points to an
+	 * offset within the packet to segment.
+	 */
+	uint64_t flag;
+	/**< flag that controls specific attributes of output segments,
+	 * such as the type of IP ID generated (i.e. fixed or incremental).
+	 */
+	uint32_t gso_types;
+	/**< the bit mask of required GSO types. The GSO library
+	 * uses the same macros as that of describing device TX
+	 * offloading capabilities (i.e. DEV_TX_OFFLOAD_*_TSO) for
+	 * gso_types.
+	 *
+	 * For example, if applications want to segment TCP/IPv4
+	 * packets, set DEV_TX_OFFLOAD_TCP_TSO in gso_types.
+	 */
+	uint16_t gso_size;
+	/**< maximum size of an output GSO segment, including packet
+	 * header and payload, measured in bytes.
+	 */
+};
+
+/**
+ * Segmentation function, which supports processing of both single- and
+ * multi- MBUF packets.
+ *
+ * Note that we refer to the packets that are segmented from the input
+ * packet as 'GSO segments'. rte_gso_segment() doesn't check if the
+ * input packet has correct checksums, and doesn't update checksums for
+ * output GSO segments. Additionally, it doesn't process IP fragment
+ * packets.
+ *
+ * Before calling rte_gso_segment(), applications must set proper ol_flags
+ * for the packet. The GSO library uses the same macros as that of TSO.
+ * For example, set PKT_TX_TCP_SEG and PKT_TX_IPV4 in ol_flags to segment
+ * a TCP/IPv4 packet. If rte_gso_segment() succceds, the PKT_TX_TCP_SEG
+ * flag is removed for all GSO segments and the input packet.
+ *
+ * Each of the newly-created GSO segments is organized as a two-segment
+ * MBUF, where the first segment is a standard MBUF, which stores a copy
+ * of packet header, and the second is an indirect MBUF which points to
+ * a section of data in the input packet. Since each GSO segment has
+ * multiple MBUFs (i.e. typically 2 MBUFs), the driver of the interface which
+ * the GSO segments are sent to should support transmission of multi-segment
+ * packets.
+ *
+ * If the input packet is GSO'd, its mbuf refcnt reduces by 1. Therefore,
+ * when all GSO segments are freed, the input packet is freed automatically.
+ *
+ * If the memory space in pkts_out or MBUF pools is insufficient, this
+ * function fails, and it returns (-1) * errno. Otherwise, GSO succeeds,
+ * and this function returns the number of output GSO segments filled in
+ * pkts_out.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param ctx
+ *  GSO context object pointer.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when rte_gso_segment() succeeds.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of GSO segments filled in pkts_out on success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int rte_gso_segment(struct rte_mbuf *pkt,
+		const struct rte_gso_ctx *ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_GSO_H_ */
diff --git a/lib/librte_gso/rte_gso_version.map b/lib/librte_gso/rte_gso_version.map
new file mode 100644
index 0000000..e1fd453
--- /dev/null
+++ b/lib/librte_gso/rte_gso_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_gso_segment;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index c25fdd9..d4c9873 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
+_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v7 2/6] gso: add TCP/IPv4 GSO support
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
  2017-10-05 11:02             ` [PATCH v7 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
@ 2017-10-05 11:02             ` Mark Kavanagh
  2017-10-05 11:02             ` [PATCH v7 3/6] gso: add VxLAN " Mark Kavanagh
                               ` (11 subsequent siblings)
  13 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 11:02 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

This patch adds GSO support for TCP/IPv4 packets. Supported packets
may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
packets have correct checksums, and doesn't update checksums for
output packets (the responsibility for this lies with the application).
Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.

TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
MBUF, to organize an output packet. Note that we refer to these two
chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
header, while the indirect mbuf simply points to a location within the
original packet's payload. Consequently, use of the GSO library requires
multi-segment MBUF support in the TX functions of the NIC driver.

If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
result, when all of its GSOed segments are freed, the packet is freed
automatically.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst  |  12 +++
 lib/Makefile                            |   2 +-
 lib/librte_eal/common/include/rte_log.h |   1 +
 lib/librte_gso/Makefile                 |   2 +
 lib/librte_gso/gso_common.c             | 153 ++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h             | 141 +++++++++++++++++++++++++++++
 lib/librte_gso/gso_tcp4.c               | 104 ++++++++++++++++++++++
 lib/librte_gso/gso_tcp4.h               |  74 +++++++++++++++
 lib/librte_gso/rte_gso.c                |  52 ++++++++++-
 lib/librte_gso/rte_gso.h                |   7 +-
 10 files changed, 543 insertions(+), 5 deletions(-)
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 5bb36b7..dd37169 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -41,6 +41,18 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added the Generic Segmentation Offload Library.**
+
+  Added the Generic Segmentation Offload (GSO) library to enable
+  applications to split large packets (e.g. MTU is 64KB) into small
+  ones (e.g. MTU is 1500B). Supported packet types are:
+
+  * TCP/IPv4 packets.
+
+  The GSO library doesn't check if the input packets have correct
+  checksums, and doesn't update checksums for output packets.
+  Additionally, the GSO library doesn't process IP fragmented packets.
+
 
 Resolved Issues
 ---------------
diff --git a/lib/Makefile b/lib/Makefile
index 3d123f4..5ecd1b3 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -109,7 +109,7 @@ DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
 DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
-DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
+DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net librte_mempool
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index ec8dba7..2fa1199 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -87,6 +87,7 @@ struct rte_logs {
 #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
 #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
 #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
+#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index aeaacbc..2be64d1 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -42,6 +42,8 @@ LIBABIVER := 1
 
 #source files
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
new file mode 100644
index 0000000..ee75d4c
--- /dev/null
+++ b/lib/librte_gso/gso_common.c
@@ -0,0 +1,153 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdbool.h>
+#include <errno.h>
+
+#include <rte_memcpy.h>
+#include <rte_mempool.h>
+
+#include "gso_common.h"
+
+static inline void
+hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset)
+{
+	/* Copy MBUF metadata */
+	hdr_segment->nb_segs = 1;
+	hdr_segment->port = pkt->port;
+	hdr_segment->ol_flags = pkt->ol_flags;
+	hdr_segment->packet_type = pkt->packet_type;
+	hdr_segment->pkt_len = pkt_hdr_offset;
+	hdr_segment->data_len = pkt_hdr_offset;
+	hdr_segment->tx_offload = pkt->tx_offload;
+
+	/* Copy the packet header */
+	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
+			rte_pktmbuf_mtod(pkt, char *),
+			pkt_hdr_offset);
+}
+
+static inline void
+free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
+{
+	uint16_t i;
+
+	for (i = 0; i < nb_pkts; i++)
+		rte_pktmbuf_free(pkts[i]);
+}
+
+int
+gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct rte_mbuf *pkt_in;
+	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
+	uint16_t pkt_in_data_pos, segment_bytes_remaining;
+	uint16_t pyld_len, nb_segs;
+	bool more_in_pkt, more_out_segs;
+
+	pkt_in = pkt;
+	nb_segs = 0;
+	more_in_pkt = 1;
+	pkt_in_data_pos = pkt_hdr_offset;
+
+	while (more_in_pkt) {
+		if (unlikely(nb_segs >= nb_pkts_out)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -EINVAL;
+		}
+
+		/* Allocate a direct MBUF */
+		hdr_segment = rte_pktmbuf_alloc(direct_pool);
+		if (unlikely(hdr_segment == NULL)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -ENOMEM;
+		}
+		/* Fill the packet header */
+		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
+
+		prev_segment = hdr_segment;
+		segment_bytes_remaining = pyld_unit_size;
+		more_out_segs = 1;
+
+		while (more_out_segs && more_in_pkt) {
+			/* Allocate an indirect MBUF */
+			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
+			if (unlikely(pyld_segment == NULL)) {
+				rte_pktmbuf_free(hdr_segment);
+				free_gso_segment(pkts_out, nb_segs);
+				return -ENOMEM;
+			}
+			/* Attach to current MBUF segment of pkt */
+			rte_pktmbuf_attach(pyld_segment, pkt_in);
+
+			prev_segment->next = pyld_segment;
+			prev_segment = pyld_segment;
+
+			pyld_len = segment_bytes_remaining;
+			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
+				pyld_len = pkt_in->data_len - pkt_in_data_pos;
+
+			pyld_segment->data_off = pkt_in_data_pos +
+				pkt_in->data_off;
+			pyld_segment->data_len = pyld_len;
+
+			/* Update header segment */
+			hdr_segment->pkt_len += pyld_len;
+			hdr_segment->nb_segs++;
+
+			pkt_in_data_pos += pyld_len;
+			segment_bytes_remaining -= pyld_len;
+
+			/* Finish processing a MBUF segment of pkt */
+			if (pkt_in_data_pos == pkt_in->data_len) {
+				pkt_in = pkt_in->next;
+				pkt_in_data_pos = 0;
+				if (pkt_in == NULL)
+					more_in_pkt = 0;
+			}
+
+			/* Finish generating a GSO segment */
+			if (segment_bytes_remaining == 0)
+				more_out_segs = 0;
+		}
+		pkts_out[nb_segs++] = hdr_segment;
+	}
+	return nb_segs;
+}
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
new file mode 100644
index 0000000..a8ad638
--- /dev/null
+++ b/lib/librte_gso/gso_common.h
@@ -0,0 +1,141 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_COMMON_H_
+#define _GSO_COMMON_H_
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
+#define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
+		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
+
+#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
+#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
+
+#define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
+
+/**
+ * Internal function which updates the TCP header of a packet, following
+ * segmentation. This is required to update the header's 'sent' sequence
+ * number, and also to clear 'PSH' and 'FIN' flags for non-tail segments.
+ *
+ * @param pkt
+ *  The packet containing the TCP header.
+ * @param l4_offset
+ *  The offset of the TCP header from the start of the packet.
+ * @param sent_seq
+ *  The sent sequence number.
+ * @param non-tail
+ *  Indicates whether or not this is a tail segment.
+ */
+static inline void
+update_tcp_header(struct rte_mbuf *pkt, uint16_t l4_offset, uint32_t sent_seq,
+		uint8_t non_tail)
+{
+	struct tcp_hdr *tcp_hdr;
+
+	tcp_hdr = (struct tcp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l4_offset);
+	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
+	if (likely(non_tail))
+		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
+					TCP_HDR_FIN_MASK));
+}
+
+/**
+ * Internal function which updates the IPv4 header of a packet, following
+ * segmentation. This is required to update the header's 'total_length' field,
+ * to reflect the reduced length of the now-segmented packet. Furthermore, the
+ * header's 'packet_id' field must be updated to reflect the new ID of the
+ * now-segmented packet.
+ *
+ * @param pkt
+ *  The packet containing the IPv4 header.
+ * @param l3_offset
+ *  The offset of the IPv4 header from the start of the packet.
+ * @param id
+ *  The new ID of the packet.
+ */
+static inline void
+update_ipv4_header(struct rte_mbuf *pkt, uint16_t l3_offset, uint16_t id)
+{
+	struct ipv4_hdr *ipv4_hdr;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l3_offset);
+	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len - l3_offset);
+	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
+}
+
+/**
+ * Internal function which divides the input packet into small segments.
+ * Each of the newly-created segments is organized as a two-segment MBUF,
+ * where the first segment is a standard mbuf, which stores a copy of
+ * packet header, and the second is an indirect mbuf which points to a
+ * section of data in the input packet.
+ *
+ * @param pkt
+ *  Packet to segment.
+ * @param pkt_hdr_offset
+ *  Packet header offset, measured in bytes.
+ * @param pyld_unit_size
+ *  The max payload length of a GSO segment.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to keep the mbuf addresses of output segments. If
+ *  the memory space in pkts_out is insufficient, gso_do_segment() fails
+ *  and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of segments created in the event of success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
new file mode 100644
index 0000000..d83e610
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.c
@@ -0,0 +1,104 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "gso_common.h"
+#include "gso_tcp4.h"
+
+static void
+update_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t id, tail_idx, i;
+	uint16_t l3_offset = pkt->l2_len;
+	uint16_t l4_offset = l3_offset + pkt->l3_len;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char*) +
+			l3_offset);
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+
+	for (i = 0; i < nb_segs; i++) {
+		update_ipv4_header(segs[i], l3_offset, id);
+		update_tcp_header(segs[i], l4_offset, sent_seq, i < tail_idx);
+		id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
+
+int
+gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t tcp_dl;
+	uint16_t pyld_unit_size, hdr_offset;
+	uint16_t frag_off;
+	int ret;
+
+	/* Don't process the fragmented packet */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->l2_len);
+	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	/* Don't process the packet without data */
+	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
+	if (unlikely(tcp_dl == 0)) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret > 1)
+		update_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
new file mode 100644
index 0000000..1c57441
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.h
@@ -0,0 +1,74 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TCP4_H_
+#define _GSO_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment an IPv4/TCP packet. This function doesn't check if the input
+ * packet has correct checksums, and doesn't update checksums for output
+ * GSO segments. Furthermore, it doesn't process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing unit of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when the function succeeds. If the memory space in
+ *  pkts_out is insufficient, it fails and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ip_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index b773636..2b6fc2d 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -33,7 +33,12 @@
 
 #include <errno.h>
 
+#include <rte_log.h>
+#include <rte_ethdev.h>
+
 #include "rte_gso.h"
+#include "gso_common.h"
+#include "gso_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -41,12 +46,53 @@
 		struct rte_mbuf **pkts_out,
 		uint16_t nb_pkts_out)
 {
+	struct rte_mempool *direct_pool, *indirect_pool;
+	struct rte_mbuf *pkt_seg;
+	uint64_t ol_flags;
+	uint16_t gso_size;
+	uint8_t ipid_delta;
+	int ret = 1;
+
 	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
 			nb_pkts_out < 1)
 		return -EINVAL;
 
-	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
-	pkts_out[0] = pkt;
+	if ((gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN) ||
+			(gso_ctx->gso_size >= pkt->pkt_len) ||
+			(gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) !=
+			gso_ctx->gso_types) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	direct_pool = gso_ctx->direct_pool;
+	indirect_pool = gso_ctx->indirect_pool;
+	gso_size = gso_ctx->gso_size;
+	ipid_delta = (gso_ctx->flag != RTE_GSO_FLAG_IPID_FIXED);
+	ol_flags = pkt->ol_flags;
+
+	if (IS_IPV4_TCP(pkt->ol_flags)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else {
+		pkts_out[0] = pkt;
+		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
+		return 1;
+	}
+
+	if (ret > 1) {
+		pkt_seg = pkt;
+		while (pkt_seg) {
+			rte_mbuf_refcnt_update(pkt_seg, -1);
+			pkt_seg = pkt_seg->next;
+		}
+	} else if (ret < 0) {
+		/* Revert the ol_flags in the event of failure. */
+		pkt->ol_flags = ol_flags;
+	}
 
-	return 1;
+	return ret;
 }
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
index 7d343d7..7ca2d81 100644
--- a/lib/librte_gso/rte_gso.h
+++ b/lib/librte_gso/rte_gso.h
@@ -46,6 +46,10 @@
 #include <stdint.h>
 #include <rte_mbuf.h>
 
+/* Minimum GSO segment size. */
+#define RTE_GSO_SEG_SIZE_MIN (sizeof(struct ether_hdr) + \
+		sizeof(struct ipv4_hdr) + sizeof(struct tcp_hdr) + 1)
+
 /* GSO flags for rte_gso_ctx. */
 #define RTE_GSO_FLAG_IPID_FIXED (1ULL << 0)
 /**< Use fixed IP ids for output GSO segments. Setting
@@ -81,7 +85,8 @@ struct rte_gso_ctx {
 	 */
 	uint16_t gso_size;
 	/**< maximum size of an output GSO segment, including packet
-	 * header and payload, measured in bytes.
+	 * header and payload, measured in bytes. Must exceed
+	 * RTE_GSO_SEG_SIZE_MIN.
 	 */
 };
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v7 3/6] gso: add VxLAN GSO support
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
  2017-10-05 11:02             ` [PATCH v7 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
  2017-10-05 11:02             ` [PATCH v7 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
@ 2017-10-05 11:02             ` Mark Kavanagh
  2017-10-05 11:02             ` [PATCH v7 4/6] gso: add GRE " Mark Kavanagh
                               ` (10 subsequent siblings)
  13 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 11:02 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

This patch adds a framework that allows GSO on tunneled packets.
Furthermore, it leverages that framework to provide GSO support for
VxLAN-encapsulated packets.

Supported VxLAN packets must have an outer IPv4 header (prepended by an
optional VLAN tag), and contain an inner TCP/IPv4 packet (with an optional
inner VLAN tag).

VxLAN GSO doesn't check if input packets have correct checksums and
doesn't update checksums for output packets. Additionally, it doesn't
process IP fragmented packets.

As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
output packet, which mandates support for multi-segment mbufs in the TX
functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
reduces its MBUF refcnt by 1. As a result, when all of its GSO'd segments
are freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst |   2 +
 lib/librte_gso/Makefile                |   1 +
 lib/librte_gso/gso_common.h            |  25 +++++++
 lib/librte_gso/gso_tunnel_tcp4.c       | 120 +++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_tunnel_tcp4.h       |  75 +++++++++++++++++++++
 lib/librte_gso/rte_gso.c               |  14 +++-
 6 files changed, 235 insertions(+), 2 deletions(-)
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index dd37169..c58eeb1 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -48,6 +48,8 @@ New Features
   ones (e.g. MTU is 1500B). Supported packet types are:
 
   * TCP/IPv4 packets.
+  * VxLAN packets, which must have an outer IPv4 header, and contain
+    an inner TCP/IPv4 packet.
 
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index 2be64d1..e6d41df 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -44,6 +44,7 @@ LIBABIVER := 1
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index a8ad638..95d54e7 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -39,6 +39,7 @@
 #include <rte_mbuf.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
+#include <rte_udp.h>
 
 #define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
 		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
@@ -49,6 +50,30 @@
 #define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
 		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
 
+#define IS_IPV4_VXLAN_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
+				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
+		 PKT_TX_TUNNEL_VXLAN))
+
+/**
+ * Internal function which updates the UDP header of a packet, following
+ * segmentation. This is required to update the header's datagram length field.
+ *
+ * @param pkt
+ *  The packet containing the UDP header.
+ * @param udp_offset
+ *  The offset of the UDP header from the start of the packet.
+ */
+static inline void
+update_udp_header(struct rte_mbuf *pkt, uint16_t udp_offset)
+{
+	struct udp_hdr *udp_hdr;
+
+	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			udp_offset);
+	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - udp_offset);
+}
+
 /**
  * Internal function which updates the TCP header of a packet, following
  * segmentation. This is required to update the header's 'sent' sequence
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
new file mode 100644
index 0000000..5e8c8e5
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -0,0 +1,120 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "gso_common.h"
+#include "gso_tunnel_tcp4.h"
+
+static void
+update_tunnel_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t outer_id, inner_id, tail_idx, i;
+	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
+
+	outer_ipv4_offset = pkt->outer_l2_len;
+	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
+	inner_ipv4_offset = udp_offset + pkt->l2_len;
+	tcp_offset = inner_ipv4_offset + pkt->l3_len;
+
+	/* Outer IPv4 header. */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			outer_ipv4_offset);
+	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+
+	/* Inner IPv4 header. */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			inner_ipv4_offset);
+	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+
+	for (i = 0; i < nb_segs; i++) {
+		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
+		update_udp_header(segs[i], udp_offset);
+		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
+		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
+		outer_id++;
+		inner_id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
+
+int
+gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *inner_ipv4_hdr;
+	uint16_t pyld_unit_size, hdr_offset, frag_off;
+	int ret = 1;
+
+	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
+	inner_ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			hdr_offset);
+	/*
+	 * Don't process the packet whose MF bit or offset in the inner
+	 * IPv4 header are non-zero.
+	 */
+	frag_off = rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	hdr_offset += pkt->l3_len + pkt->l4_len;
+	/* Don't process the packet without data */
+	if (hdr_offset >= pkt->pkt_len) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret <= 1)
+		return ret;
+
+	update_tunnel_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tunnel_tcp4.h b/lib/librte_gso/gso_tunnel_tcp4.h
new file mode 100644
index 0000000..3c67f0c
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.h
@@ -0,0 +1,75 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TUNNEL_TCP4_H_
+#define _GSO_TUNNEL_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment a tunneling packet with inner TCP/IPv4 headers. This function
+ * doesn't check if the input packet has correct checksums, and doesn't
+ * update checksums for output GSO segments. Furthermore, it doesn't
+ * process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing unit of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when it succeeds. If the memory space in pkts_out is
+ *  insufficient, it fails and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index 2b6fc2d..a6f38e2 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -39,6 +39,7 @@
 #include "rte_gso.h"
 #include "gso_common.h"
 #include "gso_tcp4.h"
+#include "gso_tunnel_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -59,7 +60,8 @@
 
 	if ((gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN) ||
 			(gso_ctx->gso_size >= pkt->pkt_len) ||
-			(gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) !=
+			(gso_ctx->gso_types & (DEV_TX_OFFLOAD_TCP_TSO |
+			                       DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
 			gso_ctx->gso_types) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		pkts_out[0] = pkt;
@@ -72,12 +74,20 @@
 	ipid_delta = (gso_ctx->flag != RTE_GSO_FLAG_IPID_FIXED);
 	ol_flags = pkt->ol_flags;
 
-	if (IS_IPV4_TCP(pkt->ol_flags)) {
+	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)
+		&& (gso_ctx->gso_types & DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
+			(gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO)) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
 				pkts_out, nb_pkts_out);
 	} else {
+		/* unsupported packet, skip */
 		pkts_out[0] = pkt;
 		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
 		return 1;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v7 4/6] gso: add GRE GSO support
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
                               ` (2 preceding siblings ...)
  2017-10-05 11:02             ` [PATCH v7 3/6] gso: add VxLAN " Mark Kavanagh
@ 2017-10-05 11:02             ` Mark Kavanagh
  2017-10-05 11:02             ` [PATCH v7 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
                               ` (9 subsequent siblings)
  13 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 11:02 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

This patch adds GSO support for GRE-tunneled packets. Supported GRE
packets must contain an outer IPv4 header, and inner TCP/IPv4 headers.
They may also contain a single VLAN tag. GRE GSO doesn't check if all
input packets have correct checksums and doesn't update checksums for
output packets. Additionally, it doesn't process IP fragmented packets.

As with VxLAN GSO, GRE GSO uses a two-segment MBUF to organize each
output packet, which requires multi-segment mbuf support in the TX
functions of the NIC driver. Also, if a packet is GSOed, GRE GSO reduces
its MBUF refcnt by 1. As a result, when all of its GSOed segments are
freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst |  2 ++
 lib/librte_gso/gso_common.h            |  5 +++++
 lib/librte_gso/gso_tunnel_tcp4.c       | 14 ++++++++++----
 lib/librte_gso/rte_gso.c               |  9 ++++++---
 4 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index c58eeb1..2faa630 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -50,6 +50,8 @@ New Features
   * TCP/IPv4 packets.
   * VxLAN packets, which must have an outer IPv4 header, and contain
     an inner TCP/IPv4 packet.
+  * GRE packets, which must contain an outer IPv4 header, and inner
+    TCP/IPv4 headers.
 
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index 95d54e7..145ea49 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -55,6 +55,11 @@
 		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
 		 PKT_TX_TUNNEL_VXLAN))
 
+#define IS_IPV4_GRE_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
+				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_GRE)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
+		 PKT_TX_TUNNEL_GRE))
+
 /**
  * Internal function which updates the UDP header of a packet, following
  * segmentation. This is required to update the header's datagram length field.
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
index 5e8c8e5..8d0cfd7 100644
--- a/lib/librte_gso/gso_tunnel_tcp4.c
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -42,11 +42,13 @@
 	struct tcp_hdr *tcp_hdr;
 	uint32_t sent_seq;
 	uint16_t outer_id, inner_id, tail_idx, i;
-	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
+	uint16_t outer_ipv4_offset, inner_ipv4_offset;
+	uint16_t udp_gre_offset, tcp_offset;
+	uint8_t update_udp_hdr;
 
 	outer_ipv4_offset = pkt->outer_l2_len;
-	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
-	inner_ipv4_offset = udp_offset + pkt->l2_len;
+	udp_gre_offset = outer_ipv4_offset + pkt->outer_l3_len;
+	inner_ipv4_offset = udp_gre_offset + pkt->l2_len;
 	tcp_offset = inner_ipv4_offset + pkt->l3_len;
 
 	/* Outer IPv4 header. */
@@ -63,9 +65,13 @@
 	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
 	tail_idx = nb_segs - 1;
 
+	/* Only update UDP header for VxLAN packets. */
+	update_udp_hdr = (pkt->ol_flags & PKT_TX_TUNNEL_VXLAN) ? 1 : 0;
+
 	for (i = 0; i < nb_segs; i++) {
 		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
-		update_udp_header(segs[i], udp_offset);
+		if (update_udp_hdr)
+			update_udp_header(segs[i], udp_gre_offset);
 		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
 		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
 		outer_id++;
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index a6f38e2..1d4082a 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -61,7 +61,8 @@
 	if ((gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN) ||
 			(gso_ctx->gso_size >= pkt->pkt_len) ||
 			(gso_ctx->gso_types & (DEV_TX_OFFLOAD_TCP_TSO |
-			                       DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
+			                       DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+			                       DEV_TX_OFFLOAD_GRE_TNL_TSO)) !=
 			gso_ctx->gso_types) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		pkts_out[0] = pkt;
@@ -74,8 +75,10 @@
 	ipid_delta = (gso_ctx->flag != RTE_GSO_FLAG_IPID_FIXED);
 	ol_flags = pkt->ol_flags;
 
-	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)
-		&& (gso_ctx->gso_types & DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) {
+	if ((IS_IPV4_VXLAN_TCP4(pkt->ol_flags) &&
+			(gso_ctx->gso_types & DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) ||
+			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
+			 (gso_ctx->gso_types & DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v7 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
                               ` (3 preceding siblings ...)
  2017-10-05 11:02             ` [PATCH v7 4/6] gso: add GRE " Mark Kavanagh
@ 2017-10-05 11:02             ` Mark Kavanagh
  2017-10-05 11:02             ` [PATCH v7 6/6] doc: add GSO programmer's guide Mark Kavanagh
                               ` (8 subsequent siblings)
  13 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 11:02 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

This patch adds GSO support to the csum forwarding engine. Oversized
packets transmitted over a GSO-enabled port will undergo segmentation
(with the exception of packet-types unsupported by the GSO library).
GSO support is disabled by default.

GSO support may be toggled on a per-port basis, using the command:

        "set port <port_id> gso on|off"

The maximum packet length (including the packet header and payload) for
GSO segments may be set with the command:

        "set gso segsz <length>"

Show GSO configuration for a given port with the command:

	"show port <port_id> gso"

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 app/test-pmd/cmdline.c                      | 179 ++++++++++++++++++++++++++++
 app/test-pmd/config.c                       |  24 ++++
 app/test-pmd/csumonly.c                     |  43 ++++++-
 app/test-pmd/testpmd.c                      |  13 ++
 app/test-pmd/testpmd.h                      |  10 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  46 +++++++
 6 files changed, 311 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index ccdf239..92e6171 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -431,6 +431,17 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"    Set max flow number and max packet number per-flow"
 			" for GRO.\n\n"
 
+			"set port (port_id) gso (on|off)"
+			"    Enable or disable Generic Segmentation Offload in"
+			" csum forwarding engine.\n\n"
+
+			"set gso segsz (length)\n"
+			"    Set max packet length for output GSO segments,"
+			" including packet header and payload.\n\n"
+
+			"show port (port_id) gso\n"
+			"    Show GSO configuration.\n\n"
+
 			"set fwd (%s)\n"
 			"    Set packet forwarding mode.\n\n"
 
@@ -3967,6 +3978,171 @@ struct cmd_gro_set_result {
 	},
 };
 
+/* *** ENABLE/DISABLE GSO *** */
+struct cmd_gso_enable_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_mode;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_enable_parsed(void *parsed_result,
+		__attribute__((unused)) struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_gso_enable_result *res;
+
+	res = parsed_result;
+	if (!strcmp(res->cmd_keyword, "gso"))
+		setup_gso(res->cmd_mode, res->cmd_pid);
+}
+
+cmdline_parse_token_string_t cmd_gso_enable_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_enable_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_enable_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_enable_mode =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_mode, "on#off");
+cmdline_parse_token_num_t cmd_gso_enable_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_enable = {
+	.f = cmd_gso_enable_parsed,
+	.data = NULL,
+	.help_str = "set port <port_id> gso on|off",
+	.tokens = {
+		(void *)&cmd_gso_enable_set,
+		(void *)&cmd_gso_enable_port,
+		(void *)&cmd_gso_enable_pid,
+		(void *)&cmd_gso_enable_keyword,
+		(void *)&cmd_gso_enable_mode,
+		NULL,
+	},
+};
+
+/* *** SET MAX PACKET LENGTH FOR GSO SEGMENTS *** */
+struct cmd_gso_size_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_segsz;
+	uint16_t cmd_size;
+};
+
+static void
+cmd_gso_size_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_size_result *res = parsed_result;
+
+	if (test_done == 0) {
+		printf("Before setting GSO segsz, please first stop fowarding\n");
+		return;
+	}
+
+	if (!strcmp(res->cmd_keyword, "gso") &&
+			!strcmp(res->cmd_segsz, "segsz")) {
+		if (res->cmd_size < RTE_GSO_SEG_SIZE_MIN)
+			printf("gso_size should be larger than %lu."
+					" Please input a legal value\n",
+					RTE_GSO_SEG_SIZE_MIN);
+		else
+			gso_max_segment_size = res->cmd_size;
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_size_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_size_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_size_segsz =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_segsz, "segsz");
+cmdline_parse_token_num_t cmd_gso_size_size =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_size_result,
+				cmd_size, UINT16);
+
+cmdline_parse_inst_t cmd_gso_size = {
+	.f = cmd_gso_size_parsed,
+	.data = NULL,
+	.help_str = "set gso segsz <length>",
+	.tokens = {
+		(void *)&cmd_gso_size_set,
+		(void *)&cmd_gso_size_keyword,
+		(void *)&cmd_gso_size_segsz,
+		(void *)&cmd_gso_size_size,
+		NULL,
+	},
+};
+
+/* *** SHOW GSO CONFIGURATION *** */
+struct cmd_gso_show_result {
+	cmdline_fixed_string_t cmd_show;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_show_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_show_result *res = parsed_result;
+
+	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
+		printf("invalid port id %u\n", res->cmd_pid);
+		return;
+	}
+	if (!strcmp(res->cmd_keyword, "gso")) {
+		if (gso_ports[res->cmd_pid].enable) {
+			printf("Max GSO'd packet size: %uB\n"
+					"Supported GSO types: TCP/IPv4, "
+					"VxLAN with inner TCP/IPv4 packet, "
+					"GRE with inner TCP/IPv4  packet\n",
+					gso_max_segment_size);
+		} else
+			printf("GSO is not enabled on Port %u\n", res->cmd_pid);
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_show_show =
+TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+		cmd_show, "show");
+cmdline_parse_token_string_t cmd_gso_show_port =
+TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+		cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_show_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_num_t cmd_gso_show_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_show_result,
+				cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_show = {
+	.f = cmd_gso_show_parsed,
+	.data = NULL,
+	.help_str = "show port <port_id> gso",
+	.tokens = {
+		(void *)&cmd_gso_show_show,
+		(void *)&cmd_gso_show_port,
+		(void *)&cmd_gso_show_pid,
+		(void *)&cmd_gso_show_keyword,
+		NULL,
+	},
+};
+
 /* *** ENABLE/DISABLE FLUSH ON RX STREAMS *** */
 struct cmd_set_flush_rx {
 	cmdline_fixed_string_t set;
@@ -14255,6 +14431,9 @@ struct cmd_cmdfile_result {
 	(cmdline_parse_inst_t *)&cmd_tunnel_tso_show,
 	(cmdline_parse_inst_t *)&cmd_enable_gro,
 	(cmdline_parse_inst_t *)&cmd_gro_set,
+	(cmdline_parse_inst_t *)&cmd_gso_enable,
+	(cmdline_parse_inst_t *)&cmd_gso_size,
+	(cmdline_parse_inst_t *)&cmd_gso_show,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_rx,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_tx,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 3ae3e1c..88d09d0 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2454,6 +2454,30 @@ struct igb_ring_desc_16_bytes {
 	}
 }
 
+void
+setup_gso(const char *mode, uint8_t port_id)
+{
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		printf("invalid port id %u\n", port_id);
+		return;
+	}
+	if (strcmp(mode, "on") == 0) {
+		if (test_done == 0) {
+			printf("before enabling GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 1;
+	} else if (strcmp(mode, "off") == 0) {
+		if (test_done == 0) {
+			printf("before disabling GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 0;
+	}
+}
+
 char*
 list_pkt_forwarding_modes(void)
 {
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 90c8119..81a631c 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -70,6 +70,8 @@
 #include <rte_string_fns.h>
 #include <rte_flow.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
+
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
@@ -91,6 +93,7 @@
 /* structure that caches offload info for the current packet */
 struct testpmd_offload_info {
 	uint16_t ethertype;
+	uint8_t gso_enable;
 	uint16_t l2_len;
 	uint16_t l3_len;
 	uint16_t l4_len;
@@ -381,6 +384,8 @@ struct simple_gre_hdr {
 				get_udptcp_checksum(l3_hdr, tcp_hdr,
 					info->ethertype);
 		}
+		if (info->gso_enable)
+			ol_flags |= PKT_TX_TCP_SEG;
 	} else if (info->l4_proto == IPPROTO_SCTP) {
 		sctp_hdr = (struct sctp_hdr *)((char *)l3_hdr + info->l3_len);
 		sctp_hdr->cksum = 0;
@@ -627,6 +632,9 @@ struct simple_gre_hdr {
 pkt_burst_checksum_forward(struct fwd_stream *fs)
 {
 	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_mbuf *gso_segments[GSO_MAX_PKT_BURST];
+	struct rte_gso_ctx *gso_ctx;
+	struct rte_mbuf **tx_pkts_burst;
 	struct rte_port *txp;
 	struct rte_mbuf *m, *p;
 	struct ether_hdr *eth_hdr;
@@ -641,6 +649,8 @@ struct simple_gre_hdr {
 	uint32_t rx_bad_ip_csum;
 	uint32_t rx_bad_l4_csum;
 	struct testpmd_offload_info info;
+	uint16_t nb_segments = 0;
+	int ret;
 
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	uint64_t start_tsc;
@@ -674,6 +684,8 @@ struct simple_gre_hdr {
 	memset(&info, 0, sizeof(info));
 	info.tso_segsz = txp->tso_segsz;
 	info.tunnel_tso_segsz = txp->tunnel_tso_segsz;
+	if (gso_ports[fs->tx_port].enable)
+		info.gso_enable = 1;
 
 	for (i = 0; i < nb_rx; i++) {
 		if (likely(i < nb_rx - 1))
@@ -851,13 +863,35 @@ struct simple_gre_hdr {
 		}
 	}
 
+	if (gso_ports[fs->tx_port].enable == 0)
+		tx_pkts_burst = pkts_burst;
+	else {
+		gso_ctx = &(current_fwd_lcore()->gso_ctx);
+		gso_ctx->gso_size = gso_max_segment_size;
+		for (i = 0; i < nb_rx; i++) {
+			ret = rte_gso_segment(pkts_burst[i], gso_ctx,
+					&gso_segments[nb_segments],
+					RTE_DIM(gso_segments) - nb_segments);
+			if (ret < 0)  {
+				RTE_LOG(DEBUG, USER1,
+						"Unable to segment \
+						packet %d of %d", i, nb_rx);
+				rte_pktmbuf_free(pkts_burst[i]);
+			} else
+				nb_segments += ret;
+		}
+
+		tx_pkts_burst = gso_segments;
+		nb_rx = nb_segments;
+	}
+
 	nb_prep = rte_eth_tx_prepare(fs->tx_port, fs->tx_queue,
-			pkts_burst, nb_rx);
+			tx_pkts_burst, nb_rx);
 	if (nb_prep != nb_rx)
 		printf("Preparing packet burst to transmit failed: %s\n",
 				rte_strerror(rte_errno));
 
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst,
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, tx_pkts_burst,
 			nb_prep);
 
 	/*
@@ -868,7 +902,7 @@ struct simple_gre_hdr {
 		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
 			rte_delay_us(burst_tx_delay_time);
 			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_rx - nb_tx);
+					&tx_pkts_burst[nb_tx], nb_rx - nb_tx);
 		}
 	}
 	fs->tx_packets += nb_tx;
@@ -881,9 +915,10 @@ struct simple_gre_hdr {
 	if (unlikely(nb_tx < nb_rx)) {
 		fs->fwd_dropped += (nb_rx - nb_tx);
 		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
+			rte_pktmbuf_free(tx_pkts_burst[nb_tx]);
 		} while (++nb_tx < nb_rx);
 	}
+
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	end_tsc = rte_rdtsc();
 	core_cycles = (end_tsc - start_tsc);
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index e097ee0..b9ee77c 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -400,6 +400,9 @@ static int eth_event_callback(uint8_t port_id,
  */
 static int all_ports_started(void);
 
+struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+uint16_t gso_max_segment_size = ETHER_MAX_LEN - ETHER_CRC_LEN;
+
 /*
  * Helper function to check if socket is already discovered.
  * If yes, return positive value. If not, return zero.
@@ -570,6 +573,7 @@ static int eth_event_callback(uint8_t port_id,
 	unsigned int nb_mbuf_per_pool;
 	lcoreid_t  lc_id;
 	uint8_t port_per_socket[RTE_MAX_NUMA_NODES];
+	uint32_t gso_types = 0;
 
 	memset(port_per_socket,0,RTE_MAX_NUMA_NODES);
 
@@ -654,6 +658,8 @@ static int eth_event_callback(uint8_t port_id,
 
 	init_port_config();
 
+	gso_types = DEV_TX_OFFLOAD_TCP_TSO | DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+		DEV_TX_OFFLOAD_GRE_TNL_TSO;
 	/*
 	 * Records which Mbuf pool to use by each logical core, if needed.
 	 */
@@ -664,6 +670,13 @@ static int eth_event_callback(uint8_t port_id,
 		if (mbp == NULL)
 			mbp = mbuf_pool_find(0);
 		fwd_lcores[lc_id]->mbp = mbp;
+		/* initialize GSO context */
+		fwd_lcores[lc_id]->gso_ctx.direct_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.indirect_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.gso_types = gso_types;
+		fwd_lcores[lc_id]->gso_ctx.gso_size = ETHER_MAX_LEN -
+			ETHER_CRC_LEN;
+		fwd_lcores[lc_id]->gso_ctx.flag = 0;
 	}
 
 	/* Configuration of packet forwarding streams. */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 1d1ee75..ff842a1 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -36,6 +36,7 @@
 
 #include <rte_pci.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
@@ -205,6 +206,7 @@ struct rte_port {
  * CPU id. configuration table.
  */
 struct fwd_lcore {
+	struct rte_gso_ctx gso_ctx;     /**< GSO context */
 	struct rte_mempool *mbp; /**< The mbuf pool to use by this core */
 	streamid_t stream_idx;   /**< index of 1st stream in "fwd_streams" */
 	streamid_t stream_nb;    /**< number of streams in "fwd_streams" */
@@ -442,6 +444,13 @@ struct gro_status {
 };
 extern struct gro_status gro_ports[RTE_MAX_ETHPORTS];
 
+#define GSO_MAX_PKT_BURST 2048
+struct gso_status {
+	uint8_t enable;
+};
+extern struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+extern uint16_t gso_max_segment_size;
+
 static inline unsigned int
 lcore_num(void)
 {
@@ -642,6 +651,7 @@ void port_rss_hash_key_update(portid_t port_id, char rss_type[],
 int rx_queue_id_is_invalid(queueid_t rxq_id);
 int tx_queue_id_is_invalid(queueid_t txq_id);
 void setup_gro(const char *mode, uint8_t port_id);
+void setup_gso(const char *mode, uint8_t port_id);
 
 /* Functions to manage the set of filtered Multicast MAC addresses */
 void mcast_addr_add(uint8_t port_id, struct ether_addr *mc_addr);
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 2ed62f5..f9b5bda 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -932,6 +932,52 @@ number of packets a GRO table can store.
 If current packet number is greater than or equal to the max value, GRO
 will stop processing incoming packets.
 
+set port - gso
+~~~~~~~~~~~~~~
+
+Toggle per-port GSO support in ``csum`` forwarding engine::
+
+   testpmd> set port <port_id> gso on|off
+
+If enabled, the csum forwarding engine will perform GSO on supported IPv4
+packets, transmitted on the given port.
+
+If disabled, packets transmitted on the given port will not undergo GSO.
+By default, GSO is disabled for all ports.
+
+.. note::
+
+   When GSO is enabled on a port, supported IPv4 packets transmitted on that
+   port undergo GSO. Afterwards, the segmented packets are represented by
+   multi-segment mbufs; however, the csum forwarding engine doesn't calculation
+   of checksums for GSO'd segments in SW. As a result, if users want correct
+   checksums in GSO segments, they should enable HW checksum calculation for
+   GSO-enabled ports.
+
+   For example, HW checksum calculation for VxLAN GSO'd packets may be enabled
+   by setting the following options in the csum forwarding engine:
+
+   testpmd> csum set outer_ip hw <port_id>
+
+   testpmd> csum set ip hw <port_id>
+
+   testpmd> csum set tcp hw <port_id>
+
+set gso segsz
+~~~~~~~~~~~~~
+
+Set the maximum GSO segment size (measured in bytes), which includes the
+packet header and the packet payload for GSO-enabled ports (global)::
+
+   testpmd> set gso segsz <length>
+
+show port - gso
+~~~~~~~~~~~~~~~
+
+Display the status of Generic Segmentation Offload for a given port::
+
+   testpmd> show port <port_id> gso
+
 mac_addr add
 ~~~~~~~~~~~~
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v7 6/6] doc: add GSO programmer's guide
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
                               ` (4 preceding siblings ...)
  2017-10-05 11:02             ` [PATCH v7 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
@ 2017-10-05 11:02             ` Mark Kavanagh
  2017-10-05 13:22             ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Ananyev, Konstantin
                               ` (7 subsequent siblings)
  13 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 11:02 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

Add programmer's guide doc to explain the design and use of the
GSO library.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 MAINTAINERS                                        |   6 +
 .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
 .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
 doc/guides/prog_guide/index.rst                    |   1 +
 5 files changed, 1053 insertions(+)
 create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
 create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
 create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg

diff --git a/MAINTAINERS b/MAINTAINERS
index 8df2a7f..8f0a4bd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -644,6 +644,12 @@ M: Jiayu Hu <jiayu.hu@intel.com>
 F: lib/librte_gro/
 F: doc/guides/prog_guide/generic_receive_offload_lib.rst
 
+Generic Segmentation Offload
+M: Jiayu Hu <jiayu.hu@intel.com>
+M: Mark Kavanagh <mark.b.kavanagh@intel.com>
+F: lib/librte_gso/
+F: doc/guides/prog_guide/generic_segmentation_offload_lib.rst
+
 Distributor
 M: Bruce Richardson <bruce.richardson@intel.com>
 M: David Hunt <david.hunt@intel.com>
diff --git a/doc/guides/prog_guide/generic_segmentation_offload_lib.rst b/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
new file mode 100644
index 0000000..5e78f16
--- /dev/null
+++ b/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
@@ -0,0 +1,256 @@
+..  BSD LICENSE
+    Copyright(c) 2017 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Generic Segmentation Offload Library
+====================================
+
+Overview
+--------
+Generic Segmentation Offload (GSO) is a widely used software implementation of
+TCP Segmentation Offload (TSO), which reduces per-packet processing overhead.
+Much like TSO, GSO gains performance by enabling upper layer applications to
+process a smaller number of large packets (e.g. MTU size of 64KB), instead of
+processing higher numbers of small packets (e.g. MTU size of 1500B), thus
+reducing per-packet overhead.
+
+For example, GSO allows guest kernel stacks to transmit over-sized TCP segments
+that far exceed the kernel interface's MTU; this eliminates the need to segment
+packets within the guest, and improves the data-to-overhead ratio of both the
+guest-host link, and PCI bus. The expectation of the guest network stack in this
+scenario is that segmentation of egress frames will take place either in the NIC
+HW, or where that hardware capability is unavailable, either in the host
+application, or network stack.
+
+Bearing that in mind, the GSO library enables DPDK applications to segment
+packets in software. Note however, that GSO is implemented as a standalone
+library, and not via a 'fallback' mechanism (i.e. for when TSO is unsupported
+in the underlying hardware); that is, applications must explicitly invoke the
+GSO library to segment packets. The size of GSO segments ``(segsz)`` is
+configurable by the application.
+
+Limitations
+-----------
+
+#. The GSO library doesn't check if input packets have correct checksums.
+
+#. In addition, the GSO library doesn't re-calculate checksums for segmented
+   packets (that task is left to the application).
+
+#. IP fragments are unsupported by the GSO library.
+
+#. The egress interface's driver must support multi-segment packets.
+
+#. Currently, the GSO library supports the following IPv4 packet types:
+
+ - TCP
+ - VxLAN
+ - GRE
+
+  See `Supported GSO Packet Types`_ for further details.
+
+Packet Segmentation
+-------------------
+
+The ``rte_gso_segment()`` function is the GSO library's primary
+segmentation API.
+
+Before performing segmentation, an application must create a GSO context object
+``(struct rte_gso_ctx)``, which provides the library with some of the
+information required to understand how the packet should be segmented. Refer to
+`How to Segment a Packet`_ for additional details on same. Once the GSO context
+has been created, and populated, the application can then use the
+``rte_gso_segment()`` function to segment packets.
+
+The GSO library typically stores each segment that it creates in two parts: the
+first part contains a copy of the original packet's headers, while the second
+part contains a pointer to an offset within the original packet. This mechanism
+is explained in more detail in `GSO Output Segment Format`_.
+
+The GSO library supports both single- and multi-segment input mbufs.
+
+GSO Output Segment Format
+~~~~~~~~~~~~~~~~~~~~~~~~~
+To reduce the number of expensive memcpy operations required when segmenting a
+packet, the GSO library typically stores each segment that it creates as a
+two-part mbuf (technically, this is termed a 'two-segment' mbuf; however, since
+the elements produced by the API are also called 'segments', for clarity the
+term 'part' is used here instead).
+
+The first part of each output segment is a direct mbuf and contains a copy of
+the original packet's headers, which must be prepended to each output segment.
+These headers are copied from the original packet into each output segment.
+
+The second part of each output segment, represents a section of data from the
+original packet, i.e. a data segment. Rather than copy the data directly from
+the original packet into the output segment (which would impact performance
+considerably), the second part of each output segment is an indirect mbuf,
+which contains no actual data, but simply points to an offset within the
+original packet.
+
+The combination of the 'header' segment and the 'data' segment constitutes a
+single logical output GSO segment of the original packet. This is illustrated
+in :numref:`figure_gso-output-segment-format`.
+
+.. _figure_gso-output-segment-format:
+
+.. figure:: img/gso-output-segment-format.svg
+   :align: center
+
+   Two-part GSO output segment
+
+In one situation, the output segment may contain additional 'data' segments.
+This only occurs when:
+
+- the input packet on which GSO is to be performed is represented by a
+  multi-segment mbuf.
+
+- the output segment is required to contain data that spans the boundaries
+  between segments of the input multi-segment mbuf.
+
+The GSO library traverses each segment of the input packet, and produces
+numerous output segments; for optimal performance, the number of output
+segments is kept to a minimum. Consequently, the GSO library maximizes the
+amount of data contained within each output segment; i.e. each output segment
+``segsz`` bytes of data. The only exception to this is in the case of the very
+final output segment; if ``pkt_len`` % ``segsz``, then the final segment is
+smaller than the rest.
+
+In order for an output segment to meet its MSS, it may need to include data from
+multiple input segments. Due to the nature of indirect mbufs (each indirect mbuf
+can point to only one direct mbuf), the solution here is to add another indirect
+mbuf to the output segment; this additional segment then points to the next
+input segment. If necessary, this chaining process is repeated, until the sum of
+all of the data 'contained' in the output segment reaches ``segsz``. This
+ensures that the amount of data contained within each output segment is uniform,
+with the possible exception of the last segment, as previously described.
+
+:numref:`figure_gso-three-seg-mbuf` illustrates an example of a three-part
+output segment. In this example, the output segment needs to include data from
+the end of one input segment, and the beginning of another. To achieve this,
+an additional indirect mbuf is chained to the second part of the output segment,
+and is attached to the next input segment (i.e. it points to the data in the
+next input segment).
+
+.. _figure_gso-three-seg-mbuf:
+
+.. figure:: img/gso-three-seg-mbuf.svg
+   :align: center
+
+   Three-part GSO output segment
+
+Supported GSO Packet Types
+--------------------------
+
+TCP/IPv4 GSO
+~~~~~~~~~~~~
+TCP/IPv4 GSO supports segmentation of suitably large TCP/IPv4 packets, which
+may also contain an optional VLAN tag.
+
+VxLAN GSO
+~~~~~~~~~
+VxLAN packets GSO supports segmentation of suitably large VxLAN packets,
+which contain an outer IPv4 header, inner TCP/IPv4 headers, and optional
+inner and/or outer VLAN tag(s).
+
+GRE GSO
+~~~~~~~
+GRE GSO supports segmentation of suitably large GRE packets, which contain
+an outer IPv4 header, inner TCP/IPv4 headers, and an optional VLAN tag.
+
+How to Segment a Packet
+-----------------------
+
+To segment an outgoing packet, an application must:
+
+#. First create a GSO context ``(struct rte_gso_ctx)``; this contains:
+
+   - a pointer to the mbuf pool for allocating the direct buffers, which are
+     used to store the GSO segments' packet headers.
+
+   - a pointer to the mbuf pool for allocating indirect buffers, which are
+     used to locate GSO segments' packet payloads.
+
+.. note::
+
+     An application may use the same pool for both direct and indirect
+     buffers. However, since each indirect mbuf simply stores a pointer, the
+     application may reduce its memory consumption by creating a separate memory
+     pool, containing smaller elements, for the indirect pool.
+
+   - the size of each output segment, including packet headers and payload,
+     measured in bytes.
+
+   - the bit mask of required GSO types. The GSO library uses the same macros as
+     those that describe a physical device's TX offloading capabilities (i.e.
+     ``DEV_TX_OFFLOAD_*_TSO``) for gso_types. For example, if an application
+     wants to segment TCP/IPv4 packets, it should set gso_types to
+     ``DEV_TX_OFFLOAD_TCP_TSO``. The only other supported values currently
+     supported for gso_types are ``DEV_TX_OFFLOAD_VXLAN_TNL_TSO``, and
+     ``DEV_TX_OFFLOAD_GRE_TNL_TSO``; a combination of these macros is also
+     allowed.
+
+   - a flag, that indicates whether the IPv4 headers of output segments should
+     contain fixed or incremental ID values.
+
+2. Set the appropriate ol_flags in the mbuf.
+
+   - The GSO library use the value of an mbuf's ``ol_flags`` attribute to
+     to determine how a packet should be segmented. It is the application's
+     responsibility to ensure that these flags are set.
+
+   - For example, in order to segment TCP/IPv4 packets, the application should
+     add the ``PKT_TX_IPV4`` and ``PKT_TX_TCP_SEG`` flags to the mbuf's
+     ol_flags.
+
+   - If checksum calculation in hardware is required, the application should
+     also add the ``PKT_TX_TCP_CKSUM`` and ``PKT_TX_IP_CKSUM`` flags.
+
+#. Check if the packet should be processed. Packets with one of the
+   following properties are not processed and are returned immediately:
+
+   - Packet length is less than ``segsz`` (i.e. GSO is not required).
+
+   - Packet type is not supported by GSO library (see
+     `Supported GSO Packet Types`_).
+
+   - Application has not enabled GSO support for the packet type.
+
+   - Packet's ol_flags have been incorrectly set.
+
+#. Allocate space in which to store the output GSO segments. If the amount of
+   space allocated by the application is insufficient, segmentation will fail.
+
+#. Invoke the GSO segmentation API, ``rte_gso_segment()``.
+
+#. If required, update the L3 and L4 checksums of the newly-created segments.
+   For tunneled packets, the outer IPv4 headers' checksums should also be
+   updated. Alternatively, the application may offload checksum calculation
+   to HW.
+
diff --git a/doc/guides/prog_guide/img/gso-output-segment-format.svg b/doc/guides/prog_guide/img/gso-output-segment-format.svg
new file mode 100644
index 0000000..bdb5ec3
--- /dev/null
+++ b/doc/guides/prog_guide/img/gso-output-segment-format.svg
@@ -0,0 +1,313 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by Microsoft Visio, SVG Export gso-output-segment-format.svg Page-1 -->
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ev="http://www.w3.org/2001/xml-events"
+		xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/" width="19.3975in" height="8.21796in"
+		viewBox="0 0 1396.62 591.693" xml:space="preserve" color-interpolation-filters="sRGB" class="st21">
+	<v:documentProperties v:langID="1033" v:metric="true" v:viewMarkup="false"/>
+
+	<style type="text/css">
+	<![CDATA[
+		.st1 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0552552}
+		.st2 {fill:#ffffff;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st3 {stroke:#c3d600;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.68828}
+		.st4 {fill:#c3d600;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st5 {stroke:#8f9d00;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.75735}
+		.st6 {fill:#00aeef;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st7 {stroke:#007fb0;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.75735}
+		.st8 {stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st9 {fill:#ffffff;font-family:Intel Clear;font-size:1.99999em;font-weight:bold}
+		.st10 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0552552}
+		.st11 {fill:#ffffff;font-family:Intel Clear;font-size:2.44732em;font-weight:bold}
+		.st12 {fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round;stroke-width:5.52552}
+		.st13 {fill:#000000;font-family:Intel Clear;font-size:2.15291em}
+		.st14 {fill:#000000;font-family:Intel Clear;font-size:1.8401em}
+		.st15 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0276276}
+		.st16 {fill:#c3d600;font-family:Intel Clear;font-size:2.44732em}
+		.st17 {fill:#ffc000;font-family:Intel Clear;font-size:2.44732em}
+		.st18 {fill:#ffc000;stroke:#ffc000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0276276}
+		.st19 {fill:#0070c0;font-family:Intel Clear;font-size:1.8401em}
+		.st20 {fill:#006fc5;font-family:Intel Clear;font-size:1.61927em}
+		.st21 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}
+	]]>
+	</style>
+
+	<g v:mID="0" v:index="1" v:groupContext="foregroundPage">
+		<title>Page-1</title>
+		<v:pageProperties v:drawingScale="0.0393701" v:pageScale="0.0393701" v:drawingUnits="24" v:shadowOffsetX="8.50394"
+				v:shadowOffsetY="-8.50394"/>
+		<g id="shape3-1" v:mID="3" v:groupContext="shape" transform="translate(577.244,-560.42)">
+			<title>Sheet.3</title>
+			<path d="M9.24 585.29 L16.32 585.29 L16.32 587.06 L9.24 587.06 L9.24 585.29 L9.24 585.29 ZM21.63 585.29 L23.4 585.29
+						 L23.4 587.06 L21.63 587.06 L21.63 585.29 L21.63 585.29 ZM28.7 585.29 L35.78 585.29 L35.78 587.06 L28.7 587.06
+						 L28.7 585.29 L28.7 585.29 ZM41.09 585.29 L42.86 585.29 L42.86 587.06 L41.09 587.06 L41.09 585.29 L41.09
+						 585.29 ZM48.17 585.29 L55.25 585.29 L55.25 587.06 L48.17 587.06 L48.17 585.29 L48.17 585.29 ZM60.56 585.29
+						 L62.33 585.29 L62.33 587.06 L60.56 587.06 L60.56 585.29 L60.56 585.29 ZM67.64 585.29 L74.72 585.29 L74.72
+						 587.06 L67.64 587.06 L67.64 585.29 L67.64 585.29 ZM80.03 585.29 L81.8 585.29 L81.8 587.06 L80.03 587.06
+						 L80.03 585.29 L80.03 585.29 ZM87.11 585.29 L94.19 585.29 L94.19 587.06 L87.11 587.06 L87.11 585.29 L87.11
+						 585.29 ZM99.5 585.29 L101.27 585.29 L101.27 587.06 L99.5 587.06 L99.5 585.29 L99.5 585.29 ZM106.58 585.29
+						 L113.66 585.29 L113.66 587.06 L106.58 587.06 L106.58 585.29 L106.58 585.29 ZM118.97 585.29 L120.74 585.29
+						 L120.74 587.06 L118.97 587.06 L118.97 585.29 L118.97 585.29 ZM126.05 585.29 L133.13 585.29 L133.13 587.06
+						 L126.05 587.06 L126.05 585.29 L126.05 585.29 ZM138.43 585.29 L140.2 585.29 L140.2 587.06 L138.43 587.06
+						 L138.43 585.29 L138.43 585.29 ZM145.51 585.29 L152.59 585.29 L152.59 587.06 L145.51 587.06 L145.51 585.29
+						 L145.51 585.29 ZM157.9 585.29 L159.67 585.29 L159.67 587.06 L157.9 587.06 L157.9 585.29 L157.9 585.29 ZM164.98
+						 585.29 L172.06 585.29 L172.06 587.06 L164.98 587.06 L164.98 585.29 L164.98 585.29 ZM177.37 585.29 L179.14
+						 585.29 L179.14 587.06 L177.37 587.06 L177.37 585.29 L177.37 585.29 ZM184.45 585.29 L191.53 585.29 L191.53
+						 587.06 L184.45 587.06 L184.45 585.29 L184.45 585.29 ZM196.84 585.29 L198.61 585.29 L198.61 587.06 L196.84
+						 587.06 L196.84 585.29 L196.84 585.29 ZM203.92 585.29 L211 585.29 L211 587.06 L203.92 587.06 L203.92 585.29
+						 L203.92 585.29 ZM216.31 585.29 L218.08 585.29 L218.08 587.06 L216.31 587.06 L216.31 585.29 L216.31 585.29
+						 ZM223.39 585.29 L230.47 585.29 L230.47 587.06 L223.39 587.06 L223.39 585.29 L223.39 585.29 ZM235.78 585.29
+						 L237.55 585.29 L237.55 587.06 L235.78 587.06 L235.78 585.29 L235.78 585.29 ZM242.86 585.29 L249.93 585.29
+						 L249.93 587.06 L242.86 587.06 L242.86 585.29 L242.86 585.29 ZM255.24 585.29 L257.01 585.29 L257.01 587.06
+						 L255.24 587.06 L255.24 585.29 L255.24 585.29 ZM262.32 585.29 L269.4 585.29 L269.4 587.06 L262.32 587.06
+						 L262.32 585.29 L262.32 585.29 ZM274.71 585.29 L276.48 585.29 L276.48 587.06 L274.71 587.06 L274.71 585.29
+						 L274.71 585.29 ZM281.79 585.29 L288.87 585.29 L288.87 587.06 L281.79 587.06 L281.79 585.29 L281.79 585.29
+						 ZM294.18 585.29 L295.95 585.29 L295.95 587.06 L294.18 587.06 L294.18 585.29 L294.18 585.29 ZM301.26 585.29
+						 L308.34 585.29 L308.34 587.06 L301.26 587.06 L301.26 585.29 L301.26 585.29 ZM313.65 585.29 L315.42 585.29
+						 L315.42 587.06 L313.65 587.06 L313.65 585.29 L313.65 585.29 ZM320.73 585.29 L324.99 585.29 L324.99 587.06
+						 L320.73 587.06 L320.73 585.29 L320.73 585.29 ZM11.06 591.69 L0 586.17 L11.06 580.65 L11.06 591.69 L11.06
+						 591.69 ZM323.16 580.65 L334.22 586.17 L323.16 591.69 L323.16 580.65 L323.16 580.65 Z" class="st1"/>
+		</g>
+		<g id="shape4-3" v:mID="4" v:groupContext="shape" transform="translate(184.298,-201.906)">
+			<title>Sheet.4</title>
+			<path d="M94.04 570.43 L117.87 557.26 L0 344.58 L47.68 318.26 L165.55 530.94 L189.39 517.79 L168.08 591.69 L94.04 570.43
+						 Z" class="st2"/>
+		</g>
+		<g id="shape5-5" v:mID="5" v:groupContext="shape" transform="translate(184.298,-201.906)">
+			<title>Sheet.5</title>
+			<path d="M94.04 570.43 L117.87 557.26 L0 344.58 L47.68 318.26 L165.55 530.94 L189.39 517.79 L168.08 591.69 L94.04 570.43"
+					class="st3"/>
+		</g>
+		<g id="shape6-8" v:mID="6" v:groupContext="shape" transform="translate(119.408,-447.917)">
+			<title>Sheet.6</title>
+			<path d="M0 510.21 L0 591.69 L129.86 591.69 L129.86 510.21 L0 510.21 L0 510.21 Z" class="st4"/>
+		</g>
+		<g id="shape7-10" v:mID="7" v:groupContext="shape" transform="translate(119.408,-447.917)">
+			<title>Sheet.7</title>
+			<path d="M0 510.21 L129.86 510.21 L129.86 591.69 L0 591.69 L0 510.21" class="st5"/>
+		</g>
+		<g id="shape10-13" v:mID="10" v:groupContext="shape" transform="translate(250.819,-447.917)">
+			<title>Sheet.10</title>
+			<path d="M0 510.21 L0 591.69 L822.53 591.69 L822.53 510.21 L0 510.21 L0 510.21 Z" class="st6"/>
+		</g>
+		<g id="shape11-15" v:mID="11" v:groupContext="shape" transform="translate(250.819,-447.917)">
+			<title>Sheet.11</title>
+			<path d="M0 510.21 L822.53 510.21 L822.53 591.69 L0 591.69 L0 510.21" class="st7"/>
+		</g>
+		<g id="shape12-18" v:mID="12" v:groupContext="shape" transform="translate(255.478,-470.123)">
+			<title>Sheet.12</title>
+			<desc>Payload 0</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="157.315" cy="574.07" width="314.63" height="35.245"/>
+			<path d="M314.63 556.45 L0 556.45 L0 591.69 L314.63 591.69 L314.63 556.45" class="st8"/>
+			<text x="102.08" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 0</text>		</g>
+		<g id="shape13-22" v:mID="13" v:groupContext="shape" transform="translate(577.354,-470.123)">
+			<title>Sheet.13</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="167.112" cy="574.07" width="334.23" height="35.245"/>
+			<path d="M334.22 556.45 L0 556.45 L0 591.69 L334.22 591.69 L334.22 556.45" class="st8"/>
+			<text x="111.88" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape14-26" v:mID="14" v:groupContext="shape" transform="translate(910.635,-470.956)">
+			<title>Sheet.14</title>
+			<desc>Payload 2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="81.8509" cy="574.07" width="163.71" height="35.245"/>
+			<path d="M163.7 556.45 L0 556.45 L0 591.69 L163.7 591.69 L163.7 556.45" class="st8"/>
+			<text x="26.61" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 2</text>		</g>
+		<g id="shape15-30" v:mID="15" v:groupContext="shape" transform="translate(909.144,-453.824)">
+			<title>Sheet.15</title>
+			<path d="M1.16 453.85 L1.05 465.33 L3.93 465.39 L4.04 453.91 L1.16 453.85 L1.16 453.85 ZM1 473.95 L0.94 476.82 L3.82
+						 476.87 L3.87 474 L1 473.95 L1 473.95 ZM0.88 485.43 L0.77 496.91 L3.65 496.96 L3.76 485.48 L0.88 485.43 L0.88
+						 485.43 ZM0.72 505.52 L0.72 508.39 L3.59 508.45 L3.59 505.58 L0.72 505.52 L0.72 505.52 ZM0.61 517 L0.55 528.49
+						 L3.43 528.54 L3.48 517.06 L0.61 517 L0.61 517 ZM0.44 537.1 L0.44 539.97 L3.32 540.02 L3.32 537.15 L0.44
+						 537.1 L0.44 537.1 ZM0.39 548.58 L0.28 560.06 L3.15 560.12 L3.26 548.63 L0.39 548.58 L0.39 548.58 ZM0.22
+						 568.67 L0.17 571.54 L3.04 571.6 L3.1 568.73 L0.22 568.67 L0.22 568.67 ZM0.11 580.16 L0 591.64 L2.88 591.69
+						 L2.99 580.21 L0.11 580.16 L0.11 580.16 Z" class="st10"/>
+		</g>
+		<g id="shape16-32" v:mID="16" v:groupContext="shape" transform="translate(119.187,-447.917)">
+			<title>Sheet.16</title>
+			<path d="M0 510.21 L0 591.69 L129.86 591.69 L129.86 510.21 L0 510.21 L0 510.21 Z" class="st4"/>
+		</g>
+		<g id="shape17-34" v:mID="17" v:groupContext="shape" transform="translate(119.187,-447.917)">
+			<title>Sheet.17</title>
+			<path d="M0 510.21 L129.86 510.21 L129.86 591.69 L0 591.69 L0 510.21" class="st5"/>
+		</g>
+		<g id="shape18-37" v:mID="18" v:groupContext="shape" transform="translate(121.944,-471.034)">
+			<title>Sheet.18</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="61.0973" cy="574.07" width="122.2" height="35.245"/>
+			<path d="M122.19 556.45 L0 556.45 L0 591.69 L122.19 591.69 L122.19 556.45" class="st8"/>
+			<text x="20.61" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape19-41" v:mID="19" v:groupContext="shape" transform="translate(329.798,-1.87868)">
+			<title>Sheet.19</title>
+			<path d="M0 510.43 L0 591.69 L289.81 591.69 L289.81 510.43 L0 510.43 L0 510.43 Z" class="st4"/>
+		</g>
+		<g id="shape20-43" v:mID="20" v:groupContext="shape" transform="translate(329.798,-1.87868)">
+			<title>Sheet.20</title>
+			<path d="M0 510.43 L289.81 510.43 L289.81 591.69 L0 591.69 L0 510.43" class="st5"/>
+		</g>
+		<g id="shape21-46" v:mID="21" v:groupContext="shape" transform="translate(424.908,-21.567)">
+			<title>Sheet.21</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="61.0973" cy="574.07" width="122.2" height="35.245"/>
+			<path d="M122.19 556.45 L0 556.45 L0 591.69 L122.19 591.69 L122.19 556.45" class="st8"/>
+			<text x="11.55" y="582.88" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape22-50" v:mID="22" v:groupContext="shape" transform="translate(619.609,-1.87868)">
+			<title>Sheet.22</title>
+			<path d="M0 510.43 L0 591.69 L453.74 591.69 L453.74 510.43 L0 510.43 L0 510.43 Z" class="st6"/>
+		</g>
+		<g id="shape23-52" v:mID="23" v:groupContext="shape" transform="translate(619.609,-1.87868)">
+			<title>Sheet.23</title>
+			<path d="M0 510.43 L453.74 510.43 L453.74 591.69 L0 591.69 L0 510.43" class="st7"/>
+		</g>
+		<g id="shape24-55" v:mID="24" v:groupContext="shape" transform="translate(778.624,-21.5672)">
+			<title>Sheet.24</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="81.8509" cy="574.07" width="163.71" height="35.245"/>
+			<path d="M163.7 556.45 L0 556.45 L0 591.69 L163.7 591.69 L163.7 556.45" class="st8"/>
+			<text x="14.26" y="582.88" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape25-59" v:mID="25" v:groupContext="shape" transform="translate(710.092,-113.83)">
+			<title>Sheet.25</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L349.43 508.89 C357.12 508.89 363.26 515.07 363.26 522.69 L363.26
+						 577.89 C363.26 585.57 357.12 591.69 349.43 591.69 L13.83 591.69 C6.19 591.69 0 585.57 0 577.89 L0 522.69
+						 Z" class="st6"/>
+		</g>
+		<g id="shape26-61" v:mID="26" v:groupContext="shape" transform="translate(710.092,-113.83)">
+			<title>Sheet.26</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L349.43 508.89 C357.12 508.89 363.26 515.07 363.26 522.69 L363.26
+						 577.89 C363.26 585.57 357.12 591.69 349.43 591.69 L13.83 591.69 C6.19 591.69 0 585.57 0 577.89 L0 522.69
+						 Z" class="st12"/>
+		</g>
+		<g id="shape27-63" v:mID="27" v:groupContext="shape" transform="translate(813.057,-150.108)">
+			<title>Sheet.27</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="94.1386" cy="576.19" width="188.28" height="31.0055"/>
+			<path d="M188.28 560.69 L0 560.69 L0 591.69 L188.28 591.69 L188.28 560.69" class="st8"/>
+			<text x="15.43" y="583.94" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape28-67" v:mID="28" v:groupContext="shape" transform="translate(810.845,-123.854)">
+			<title>Sheet.28</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="95.5065" cy="578.442" width="191.02" height="26.501"/>
+			<path d="M191.01 565.19 L0 565.19 L0 591.69 L191.01 591.69 L191.01 565.19" class="st8"/>
+			<text x="15.15" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape29-71" v:mID="29" v:groupContext="shape" transform="translate(573.151,-149.601)">
+			<title>Sheet.29</title>
+			<path d="M0 584.74 L127.76 584.74 L127.76 587.61 L0 587.61 L0 584.74 L0 584.74 ZM125.91 580.65 L136.97 586.17 L125.91
+						 591.69 L125.91 580.65 L125.91 580.65 Z" class="st15"/>
+		</g>
+		<g id="shape30-73" v:mID="30" v:groupContext="shape" transform="translate(0,-309.671)">
+			<title>Sheet.30</title>
+			<desc>Memory copy</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="108.076" cy="574.07" width="216.16" height="35.245"/>
+			<path d="M216.15 556.45 L0 556.45 L0 591.69 L216.15 591.69 L216.15 556.45" class="st8"/>
+			<text x="17.68" y="582.88" class="st16" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Memory copy</text>		</g>
+		<g id="shape31-77" v:mID="31" v:groupContext="shape" transform="translate(680.77,-305.707)">
+			<title>Sheet.31</title>
+			<desc>No Memory Copy</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="136.547" cy="574.07" width="273.1" height="35.245"/>
+			<path d="M273.09 556.45 L0 556.45 L0 591.69 L273.09 591.69 L273.09 556.45" class="st8"/>
+			<text x="21.4" y="582.88" class="st17" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>No Memory Copy</text>		</g>
+		<g id="shape32-81" v:mID="32" v:groupContext="shape" transform="translate(1102.72,-26.7532)">
+			<title>Sheet.32</title>
+			<desc>Logical output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="138.243" cy="578.442" width="276.49" height="26.501"/>
+			<path d="M276.49 565.19 L0 565.19 L0 591.69 L276.49 591.69 L276.49 565.19" class="st8"/>
+			<text x="20.73" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Logical output segment</text>		</g>
+		<g id="shape36-85" v:mID="36" v:groupContext="shape" transform="translate(1106.81,-138.647)">
+			<title>Sheet.36</title>
+			<desc>Two-part output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="144.906" cy="578.442" width="289.82" height="26.501"/>
+			<path d="M289.81 565.19 L0 565.19 L0 591.69 L289.81 591.69 L289.81 565.19" class="st8"/>
+			<text x="16.56" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Two-part output segment</text>		</g>
+		<g id="shape37-89" v:mID="37" v:groupContext="shape" transform="translate(575.916,-453.879)">
+			<title>Sheet.37</title>
+			<path d="M2.88 453.91 L2.9 465.39 L0.03 465.39 L0 453.91 L2.88 453.91 L2.88 453.91 ZM2.9 474 L2.9 476.87 L0.03 476.87
+						 L0.03 474 L2.9 474 L2.9 474 ZM2.9 485.48 L2.9 496.96 L0.03 496.96 L0.03 485.48 L2.9 485.48 L2.9 485.48 ZM2.9
+						 505.58 L2.9 508.45 L0.03 508.45 L0.03 505.58 L2.9 505.58 L2.9 505.58 ZM2.9 517.06 L2.9 528.54 L0.03 528.54
+						 L0.03 517.06 L2.9 517.06 L2.9 517.06 ZM2.9 537.15 L2.9 540.02 L0.03 540.02 L0.03 537.15 L2.9 537.15 L2.9
+						 537.15 ZM2.9 548.63 L2.9 560.12 L0.03 560.12 L0.03 548.63 L2.9 548.63 L2.9 548.63 ZM2.9 568.73 L2.9 571.6
+						 L0.03 571.6 L0.03 568.73 L2.9 568.73 L2.9 568.73 ZM2.9 580.21 L2.9 591.69 L0.03 591.69 L0.03 580.21 L2.9
+						 580.21 L2.9 580.21 Z" class="st18"/>
+		</g>
+		<g id="shape38-91" v:mID="38" v:groupContext="shape" transform="translate(577.354,-193.764)">
+			<title>Sheet.38</title>
+			<path d="M5.59 347.01 L10.92 357.16 L8.38 358.52 L3.04 348.36 L5.59 347.01 L5.59 347.01 ZM14.96 364.78 L16.29 367.32
+						 L13.74 368.67 L12.42 366.13 L14.96 364.78 L14.96 364.78 ZM20.33 374.97 L25.66 385.12 L23.12 386.45 L17.78
+						 376.29 L20.33 374.97 L20.33 374.97 ZM29.7 392.74 L31.03 395.28 L28.48 396.61 L27.16 394.07 L29.7 392.74
+						 L29.7 392.74 ZM35.04 402.9 L40.4 413.06 L37.86 414.38 L32.49 404.22 L35.04 402.9 L35.04 402.9 ZM44.41 420.67
+						 L45.77 423.21 L43.22 424.57 L41.87 422.03 L44.41 420.67 L44.41 420.67 ZM49.78 430.83 L55.14 440.99 L52.6
+						 442.34 L47.23 432.18 L49.78 430.83 L49.78 430.83 ZM59.15 448.61 L60.51 451.15 L57.96 452.5 L56.61 449.96
+						 L59.15 448.61 L59.15 448.61 ZM64.52 458.79 L69.88 468.95 L67.34 470.27 L61.97 460.12 L64.52 458.79 L64.52
+						 458.79 ZM73.89 476.57 L75.25 479.11 L72.7 480.43 L71.35 477.89 L73.89 476.57 L73.89 476.57 ZM79.26 486.72
+						 L84.62 496.88 L82.08 498.21 L76.71 488.05 L79.26 486.72 L79.26 486.72 ZM88.63 504.5 L89.96 507.04 L87.41
+						 508.39 L86.09 505.85 L88.63 504.5 L88.63 504.5 ZM94 514.66 L99.33 524.81 L96.79 526.17 L91.45 516.01 L94
+						 514.66 L94 514.66 ZM103.37 532.43 L104.7 534.97 L102.15 536.32 L100.83 533.79 L103.37 532.43 L103.37 532.43
+						 ZM108.73 542.62 L114.07 552.77 L111.53 554.1 L106.19 543.94 L108.73 542.62 L108.73 542.62 ZM118.11 560.39
+						 L119.44 562.93 L116.89 564.26 L115.57 561.72 L118.11 560.39 L118.11 560.39 ZM123.45 570.55 L128.81 580.71
+						 L126.27 582.03 L120.9 571.87 L123.45 570.55 L123.45 570.55 ZM132.82 588.33 L133.9 590.37 L131.36 591.69
+						 L130.28 589.68 L132.82 588.33 L132.82 588.33 ZM0.28 351.89 L0 339.53 L10.07 346.73 L0.28 351.89 L0.28 351.89
+						 Z" class="st18"/>
+		</g>
+		<g id="shape39-93" v:mID="39" v:groupContext="shape" transform="translate(329.798,-113.83)">
+			<title>Sheet.39</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L229.53 508.89 C237.19 508.89 243.35 515.07 243.35 522.69 L243.35
+						 577.89 C243.35 585.54 237.19 591.69 229.53 591.69 L13.83 591.69 C6.19 591.69 0 585.54 0 577.89 L0 522.69
+						 Z" class="st4"/>
+		</g>
+		<g id="shape40-95" v:mID="40" v:groupContext="shape" transform="translate(329.798,-113.83)">
+			<title>Sheet.40</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L229.53 508.89 C237.19 508.89 243.35 515.07 243.35 522.69 L243.35
+						 577.89 C243.35 585.54 237.19 591.69 229.53 591.69 L13.83 591.69 C6.19 591.69 0 585.54 0 577.89 L0 522.69
+						 Z" class="st12"/>
+		</g>
+		<g id="shape41-97" v:mID="41" v:groupContext="shape" transform="translate(368.774,-150.453)">
+			<title>Sheet.41</title>
+			<desc>Direct mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="82.7002" cy="576.19" width="165.41" height="31.0055"/>
+			<path d="M165.4 560.69 L0 560.69 L0 591.69 L165.4 591.69 L165.4 560.69" class="st8"/>
+			<text x="13.94" y="583.94" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Direct mbuf</text>		</g>
+		<g id="shape42-101" v:mID="42" v:groupContext="shape" transform="translate(351.856,-123.854)">
+			<title>Sheet.42</title>
+			<desc>(copy of headers)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="102.121" cy="578.442" width="204.25" height="26.501"/>
+			<path d="M204.24 565.19 L0 565.19 L0 591.69 L204.24 591.69 L204.24 565.19" class="st8"/>
+			<text x="16.02" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(copy of headers)</text>		</g>
+		<g id="shape43-105" v:mID="43" v:groupContext="shape" transform="translate(619.797,-155.563)">
+			<title>Sheet.43</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="28.011" cy="578.442" width="56.03" height="26.501"/>
+			<path d="M56.02 565.19 L0 565.19 L0 591.69 L56.02 591.69 L56.02 565.19" class="st8"/>
+			<text x="6.35" y="585.07" class="st19" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape44-109" v:mID="44" v:groupContext="shape" transform="translate(700.911,-551.367)">
+			<title>Sheet.44</title>
+			<path d="M0 559.23 L0 591.69 L84.29 591.69 L84.29 559.23 L0 559.23 L0 559.23 Z" class="st2"/>
+		</g>
+		<g id="shape45-111" v:mID="45" v:groupContext="shape" transform="translate(709.883,-555.163)">
+			<title>Sheet.45</title>
+			<desc>segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="30.7501" cy="580.032" width="61.51" height="23.3211"/>
+			<path d="M61.5 568.37 L0 568.37 L0 591.69 L61.5 591.69 L61.5 568.37" class="st8"/>
+			<text x="6.38" y="585.86" class="st20" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>segsz</text>		</g>
+		<g id="shape46-115" v:mID="46" v:groupContext="shape" transform="translate(1111.54,-477.36)">
+			<title>Sheet.46</title>
+			<desc>Input packet</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="74.9" cy="578.442" width="149.8" height="26.501"/>
+			<path d="M149.8 565.19 L0 565.19 L0 591.69 L149.8 591.69 L149.8 565.19" class="st8"/>
+			<text x="12.47" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Input packet</text>		</g>
+	</g>
+</svg>
diff --git a/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg b/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
new file mode 100644
index 0000000..f18a327
--- /dev/null
+++ b/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
@@ -0,0 +1,477 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by Microsoft Visio, SVG Export gso-three-seg-mbuf.svg Page-1 -->
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ev="http://www.w3.org/2001/xml-events"
+		xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/" width="21.8589in" height="9.63966in"
+		viewBox="0 0 1573.84 694.055" xml:space="preserve" color-interpolation-filters="sRGB" class="st23">
+	<title>GSO three-part output segment</title>
+	<v:documentProperties v:langID="1033" v:metric="true" v:viewMarkup="false"/>
+
+	<style type="text/css">
+	<![CDATA[
+		.st1 {fill:#ffc000;stroke:#ffc000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st2 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st3 {fill:none;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.42236}
+		.st4 {fill:#c3d600;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st5 {stroke:#8f9d00;stroke-linecap:round;stroke-linejoin:round;stroke-width:4.47539}
+		.st6 {fill:#00aeef;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st7 {stroke:#007fb0;stroke-linecap:round;stroke-linejoin:round;stroke-width:4.47539}
+		.st8 {stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st9 {fill:#ffffff;font-family:Calibri;font-size:2.08333em;font-weight:bold}
+		.st10 {fill:#ffffff;font-family:Intel Clear;font-size:2.91502em;font-weight:bold}
+		.st11 {fill:#000000;font-family:Intel Clear;font-size:2.19175em}
+		.st12 {fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round;stroke-width:6.58146}
+		.st13 {fill:#000000;font-family:Intel Clear;font-size:2.50001em}
+		.st14 {fill:#000000;font-family:Intel Clear;font-size:1.99999em}
+		.st15 {fill:#0070c0;font-family:Intel Clear;font-size:2.19175em}
+		.st16 {fill:#ffffff;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st17 {fill:#006fc5;font-family:Intel Clear;font-size:1.92874em}
+		.st18 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st19 {fill:#0070c0;font-family:Intel Clear;font-size:1.5em}
+		.st20 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0658146}
+		.st21 {fill:#000000;font-family:Intel Clear;font-size:1.81915em}
+		.st22 {fill:#000000;font-family:Intel Clear;font-size:1.49785em}
+		.st23 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}
+	]]>
+	</style>
+
+	<g v:mID="0" v:index="1" v:groupContext="foregroundPage">
+		<title>Page-1</title>
+		<v:pageProperties v:drawingScale="0.0393701" v:pageScale="0.0393701" v:drawingUnits="24" v:shadowOffsetX="8.50394"
+				v:shadowOffsetY="-8.50394"/>
+		<v:layer v:name="top" v:index="0"/>
+		<v:layer v:name="middle" v:index="1"/>
+		<g id="shape111-1" v:mID="111" v:groupContext="shape" v:layerMember="0" transform="translate(787.208,-220.973)">
+			<title>Sheet.111</title>
+			<path d="M6.65 402.61 L13.01 414.71 L9.98 416.32 L3.62 404.22 L6.65 402.61 L6.65 402.61 ZM17.82 423.78 L19.4 426.81 L16.37
+						 428.42 L14.79 425.39 L17.82 423.78 L17.82 423.78 ZM24.21 435.91 L30.57 448.01 L27.54 449.59 L21.18 437.49
+						 L24.21 435.91 L24.21 435.91 ZM35.38 457.08 L36.96 460.11 L33.93 461.69 L32.35 458.66 L35.38 457.08 L35.38
+						 457.08 ZM41.73 469.18 L48.12 481.28 L45.09 482.86 L38.7 470.76 L41.73 469.18 L41.73 469.18 ZM52.9 490.36
+						 L54.51 493.38 L51.48 494.99 L49.87 491.97 L52.9 490.36 L52.9 490.36 ZM59.29 502.45 L65.68 514.55 L62.65
+						 516.16 L56.26 504.06 L59.29 502.45 L59.29 502.45 ZM70.46 523.63 L72.07 526.65 L69.04 528.26 L67.43 525.24
+						 L70.46 523.63 L70.46 523.63 ZM76.85 535.76 L83.24 547.86 L80.21 549.43 L73.82 537.34 L76.85 535.76 L76.85
+						 535.76 ZM88.01 556.93 L89.63 559.95 L86.6 561.53 L84.98 558.51 L88.01 556.93 L88.01 556.93 ZM94.4 569.03
+						 L100.79 581.13 L97.76 582.7 L91.37 570.61 L94.4 569.03 L94.4 569.03 ZM105.57 590.2 L107.15 593.22 L104.12
+						 594.84 L102.54 591.81 L105.57 590.2 L105.57 590.2 ZM111.96 602.3 L118.32 614.4 L115.28 616.01 L108.93 603.91
+						 L111.96 602.3 L111.96 602.3 ZM123.12 623.47 L124.71 626.5 L121.67 628.11 L120.09 625.08 L123.12 623.47 L123.12
+						 623.47 ZM129.51 635.6 L135.87 647.7 L132.84 649.28 L126.48 637.18 L129.51 635.6 L129.51 635.6 ZM140.68 656.77
+						 L142.26 659.8 L139.23 661.38 L137.65 658.35 L140.68 656.77 L140.68 656.77 ZM147.04 668.87 L153.43 680.97
+						 L150.4 682.55 L144.01 670.45 L147.04 668.87 L147.04 668.87 ZM158.2 690.04 L159.49 692.48 L156.46 694.06
+						 L155.17 691.66 L158.2 690.04 L158.2 690.04 ZM0.33 408.43 L0 393.7 L11.99 402.28 L0.33 408.43 L0.33 408.43
+						 Z" class="st1"/>
+		</g>
+		<g id="shape110-3" v:mID="110" v:groupContext="shape" v:layerMember="0" transform="translate(685.078,-560.166)">
+			<title>Sheet.110</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape4-5" v:mID="4" v:groupContext="shape" transform="translate(718.715,-469.955)">
+			<title>Sheet.4</title>
+			<path d="M0 655.13 L0 678.22 C0 686.97 11.69 694.06 26.05 694.06 C40.45 694.06 52.11 686.97 52.11 678.22 L52.11 673.91
+						 L59.55 673.91 L44.66 664.86 L29.78 673.91 L37.22 673.91 L37.22 678.22 C37.22 681.98 32.25 685 26.05 685
+						 C19.89 685 14.89 681.98 14.89 678.22 L14.89 655.13 L0 655.13 Z" class="st3"/>
+		</g>
+		<g id="shape5-7" v:mID="5" v:groupContext="shape" transform="translate(547.831,-656.823)">
+			<title>Sheet.5</title>
+			<path d="M11 686.43 L19.43 686.43 L19.43 688.53 L11 688.53 L11 686.43 L11 686.43 ZM25.76 686.43 L27.87 686.43 L27.87
+						 688.53 L25.76 688.53 L25.76 686.43 L25.76 686.43 ZM34.19 686.43 L42.62 686.43 L42.62 688.53 L34.19 688.53
+						 L34.19 686.43 L34.19 686.43 ZM48.95 686.43 L51.05 686.43 L51.05 688.53 L48.95 688.53 L48.95 686.43 L48.95
+						 686.43 ZM57.38 686.43 L65.81 686.43 L65.81 688.53 L57.38 688.53 L57.38 686.43 L57.38 686.43 ZM72.14 686.43
+						 L74.24 686.43 L74.24 688.53 L72.14 688.53 L72.14 686.43 L72.14 686.43 ZM80.57 686.43 L89 686.43 L89 688.53
+						 L80.57 688.53 L80.57 686.43 L80.57 686.43 ZM95.32 686.43 L97.43 686.43 L97.43 688.53 L95.32 688.53 L95.32
+						 686.43 L95.32 686.43 ZM103.76 686.43 L112.19 686.43 L112.19 688.53 L103.76 688.53 L103.76 686.43 L103.76
+						 686.43 ZM118.51 686.43 L120.62 686.43 L120.62 688.53 L118.51 688.53 L118.51 686.43 L118.51 686.43 ZM126.94
+						 686.43 L135.38 686.43 L135.38 688.53 L126.94 688.53 L126.94 686.43 L126.94 686.43 ZM141.7 686.43 L143.81
+						 686.43 L143.81 688.53 L141.7 688.53 L141.7 686.43 L141.7 686.43 ZM150.13 686.43 L158.57 686.43 L158.57 688.53
+						 L150.13 688.53 L150.13 686.43 L150.13 686.43 ZM164.89 686.43 L167 686.43 L167 688.53 L164.89 688.53 L164.89
+						 686.43 L164.89 686.43 ZM173.32 686.43 L181.75 686.43 L181.75 688.53 L173.32 688.53 L173.32 686.43 L173.32
+						 686.43 ZM188.08 686.43 L190.19 686.43 L190.19 688.53 L188.08 688.53 L188.08 686.43 L188.08 686.43 ZM196.51
+						 686.43 L204.94 686.43 L204.94 688.53 L196.51 688.53 L196.51 686.43 L196.51 686.43 ZM211.27 686.43 L213.38
+						 686.43 L213.38 688.53 L211.27 688.53 L211.27 686.43 L211.27 686.43 ZM219.7 686.43 L228.13 686.43 L228.13
+						 688.53 L219.7 688.53 L219.7 686.43 L219.7 686.43 ZM234.46 686.43 L236.56 686.43 L236.56 688.53 L234.46 688.53
+						 L234.46 686.43 L234.46 686.43 ZM242.89 686.43 L251.32 686.43 L251.32 688.53 L242.89 688.53 L242.89 686.43
+						 L242.89 686.43 ZM257.64 686.43 L259.75 686.43 L259.75 688.53 L257.64 688.53 L257.64 686.43 L257.64 686.43
+						 ZM266.08 686.43 L274.51 686.43 L274.51 688.53 L266.08 688.53 L266.08 686.43 L266.08 686.43 ZM280.83 686.43
+						 L282.94 686.43 L282.94 688.53 L280.83 688.53 L280.83 686.43 L280.83 686.43 ZM289.27 686.43 L297.7 686.43
+						 L297.7 688.53 L289.27 688.53 L289.27 686.43 L289.27 686.43 ZM304.02 686.43 L306.13 686.43 L306.13 688.53
+						 L304.02 688.53 L304.02 686.43 L304.02 686.43 ZM312.45 686.43 L320.89 686.43 L320.89 688.53 L312.45 688.53
+						 L312.45 686.43 L312.45 686.43 ZM327.21 686.43 L329.32 686.43 L329.32 688.53 L327.21 688.53 L327.21 686.43
+						 L327.21 686.43 ZM335.64 686.43 L344.08 686.43 L344.08 688.53 L335.64 688.53 L335.64 686.43 L335.64 686.43
+						 ZM350.4 686.43 L352.51 686.43 L352.51 688.53 L350.4 688.53 L350.4 686.43 L350.4 686.43 ZM358.83 686.43 L367.26
+						 686.43 L367.26 688.53 L358.83 688.53 L358.83 686.43 L358.83 686.43 ZM373.59 686.43 L375.7 686.43 L375.7
+						 688.53 L373.59 688.53 L373.59 686.43 L373.59 686.43 ZM382.02 686.43 L387.06 686.43 L387.06 688.53 L382.02
+						 688.53 L382.02 686.43 L382.02 686.43 ZM13.18 694.06 L0 687.48 L13.18 680.9 L13.18 694.06 L13.18 694.06 ZM384.89
+						 680.9 L398.06 687.48 L384.89 694.06 L384.89 680.9 L384.89 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape6-9" v:mID="6" v:groupContext="shape" transform="translate(2.5012,-522.82)">
+			<title>Sheet.6</title>
+			<path d="M0 597.01 L0 694.06 L154.68 694.06 L154.68 597.01 L0 597.01 L0 597.01 Z" class="st4"/>
+		</g>
+		<g id="shape7-11" v:mID="7" v:groupContext="shape" transform="translate(2.5012,-522.82)">
+			<title>Sheet.7</title>
+			<path d="M0 597.01 L154.68 597.01 L154.68 694.06 L0 694.06 L0 597.01" class="st5"/>
+		</g>
+		<g id="shape10-14" v:mID="10" v:groupContext="shape" transform="translate(159.025,-522.82)">
+			<title>Sheet.10</title>
+			<path d="M0 597.01 L0 694.06 L563.73 694.06 L563.73 597.01 L0 597.01 L0 597.01 Z" class="st6"/>
+		</g>
+		<g id="shape11-16" v:mID="11" v:groupContext="shape" transform="translate(159.025,-522.82)">
+			<title>Sheet.11</title>
+			<path d="M0 597.01 L563.73 597.01 L563.73 694.06 L0 694.06 L0 597.01" class="st7"/>
+		</g>
+		<g id="shape12-19" v:mID="12" v:groupContext="shape" transform="translate(262.039,-549.269)">
+			<title>Sheet.12</title>
+			<desc>Payload 0</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="46.92" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 0</text>		</g>
+		<g id="shape13-23" v:mID="13" v:groupContext="shape" transform="translate(547.615,-549.269)">
+			<title>Sheet.13</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="87.5716" cy="673.065" width="175.15" height="41.9798"/>
+			<path d="M175.14 652.08 L0 652.08 L0 694.06 L175.14 694.06 L175.14 652.08" class="st8"/>
+			<text x="37" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape15-27" v:mID="15" v:groupContext="shape" transform="translate(2.2377,-522.82)">
+			<title>Sheet.15</title>
+			<path d="M0 597.01 L0 694.06 L154.68 694.06 L154.68 597.01 L0 597.01 L0 597.01 Z" class="st4"/>
+		</g>
+		<g id="shape16-29" v:mID="16" v:groupContext="shape" transform="translate(2.2377,-522.82)">
+			<title>Sheet.16</title>
+			<path d="M0 597.01 L154.68 597.01 L154.68 694.06 L0 694.06 L0 597.01" class="st5"/>
+		</g>
+		<g id="shape17-32" v:mID="17" v:groupContext="shape" transform="translate(6.52106,-546.331)">
+			<title>Sheet.17</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="72.773" cy="673.065" width="145.55" height="41.9798"/>
+			<path d="M145.55 652.08 L0 652.08 L0 694.06 L145.55 694.06 L145.55 652.08" class="st8"/>
+			<text x="34.98" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape23-36" v:mID="23" v:groupContext="shape" transform="translate(286.548,-2.2377)">
+			<title>Sheet.23</title>
+			<path d="M0 597.27 L0 694.06 L345.2 694.06 L345.2 597.27 L0 597.27 L0 597.27 Z" class="st4"/>
+		</g>
+		<g id="shape24-38" v:mID="24" v:groupContext="shape" transform="translate(286.548,-2.2377)">
+			<title>Sheet.24</title>
+			<path d="M0 597.27 L345.2 597.27 L345.2 694.06 L0 694.06 L0 597.27" class="st5"/>
+		</g>
+		<g id="shape25-41" v:mID="25" v:groupContext="shape" transform="translate(399.834,-25.6887)">
+			<title>Sheet.25</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="72.773" cy="673.065" width="145.55" height="41.9798"/>
+			<path d="M145.55 652.08 L0 652.08 L0 694.06 L145.55 694.06 L145.55 652.08" class="st8"/>
+			<text x="13.76" y="683.56" class="st10" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape31-45" v:mID="31" v:groupContext="shape" transform="translate(631.744,-2.2377)">
+			<title>Sheet.31</title>
+			<path d="M0 597.27 L0 694.06 L516.21 694.06 L516.21 597.27 L0 597.27 L0 597.27 Z" class="st6"/>
+		</g>
+		<g id="shape32-47" v:mID="32" v:groupContext="shape" transform="translate(631.744,-2.2377)">
+			<title>Sheet.32</title>
+			<path d="M0 597.27 L516.21 597.27 L516.21 694.06 L0 694.06 L0 597.27" class="st7"/>
+		</g>
+		<g id="shape33-50" v:mID="33" v:groupContext="shape" transform="translate(809.035,-25.6889)">
+			<title>Sheet.33</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="16.99" y="683.56" class="st10" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape35-54" v:mID="35" v:groupContext="shape" transform="translate(1199.29,-21.1708)">
+			<title>Sheet.35</title>
+			<desc>Logical output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="164.662" cy="678.273" width="329.33" height="31.5648"/>
+			<path d="M329.32 662.49 L0 662.49 L0 694.06 L329.32 694.06 L329.32 662.49" class="st8"/>
+			<text x="24.69" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Logical output segment</text>		</g>
+		<g id="shape38-58" v:mID="38" v:groupContext="shape" transform="translate(1204.65,-254.446)">
+			<title>Sheet.38</title>
+			<desc>Three-part output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="181.707" cy="678.273" width="363.42" height="31.5648"/>
+			<path d="M363.41 662.49 L0 662.49 L0 694.06 L363.41 694.06 L363.41 662.49" class="st8"/>
+			<text x="19.51" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Three-part output segment</text>		</g>
+		<g id="shape39-62" v:mID="39" v:groupContext="shape" transform="translate(546.25,-529.921)">
+			<title>Sheet.39</title>
+			<path d="M3.43 529.94 L3.46 543.61 L0.03 543.61 L0 529.94 L3.43 529.94 L3.43 529.94 ZM3.46 553.87 L3.46 557.29 L0.03
+						 557.29 L0.03 553.87 L3.46 553.87 L3.46 553.87 ZM3.46 567.55 L3.46 581.22 L0.03 581.22 L0.03 567.55 L3.46
+						 567.55 L3.46 567.55 ZM3.46 591.48 L3.46 594.9 L0.03 594.9 L0.03 591.48 L3.46 591.48 L3.46 591.48 ZM3.46
+						 605.16 L3.46 618.83 L0.03 618.83 L0.03 605.16 L3.46 605.16 L3.46 605.16 ZM3.46 629.09 L3.46 632.51 L0.03
+						 632.51 L0.03 629.09 L3.46 629.09 L3.46 629.09 ZM3.46 642.77 L3.46 656.45 L0.03 656.45 L0.03 642.77 L3.46
+						 642.77 L3.46 642.77 ZM3.46 666.7 L3.46 670.12 L0.03 670.12 L0.03 666.7 L3.46 666.7 L3.46 666.7 ZM3.46 680.38
+						 L3.46 694.06 L0.03 694.06 L0.03 680.38 L3.46 680.38 L3.46 680.38 Z" class="st1"/>
+		</g>
+		<g id="shape40-64" v:mID="40" v:groupContext="shape" transform="translate(549.097,-223.749)">
+			<title>Sheet.40</title>
+			<path d="M6.65 402.61 L13.01 414.71 L9.98 416.32 L3.62 404.22 L6.65 402.61 L6.65 402.61 ZM17.82 423.78 L19.4 426.81 L16.37
+						 428.42 L14.79 425.39 L17.82 423.78 L17.82 423.78 ZM24.21 435.91 L30.57 448.01 L27.54 449.59 L21.18 437.49
+						 L24.21 435.91 L24.21 435.91 ZM35.38 457.08 L36.96 460.11 L33.93 461.69 L32.35 458.66 L35.38 457.08 L35.38
+						 457.08 ZM41.73 469.18 L48.12 481.28 L45.09 482.86 L38.7 470.76 L41.73 469.18 L41.73 469.18 ZM52.9 490.36
+						 L54.51 493.38 L51.48 494.99 L49.87 491.97 L52.9 490.36 L52.9 490.36 ZM59.29 502.45 L65.68 514.55 L62.65
+						 516.16 L56.26 504.06 L59.29 502.45 L59.29 502.45 ZM70.46 523.63 L72.07 526.65 L69.04 528.26 L67.43 525.24
+						 L70.46 523.63 L70.46 523.63 ZM76.85 535.76 L83.24 547.86 L80.21 549.43 L73.82 537.34 L76.85 535.76 L76.85
+						 535.76 ZM88.01 556.93 L89.63 559.95 L86.6 561.53 L84.98 558.51 L88.01 556.93 L88.01 556.93 ZM94.4 569.03
+						 L100.79 581.13 L97.76 582.7 L91.37 570.61 L94.4 569.03 L94.4 569.03 ZM105.57 590.2 L107.15 593.22 L104.12
+						 594.84 L102.54 591.81 L105.57 590.2 L105.57 590.2 ZM111.96 602.3 L118.32 614.4 L115.28 616.01 L108.93 603.91
+						 L111.96 602.3 L111.96 602.3 ZM123.12 623.47 L124.71 626.5 L121.67 628.11 L120.09 625.08 L123.12 623.47 L123.12
+						 623.47 ZM129.51 635.6 L135.87 647.7 L132.84 649.28 L126.48 637.18 L129.51 635.6 L129.51 635.6 ZM140.68 656.77
+						 L142.26 659.8 L139.23 661.38 L137.65 658.35 L140.68 656.77 L140.68 656.77 ZM147.04 668.87 L153.43 680.97
+						 L150.4 682.55 L144.01 670.45 L147.04 668.87 L147.04 668.87 ZM158.2 690.04 L159.49 692.48 L156.46 694.06
+						 L155.17 691.66 L158.2 690.04 L158.2 690.04 ZM0.33 408.43 L0 393.7 L11.99 402.28 L0.33 408.43 L0.33 408.43
+						 Z" class="st1"/>
+		</g>
+		<g id="shape46-66" v:mID="46" v:groupContext="shape" transform="translate(66.8445,-221.499)">
+			<title>Sheet.46</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L273.39 595.43 C282.51 595.43 289.86 602.79 289.86 611.87 L289.86
+						 677.62 C289.86 686.72 282.51 694.06 273.39 694.06 L16.47 694.06 C7.38 694.06 -0 686.72 0 677.62 L0 611.87
+						 Z" class="st4"/>
+		</g>
+		<g id="shape47-68" v:mID="47" v:groupContext="shape" transform="translate(66.8445,-221.499)">
+			<title>Sheet.47</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L273.39 595.43 C282.51 595.43 289.86 602.79 289.86 611.87 L289.86
+						 677.62 C289.86 686.72 282.51 694.06 273.39 694.06 L16.47 694.06 C7.38 694.06 -0 686.72 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape48-70" v:mID="48" v:groupContext="shape" transform="translate(113.27,-263.667)">
+			<title>Sheet.48</title>
+			<desc>Direct mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="98.5041" cy="675.59" width="197.01" height="36.9302"/>
+			<path d="M197.01 657.13 L0 657.13 L0 694.06 L197.01 694.06 L197.01 657.13" class="st8"/>
+			<text x="18.66" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Direct mbuf</text>		</g>
+		<g id="shape51-74" v:mID="51" v:groupContext="shape" transform="translate(85.817,-233.439)">
+			<title>Sheet.51</title>
+			<desc>(copy of headers)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="127.916" cy="678.273" width="255.84" height="31.5648"/>
+			<path d="M255.83 662.49 L0 662.49 L0 694.06 L255.83 694.06 L255.83 662.49" class="st8"/>
+			<text x="34.33" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(copy of headers)</text>		</g>
+		<g id="shape53-78" v:mID="53" v:groupContext="shape" transform="translate(371.944,-275.998)">
+			<title>Sheet.53</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="7.56" y="686.16" class="st15" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape54-82" v:mID="54" v:groupContext="shape" transform="translate(695.132,-646.04)">
+			<title>Sheet.54</title>
+			<path d="M0 655.39 L0 694.06 L100.4 694.06 L100.4 655.39 L0 655.39 L0 655.39 Z" class="st16"/>
+		</g>
+		<g id="shape55-84" v:mID="55" v:groupContext="shape" transform="translate(709.033,-648.946)">
+			<title>Sheet.55</title>
+			<desc>segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="36.6265" cy="680.167" width="73.26" height="27.7775"/>
+			<path d="M73.25 666.28 L0 666.28 L0 694.06 L73.25 694.06 L73.25 666.28" class="st8"/>
+			<text x="7.6" y="687.11" class="st17" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>segsz</text>		</g>
+		<g id="shape56-88" v:mID="56" v:groupContext="shape" transform="translate(785.874,-521.182)">
+			<title>Sheet.56</title>
+			<path d="M0 597.27 L0 694.06 L363.41 694.06 L363.41 597.27 L0 597.27 L0 597.27 Z" class="st6"/>
+		</g>
+		<g id="shape57-90" v:mID="57" v:groupContext="shape" transform="translate(785.874,-521.182)">
+			<title>Sheet.57</title>
+			<path d="M0 597.27 L363.41 597.27 L363.41 694.06 L0 694.06 L0 597.27" class="st7"/>
+		</g>
+		<g id="shape58-93" v:mID="58" v:groupContext="shape" v:layerMember="0" transform="translate(943.158,-529.889)">
+			<title>Sheet.58</title>
+			<path d="M1.35 529.91 L1.25 543.58 L4.68 543.61 L4.78 529.94 L1.35 529.91 L1.35 529.91 ZM1.15 553.84 L1.12 557.26 L4.55
+						 557.29 L4.58 553.87 L1.15 553.84 L1.15 553.84 ZM1.05 567.52 L0.92 581.19 L4.35 581.22 L4.48 567.55 L1.05
+						 567.52 L1.05 567.52 ZM0.86 591.45 L0.82 594.87 L4.25 594.9 L4.28 591.48 L0.86 591.45 L0.86 591.45 ZM0.72
+						 605.13 L0.63 618.8 L4.05 618.83 L4.15 605.16 L0.72 605.13 L0.72 605.13 ZM0.53 629.06 L0.53 632.48 L3.95
+						 632.51 L3.95 629.09 L0.53 629.06 L0.53 629.06 ZM0.43 642.74 L0.33 656.41 L3.75 656.45 L3.85 642.77 L0.43
+						 642.74 L0.43 642.74 ZM0.23 666.67 L0.2 670.09 L3.62 670.12 L3.66 666.7 L0.23 666.67 L0.23 666.67 ZM0.13
+						 680.35 L0 694.02 L3.43 694.06 L3.56 680.38 L0.13 680.35 L0.13 680.35 Z" class="st18"/>
+		</g>
+		<g id="shape59-95" v:mID="59" v:groupContext="shape" transform="translate(785.874,-549.473)">
+			<title>Sheet.59</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="77.3395" cy="673.065" width="154.68" height="41.9798"/>
+			<path d="M154.68 652.08 L0 652.08 L0 694.06 L154.68 694.06 L154.68 652.08" class="st8"/>
+			<text x="26.77" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape60-99" v:mID="60" v:groupContext="shape" transform="translate(952.97,-548.822)">
+			<title>Sheet.60</title>
+			<desc>Payload 2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="46.92" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 2</text>		</g>
+		<g id="shape63-103" v:mID="63" v:groupContext="shape" transform="translate(1210.43,-551.684)">
+			<title>Sheet.63</title>
+			<desc>Multi-segment input packet</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="181.707" cy="678.273" width="363.42" height="31.5648"/>
+			<path d="M363.41 662.49 L0 662.49 L0 694.06 L363.41 694.06 L363.41 662.49" class="st8"/>
+			<text x="17.75" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Multi-segment input packet</text>		</g>
+		<g id="shape70-107" v:mID="70" v:groupContext="shape" v:layerMember="1" transform="translate(455.049,-221.499)">
+			<title>Sheet.70</title>
+			<path d="M0 611.87 C0 602.79 5.33 595.43 11.89 595.43 L282.92 595.43 C289.53 595.43 294.8 602.79 294.8 611.87 L294.8
+						 677.62 C294.8 686.76 289.53 694.06 282.92 694.06 L11.89 694.06 C5.33 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st6"/>
+		</g>
+		<g id="shape71-109" v:mID="71" v:groupContext="shape" transform="translate(455.049,-221.499)">
+			<title>Sheet.71</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L391.97 595.43 C401.12 595.43 408.44 602.79 408.44 611.87 L408.44
+						 677.62 C408.44 686.76 401.12 694.06 391.97 694.06 L16.47 694.06 C7.38 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape72-111" v:mID="72" v:groupContext="shape" transform="translate(489.065,-263.434)">
+			<title>Sheet.72</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="112.128" cy="675.59" width="224.26" height="36.9302"/>
+			<path d="M224.26 657.13 L0 657.13 L0 694.06 L224.26 694.06 L224.26 657.13" class="st8"/>
+			<text x="20.73" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape75-115" v:mID="75" v:groupContext="shape" transform="translate(849.065,-281.435)">
+			<title>Sheet.75</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="4.49" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape77-119" v:mID="77" v:groupContext="shape" transform="translate(717.742,-563.523)">
+			<title>Sheet.77</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="15.71" y="683.67" class="st19" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape78-123" v:mID="78" v:groupContext="shape" transform="translate(1148.17,-529.067)">
+			<title>Sheet.78</title>
+			<path d="M1.38 529.87 L1.25 543.55 L4.68 543.61 L4.81 529.94 L1.38 529.87 L1.38 529.87 ZM1.19 553.81 L1.12 557.23 L4.55
+						 557.29 L4.61 553.87 L1.19 553.81 L1.19 553.81 ZM1.05 567.48 L0.92 581.16 L4.35 581.22 L4.48 567.55 L1.05
+						 567.48 L1.05 567.48 ZM0.86 591.42 L0.86 594.84 L4.28 594.9 L4.28 591.48 L0.86 591.42 L0.86 591.42 ZM0.72
+						 605.09 L0.66 618.77 L4.08 618.83 L4.15 605.16 L0.72 605.09 L0.72 605.09 ZM0.53 629.03 L0.53 632.45 L3.95
+						 632.51 L3.95 629.09 L0.53 629.03 L0.53 629.03 ZM0.46 642.7 L0.33 656.38 L3.75 656.45 L3.89 642.77 L0.46
+						 642.7 L0.46 642.7 ZM0.26 666.64 L0.2 670.06 L3.62 670.12 L3.69 666.7 L0.26 666.64 L0.26 666.64 ZM0.13 680.31
+						 L0 693.99 L3.43 694.06 L3.56 680.38 L0.13 680.31 L0.13 680.31 Z" class="st20"/>
+		</g>
+		<g id="shape79-125" v:mID="79" v:groupContext="shape" transform="translate(946.254,-657.81)">
+			<title>Sheet.79</title>
+			<path d="M11 686.69 L17.33 686.69 L17.33 688.27 L11 688.27 L11 686.69 L11 686.69 ZM22.07 686.69 L23.65 686.69 L23.65
+						 688.27 L22.07 688.27 L22.07 686.69 L22.07 686.69 ZM28.39 686.69 L34.72 686.69 L34.72 688.27 L28.39 688.27
+						 L28.39 686.69 L28.39 686.69 ZM39.46 686.69 L41.04 686.69 L41.04 688.27 L39.46 688.27 L39.46 686.69 L39.46
+						 686.69 ZM45.78 686.69 L52.11 686.69 L52.11 688.27 L45.78 688.27 L45.78 686.69 L45.78 686.69 ZM56.85 686.69
+						 L58.43 686.69 L58.43 688.27 L56.85 688.27 L56.85 686.69 L56.85 686.69 ZM63.18 686.69 L69.5 686.69 L69.5
+						 688.27 L63.18 688.27 L63.18 686.69 L63.18 686.69 ZM74.24 686.69 L75.82 686.69 L75.82 688.27 L74.24 688.27
+						 L74.24 686.69 L74.24 686.69 ZM80.57 686.69 L86.89 686.69 L86.89 688.27 L80.57 688.27 L80.57 686.69 L80.57
+						 686.69 ZM91.63 686.69 L93.22 686.69 L93.22 688.27 L91.63 688.27 L91.63 686.69 L91.63 686.69 ZM97.96 686.69
+						 L104.28 686.69 L104.28 688.27 L97.96 688.27 L97.96 686.69 L97.96 686.69 ZM109.03 686.69 L110.61 686.69 L110.61
+						 688.27 L109.03 688.27 L109.03 686.69 L109.03 686.69 ZM115.35 686.69 L121.67 686.69 L121.67 688.27 L115.35
+						 688.27 L115.35 686.69 L115.35 686.69 ZM126.42 686.69 L128 686.69 L128 688.27 L126.42 688.27 L126.42 686.69
+						 L126.42 686.69 ZM132.74 686.69 L139.07 686.69 L139.07 688.27 L132.74 688.27 L132.74 686.69 L132.74 686.69
+						 ZM143.81 686.69 L145.39 686.69 L145.39 688.27 L143.81 688.27 L143.81 686.69 L143.81 686.69 ZM150.13 686.69
+						 L156.46 686.69 L156.46 688.27 L150.13 688.27 L150.13 686.69 L150.13 686.69 ZM161.2 686.69 L162.78 686.69
+						 L162.78 688.27 L161.2 688.27 L161.2 686.69 L161.2 686.69 ZM167.53 686.69 L173.85 686.69 L173.85 688.27 L167.53
+						 688.27 L167.53 686.69 L167.53 686.69 ZM178.59 686.69 L180.17 686.69 L180.17 688.27 L178.59 688.27 L178.59
+						 686.69 L178.59 686.69 ZM184.92 686.69 L189.4 686.69 L189.4 688.27 L184.92 688.27 L184.92 686.69 L184.92
+						 686.69 ZM13.18 694.06 L0 687.41 L13.18 680.9 L13.18 694.06 L13.18 694.06 ZM187.22 680.9 L200.4 687.48 L187.22
+						 694.06 L187.22 680.9 L187.22 680.9 Z" class="st20"/>
+		</g>
+		<g id="shape80-127" v:mID="80" v:groupContext="shape" transform="translate(982.882,-643.673)">
+			<title>Sheet.80</title>
+			<path d="M0 655.13 L0 694.06 L127.01 694.06 L127.01 655.13 L0 655.13 L0 655.13 Z" class="st16"/>
+		</g>
+		<g id="shape81-129" v:mID="81" v:groupContext="shape" transform="translate(1003.39,-660.621)">
+			<title>Sheet.81</title>
+			<desc>pkt_len</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="48.6041" cy="680.956" width="97.21" height="26.1994"/>
+			<path d="M97.21 667.86 L0 667.86 L0 694.06 L97.21 694.06 L97.21 667.86" class="st8"/>
+			<text x="11.67" y="687.5" class="st21" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>pkt_len  </text>		</g>
+		<g id="shape82-133" v:mID="82" v:groupContext="shape" transform="translate(1001.18,-634.321)">
+			<title>Sheet.82</title>
+			<desc>% segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="49.2945" cy="680.956" width="98.59" height="26.1994"/>
+			<path d="M98.59 667.86 L0 667.86 L0 694.06 L98.59 694.06 L98.59 667.86" class="st8"/>
+			<text x="9.09" y="687.5" class="st21" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>% segsz</text>		</g>
+		<g id="shape34-137" v:mID="34" v:groupContext="shape" v:layerMember="0" transform="translate(356.703,-264.106)">
+			<title>Sheet.34</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape85-139" v:mID="85" v:groupContext="shape" v:layerMember="0" transform="translate(78.5359,-282.66)">
+			<title>Sheet.85</title>
+			<path d="M0 680.87 C-0 673.59 6.88 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.88 694.06 0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape87-141" v:mID="87" v:groupContext="shape" v:layerMember="0" transform="translate(85.4791,-284.062)">
+			<title>Sheet.87</title>
+			<desc>1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>1</text>		</g>
+		<g id="shape88-145" v:mID="88" v:groupContext="shape" v:layerMember="0" transform="translate(468.906,-282.66)">
+			<title>Sheet.88</title>
+			<path d="M0 680.87 C-0 673.59 6.89 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.89 694.06 -0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape90-147" v:mID="90" v:groupContext="shape" v:layerMember="0" transform="translate(474.575,-284.062)">
+			<title>Sheet.90</title>
+			<desc>2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>2</text>		</g>
+		<g id="shape95-151" v:mID="95" v:groupContext="shape" v:layerMember="0" transform="translate(764.026,-275.998)">
+			<title>Sheet.95</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="7.56" y="686.16" class="st15" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape97-155" v:mID="97" v:groupContext="shape" v:layerMember="0" transform="translate(889.755,-220.915)">
+			<title>Sheet.97</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L391.97 595.43 C401.12 595.43 408.44 602.79 408.44 611.87 L408.44
+						 677.62 C408.44 686.76 401.12 694.06 391.97 694.06 L16.47 694.06 C7.38 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape100-157" v:mID="100" v:groupContext="shape" v:layerMember="0" transform="translate(751.857,-262.528)">
+			<title>Sheet.100</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape104-159" v:mID="104" v:groupContext="shape" v:layerMember="1" transform="translate(851.429,-218.08)">
+			<title>Sheet.104</title>
+			<path d="M0 611.87 C0 602.79 5.33 595.43 11.89 595.43 L282.92 595.43 C289.53 595.43 294.8 602.79 294.8 611.87 L294.8
+						 677.62 C294.8 686.76 289.53 694.06 282.92 694.06 L11.89 694.06 C5.33 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st6"/>
+		</g>
+		<g id="shape105-161" v:mID="105" v:groupContext="shape" v:layerMember="0" transform="translate(885.444,-260.015)">
+			<title>Sheet.105</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="112.128" cy="675.59" width="224.26" height="36.9302"/>
+			<path d="M224.26 657.13 L0 657.13 L0 694.06 L224.26 694.06 L224.26 657.13" class="st8"/>
+			<text x="20.73" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape106-165" v:mID="106" v:groupContext="shape" v:layerMember="0" transform="translate(895.672,-229.419)">
+			<title>Sheet.106</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="12.86" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape107-169" v:mID="107" v:groupContext="shape" v:layerMember="0" transform="translate(863.297,-280.442)">
+			<title>Sheet.107</title>
+			<path d="M0 680.87 C-0 673.59 6.89 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.89 694.06 -0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape108-171" v:mID="108" v:groupContext="shape" v:layerMember="0" transform="translate(870.001,-281.547)">
+			<title>Sheet.108</title>
+			<desc>3</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>3</text>		</g>
+		<g id="shape109-175" v:mID="109" v:groupContext="shape" v:layerMember="0" transform="translate(500.959,-231.87)">
+			<title>Sheet.109</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="12.86" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+	</g>
+</svg>
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 40f04a1..c7c8b17 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -56,6 +56,7 @@ Programmer's Guide
     reorder_lib
     ip_fragment_reassembly_lib
     generic_receive_offload_lib
+    generic_segmentation_offload_lib
     pdump_lib
     multi_proc_support
     kernel_nic_interface
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* Re: [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
                               ` (5 preceding siblings ...)
  2017-10-05 11:02             ` [PATCH v7 6/6] doc: add GSO programmer's guide Mark Kavanagh
@ 2017-10-05 13:22             ` Ananyev, Konstantin
  2017-10-05 14:39               ` Kavanagh, Mark B
  2017-10-05 15:43             ` [PATCH v8 " Mark Kavanagh
                               ` (6 subsequent siblings)
  13 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-10-05 13:22 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas

Hi Mark,

> 
> Generic Segmentation Offload (GSO) is a SW technique to split large
> packets into small ones. Akin to TSO, GSO enables applications to
> operate on large packets, thus reducing per-packet processing overhead.
> 
> To enable more flexibility to applications, DPDK GSO is implemented
> as a standalone library. Applications explicitly use the GSO library
> to segment packets. This patch adds GSO support to DPDK for specific
> packet types: specifically, TCP/IPv4, VxLAN, and GRE.
> 
> The first patch introduces the GSO API framework. The second patch
> adds GSO support for TCP/IPv4 packets (containing an optional VLAN
> tag). The third patch adds GSO support for VxLAN packets that contain
> outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or
> outer VLAN tags). The fourth patch adds GSO support for GRE packets
> that contain outer IPv4, and inner TCP/IPv4 headers (with optional
> outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
> and GRE GSO in testpmd's checksum forwarding engine. The final patch
> in the series adds GSO documentation to the programmer's guide.
> 
> Performance Testing
> ===================
> The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
> iperf. Setup for the test is described as follows:
> 
> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>    machine, together physically.
> b. Launch testpmd with P0 and a vhost-user port, and use csum
>    forwarding engine with "retry".
> c. Select IP and TCP HW checksum calculation for P0; select TCP HW
>    checksum calculation for vhost-user port.
> d. Launch a VM with csum and tso offloading enabled.
> e. Run iperf-client on virtio-net port in the VM to send TCP packets.
>    With enabling csum and tso, the VM can send large TCP/IPv4 packets
>    (mss is up to 64KB).
> f. P1 is assigned to linux kernel and enabled kernel GRO. Run
>    iperf-server on P1.
> 
> We conduct three iperf tests:
> 
> test-1: enable GSO for P0 in testpmd, and set max GSO segment length
>     to 1518B. Run two iperf-client in the VM.
> test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
>     two iperf-client in the VM.
> test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.
> 
> Throughput of the above three tests:
> 
> test-1: 9.4Gbps
> test-2: 9.5Gbps
> test-3: 3Mbps
> 
> Functional Testing
> ==================
> Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
> length of tunneled packets from VMs is 1514B. So current experiment
> method can't be used to measure VxLAN and GRE GSO performance, but simply
> test the functionality via setting small GSO segment length (e.g. 500B).
> 
> VxLAN
> -----
> To test VxLAN GSO functionality, we use the following setup:
> 
> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>    machine, together physically.
> b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
>    engine with "retry".
> c. Testpmd commands:
>     - csum parse_tunnel on "P0"
>     - csum parse_tunnel on "vhost-user port"
>     - csum set outer-ip hw "P0"
>     - csum set ip hw "P0"
>     - csum set tcp hw "P0"
>     - csum set tcp hw "vhost-user port"
>     - set port "P0" gso on
>     - set gso segsz 500
> d. Launch a VM with csum and tso offloading enabled.
> e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
>    on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
>    max packet length is 1514B.
> f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
>    create a VxLAN port for P1, and run iperf-server on the VxLAN port.
> 
> In testpmd, we can see the length of all packets sent from P0 is smaller
> than or equal to 500B. Additionally, the packets arriving in P1 is
> encapsulated and is smaller than or equal to 500B.
> 
> GRE
> ---
> The same process may be used to test GRE functionality, with the exception that
> the tunnel type created for both the guest's virtio-net, and the host's kernel
> interfaces is GRE:
>    `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`
> 
> As in the VxLAN testcase, the length of packets sent from P0, and received on
> P1, is less than 500B.
> 
> Change log
> ==========
> v7:
> - add RTE_GSO_SEG_SIZE_MIN macro; use this to validate gso_ctx.gso_segsz.
> - rename 'ipid_flag' member of gso_ctx to 'flag'.
> - remove mention of VLAN tags in supported packet types.
> - don't clear PKT_TX_TCP_SEG flag if GSO fails.
> - take all packet overhead into account when checking for empty packet.
> - ensure that only enabled GSO types are enacted upon (i.e. no fall-through to
>   TCP/IPv4 case from tunneled case).
> - validate user-supplied gso segsz arg against RTE_GSO_SEG_SIZE_MIN in testpmd.
> - simplify error-checking/handling for GSO failure case in testpmd csum engine.
> - use 0 instead of !RTE_GSO_IPID_FIXED in testpmd.

Looks o in general, just few nits below.
Konstantin

1. there are few checkpatch errors regarding indentation.
2. [dpdk-dev,v7,2/6] gso: add TCP/IPv4 GSO support

int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -41,12 +46,53 @@ 
 		struct rte_mbuf **pkts_out,
 		uint16_t nb_pkts_out)
 {
+	struct rte_mempool *direct_pool, *indirect_pool;
+	struct rte_mbuf *pkt_seg;
+	uint64_t ol_flags;
+	uint16_t gso_size;
+	uint8_t ipid_delta;
+	int ret = 1;
+
 	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
 			nb_pkts_out < 1)
 		return -EINVAL;
 
-	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
-	pkts_out[0] = pkt;
+	if ((gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN) ||
+			(gso_ctx->gso_size >= pkt->pkt_len) ||
+			(gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) !=
+			gso_ctx->gso_types) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkts_out[0] = pkt;
+		return 1;
+	}
First and third are just checks for gso_ctx misconfiguration.
I think we don't need to reset PKT_TX_TCP_SEG bit in ol_flags nt case.
I'd suggest either remove them at all or merge with the invalid parameters check above:

if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
 			nb_pkts_out < 1 ||
			gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN ||
 			gso_ctx->gso_types != DEV_TX_OFFLOAD_TCP_TSO)
 		return -EINVAL;

if ((gso_ctx->gso_size >= pkt->pkt_len) {
   pkt->ol_flags &= (~PKT_TX_TCP_SEG);
   pkts_out[0] = pkt;
   return 1;
}
....

3.    [dpdk-dev,v7,3/6] gso: add VxLAN GSO support


lib/librte_gso/rte_gso.c

int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -59,7 +60,8 @@ 
 
 ...
+			(gso_ctx->gso_types & (DEV_TX_OFFLOAD_TCP_TSO |
+			                       DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
 			gso_ctx->gso_types) {

I think the check should be just:
(gso_ctx->gso_types & (DEV_TX_OFFLOAD_TCP_TSO | DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) != 0)

As we do want to allow only VXLAN segmentation for ctx.

4. [dpdk-dev,v7,4/6] gso: add GRE GSO support
Same comment as above.

> 
> v6:
> - rebase to HEAD of master (i5dce9fcA)
> - remove 'l3_offset' parameter from 'update_ipv4_tcp_headers'
> 
> v5:
> - add GSO section to the programmer's guide.
> - use MF or (previously 'and') offset to check if a packet is IP
>   fragmented.
> - move 'update_header' helper functions to gso_common.h.
> - move txp/ipv4 'update_header' function to gso_tcp4.c.
> - move tunnel 'update_header' function to gso_tunnel_tcp4.c.
> - add offset parameter to 'update_header' functions.
> - combine GRE and VxLAN tunnel header update functions into a single
>   function.
> - correct typos and errors in comments/commit messages.
> 
> v4:
> - use ol_flags instead of packet_type to decide which segmentation
>   function to use.
> - use MF and offset to check if a packet is IP fragmented, instead of
>   using DF.
> - remove ETHER_CRC_LEN from gso segment payload length calculation.
> - refactor internal header update and other functions.
> - remove RTE_GSO_IPID_INCREASE.
> - add some of GSO documents.
> - set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
>   packets sent from GSO-enabled ports in testpmd.
> v3:
> - support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
>   RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
>   UNKNOWN.
> - fill mbuf->packet_type instead of using rte_net_get_ptype() in
>   csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
> - store the input packet into pkts_out inside gso_tcp4_segment() and
>   gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
>   is performed.
> - add missing incldues.
> - optimize file names, function names and function description.
> - fix one bug in testpmd.
> v2:
> - merge data segments whose data_len is less than mss into a large data
>   segment in gso_do_segment().
> - use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
>   header in rte_gso_segment().
> - provide IP id macros for applications to select fixed or incremental IP
>   ids.
> 
> Jiayu Hu (3):
>   gso: add Generic Segmentation Offload API framework
>   gso: add TCP/IPv4 GSO support
>   app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
> 
> Mark Kavanagh (3):
>   gso: add VxLAN GSO support
>   gso: add GRE GSO support
>   doc: add GSO programmer's guide
> 
>  MAINTAINERS                                        |   6 +
>  app/test-pmd/cmdline.c                             | 179 ++++++++
>  app/test-pmd/config.c                              |  24 ++
>  app/test-pmd/csumonly.c                            |  43 +-
>  app/test-pmd/testpmd.c                             |  13 +
>  app/test-pmd/testpmd.h                             |  10 +
>  config/common_base                                 |   5 +
>  doc/api/doxy-api-index.md                          |   1 +
>  doc/api/doxy-api.conf                              |   1 +
>  .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
>  .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
>  doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
>  doc/guides/prog_guide/index.rst                    |   1 +
>  doc/guides/rel_notes/release_17_11.rst             |  17 +
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  46 ++
>  lib/Makefile                                       |   2 +
>  lib/librte_eal/common/include/rte_log.h            |   1 +
>  lib/librte_gso/Makefile                            |  52 +++
>  lib/librte_gso/gso_common.c                        | 153 +++++++
>  lib/librte_gso/gso_common.h                        | 171 ++++++++
>  lib/librte_gso/gso_tcp4.c                          | 104 +++++
>  lib/librte_gso/gso_tcp4.h                          |  74 ++++
>  lib/librte_gso/gso_tunnel_tcp4.c                   | 126 ++++++
>  lib/librte_gso/gso_tunnel_tcp4.h                   |  75 ++++
>  lib/librte_gso/rte_gso.c                           | 111 +++++
>  lib/librte_gso/rte_gso.h                           | 148 +++++++
>  lib/librte_gso/rte_gso_version.map                 |   7 +
>  mk/rte.app.mk                                      |   1 +
>  28 files changed, 2413 insertions(+), 4 deletions(-)
>  create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
>  create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
>  create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
>  create mode 100644 lib/librte_gso/Makefile
>  create mode 100644 lib/librte_gso/gso_common.c
>  create mode 100644 lib/librte_gso/gso_common.h
>  create mode 100644 lib/librte_gso/gso_tcp4.c
>  create mode 100644 lib/librte_gso/gso_tcp4.h
>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
>  create mode 100644 lib/librte_gso/rte_gso.c
>  create mode 100644 lib/librte_gso/rte_gso.h
>  create mode 100644 lib/librte_gso/rte_gso_version.map
> 
> --
> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-10-05 13:22             ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Ananyev, Konstantin
@ 2017-10-05 14:39               ` Kavanagh, Mark B
  0 siblings, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-10-05 14:39 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



>From: Ananyev, Konstantin
>Sent: Thursday, October 5, 2017 2:23 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>Subject: RE: [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
>
>Hi Mark,
>
>>
>> Generic Segmentation Offload (GSO) is a SW technique to split large
>> packets into small ones. Akin to TSO, GSO enables applications to
>> operate on large packets, thus reducing per-packet processing overhead.
>>
>> To enable more flexibility to applications, DPDK GSO is implemented
>> as a standalone library. Applications explicitly use the GSO library
>> to segment packets. This patch adds GSO support to DPDK for specific
>> packet types: specifically, TCP/IPv4, VxLAN, and GRE.
>>
>> The first patch introduces the GSO API framework. The second patch
>> adds GSO support for TCP/IPv4 packets (containing an optional VLAN
>> tag). The third patch adds GSO support for VxLAN packets that contain
>> outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or
>> outer VLAN tags). The fourth patch adds GSO support for GRE packets
>> that contain outer IPv4, and inner TCP/IPv4 headers (with optional
>> outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
>> and GRE GSO in testpmd's checksum forwarding engine. The final patch
>> in the series adds GSO documentation to the programmer's guide.
>>
>> Performance Testing
>> ===================
>> The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
>> iperf. Setup for the test is described as follows:
>>
>> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>>    machine, together physically.
>> b. Launch testpmd with P0 and a vhost-user port, and use csum
>>    forwarding engine with "retry".
>> c. Select IP and TCP HW checksum calculation for P0; select TCP HW
>>    checksum calculation for vhost-user port.
>> d. Launch a VM with csum and tso offloading enabled.
>> e. Run iperf-client on virtio-net port in the VM to send TCP packets.
>>    With enabling csum and tso, the VM can send large TCP/IPv4 packets
>>    (mss is up to 64KB).
>> f. P1 is assigned to linux kernel and enabled kernel GRO. Run
>>    iperf-server on P1.
>>
>> We conduct three iperf tests:
>>
>> test-1: enable GSO for P0 in testpmd, and set max GSO segment length
>>     to 1518B. Run two iperf-client in the VM.
>> test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
>>     two iperf-client in the VM.
>> test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.
>>
>> Throughput of the above three tests:
>>
>> test-1: 9.4Gbps
>> test-2: 9.5Gbps
>> test-3: 3Mbps
>>
>> Functional Testing
>> ==================
>> Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
>> length of tunneled packets from VMs is 1514B. So current experiment
>> method can't be used to measure VxLAN and GRE GSO performance, but simply
>> test the functionality via setting small GSO segment length (e.g. 500B).
>>
>> VxLAN
>> -----
>> To test VxLAN GSO functionality, we use the following setup:
>>
>> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>>    machine, together physically.
>> b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
>>    engine with "retry".
>> c. Testpmd commands:
>>     - csum parse_tunnel on "P0"
>>     - csum parse_tunnel on "vhost-user port"
>>     - csum set outer-ip hw "P0"
>>     - csum set ip hw "P0"
>>     - csum set tcp hw "P0"
>>     - csum set tcp hw "vhost-user port"
>>     - set port "P0" gso on
>>     - set gso segsz 500
>> d. Launch a VM with csum and tso offloading enabled.
>> e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
>>    on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
>>    max packet length is 1514B.
>> f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
>>    create a VxLAN port for P1, and run iperf-server on the VxLAN port.
>>
>> In testpmd, we can see the length of all packets sent from P0 is smaller
>> than or equal to 500B. Additionally, the packets arriving in P1 is
>> encapsulated and is smaller than or equal to 500B.
>>
>> GRE
>> ---
>> The same process may be used to test GRE functionality, with the exception
>that
>> the tunnel type created for both the guest's virtio-net, and the host's
>kernel
>> interfaces is GRE:
>>    `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`
>>
>> As in the VxLAN testcase, the length of packets sent from P0, and received
>on
>> P1, is less than 500B.
>>
>> Change log
>> ==========
>> v7:
>> - add RTE_GSO_SEG_SIZE_MIN macro; use this to validate gso_ctx.gso_segsz.
>> - rename 'ipid_flag' member of gso_ctx to 'flag'.
>> - remove mention of VLAN tags in supported packet types.
>> - don't clear PKT_TX_TCP_SEG flag if GSO fails.
>> - take all packet overhead into account when checking for empty packet.
>> - ensure that only enabled GSO types are enacted upon (i.e. no fall-through
>to
>>   TCP/IPv4 case from tunneled case).
>> - validate user-supplied gso segsz arg against RTE_GSO_SEG_SIZE_MIN in
>testpmd.
>> - simplify error-checking/handling for GSO failure case in testpmd csum
>engine.
>> - use 0 instead of !RTE_GSO_IPID_FIXED in testpmd.
>
>Looks o in general, just few nits below.
>Konstantin

Thanks Konstantin - I'll address those issues and spin up a new patchset right now.
Responses to comments are inline, as usual.

Thanks again,
Mark

>
>1. there are few checkpatch errors regarding indentation.

I saw the issues regarding indentation  - apologies :(

The issue reported with the MAINTAINERS file in patch 6/6 is a bit of a mystery to me though - any ideas?

>2. [dpdk-dev,v7,2/6] gso: add TCP/IPv4 GSO support
>
>int
> rte_gso_segment(struct rte_mbuf *pkt,
>@@ -41,12 +46,53 @@
> 		struct rte_mbuf **pkts_out,
> 		uint16_t nb_pkts_out)
> {
>+	struct rte_mempool *direct_pool, *indirect_pool;
>+	struct rte_mbuf *pkt_seg;
>+	uint64_t ol_flags;
>+	uint16_t gso_size;
>+	uint8_t ipid_delta;
>+	int ret = 1;
>+
> 	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
> 			nb_pkts_out < 1)
> 		return -EINVAL;
>
>-	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>-	pkts_out[0] = pkt;
>+	if ((gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN) ||
>+			(gso_ctx->gso_size >= pkt->pkt_len) ||
>+			(gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO) !=
>+			gso_ctx->gso_types) {
>+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>+		pkts_out[0] = pkt;
>+		return 1;
>+	}
>First and third are just checks for gso_ctx misconfiguration.
>I think we don't need to reset PKT_TX_TCP_SEG bit in ol_flags nt case.

Yes, you're right - that error appears to have been introduced while manually resolving a merge conflict while rebasing.

>I'd suggest either remove them at all or merge with the invalid parameters
>check above:
>
>if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
> 			nb_pkts_out < 1 ||
>			gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN ||
> 			gso_ctx->gso_types != DEV_TX_OFFLOAD_TCP_TSO)
> 		return -EINVAL;
>
>if ((gso_ctx->gso_size >= pkt->pkt_len) {
>   pkt->ol_flags &= (~PKT_TX_TCP_SEG);
>   pkts_out[0] = pkt;
>   return 1;
>}
>....

Agreed.

>
>3.    [dpdk-dev,v7,3/6] gso: add VxLAN GSO support
>
>
>lib/librte_gso/rte_gso.c
>
>int
> rte_gso_segment(struct rte_mbuf *pkt,
>@@ -59,7 +60,8 @@
>
> ...
>+			(gso_ctx->gso_types & (DEV_TX_OFFLOAD_TCP_TSO |
>+			                       DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) !=
> 			gso_ctx->gso_types) {
>
>I think the check should be just:
>(gso_ctx->gso_types & (DEV_TX_OFFLOAD_TCP_TSO | DEV_TX_OFFLOAD_VXLAN_TNL_TSO))
>!= 0)
>
>As we do want to allow only VXLAN segmentation for ctx.

Ah yes, good catch.

In that case though, wouldn't the statement be:

        if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
                        nb_pkts_out < 1 ||
                        gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN ||
                        (gso_ctx->gso_types & (DEV_TX_OFFLOAD_TCP_TSO |
                        DEV_TX_OFFLOAD_VXLAN_TNL_TSO) == 0))
                return -EINVAL;
>
>4. [dpdk-dev,v7,4/6] gso: add GRE GSO support
>Same comment as above.
>
>>
>> v6:
>> - rebase to HEAD of master (i5dce9fcA)
>> - remove 'l3_offset' parameter from 'update_ipv4_tcp_headers'
>>
>> v5:
>> - add GSO section to the programmer's guide.
>> - use MF or (previously 'and') offset to check if a packet is IP
>>   fragmented.
>> - move 'update_header' helper functions to gso_common.h.
>> - move txp/ipv4 'update_header' function to gso_tcp4.c.
>> - move tunnel 'update_header' function to gso_tunnel_tcp4.c.
>> - add offset parameter to 'update_header' functions.
>> - combine GRE and VxLAN tunnel header update functions into a single
>>   function.
>> - correct typos and errors in comments/commit messages.
>>
>> v4:
>> - use ol_flags instead of packet_type to decide which segmentation
>>   function to use.
>> - use MF and offset to check if a packet is IP fragmented, instead of
>>   using DF.
>> - remove ETHER_CRC_LEN from gso segment payload length calculation.
>> - refactor internal header update and other functions.
>> - remove RTE_GSO_IPID_INCREASE.
>> - add some of GSO documents.
>> - set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
>>   packets sent from GSO-enabled ports in testpmd.
>> v3:
>> - support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
>>   RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
>>   UNKNOWN.
>> - fill mbuf->packet_type instead of using rte_net_get_ptype() in
>>   csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
>> - store the input packet into pkts_out inside gso_tcp4_segment() and
>>   gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
>>   is performed.
>> - add missing incldues.
>> - optimize file names, function names and function description.
>> - fix one bug in testpmd.
>> v2:
>> - merge data segments whose data_len is less than mss into a large data
>>   segment in gso_do_segment().
>> - use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
>>   header in rte_gso_segment().
>> - provide IP id macros for applications to select fixed or incremental IP
>>   ids.
>>
>> Jiayu Hu (3):
>>   gso: add Generic Segmentation Offload API framework
>>   gso: add TCP/IPv4 GSO support
>>   app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
>>
>> Mark Kavanagh (3):
>>   gso: add VxLAN GSO support
>>   gso: add GRE GSO support
>>   doc: add GSO programmer's guide
>>
>>  MAINTAINERS                                        |   6 +
>>  app/test-pmd/cmdline.c                             | 179 ++++++++
>>  app/test-pmd/config.c                              |  24 ++
>>  app/test-pmd/csumonly.c                            |  43 +-
>>  app/test-pmd/testpmd.c                             |  13 +
>>  app/test-pmd/testpmd.h                             |  10 +
>>  config/common_base                                 |   5 +
>>  doc/api/doxy-api-index.md                          |   1 +
>>  doc/api/doxy-api.conf                              |   1 +
>>  .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
>>  .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
>>  doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477
>+++++++++++++++++++++
>>  doc/guides/prog_guide/index.rst                    |   1 +
>>  doc/guides/rel_notes/release_17_11.rst             |  17 +
>>  doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  46 ++
>>  lib/Makefile                                       |   2 +
>>  lib/librte_eal/common/include/rte_log.h            |   1 +
>>  lib/librte_gso/Makefile                            |  52 +++
>>  lib/librte_gso/gso_common.c                        | 153 +++++++
>>  lib/librte_gso/gso_common.h                        | 171 ++++++++
>>  lib/librte_gso/gso_tcp4.c                          | 104 +++++
>>  lib/librte_gso/gso_tcp4.h                          |  74 ++++
>>  lib/librte_gso/gso_tunnel_tcp4.c                   | 126 ++++++
>>  lib/librte_gso/gso_tunnel_tcp4.h                   |  75 ++++
>>  lib/librte_gso/rte_gso.c                           | 111 +++++
>>  lib/librte_gso/rte_gso.h                           | 148 +++++++
>>  lib/librte_gso/rte_gso_version.map                 |   7 +
>>  mk/rte.app.mk                                      |   1 +
>>  28 files changed, 2413 insertions(+), 4 deletions(-)
>>  create mode 100644
>doc/guides/prog_guide/generic_segmentation_offload_lib.rst
>>  create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
>>  create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
>>  create mode 100644 lib/librte_gso/Makefile
>>  create mode 100644 lib/librte_gso/gso_common.c
>>  create mode 100644 lib/librte_gso/gso_common.h
>>  create mode 100644 lib/librte_gso/gso_tcp4.c
>>  create mode 100644 lib/librte_gso/gso_tcp4.h
>>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
>>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
>>  create mode 100644 lib/librte_gso/rte_gso.c
>>  create mode 100644 lib/librte_gso/rte_gso.h
>>  create mode 100644 lib/librte_gso/rte_gso_version.map
>>
>> --
>> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v8 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
                               ` (6 preceding siblings ...)
  2017-10-05 13:22             ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Ananyev, Konstantin
@ 2017-10-05 15:43             ` Mark Kavanagh
  2017-10-05 17:12               ` Ananyev, Konstantin
                                 ` (7 more replies)
  2017-10-05 15:43             ` [PATCH v8 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
                               ` (5 subsequent siblings)
  13 siblings, 8 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 15:43 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. This patch adds GSO support to DPDK for specific
packet types: specifically, TCP/IPv4, VxLAN, and GRE.

The first patch introduces the GSO API framework. The second patch
adds GSO support for TCP/IPv4 packets (containing an optional VLAN
tag). The third patch adds GSO support for VxLAN packets that contain
outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or 
outer VLAN tags). The fourth patch adds GSO support for GRE packets
that contain outer IPv4, and inner TCP/IPv4 headers (with optional 
outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
and GRE GSO in testpmd's checksum forwarding engine. The final patch
in the series adds GSO documentation to the programmer's guide.

Performance Testing
===================
The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
iperf. Setup for the test is described as follows:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum
   forwarding engine with "retry".
c. Select IP and TCP HW checksum calculation for P0; select TCP HW
   checksum calculation for vhost-user port.
d. Launch a VM with csum and tso offloading enabled.
e. Run iperf-client on virtio-net port in the VM to send TCP packets.
   With enabling csum and tso, the VM can send large TCP/IPv4 packets
   (mss is up to 64KB).
f. P1 is assigned to linux kernel and enabled kernel GRO. Run
   iperf-server on P1.

We conduct three iperf tests:

test-1: enable GSO for P0 in testpmd, and set max GSO segment length
    to 1518B. Run two iperf-client in the VM.
test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
    two iperf-client in the VM.
test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.

Throughput of the above three tests:

test-1: 9.4Gbps
test-2: 9.5Gbps
test-3: 3Mbps

Functional Testing
==================
Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
length of tunneled packets from VMs is 1514B. So current experiment
method can't be used to measure VxLAN and GRE GSO performance, but simply
test the functionality via setting small GSO segment length (e.g. 500B).

VxLAN
-----
To test VxLAN GSO functionality, we use the following setup:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
   engine with "retry".
c. Testpmd commands:
    - csum parse_tunnel on "P0"
    - csum parse_tunnel on "vhost-user port"
    - csum set outer-ip hw "P0"
    - csum set ip hw "P0"
    - csum set tcp hw "P0"
    - csum set tcp hw "vhost-user port"
    - set port "P0" gso on
    - set gso segsz 500
d. Launch a VM with csum and tso offloading enabled.
e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
   on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
   max packet length is 1514B.
f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
   create a VxLAN port for P1, and run iperf-server on the VxLAN port.

In testpmd, we can see the length of all packets sent from P0 is smaller
than or equal to 500B. Additionally, the packets arriving in P1 is
encapsulated and is smaller than or equal to 500B.

GRE
---
The same process may be used to test GRE functionality, with the exception that
the tunnel type created for both the guest's virtio-net, and the host's kernel
interfaces is GRE:
   `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`

As in the VxLAN testcase, the length of packets sent from P0, and received on
P1, is less than 500B.

Change log
==========
v8:
- resolve coding style infractions (indentation).
- centralize invalid parameter checking for rte_gso_segment() into a single
  'if' statement.
- don't clear PKT_TX_TCP_SEG flag for packets that don't qualify for GSO
  on account of invalid params.
- allow GSO for tunneled packets only via gso_ctx (by correcting 'if'
  statement condition).

v7:
- add RTE_GSO_SEG_SIZE_MIN macro; use this to validate gso_ctx.gso_segsz.
- rename 'ipid_flag' member of gso_ctx to 'flag'.
- remove mention of VLAN tags in supported packet types.
- don't clear PKT_TX_TCP_SEG flag if GSO fails.
- take all packet overhead into account when checking for empty packet.
- ensure that only enabled GSO types are enacted upon (i.e. no fall-through to
  TCP/IPv4 case from tunneled case).
- validate user-supplied gso segsz arg against RTE_GSO_SEG_SIZE_MIN in testpmd.
- simplify error-checking/handling for GSO failure case in testpmd csum engine.
- use 0 instead of !RTE_GSO_IPID_FIXED in testpmd.

v6:
- rebase to HEAD of master (i5dce9fcA)
- remove 'l3_offset' parameter from 'update_ipv4_tcp_headers'

v5:
- add GSO section to the programmer's guide.
- use MF or (previously 'and') offset to check if a packet is IP
  fragmented.
- move 'update_header' helper functions to gso_common.h.
- move txp/ipv4 'update_header' function to gso_tcp4.c.
- move tunnel 'update_header' function to gso_tunnel_tcp4.c.
- add offset parameter to 'update_header' functions.
- combine GRE and VxLAN tunnel header update functions into a single
  function.
- correct typos and errors in comments/commit messages.

v4:
- use ol_flags instead of packet_type to decide which segmentation
  function to use.
- use MF and offset to check if a packet is IP fragmented, instead of
  using DF.
- remove ETHER_CRC_LEN from gso segment payload length calculation.
- refactor internal header update and other functions.
- remove RTE_GSO_IPID_INCREASE.
- add some of GSO documents.
- set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
  packets sent from GSO-enabled ports in testpmd.
v3:
- support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
  RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
  UNKNOWN.
- fill mbuf->packet_type instead of using rte_net_get_ptype() in
  csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
- store the input packet into pkts_out inside gso_tcp4_segment() and
  gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
  is performed.
- add missing incldues.
- optimize file names, function names and function description.
- fix one bug in testpmd.
v2:
- merge data segments whose data_len is less than mss into a large data
  segment in gso_do_segment().
- use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
  header in rte_gso_segment().
- provide IP id macros for applications to select fixed or incremental IP
  ids.

Jiayu Hu (3):
  gso: add Generic Segmentation Offload API framework
  gso: add TCP/IPv4 GSO support
  app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO

Mark Kavanagh (3):
  gso: add VxLAN GSO support
  gso: add GRE GSO support
  doc: add GSO programmer's guide

 MAINTAINERS                                        |   6 +
 app/test-pmd/cmdline.c                             | 179 ++++++++
 app/test-pmd/config.c                              |  24 ++
 app/test-pmd/csumonly.c                            |  42 +-
 app/test-pmd/testpmd.c                             |  13 +
 app/test-pmd/testpmd.h                             |  10 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   1 +
 doc/api/doxy-api.conf                              |   1 +
 .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
 .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
 doc/guides/prog_guide/index.rst                    |   1 +
 doc/guides/rel_notes/release_17_11.rst             |  17 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  46 ++
 lib/Makefile                                       |   2 +
 lib/librte_eal/common/include/rte_log.h            |   1 +
 lib/librte_gso/Makefile                            |  52 +++
 lib/librte_gso/gso_common.c                        | 153 +++++++
 lib/librte_gso/gso_common.h                        | 171 ++++++++
 lib/librte_gso/gso_tcp4.c                          | 104 +++++
 lib/librte_gso/gso_tcp4.h                          |  74 ++++
 lib/librte_gso/gso_tunnel_tcp4.c                   | 126 ++++++
 lib/librte_gso/gso_tunnel_tcp4.h                   |  75 ++++
 lib/librte_gso/rte_gso.c                           | 110 +++++
 lib/librte_gso/rte_gso.h                           | 148 +++++++
 lib/librte_gso/rte_gso_version.map                 |   7 +
 mk/rte.app.mk                                      |   1 +
 28 files changed, 2411 insertions(+), 4 deletions(-)
 create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
 create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
 create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

-- 
1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v8 1/6] gso: add Generic Segmentation Offload API framework
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
                               ` (7 preceding siblings ...)
  2017-10-05 15:43             ` [PATCH v8 " Mark Kavanagh
@ 2017-10-05 15:43             ` Mark Kavanagh
  2017-10-05 15:44             ` [PATCH v8 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
                               ` (4 subsequent siblings)
  13 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 15:43 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. To segment a packet requires two steps. The first
is to set proper flags to mbuf->ol_flags, where the flags are the same
as that of TSO. The second is to call the segmentation API,
rte_gso_segment(). This patch introduces the GSO API framework to DPDK.

rte_gso_segment() splits an input packet into small ones in each
invocation. The GSO library refers to these small packets generated
by rte_gso_segment() as GSO segments. Each of the newly-created GSO
segments is organized as a two-segment MBUF, where the first segment is a
standard MBUF, which stores a copy of packet header, and the second is an
indirect MBUF which points to a section of data in the input packet.
rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
when all GSO segments are freed, the input packet is freed automatically.
Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
the driver of the interface which the GSO segments are sent to should
support to transmit multi-segment packets.

The GSO framework clears the PKT_TX_TCP_SEG flag for both the input
packet, and all produced GSO segments in the event of success, since
segmentation in hardware is no longer required at that point.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 config/common_base                     |   5 ++
 doc/api/doxy-api-index.md              |   1 +
 doc/api/doxy-api.conf                  |   1 +
 doc/guides/rel_notes/release_17_11.rst |   1 +
 lib/Makefile                           |   2 +
 lib/librte_gso/Makefile                |  49 +++++++++++
 lib/librte_gso/rte_gso.c               |  52 ++++++++++++
 lib/librte_gso/rte_gso.h               | 143 +++++++++++++++++++++++++++++++++
 lib/librte_gso/rte_gso_version.map     |   7 ++
 mk/rte.app.mk                          |   1 +
 10 files changed, 262 insertions(+)
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

diff --git a/config/common_base b/config/common_base
index 12f6be9..58ca5c0 100644
--- a/config/common_base
+++ b/config/common_base
@@ -653,6 +653,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
 CONFIG_RTE_LIBRTE_GRO=y
 
 #
+# Compile GSO library
+#
+CONFIG_RTE_LIBRTE_GSO=y
+
+#
 # Compile librte_meter
 #
 CONFIG_RTE_LIBRTE_METER=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 19e0d4f..6512918 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -101,6 +101,7 @@ The public API headers are grouped by topics:
   [TCP]                (@ref rte_tcp.h),
   [UDP]                (@ref rte_udp.h),
   [GRO]                (@ref rte_gro.h),
+  [GSO]                (@ref rte_gso.h),
   [frag/reass]         (@ref rte_ip_frag.h),
   [LPM IPv4 route]     (@ref rte_lpm.h),
   [LPM IPv6 route]     (@ref rte_lpm6.h),
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 823554f..408f2e6 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -47,6 +47,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_ether \
                           lib/librte_eventdev \
                           lib/librte_gro \
+                          lib/librte_gso \
                           lib/librte_hash \
                           lib/librte_ip_frag \
                           lib/librte_jobstats \
diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index f6f9169..5bb36b7 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -174,6 +174,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_ethdev.so.7
      librte_eventdev.so.2
      librte_gro.so.1
+   + librte_gso.so.1
      librte_hash.so.2
      librte_ip_frag.so.1
      librte_jobstats.so.1
diff --git a/lib/Makefile b/lib/Makefile
index 86caba1..3d123f4 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
 DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
+DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
+DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
new file mode 100644
index 0000000..aeaacbc
--- /dev/null
+++ b/lib/librte_gso/Makefile
@@ -0,0 +1,49 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_gso.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+
+EXPORT_MAP := rte_gso_version.map
+
+LIBABIVER := 1
+
+#source files
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
new file mode 100644
index 0000000..b773636
--- /dev/null
+++ b/lib/librte_gso/rte_gso.c
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <errno.h>
+
+#include "rte_gso.h"
+
+int
+rte_gso_segment(struct rte_mbuf *pkt,
+		const struct rte_gso_ctx *gso_ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
+			nb_pkts_out < 1)
+		return -EINVAL;
+
+	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+	pkts_out[0] = pkt;
+
+	return 1;
+}
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
new file mode 100644
index 0000000..7d343d7
--- /dev/null
+++ b/lib/librte_gso/rte_gso.h
@@ -0,0 +1,143 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_GSO_H_
+#define _RTE_GSO_H_
+
+/**
+ * @file
+ * Interface to GSO library
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/* GSO flags for rte_gso_ctx. */
+#define RTE_GSO_FLAG_IPID_FIXED (1ULL << 0)
+/**< Use fixed IP ids for output GSO segments. Setting
+ * !RTE_GSO_IPID_FIXED indicates using incremental IP ids.
+ */
+
+/**
+ * GSO context structure.
+ */
+struct rte_gso_ctx {
+	struct rte_mempool *direct_pool;
+	/**< MBUF pool for allocating direct buffers, which are used
+	 * to store packet headers for GSO segments.
+	 */
+	struct rte_mempool *indirect_pool;
+	/**< MBUF pool for allocating indirect buffers, which are used
+	 * to locate packet payloads for GSO segments. The indirect
+	 * buffer doesn't contain any data, but simply points to an
+	 * offset within the packet to segment.
+	 */
+	uint64_t flag;
+	/**< flag that controls specific attributes of output segments,
+	 * such as the type of IP ID generated (i.e. fixed or incremental).
+	 */
+	uint32_t gso_types;
+	/**< the bit mask of required GSO types. The GSO library
+	 * uses the same macros as that of describing device TX
+	 * offloading capabilities (i.e. DEV_TX_OFFLOAD_*_TSO) for
+	 * gso_types.
+	 *
+	 * For example, if applications want to segment TCP/IPv4
+	 * packets, set DEV_TX_OFFLOAD_TCP_TSO in gso_types.
+	 */
+	uint16_t gso_size;
+	/**< maximum size of an output GSO segment, including packet
+	 * header and payload, measured in bytes.
+	 */
+};
+
+/**
+ * Segmentation function, which supports processing of both single- and
+ * multi- MBUF packets.
+ *
+ * Note that we refer to the packets that are segmented from the input
+ * packet as 'GSO segments'. rte_gso_segment() doesn't check if the
+ * input packet has correct checksums, and doesn't update checksums for
+ * output GSO segments. Additionally, it doesn't process IP fragment
+ * packets.
+ *
+ * Before calling rte_gso_segment(), applications must set proper ol_flags
+ * for the packet. The GSO library uses the same macros as that of TSO.
+ * For example, set PKT_TX_TCP_SEG and PKT_TX_IPV4 in ol_flags to segment
+ * a TCP/IPv4 packet. If rte_gso_segment() succceds, the PKT_TX_TCP_SEG
+ * flag is removed for all GSO segments and the input packet.
+ *
+ * Each of the newly-created GSO segments is organized as a two-segment
+ * MBUF, where the first segment is a standard MBUF, which stores a copy
+ * of packet header, and the second is an indirect MBUF which points to
+ * a section of data in the input packet. Since each GSO segment has
+ * multiple MBUFs (i.e. typically 2 MBUFs), the driver of the interface which
+ * the GSO segments are sent to should support transmission of multi-segment
+ * packets.
+ *
+ * If the input packet is GSO'd, its mbuf refcnt reduces by 1. Therefore,
+ * when all GSO segments are freed, the input packet is freed automatically.
+ *
+ * If the memory space in pkts_out or MBUF pools is insufficient, this
+ * function fails, and it returns (-1) * errno. Otherwise, GSO succeeds,
+ * and this function returns the number of output GSO segments filled in
+ * pkts_out.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param ctx
+ *  GSO context object pointer.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when rte_gso_segment() succeeds.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of GSO segments filled in pkts_out on success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int rte_gso_segment(struct rte_mbuf *pkt,
+		const struct rte_gso_ctx *ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_GSO_H_ */
diff --git a/lib/librte_gso/rte_gso_version.map b/lib/librte_gso/rte_gso_version.map
new file mode 100644
index 0000000..e1fd453
--- /dev/null
+++ b/lib/librte_gso/rte_gso_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_gso_segment;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index c25fdd9..d4c9873 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
+_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v8 2/6] gso: add TCP/IPv4 GSO support
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
                               ` (8 preceding siblings ...)
  2017-10-05 15:43             ` [PATCH v8 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
@ 2017-10-05 15:44             ` Mark Kavanagh
  2017-10-05 15:44             ` [PATCH v8 3/6] gso: add VxLAN " Mark Kavanagh
                               ` (3 subsequent siblings)
  13 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 15:44 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

This patch adds GSO support for TCP/IPv4 packets. Supported packets
may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
packets have correct checksums, and doesn't update checksums for
output packets (the responsibility for this lies with the application).
Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.

TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
MBUF, to organize an output packet. Note that we refer to these two
chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
header, while the indirect mbuf simply points to a location within the
original packet's payload. Consequently, use of the GSO library requires
multi-segment MBUF support in the TX functions of the NIC driver.

If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
result, when all of its GSOed segments are freed, the packet is freed
automatically.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst  |  12 +++
 lib/Makefile                            |   2 +-
 lib/librte_eal/common/include/rte_log.h |   1 +
 lib/librte_gso/Makefile                 |   2 +
 lib/librte_gso/gso_common.c             | 153 ++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h             | 141 +++++++++++++++++++++++++++++
 lib/librte_gso/gso_tcp4.c               | 104 ++++++++++++++++++++++
 lib/librte_gso/gso_tcp4.h               |  74 +++++++++++++++
 lib/librte_gso/rte_gso.c                |  53 ++++++++++-
 lib/librte_gso/rte_gso.h                |   7 +-
 10 files changed, 543 insertions(+), 6 deletions(-)
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 5bb36b7..dd37169 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -41,6 +41,18 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added the Generic Segmentation Offload Library.**
+
+  Added the Generic Segmentation Offload (GSO) library to enable
+  applications to split large packets (e.g. MTU is 64KB) into small
+  ones (e.g. MTU is 1500B). Supported packet types are:
+
+  * TCP/IPv4 packets.
+
+  The GSO library doesn't check if the input packets have correct
+  checksums, and doesn't update checksums for output packets.
+  Additionally, the GSO library doesn't process IP fragmented packets.
+
 
 Resolved Issues
 ---------------
diff --git a/lib/Makefile b/lib/Makefile
index 3d123f4..5ecd1b3 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -109,7 +109,7 @@ DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
 DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
-DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
+DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net librte_mempool
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index ec8dba7..2fa1199 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -87,6 +87,7 @@ struct rte_logs {
 #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
 #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
 #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
+#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index aeaacbc..2be64d1 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -42,6 +42,8 @@ LIBABIVER := 1
 
 #source files
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
new file mode 100644
index 0000000..ee75d4c
--- /dev/null
+++ b/lib/librte_gso/gso_common.c
@@ -0,0 +1,153 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdbool.h>
+#include <errno.h>
+
+#include <rte_memcpy.h>
+#include <rte_mempool.h>
+
+#include "gso_common.h"
+
+static inline void
+hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset)
+{
+	/* Copy MBUF metadata */
+	hdr_segment->nb_segs = 1;
+	hdr_segment->port = pkt->port;
+	hdr_segment->ol_flags = pkt->ol_flags;
+	hdr_segment->packet_type = pkt->packet_type;
+	hdr_segment->pkt_len = pkt_hdr_offset;
+	hdr_segment->data_len = pkt_hdr_offset;
+	hdr_segment->tx_offload = pkt->tx_offload;
+
+	/* Copy the packet header */
+	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
+			rte_pktmbuf_mtod(pkt, char *),
+			pkt_hdr_offset);
+}
+
+static inline void
+free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
+{
+	uint16_t i;
+
+	for (i = 0; i < nb_pkts; i++)
+		rte_pktmbuf_free(pkts[i]);
+}
+
+int
+gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct rte_mbuf *pkt_in;
+	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
+	uint16_t pkt_in_data_pos, segment_bytes_remaining;
+	uint16_t pyld_len, nb_segs;
+	bool more_in_pkt, more_out_segs;
+
+	pkt_in = pkt;
+	nb_segs = 0;
+	more_in_pkt = 1;
+	pkt_in_data_pos = pkt_hdr_offset;
+
+	while (more_in_pkt) {
+		if (unlikely(nb_segs >= nb_pkts_out)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -EINVAL;
+		}
+
+		/* Allocate a direct MBUF */
+		hdr_segment = rte_pktmbuf_alloc(direct_pool);
+		if (unlikely(hdr_segment == NULL)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -ENOMEM;
+		}
+		/* Fill the packet header */
+		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
+
+		prev_segment = hdr_segment;
+		segment_bytes_remaining = pyld_unit_size;
+		more_out_segs = 1;
+
+		while (more_out_segs && more_in_pkt) {
+			/* Allocate an indirect MBUF */
+			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
+			if (unlikely(pyld_segment == NULL)) {
+				rte_pktmbuf_free(hdr_segment);
+				free_gso_segment(pkts_out, nb_segs);
+				return -ENOMEM;
+			}
+			/* Attach to current MBUF segment of pkt */
+			rte_pktmbuf_attach(pyld_segment, pkt_in);
+
+			prev_segment->next = pyld_segment;
+			prev_segment = pyld_segment;
+
+			pyld_len = segment_bytes_remaining;
+			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
+				pyld_len = pkt_in->data_len - pkt_in_data_pos;
+
+			pyld_segment->data_off = pkt_in_data_pos +
+				pkt_in->data_off;
+			pyld_segment->data_len = pyld_len;
+
+			/* Update header segment */
+			hdr_segment->pkt_len += pyld_len;
+			hdr_segment->nb_segs++;
+
+			pkt_in_data_pos += pyld_len;
+			segment_bytes_remaining -= pyld_len;
+
+			/* Finish processing a MBUF segment of pkt */
+			if (pkt_in_data_pos == pkt_in->data_len) {
+				pkt_in = pkt_in->next;
+				pkt_in_data_pos = 0;
+				if (pkt_in == NULL)
+					more_in_pkt = 0;
+			}
+
+			/* Finish generating a GSO segment */
+			if (segment_bytes_remaining == 0)
+				more_out_segs = 0;
+		}
+		pkts_out[nb_segs++] = hdr_segment;
+	}
+	return nb_segs;
+}
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
new file mode 100644
index 0000000..a8ad638
--- /dev/null
+++ b/lib/librte_gso/gso_common.h
@@ -0,0 +1,141 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_COMMON_H_
+#define _GSO_COMMON_H_
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
+#define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
+		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
+
+#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
+#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
+
+#define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
+
+/**
+ * Internal function which updates the TCP header of a packet, following
+ * segmentation. This is required to update the header's 'sent' sequence
+ * number, and also to clear 'PSH' and 'FIN' flags for non-tail segments.
+ *
+ * @param pkt
+ *  The packet containing the TCP header.
+ * @param l4_offset
+ *  The offset of the TCP header from the start of the packet.
+ * @param sent_seq
+ *  The sent sequence number.
+ * @param non-tail
+ *  Indicates whether or not this is a tail segment.
+ */
+static inline void
+update_tcp_header(struct rte_mbuf *pkt, uint16_t l4_offset, uint32_t sent_seq,
+		uint8_t non_tail)
+{
+	struct tcp_hdr *tcp_hdr;
+
+	tcp_hdr = (struct tcp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l4_offset);
+	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
+	if (likely(non_tail))
+		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
+					TCP_HDR_FIN_MASK));
+}
+
+/**
+ * Internal function which updates the IPv4 header of a packet, following
+ * segmentation. This is required to update the header's 'total_length' field,
+ * to reflect the reduced length of the now-segmented packet. Furthermore, the
+ * header's 'packet_id' field must be updated to reflect the new ID of the
+ * now-segmented packet.
+ *
+ * @param pkt
+ *  The packet containing the IPv4 header.
+ * @param l3_offset
+ *  The offset of the IPv4 header from the start of the packet.
+ * @param id
+ *  The new ID of the packet.
+ */
+static inline void
+update_ipv4_header(struct rte_mbuf *pkt, uint16_t l3_offset, uint16_t id)
+{
+	struct ipv4_hdr *ipv4_hdr;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l3_offset);
+	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len - l3_offset);
+	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
+}
+
+/**
+ * Internal function which divides the input packet into small segments.
+ * Each of the newly-created segments is organized as a two-segment MBUF,
+ * where the first segment is a standard mbuf, which stores a copy of
+ * packet header, and the second is an indirect mbuf which points to a
+ * section of data in the input packet.
+ *
+ * @param pkt
+ *  Packet to segment.
+ * @param pkt_hdr_offset
+ *  Packet header offset, measured in bytes.
+ * @param pyld_unit_size
+ *  The max payload length of a GSO segment.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to keep the mbuf addresses of output segments. If
+ *  the memory space in pkts_out is insufficient, gso_do_segment() fails
+ *  and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of segments created in the event of success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
new file mode 100644
index 0000000..d83e610
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.c
@@ -0,0 +1,104 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "gso_common.h"
+#include "gso_tcp4.h"
+
+static void
+update_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t id, tail_idx, i;
+	uint16_t l3_offset = pkt->l2_len;
+	uint16_t l4_offset = l3_offset + pkt->l3_len;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char*) +
+			l3_offset);
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+
+	for (i = 0; i < nb_segs; i++) {
+		update_ipv4_header(segs[i], l3_offset, id);
+		update_tcp_header(segs[i], l4_offset, sent_seq, i < tail_idx);
+		id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
+
+int
+gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t tcp_dl;
+	uint16_t pyld_unit_size, hdr_offset;
+	uint16_t frag_off;
+	int ret;
+
+	/* Don't process the fragmented packet */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->l2_len);
+	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	/* Don't process the packet without data */
+	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
+	if (unlikely(tcp_dl == 0)) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret > 1)
+		update_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
new file mode 100644
index 0000000..1c57441
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.h
@@ -0,0 +1,74 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TCP4_H_
+#define _GSO_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment an IPv4/TCP packet. This function doesn't check if the input
+ * packet has correct checksums, and doesn't update checksums for output
+ * GSO segments. Furthermore, it doesn't process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing unit of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when the function succeeds. If the memory space in
+ *  pkts_out is insufficient, it fails and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ip_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index b773636..e414df4 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -33,7 +33,12 @@
 
 #include <errno.h>
 
+#include <rte_log.h>
+#include <rte_ethdev.h>
+
 #include "rte_gso.h"
+#include "gso_common.h"
+#include "gso_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -41,12 +46,52 @@
 		struct rte_mbuf **pkts_out,
 		uint16_t nb_pkts_out)
 {
+	struct rte_mempool *direct_pool, *indirect_pool;
+	struct rte_mbuf *pkt_seg;
+	uint64_t ol_flags;
+	uint16_t gso_size;
+	uint8_t ipid_delta;
+	int ret = 1;
+
 	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
-			nb_pkts_out < 1)
+			nb_pkts_out < 1 ||
+			gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN ||
+			gso_ctx->gso_types != DEV_TX_OFFLOAD_TCP_TSO)
 		return -EINVAL;
 
-	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
-	pkts_out[0] = pkt;
+	if (gso_ctx->gso_size >= pkt->pkt_len) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	direct_pool = gso_ctx->direct_pool;
+	indirect_pool = gso_ctx->indirect_pool;
+	gso_size = gso_ctx->gso_size;
+	ipid_delta = (gso_ctx->flag != RTE_GSO_FLAG_IPID_FIXED);
+	ol_flags = pkt->ol_flags;
+
+	if (IS_IPV4_TCP(pkt->ol_flags)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else {
+		pkts_out[0] = pkt;
+		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
+		return 1;
+	}
+
+	if (ret > 1) {
+		pkt_seg = pkt;
+		while (pkt_seg) {
+			rte_mbuf_refcnt_update(pkt_seg, -1);
+			pkt_seg = pkt_seg->next;
+		}
+	} else if (ret < 0) {
+		/* Revert the ol_flags in the event of failure. */
+		pkt->ol_flags = ol_flags;
+	}
 
-	return 1;
+	return ret;
 }
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
index 7d343d7..7ca2d81 100644
--- a/lib/librte_gso/rte_gso.h
+++ b/lib/librte_gso/rte_gso.h
@@ -46,6 +46,10 @@
 #include <stdint.h>
 #include <rte_mbuf.h>
 
+/* Minimum GSO segment size. */
+#define RTE_GSO_SEG_SIZE_MIN (sizeof(struct ether_hdr) + \
+		sizeof(struct ipv4_hdr) + sizeof(struct tcp_hdr) + 1)
+
 /* GSO flags for rte_gso_ctx. */
 #define RTE_GSO_FLAG_IPID_FIXED (1ULL << 0)
 /**< Use fixed IP ids for output GSO segments. Setting
@@ -81,7 +85,8 @@ struct rte_gso_ctx {
 	 */
 	uint16_t gso_size;
 	/**< maximum size of an output GSO segment, including packet
-	 * header and payload, measured in bytes.
+	 * header and payload, measured in bytes. Must exceed
+	 * RTE_GSO_SEG_SIZE_MIN.
 	 */
 };
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v8 3/6] gso: add VxLAN GSO support
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
                               ` (9 preceding siblings ...)
  2017-10-05 15:44             ` [PATCH v8 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
@ 2017-10-05 15:44             ` Mark Kavanagh
  2017-10-05 15:44             ` [PATCH v8 4/6] gso: add GRE " Mark Kavanagh
                               ` (2 subsequent siblings)
  13 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 15:44 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

This patch adds a framework that allows GSO on tunneled packets.
Furthermore, it leverages that framework to provide GSO support for
VxLAN-encapsulated packets.

Supported VxLAN packets must have an outer IPv4 header (prepended by an
optional VLAN tag), and contain an inner TCP/IPv4 packet (with an optional
inner VLAN tag).

VxLAN GSO doesn't check if input packets have correct checksums and
doesn't update checksums for output packets. Additionally, it doesn't
process IP fragmented packets.

As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
output packet, which mandates support for multi-segment mbufs in the TX
functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
reduces its MBUF refcnt by 1. As a result, when all of its GSO'd segments
are freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst |   2 +
 lib/librte_gso/Makefile                |   1 +
 lib/librte_gso/gso_common.h            |  25 +++++++
 lib/librte_gso/gso_tunnel_tcp4.c       | 120 +++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_tunnel_tcp4.h       |  75 +++++++++++++++++++++
 lib/librte_gso/rte_gso.c               |  14 +++-
 6 files changed, 235 insertions(+), 2 deletions(-)
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index dd37169..c58eeb1 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -48,6 +48,8 @@ New Features
   ones (e.g. MTU is 1500B). Supported packet types are:
 
   * TCP/IPv4 packets.
+  * VxLAN packets, which must have an outer IPv4 header, and contain
+    an inner TCP/IPv4 packet.
 
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index 2be64d1..e6d41df 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -44,6 +44,7 @@ LIBABIVER := 1
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index a8ad638..95d54e7 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -39,6 +39,7 @@
 #include <rte_mbuf.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
+#include <rte_udp.h>
 
 #define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
 		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
@@ -49,6 +50,30 @@
 #define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
 		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
 
+#define IS_IPV4_VXLAN_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
+				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
+		 PKT_TX_TUNNEL_VXLAN))
+
+/**
+ * Internal function which updates the UDP header of a packet, following
+ * segmentation. This is required to update the header's datagram length field.
+ *
+ * @param pkt
+ *  The packet containing the UDP header.
+ * @param udp_offset
+ *  The offset of the UDP header from the start of the packet.
+ */
+static inline void
+update_udp_header(struct rte_mbuf *pkt, uint16_t udp_offset)
+{
+	struct udp_hdr *udp_hdr;
+
+	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			udp_offset);
+	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - udp_offset);
+}
+
 /**
  * Internal function which updates the TCP header of a packet, following
  * segmentation. This is required to update the header's 'sent' sequence
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
new file mode 100644
index 0000000..5e8c8e5
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -0,0 +1,120 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "gso_common.h"
+#include "gso_tunnel_tcp4.h"
+
+static void
+update_tunnel_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t outer_id, inner_id, tail_idx, i;
+	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
+
+	outer_ipv4_offset = pkt->outer_l2_len;
+	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
+	inner_ipv4_offset = udp_offset + pkt->l2_len;
+	tcp_offset = inner_ipv4_offset + pkt->l3_len;
+
+	/* Outer IPv4 header. */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			outer_ipv4_offset);
+	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+
+	/* Inner IPv4 header. */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			inner_ipv4_offset);
+	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+
+	for (i = 0; i < nb_segs; i++) {
+		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
+		update_udp_header(segs[i], udp_offset);
+		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
+		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
+		outer_id++;
+		inner_id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
+
+int
+gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *inner_ipv4_hdr;
+	uint16_t pyld_unit_size, hdr_offset, frag_off;
+	int ret = 1;
+
+	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
+	inner_ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			hdr_offset);
+	/*
+	 * Don't process the packet whose MF bit or offset in the inner
+	 * IPv4 header are non-zero.
+	 */
+	frag_off = rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	hdr_offset += pkt->l3_len + pkt->l4_len;
+	/* Don't process the packet without data */
+	if (hdr_offset >= pkt->pkt_len) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret <= 1)
+		return ret;
+
+	update_tunnel_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tunnel_tcp4.h b/lib/librte_gso/gso_tunnel_tcp4.h
new file mode 100644
index 0000000..3c67f0c
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.h
@@ -0,0 +1,75 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TUNNEL_TCP4_H_
+#define _GSO_TUNNEL_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment a tunneling packet with inner TCP/IPv4 headers. This function
+ * doesn't check if the input packet has correct checksums, and doesn't
+ * update checksums for output GSO segments. Furthermore, it doesn't
+ * process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing unit of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when it succeeds. If the memory space in pkts_out is
+ *  insufficient, it fails and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index e414df4..b4c3e34 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -39,6 +39,7 @@
 #include "rte_gso.h"
 #include "gso_common.h"
 #include "gso_tcp4.h"
+#include "gso_tunnel_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -56,7 +57,8 @@
 	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
 			nb_pkts_out < 1 ||
 			gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN ||
-			gso_ctx->gso_types != DEV_TX_OFFLOAD_TCP_TSO)
+			((gso_ctx->gso_types & (DEV_TX_OFFLOAD_TCP_TSO |
+			DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) == 0))
 		return -EINVAL;
 
 	if (gso_ctx->gso_size >= pkt->pkt_len) {
@@ -71,12 +73,20 @@
 	ipid_delta = (gso_ctx->flag != RTE_GSO_FLAG_IPID_FIXED);
 	ol_flags = pkt->ol_flags;
 
-	if (IS_IPV4_TCP(pkt->ol_flags)) {
+	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)
+		&& (gso_ctx->gso_types & DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
+			(gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO)) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
 				pkts_out, nb_pkts_out);
 	} else {
+		/* unsupported packet, skip */
 		pkts_out[0] = pkt;
 		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
 		return 1;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v8 4/6] gso: add GRE GSO support
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
                               ` (10 preceding siblings ...)
  2017-10-05 15:44             ` [PATCH v8 3/6] gso: add VxLAN " Mark Kavanagh
@ 2017-10-05 15:44             ` Mark Kavanagh
  2017-10-05 15:44             ` [PATCH v8 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
  2017-10-05 15:44             ` [PATCH v8 6/6] doc: add GSO programmer's guide Mark Kavanagh
  13 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 15:44 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

This patch adds GSO support for GRE-tunneled packets. Supported GRE
packets must contain an outer IPv4 header, and inner TCP/IPv4 headers.
They may also contain a single VLAN tag. GRE GSO doesn't check if all
input packets have correct checksums and doesn't update checksums for
output packets. Additionally, it doesn't process IP fragmented packets.

As with VxLAN GSO, GRE GSO uses a two-segment MBUF to organize each
output packet, which requires multi-segment mbuf support in the TX
functions of the NIC driver. Also, if a packet is GSOed, GRE GSO reduces
its MBUF refcnt by 1. As a result, when all of its GSOed segments are
freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst |  2 ++
 lib/librte_gso/gso_common.h            |  5 +++++
 lib/librte_gso/gso_tunnel_tcp4.c       | 14 ++++++++++----
 lib/librte_gso/rte_gso.c               |  9 ++++++---
 4 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index c58eeb1..2faa630 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -50,6 +50,8 @@ New Features
   * TCP/IPv4 packets.
   * VxLAN packets, which must have an outer IPv4 header, and contain
     an inner TCP/IPv4 packet.
+  * GRE packets, which must contain an outer IPv4 header, and inner
+    TCP/IPv4 headers.
 
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index 95d54e7..145ea49 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -55,6 +55,11 @@
 		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
 		 PKT_TX_TUNNEL_VXLAN))
 
+#define IS_IPV4_GRE_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
+				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_GRE)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
+		 PKT_TX_TUNNEL_GRE))
+
 /**
  * Internal function which updates the UDP header of a packet, following
  * segmentation. This is required to update the header's datagram length field.
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
index 5e8c8e5..8d0cfd7 100644
--- a/lib/librte_gso/gso_tunnel_tcp4.c
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -42,11 +42,13 @@
 	struct tcp_hdr *tcp_hdr;
 	uint32_t sent_seq;
 	uint16_t outer_id, inner_id, tail_idx, i;
-	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
+	uint16_t outer_ipv4_offset, inner_ipv4_offset;
+	uint16_t udp_gre_offset, tcp_offset;
+	uint8_t update_udp_hdr;
 
 	outer_ipv4_offset = pkt->outer_l2_len;
-	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
-	inner_ipv4_offset = udp_offset + pkt->l2_len;
+	udp_gre_offset = outer_ipv4_offset + pkt->outer_l3_len;
+	inner_ipv4_offset = udp_gre_offset + pkt->l2_len;
 	tcp_offset = inner_ipv4_offset + pkt->l3_len;
 
 	/* Outer IPv4 header. */
@@ -63,9 +65,13 @@
 	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
 	tail_idx = nb_segs - 1;
 
+	/* Only update UDP header for VxLAN packets. */
+	update_udp_hdr = (pkt->ol_flags & PKT_TX_TUNNEL_VXLAN) ? 1 : 0;
+
 	for (i = 0; i < nb_segs; i++) {
 		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
-		update_udp_header(segs[i], udp_offset);
+		if (update_udp_hdr)
+			update_udp_header(segs[i], udp_gre_offset);
 		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
 		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
 		outer_id++;
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index b4c3e34..4fc2dd2 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -58,7 +58,8 @@
 			nb_pkts_out < 1 ||
 			gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN ||
 			((gso_ctx->gso_types & (DEV_TX_OFFLOAD_TCP_TSO |
-			DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) == 0))
+			DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+			DEV_TX_OFFLOAD_GRE_TNL_TSO)) == 0))
 		return -EINVAL;
 
 	if (gso_ctx->gso_size >= pkt->pkt_len) {
@@ -73,8 +74,10 @@
 	ipid_delta = (gso_ctx->flag != RTE_GSO_FLAG_IPID_FIXED);
 	ol_flags = pkt->ol_flags;
 
-	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)
-		&& (gso_ctx->gso_types & DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) {
+	if ((IS_IPV4_VXLAN_TCP4(pkt->ol_flags) &&
+			(gso_ctx->gso_types & DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) ||
+			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
+			 (gso_ctx->gso_types & DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v8 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
                               ` (11 preceding siblings ...)
  2017-10-05 15:44             ` [PATCH v8 4/6] gso: add GRE " Mark Kavanagh
@ 2017-10-05 15:44             ` Mark Kavanagh
  2017-10-05 15:44             ` [PATCH v8 6/6] doc: add GSO programmer's guide Mark Kavanagh
  13 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 15:44 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

This patch adds GSO support to the csum forwarding engine. Oversized
packets transmitted over a GSO-enabled port will undergo segmentation
(with the exception of packet-types unsupported by the GSO library).
GSO support is disabled by default.

GSO support may be toggled on a per-port basis, using the command:

        "set port <port_id> gso on|off"

The maximum packet length (including the packet header and payload) for
GSO segments may be set with the command:

        "set gso segsz <length>"

Show GSO configuration for a given port with the command:

	"show port <port_id> gso"

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 app/test-pmd/cmdline.c                      | 179 ++++++++++++++++++++++++++++
 app/test-pmd/config.c                       |  24 ++++
 app/test-pmd/csumonly.c                     |  42 ++++++-
 app/test-pmd/testpmd.c                      |  13 ++
 app/test-pmd/testpmd.h                      |  10 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  46 +++++++
 6 files changed, 310 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index ccdf239..92e6171 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -431,6 +431,17 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"    Set max flow number and max packet number per-flow"
 			" for GRO.\n\n"
 
+			"set port (port_id) gso (on|off)"
+			"    Enable or disable Generic Segmentation Offload in"
+			" csum forwarding engine.\n\n"
+
+			"set gso segsz (length)\n"
+			"    Set max packet length for output GSO segments,"
+			" including packet header and payload.\n\n"
+
+			"show port (port_id) gso\n"
+			"    Show GSO configuration.\n\n"
+
 			"set fwd (%s)\n"
 			"    Set packet forwarding mode.\n\n"
 
@@ -3967,6 +3978,171 @@ struct cmd_gro_set_result {
 	},
 };
 
+/* *** ENABLE/DISABLE GSO *** */
+struct cmd_gso_enable_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_mode;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_enable_parsed(void *parsed_result,
+		__attribute__((unused)) struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_gso_enable_result *res;
+
+	res = parsed_result;
+	if (!strcmp(res->cmd_keyword, "gso"))
+		setup_gso(res->cmd_mode, res->cmd_pid);
+}
+
+cmdline_parse_token_string_t cmd_gso_enable_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_enable_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_enable_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_enable_mode =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_mode, "on#off");
+cmdline_parse_token_num_t cmd_gso_enable_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_enable = {
+	.f = cmd_gso_enable_parsed,
+	.data = NULL,
+	.help_str = "set port <port_id> gso on|off",
+	.tokens = {
+		(void *)&cmd_gso_enable_set,
+		(void *)&cmd_gso_enable_port,
+		(void *)&cmd_gso_enable_pid,
+		(void *)&cmd_gso_enable_keyword,
+		(void *)&cmd_gso_enable_mode,
+		NULL,
+	},
+};
+
+/* *** SET MAX PACKET LENGTH FOR GSO SEGMENTS *** */
+struct cmd_gso_size_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_segsz;
+	uint16_t cmd_size;
+};
+
+static void
+cmd_gso_size_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_size_result *res = parsed_result;
+
+	if (test_done == 0) {
+		printf("Before setting GSO segsz, please first stop fowarding\n");
+		return;
+	}
+
+	if (!strcmp(res->cmd_keyword, "gso") &&
+			!strcmp(res->cmd_segsz, "segsz")) {
+		if (res->cmd_size < RTE_GSO_SEG_SIZE_MIN)
+			printf("gso_size should be larger than %lu."
+					" Please input a legal value\n",
+					RTE_GSO_SEG_SIZE_MIN);
+		else
+			gso_max_segment_size = res->cmd_size;
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_size_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_size_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_size_segsz =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_segsz, "segsz");
+cmdline_parse_token_num_t cmd_gso_size_size =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_size_result,
+				cmd_size, UINT16);
+
+cmdline_parse_inst_t cmd_gso_size = {
+	.f = cmd_gso_size_parsed,
+	.data = NULL,
+	.help_str = "set gso segsz <length>",
+	.tokens = {
+		(void *)&cmd_gso_size_set,
+		(void *)&cmd_gso_size_keyword,
+		(void *)&cmd_gso_size_segsz,
+		(void *)&cmd_gso_size_size,
+		NULL,
+	},
+};
+
+/* *** SHOW GSO CONFIGURATION *** */
+struct cmd_gso_show_result {
+	cmdline_fixed_string_t cmd_show;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_show_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_show_result *res = parsed_result;
+
+	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
+		printf("invalid port id %u\n", res->cmd_pid);
+		return;
+	}
+	if (!strcmp(res->cmd_keyword, "gso")) {
+		if (gso_ports[res->cmd_pid].enable) {
+			printf("Max GSO'd packet size: %uB\n"
+					"Supported GSO types: TCP/IPv4, "
+					"VxLAN with inner TCP/IPv4 packet, "
+					"GRE with inner TCP/IPv4  packet\n",
+					gso_max_segment_size);
+		} else
+			printf("GSO is not enabled on Port %u\n", res->cmd_pid);
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_show_show =
+TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+		cmd_show, "show");
+cmdline_parse_token_string_t cmd_gso_show_port =
+TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+		cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_show_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_num_t cmd_gso_show_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_show_result,
+				cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_show = {
+	.f = cmd_gso_show_parsed,
+	.data = NULL,
+	.help_str = "show port <port_id> gso",
+	.tokens = {
+		(void *)&cmd_gso_show_show,
+		(void *)&cmd_gso_show_port,
+		(void *)&cmd_gso_show_pid,
+		(void *)&cmd_gso_show_keyword,
+		NULL,
+	},
+};
+
 /* *** ENABLE/DISABLE FLUSH ON RX STREAMS *** */
 struct cmd_set_flush_rx {
 	cmdline_fixed_string_t set;
@@ -14255,6 +14431,9 @@ struct cmd_cmdfile_result {
 	(cmdline_parse_inst_t *)&cmd_tunnel_tso_show,
 	(cmdline_parse_inst_t *)&cmd_enable_gro,
 	(cmdline_parse_inst_t *)&cmd_gro_set,
+	(cmdline_parse_inst_t *)&cmd_gso_enable,
+	(cmdline_parse_inst_t *)&cmd_gso_size,
+	(cmdline_parse_inst_t *)&cmd_gso_show,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_rx,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_tx,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 3ae3e1c..88d09d0 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2454,6 +2454,30 @@ struct igb_ring_desc_16_bytes {
 	}
 }
 
+void
+setup_gso(const char *mode, uint8_t port_id)
+{
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		printf("invalid port id %u\n", port_id);
+		return;
+	}
+	if (strcmp(mode, "on") == 0) {
+		if (test_done == 0) {
+			printf("before enabling GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 1;
+	} else if (strcmp(mode, "off") == 0) {
+		if (test_done == 0) {
+			printf("before disabling GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 0;
+	}
+}
+
 char*
 list_pkt_forwarding_modes(void)
 {
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 90c8119..e2b18cb 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -70,6 +70,8 @@
 #include <rte_string_fns.h>
 #include <rte_flow.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
+
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
@@ -91,6 +93,7 @@
 /* structure that caches offload info for the current packet */
 struct testpmd_offload_info {
 	uint16_t ethertype;
+	uint8_t gso_enable;
 	uint16_t l2_len;
 	uint16_t l3_len;
 	uint16_t l4_len;
@@ -381,6 +384,8 @@ struct simple_gre_hdr {
 				get_udptcp_checksum(l3_hdr, tcp_hdr,
 					info->ethertype);
 		}
+		if (info->gso_enable)
+			ol_flags |= PKT_TX_TCP_SEG;
 	} else if (info->l4_proto == IPPROTO_SCTP) {
 		sctp_hdr = (struct sctp_hdr *)((char *)l3_hdr + info->l3_len);
 		sctp_hdr->cksum = 0;
@@ -627,6 +632,9 @@ struct simple_gre_hdr {
 pkt_burst_checksum_forward(struct fwd_stream *fs)
 {
 	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_mbuf *gso_segments[GSO_MAX_PKT_BURST];
+	struct rte_gso_ctx *gso_ctx;
+	struct rte_mbuf **tx_pkts_burst;
 	struct rte_port *txp;
 	struct rte_mbuf *m, *p;
 	struct ether_hdr *eth_hdr;
@@ -641,6 +649,8 @@ struct simple_gre_hdr {
 	uint32_t rx_bad_ip_csum;
 	uint32_t rx_bad_l4_csum;
 	struct testpmd_offload_info info;
+	uint16_t nb_segments = 0;
+	int ret;
 
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	uint64_t start_tsc;
@@ -674,6 +684,8 @@ struct simple_gre_hdr {
 	memset(&info, 0, sizeof(info));
 	info.tso_segsz = txp->tso_segsz;
 	info.tunnel_tso_segsz = txp->tunnel_tso_segsz;
+	if (gso_ports[fs->tx_port].enable)
+		info.gso_enable = 1;
 
 	for (i = 0; i < nb_rx; i++) {
 		if (likely(i < nb_rx - 1))
@@ -851,13 +863,34 @@ struct simple_gre_hdr {
 		}
 	}
 
+	if (gso_ports[fs->tx_port].enable == 0)
+		tx_pkts_burst = pkts_burst;
+	else {
+		gso_ctx = &(current_fwd_lcore()->gso_ctx);
+		gso_ctx->gso_size = gso_max_segment_size;
+		for (i = 0; i < nb_rx; i++) {
+			ret = rte_gso_segment(pkts_burst[i], gso_ctx,
+					&gso_segments[nb_segments],
+					RTE_DIM(gso_segments) - nb_segments);
+			if (ret < 0)  {
+				RTE_LOG(DEBUG, USER1,
+						"Unable to segment packet");
+				rte_pktmbuf_free(pkts_burst[i]);
+			} else
+				nb_segments += ret;
+		}
+
+		tx_pkts_burst = gso_segments;
+		nb_rx = nb_segments;
+	}
+
 	nb_prep = rte_eth_tx_prepare(fs->tx_port, fs->tx_queue,
-			pkts_burst, nb_rx);
+			tx_pkts_burst, nb_rx);
 	if (nb_prep != nb_rx)
 		printf("Preparing packet burst to transmit failed: %s\n",
 				rte_strerror(rte_errno));
 
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst,
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, tx_pkts_burst,
 			nb_prep);
 
 	/*
@@ -868,7 +901,7 @@ struct simple_gre_hdr {
 		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
 			rte_delay_us(burst_tx_delay_time);
 			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_rx - nb_tx);
+					&tx_pkts_burst[nb_tx], nb_rx - nb_tx);
 		}
 	}
 	fs->tx_packets += nb_tx;
@@ -881,9 +914,10 @@ struct simple_gre_hdr {
 	if (unlikely(nb_tx < nb_rx)) {
 		fs->fwd_dropped += (nb_rx - nb_tx);
 		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
+			rte_pktmbuf_free(tx_pkts_burst[nb_tx]);
 		} while (++nb_tx < nb_rx);
 	}
+
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	end_tsc = rte_rdtsc();
 	core_cycles = (end_tsc - start_tsc);
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index e097ee0..b9ee77c 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -400,6 +400,9 @@ static int eth_event_callback(uint8_t port_id,
  */
 static int all_ports_started(void);
 
+struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+uint16_t gso_max_segment_size = ETHER_MAX_LEN - ETHER_CRC_LEN;
+
 /*
  * Helper function to check if socket is already discovered.
  * If yes, return positive value. If not, return zero.
@@ -570,6 +573,7 @@ static int eth_event_callback(uint8_t port_id,
 	unsigned int nb_mbuf_per_pool;
 	lcoreid_t  lc_id;
 	uint8_t port_per_socket[RTE_MAX_NUMA_NODES];
+	uint32_t gso_types = 0;
 
 	memset(port_per_socket,0,RTE_MAX_NUMA_NODES);
 
@@ -654,6 +658,8 @@ static int eth_event_callback(uint8_t port_id,
 
 	init_port_config();
 
+	gso_types = DEV_TX_OFFLOAD_TCP_TSO | DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+		DEV_TX_OFFLOAD_GRE_TNL_TSO;
 	/*
 	 * Records which Mbuf pool to use by each logical core, if needed.
 	 */
@@ -664,6 +670,13 @@ static int eth_event_callback(uint8_t port_id,
 		if (mbp == NULL)
 			mbp = mbuf_pool_find(0);
 		fwd_lcores[lc_id]->mbp = mbp;
+		/* initialize GSO context */
+		fwd_lcores[lc_id]->gso_ctx.direct_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.indirect_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.gso_types = gso_types;
+		fwd_lcores[lc_id]->gso_ctx.gso_size = ETHER_MAX_LEN -
+			ETHER_CRC_LEN;
+		fwd_lcores[lc_id]->gso_ctx.flag = 0;
 	}
 
 	/* Configuration of packet forwarding streams. */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 1d1ee75..ff842a1 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -36,6 +36,7 @@
 
 #include <rte_pci.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
@@ -205,6 +206,7 @@ struct rte_port {
  * CPU id. configuration table.
  */
 struct fwd_lcore {
+	struct rte_gso_ctx gso_ctx;     /**< GSO context */
 	struct rte_mempool *mbp; /**< The mbuf pool to use by this core */
 	streamid_t stream_idx;   /**< index of 1st stream in "fwd_streams" */
 	streamid_t stream_nb;    /**< number of streams in "fwd_streams" */
@@ -442,6 +444,13 @@ struct gro_status {
 };
 extern struct gro_status gro_ports[RTE_MAX_ETHPORTS];
 
+#define GSO_MAX_PKT_BURST 2048
+struct gso_status {
+	uint8_t enable;
+};
+extern struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+extern uint16_t gso_max_segment_size;
+
 static inline unsigned int
 lcore_num(void)
 {
@@ -642,6 +651,7 @@ void port_rss_hash_key_update(portid_t port_id, char rss_type[],
 int rx_queue_id_is_invalid(queueid_t rxq_id);
 int tx_queue_id_is_invalid(queueid_t txq_id);
 void setup_gro(const char *mode, uint8_t port_id);
+void setup_gso(const char *mode, uint8_t port_id);
 
 /* Functions to manage the set of filtered Multicast MAC addresses */
 void mcast_addr_add(uint8_t port_id, struct ether_addr *mc_addr);
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 2ed62f5..f9b5bda 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -932,6 +932,52 @@ number of packets a GRO table can store.
 If current packet number is greater than or equal to the max value, GRO
 will stop processing incoming packets.
 
+set port - gso
+~~~~~~~~~~~~~~
+
+Toggle per-port GSO support in ``csum`` forwarding engine::
+
+   testpmd> set port <port_id> gso on|off
+
+If enabled, the csum forwarding engine will perform GSO on supported IPv4
+packets, transmitted on the given port.
+
+If disabled, packets transmitted on the given port will not undergo GSO.
+By default, GSO is disabled for all ports.
+
+.. note::
+
+   When GSO is enabled on a port, supported IPv4 packets transmitted on that
+   port undergo GSO. Afterwards, the segmented packets are represented by
+   multi-segment mbufs; however, the csum forwarding engine doesn't calculation
+   of checksums for GSO'd segments in SW. As a result, if users want correct
+   checksums in GSO segments, they should enable HW checksum calculation for
+   GSO-enabled ports.
+
+   For example, HW checksum calculation for VxLAN GSO'd packets may be enabled
+   by setting the following options in the csum forwarding engine:
+
+   testpmd> csum set outer_ip hw <port_id>
+
+   testpmd> csum set ip hw <port_id>
+
+   testpmd> csum set tcp hw <port_id>
+
+set gso segsz
+~~~~~~~~~~~~~
+
+Set the maximum GSO segment size (measured in bytes), which includes the
+packet header and the packet payload for GSO-enabled ports (global)::
+
+   testpmd> set gso segsz <length>
+
+show port - gso
+~~~~~~~~~~~~~~~
+
+Display the status of Generic Segmentation Offload for a given port::
+
+   testpmd> show port <port_id> gso
+
 mac_addr add
 ~~~~~~~~~~~~
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v8 6/6] doc: add GSO programmer's guide
  2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
                               ` (12 preceding siblings ...)
  2017-10-05 15:44             ` [PATCH v8 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
@ 2017-10-05 15:44             ` Mark Kavanagh
  2017-10-05 17:57               ` Mcnamara, John
  13 siblings, 1 reply; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 15:44 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

Add programmer's guide doc to explain the design and use of the
GSO library.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 MAINTAINERS                                        |   6 +
 .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
 .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
 doc/guides/prog_guide/index.rst                    |   1 +
 5 files changed, 1053 insertions(+)
 create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
 create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
 create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg

diff --git a/MAINTAINERS b/MAINTAINERS
index 8df2a7f..8f0a4bd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -644,6 +644,12 @@ M: Jiayu Hu <jiayu.hu@intel.com>
 F: lib/librte_gro/
 F: doc/guides/prog_guide/generic_receive_offload_lib.rst
 
+Generic Segmentation Offload
+M: Jiayu Hu <jiayu.hu@intel.com>
+M: Mark Kavanagh <mark.b.kavanagh@intel.com>
+F: lib/librte_gso/
+F: doc/guides/prog_guide/generic_segmentation_offload_lib.rst
+
 Distributor
 M: Bruce Richardson <bruce.richardson@intel.com>
 M: David Hunt <david.hunt@intel.com>
diff --git a/doc/guides/prog_guide/generic_segmentation_offload_lib.rst b/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
new file mode 100644
index 0000000..5e78f16
--- /dev/null
+++ b/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
@@ -0,0 +1,256 @@
+..  BSD LICENSE
+    Copyright(c) 2017 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Generic Segmentation Offload Library
+====================================
+
+Overview
+--------
+Generic Segmentation Offload (GSO) is a widely used software implementation of
+TCP Segmentation Offload (TSO), which reduces per-packet processing overhead.
+Much like TSO, GSO gains performance by enabling upper layer applications to
+process a smaller number of large packets (e.g. MTU size of 64KB), instead of
+processing higher numbers of small packets (e.g. MTU size of 1500B), thus
+reducing per-packet overhead.
+
+For example, GSO allows guest kernel stacks to transmit over-sized TCP segments
+that far exceed the kernel interface's MTU; this eliminates the need to segment
+packets within the guest, and improves the data-to-overhead ratio of both the
+guest-host link, and PCI bus. The expectation of the guest network stack in this
+scenario is that segmentation of egress frames will take place either in the NIC
+HW, or where that hardware capability is unavailable, either in the host
+application, or network stack.
+
+Bearing that in mind, the GSO library enables DPDK applications to segment
+packets in software. Note however, that GSO is implemented as a standalone
+library, and not via a 'fallback' mechanism (i.e. for when TSO is unsupported
+in the underlying hardware); that is, applications must explicitly invoke the
+GSO library to segment packets. The size of GSO segments ``(segsz)`` is
+configurable by the application.
+
+Limitations
+-----------
+
+#. The GSO library doesn't check if input packets have correct checksums.
+
+#. In addition, the GSO library doesn't re-calculate checksums for segmented
+   packets (that task is left to the application).
+
+#. IP fragments are unsupported by the GSO library.
+
+#. The egress interface's driver must support multi-segment packets.
+
+#. Currently, the GSO library supports the following IPv4 packet types:
+
+ - TCP
+ - VxLAN
+ - GRE
+
+  See `Supported GSO Packet Types`_ for further details.
+
+Packet Segmentation
+-------------------
+
+The ``rte_gso_segment()`` function is the GSO library's primary
+segmentation API.
+
+Before performing segmentation, an application must create a GSO context object
+``(struct rte_gso_ctx)``, which provides the library with some of the
+information required to understand how the packet should be segmented. Refer to
+`How to Segment a Packet`_ for additional details on same. Once the GSO context
+has been created, and populated, the application can then use the
+``rte_gso_segment()`` function to segment packets.
+
+The GSO library typically stores each segment that it creates in two parts: the
+first part contains a copy of the original packet's headers, while the second
+part contains a pointer to an offset within the original packet. This mechanism
+is explained in more detail in `GSO Output Segment Format`_.
+
+The GSO library supports both single- and multi-segment input mbufs.
+
+GSO Output Segment Format
+~~~~~~~~~~~~~~~~~~~~~~~~~
+To reduce the number of expensive memcpy operations required when segmenting a
+packet, the GSO library typically stores each segment that it creates as a
+two-part mbuf (technically, this is termed a 'two-segment' mbuf; however, since
+the elements produced by the API are also called 'segments', for clarity the
+term 'part' is used here instead).
+
+The first part of each output segment is a direct mbuf and contains a copy of
+the original packet's headers, which must be prepended to each output segment.
+These headers are copied from the original packet into each output segment.
+
+The second part of each output segment, represents a section of data from the
+original packet, i.e. a data segment. Rather than copy the data directly from
+the original packet into the output segment (which would impact performance
+considerably), the second part of each output segment is an indirect mbuf,
+which contains no actual data, but simply points to an offset within the
+original packet.
+
+The combination of the 'header' segment and the 'data' segment constitutes a
+single logical output GSO segment of the original packet. This is illustrated
+in :numref:`figure_gso-output-segment-format`.
+
+.. _figure_gso-output-segment-format:
+
+.. figure:: img/gso-output-segment-format.svg
+   :align: center
+
+   Two-part GSO output segment
+
+In one situation, the output segment may contain additional 'data' segments.
+This only occurs when:
+
+- the input packet on which GSO is to be performed is represented by a
+  multi-segment mbuf.
+
+- the output segment is required to contain data that spans the boundaries
+  between segments of the input multi-segment mbuf.
+
+The GSO library traverses each segment of the input packet, and produces
+numerous output segments; for optimal performance, the number of output
+segments is kept to a minimum. Consequently, the GSO library maximizes the
+amount of data contained within each output segment; i.e. each output segment
+``segsz`` bytes of data. The only exception to this is in the case of the very
+final output segment; if ``pkt_len`` % ``segsz``, then the final segment is
+smaller than the rest.
+
+In order for an output segment to meet its MSS, it may need to include data from
+multiple input segments. Due to the nature of indirect mbufs (each indirect mbuf
+can point to only one direct mbuf), the solution here is to add another indirect
+mbuf to the output segment; this additional segment then points to the next
+input segment. If necessary, this chaining process is repeated, until the sum of
+all of the data 'contained' in the output segment reaches ``segsz``. This
+ensures that the amount of data contained within each output segment is uniform,
+with the possible exception of the last segment, as previously described.
+
+:numref:`figure_gso-three-seg-mbuf` illustrates an example of a three-part
+output segment. In this example, the output segment needs to include data from
+the end of one input segment, and the beginning of another. To achieve this,
+an additional indirect mbuf is chained to the second part of the output segment,
+and is attached to the next input segment (i.e. it points to the data in the
+next input segment).
+
+.. _figure_gso-three-seg-mbuf:
+
+.. figure:: img/gso-three-seg-mbuf.svg
+   :align: center
+
+   Three-part GSO output segment
+
+Supported GSO Packet Types
+--------------------------
+
+TCP/IPv4 GSO
+~~~~~~~~~~~~
+TCP/IPv4 GSO supports segmentation of suitably large TCP/IPv4 packets, which
+may also contain an optional VLAN tag.
+
+VxLAN GSO
+~~~~~~~~~
+VxLAN packets GSO supports segmentation of suitably large VxLAN packets,
+which contain an outer IPv4 header, inner TCP/IPv4 headers, and optional
+inner and/or outer VLAN tag(s).
+
+GRE GSO
+~~~~~~~
+GRE GSO supports segmentation of suitably large GRE packets, which contain
+an outer IPv4 header, inner TCP/IPv4 headers, and an optional VLAN tag.
+
+How to Segment a Packet
+-----------------------
+
+To segment an outgoing packet, an application must:
+
+#. First create a GSO context ``(struct rte_gso_ctx)``; this contains:
+
+   - a pointer to the mbuf pool for allocating the direct buffers, which are
+     used to store the GSO segments' packet headers.
+
+   - a pointer to the mbuf pool for allocating indirect buffers, which are
+     used to locate GSO segments' packet payloads.
+
+.. note::
+
+     An application may use the same pool for both direct and indirect
+     buffers. However, since each indirect mbuf simply stores a pointer, the
+     application may reduce its memory consumption by creating a separate memory
+     pool, containing smaller elements, for the indirect pool.
+
+   - the size of each output segment, including packet headers and payload,
+     measured in bytes.
+
+   - the bit mask of required GSO types. The GSO library uses the same macros as
+     those that describe a physical device's TX offloading capabilities (i.e.
+     ``DEV_TX_OFFLOAD_*_TSO``) for gso_types. For example, if an application
+     wants to segment TCP/IPv4 packets, it should set gso_types to
+     ``DEV_TX_OFFLOAD_TCP_TSO``. The only other supported values currently
+     supported for gso_types are ``DEV_TX_OFFLOAD_VXLAN_TNL_TSO``, and
+     ``DEV_TX_OFFLOAD_GRE_TNL_TSO``; a combination of these macros is also
+     allowed.
+
+   - a flag, that indicates whether the IPv4 headers of output segments should
+     contain fixed or incremental ID values.
+
+2. Set the appropriate ol_flags in the mbuf.
+
+   - The GSO library use the value of an mbuf's ``ol_flags`` attribute to
+     to determine how a packet should be segmented. It is the application's
+     responsibility to ensure that these flags are set.
+
+   - For example, in order to segment TCP/IPv4 packets, the application should
+     add the ``PKT_TX_IPV4`` and ``PKT_TX_TCP_SEG`` flags to the mbuf's
+     ol_flags.
+
+   - If checksum calculation in hardware is required, the application should
+     also add the ``PKT_TX_TCP_CKSUM`` and ``PKT_TX_IP_CKSUM`` flags.
+
+#. Check if the packet should be processed. Packets with one of the
+   following properties are not processed and are returned immediately:
+
+   - Packet length is less than ``segsz`` (i.e. GSO is not required).
+
+   - Packet type is not supported by GSO library (see
+     `Supported GSO Packet Types`_).
+
+   - Application has not enabled GSO support for the packet type.
+
+   - Packet's ol_flags have been incorrectly set.
+
+#. Allocate space in which to store the output GSO segments. If the amount of
+   space allocated by the application is insufficient, segmentation will fail.
+
+#. Invoke the GSO segmentation API, ``rte_gso_segment()``.
+
+#. If required, update the L3 and L4 checksums of the newly-created segments.
+   For tunneled packets, the outer IPv4 headers' checksums should also be
+   updated. Alternatively, the application may offload checksum calculation
+   to HW.
+
diff --git a/doc/guides/prog_guide/img/gso-output-segment-format.svg b/doc/guides/prog_guide/img/gso-output-segment-format.svg
new file mode 100644
index 0000000..bdb5ec3
--- /dev/null
+++ b/doc/guides/prog_guide/img/gso-output-segment-format.svg
@@ -0,0 +1,313 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by Microsoft Visio, SVG Export gso-output-segment-format.svg Page-1 -->
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ev="http://www.w3.org/2001/xml-events"
+		xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/" width="19.3975in" height="8.21796in"
+		viewBox="0 0 1396.62 591.693" xml:space="preserve" color-interpolation-filters="sRGB" class="st21">
+	<v:documentProperties v:langID="1033" v:metric="true" v:viewMarkup="false"/>
+
+	<style type="text/css">
+	<![CDATA[
+		.st1 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0552552}
+		.st2 {fill:#ffffff;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st3 {stroke:#c3d600;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.68828}
+		.st4 {fill:#c3d600;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st5 {stroke:#8f9d00;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.75735}
+		.st6 {fill:#00aeef;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st7 {stroke:#007fb0;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.75735}
+		.st8 {stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st9 {fill:#ffffff;font-family:Intel Clear;font-size:1.99999em;font-weight:bold}
+		.st10 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0552552}
+		.st11 {fill:#ffffff;font-family:Intel Clear;font-size:2.44732em;font-weight:bold}
+		.st12 {fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round;stroke-width:5.52552}
+		.st13 {fill:#000000;font-family:Intel Clear;font-size:2.15291em}
+		.st14 {fill:#000000;font-family:Intel Clear;font-size:1.8401em}
+		.st15 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0276276}
+		.st16 {fill:#c3d600;font-family:Intel Clear;font-size:2.44732em}
+		.st17 {fill:#ffc000;font-family:Intel Clear;font-size:2.44732em}
+		.st18 {fill:#ffc000;stroke:#ffc000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0276276}
+		.st19 {fill:#0070c0;font-family:Intel Clear;font-size:1.8401em}
+		.st20 {fill:#006fc5;font-family:Intel Clear;font-size:1.61927em}
+		.st21 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}
+	]]>
+	</style>
+
+	<g v:mID="0" v:index="1" v:groupContext="foregroundPage">
+		<title>Page-1</title>
+		<v:pageProperties v:drawingScale="0.0393701" v:pageScale="0.0393701" v:drawingUnits="24" v:shadowOffsetX="8.50394"
+				v:shadowOffsetY="-8.50394"/>
+		<g id="shape3-1" v:mID="3" v:groupContext="shape" transform="translate(577.244,-560.42)">
+			<title>Sheet.3</title>
+			<path d="M9.24 585.29 L16.32 585.29 L16.32 587.06 L9.24 587.06 L9.24 585.29 L9.24 585.29 ZM21.63 585.29 L23.4 585.29
+						 L23.4 587.06 L21.63 587.06 L21.63 585.29 L21.63 585.29 ZM28.7 585.29 L35.78 585.29 L35.78 587.06 L28.7 587.06
+						 L28.7 585.29 L28.7 585.29 ZM41.09 585.29 L42.86 585.29 L42.86 587.06 L41.09 587.06 L41.09 585.29 L41.09
+						 585.29 ZM48.17 585.29 L55.25 585.29 L55.25 587.06 L48.17 587.06 L48.17 585.29 L48.17 585.29 ZM60.56 585.29
+						 L62.33 585.29 L62.33 587.06 L60.56 587.06 L60.56 585.29 L60.56 585.29 ZM67.64 585.29 L74.72 585.29 L74.72
+						 587.06 L67.64 587.06 L67.64 585.29 L67.64 585.29 ZM80.03 585.29 L81.8 585.29 L81.8 587.06 L80.03 587.06
+						 L80.03 585.29 L80.03 585.29 ZM87.11 585.29 L94.19 585.29 L94.19 587.06 L87.11 587.06 L87.11 585.29 L87.11
+						 585.29 ZM99.5 585.29 L101.27 585.29 L101.27 587.06 L99.5 587.06 L99.5 585.29 L99.5 585.29 ZM106.58 585.29
+						 L113.66 585.29 L113.66 587.06 L106.58 587.06 L106.58 585.29 L106.58 585.29 ZM118.97 585.29 L120.74 585.29
+						 L120.74 587.06 L118.97 587.06 L118.97 585.29 L118.97 585.29 ZM126.05 585.29 L133.13 585.29 L133.13 587.06
+						 L126.05 587.06 L126.05 585.29 L126.05 585.29 ZM138.43 585.29 L140.2 585.29 L140.2 587.06 L138.43 587.06
+						 L138.43 585.29 L138.43 585.29 ZM145.51 585.29 L152.59 585.29 L152.59 587.06 L145.51 587.06 L145.51 585.29
+						 L145.51 585.29 ZM157.9 585.29 L159.67 585.29 L159.67 587.06 L157.9 587.06 L157.9 585.29 L157.9 585.29 ZM164.98
+						 585.29 L172.06 585.29 L172.06 587.06 L164.98 587.06 L164.98 585.29 L164.98 585.29 ZM177.37 585.29 L179.14
+						 585.29 L179.14 587.06 L177.37 587.06 L177.37 585.29 L177.37 585.29 ZM184.45 585.29 L191.53 585.29 L191.53
+						 587.06 L184.45 587.06 L184.45 585.29 L184.45 585.29 ZM196.84 585.29 L198.61 585.29 L198.61 587.06 L196.84
+						 587.06 L196.84 585.29 L196.84 585.29 ZM203.92 585.29 L211 585.29 L211 587.06 L203.92 587.06 L203.92 585.29
+						 L203.92 585.29 ZM216.31 585.29 L218.08 585.29 L218.08 587.06 L216.31 587.06 L216.31 585.29 L216.31 585.29
+						 ZM223.39 585.29 L230.47 585.29 L230.47 587.06 L223.39 587.06 L223.39 585.29 L223.39 585.29 ZM235.78 585.29
+						 L237.55 585.29 L237.55 587.06 L235.78 587.06 L235.78 585.29 L235.78 585.29 ZM242.86 585.29 L249.93 585.29
+						 L249.93 587.06 L242.86 587.06 L242.86 585.29 L242.86 585.29 ZM255.24 585.29 L257.01 585.29 L257.01 587.06
+						 L255.24 587.06 L255.24 585.29 L255.24 585.29 ZM262.32 585.29 L269.4 585.29 L269.4 587.06 L262.32 587.06
+						 L262.32 585.29 L262.32 585.29 ZM274.71 585.29 L276.48 585.29 L276.48 587.06 L274.71 587.06 L274.71 585.29
+						 L274.71 585.29 ZM281.79 585.29 L288.87 585.29 L288.87 587.06 L281.79 587.06 L281.79 585.29 L281.79 585.29
+						 ZM294.18 585.29 L295.95 585.29 L295.95 587.06 L294.18 587.06 L294.18 585.29 L294.18 585.29 ZM301.26 585.29
+						 L308.34 585.29 L308.34 587.06 L301.26 587.06 L301.26 585.29 L301.26 585.29 ZM313.65 585.29 L315.42 585.29
+						 L315.42 587.06 L313.65 587.06 L313.65 585.29 L313.65 585.29 ZM320.73 585.29 L324.99 585.29 L324.99 587.06
+						 L320.73 587.06 L320.73 585.29 L320.73 585.29 ZM11.06 591.69 L0 586.17 L11.06 580.65 L11.06 591.69 L11.06
+						 591.69 ZM323.16 580.65 L334.22 586.17 L323.16 591.69 L323.16 580.65 L323.16 580.65 Z" class="st1"/>
+		</g>
+		<g id="shape4-3" v:mID="4" v:groupContext="shape" transform="translate(184.298,-201.906)">
+			<title>Sheet.4</title>
+			<path d="M94.04 570.43 L117.87 557.26 L0 344.58 L47.68 318.26 L165.55 530.94 L189.39 517.79 L168.08 591.69 L94.04 570.43
+						 Z" class="st2"/>
+		</g>
+		<g id="shape5-5" v:mID="5" v:groupContext="shape" transform="translate(184.298,-201.906)">
+			<title>Sheet.5</title>
+			<path d="M94.04 570.43 L117.87 557.26 L0 344.58 L47.68 318.26 L165.55 530.94 L189.39 517.79 L168.08 591.69 L94.04 570.43"
+					class="st3"/>
+		</g>
+		<g id="shape6-8" v:mID="6" v:groupContext="shape" transform="translate(119.408,-447.917)">
+			<title>Sheet.6</title>
+			<path d="M0 510.21 L0 591.69 L129.86 591.69 L129.86 510.21 L0 510.21 L0 510.21 Z" class="st4"/>
+		</g>
+		<g id="shape7-10" v:mID="7" v:groupContext="shape" transform="translate(119.408,-447.917)">
+			<title>Sheet.7</title>
+			<path d="M0 510.21 L129.86 510.21 L129.86 591.69 L0 591.69 L0 510.21" class="st5"/>
+		</g>
+		<g id="shape10-13" v:mID="10" v:groupContext="shape" transform="translate(250.819,-447.917)">
+			<title>Sheet.10</title>
+			<path d="M0 510.21 L0 591.69 L822.53 591.69 L822.53 510.21 L0 510.21 L0 510.21 Z" class="st6"/>
+		</g>
+		<g id="shape11-15" v:mID="11" v:groupContext="shape" transform="translate(250.819,-447.917)">
+			<title>Sheet.11</title>
+			<path d="M0 510.21 L822.53 510.21 L822.53 591.69 L0 591.69 L0 510.21" class="st7"/>
+		</g>
+		<g id="shape12-18" v:mID="12" v:groupContext="shape" transform="translate(255.478,-470.123)">
+			<title>Sheet.12</title>
+			<desc>Payload 0</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="157.315" cy="574.07" width="314.63" height="35.245"/>
+			<path d="M314.63 556.45 L0 556.45 L0 591.69 L314.63 591.69 L314.63 556.45" class="st8"/>
+			<text x="102.08" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 0</text>		</g>
+		<g id="shape13-22" v:mID="13" v:groupContext="shape" transform="translate(577.354,-470.123)">
+			<title>Sheet.13</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="167.112" cy="574.07" width="334.23" height="35.245"/>
+			<path d="M334.22 556.45 L0 556.45 L0 591.69 L334.22 591.69 L334.22 556.45" class="st8"/>
+			<text x="111.88" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape14-26" v:mID="14" v:groupContext="shape" transform="translate(910.635,-470.956)">
+			<title>Sheet.14</title>
+			<desc>Payload 2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="81.8509" cy="574.07" width="163.71" height="35.245"/>
+			<path d="M163.7 556.45 L0 556.45 L0 591.69 L163.7 591.69 L163.7 556.45" class="st8"/>
+			<text x="26.61" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 2</text>		</g>
+		<g id="shape15-30" v:mID="15" v:groupContext="shape" transform="translate(909.144,-453.824)">
+			<title>Sheet.15</title>
+			<path d="M1.16 453.85 L1.05 465.33 L3.93 465.39 L4.04 453.91 L1.16 453.85 L1.16 453.85 ZM1 473.95 L0.94 476.82 L3.82
+						 476.87 L3.87 474 L1 473.95 L1 473.95 ZM0.88 485.43 L0.77 496.91 L3.65 496.96 L3.76 485.48 L0.88 485.43 L0.88
+						 485.43 ZM0.72 505.52 L0.72 508.39 L3.59 508.45 L3.59 505.58 L0.72 505.52 L0.72 505.52 ZM0.61 517 L0.55 528.49
+						 L3.43 528.54 L3.48 517.06 L0.61 517 L0.61 517 ZM0.44 537.1 L0.44 539.97 L3.32 540.02 L3.32 537.15 L0.44
+						 537.1 L0.44 537.1 ZM0.39 548.58 L0.28 560.06 L3.15 560.12 L3.26 548.63 L0.39 548.58 L0.39 548.58 ZM0.22
+						 568.67 L0.17 571.54 L3.04 571.6 L3.1 568.73 L0.22 568.67 L0.22 568.67 ZM0.11 580.16 L0 591.64 L2.88 591.69
+						 L2.99 580.21 L0.11 580.16 L0.11 580.16 Z" class="st10"/>
+		</g>
+		<g id="shape16-32" v:mID="16" v:groupContext="shape" transform="translate(119.187,-447.917)">
+			<title>Sheet.16</title>
+			<path d="M0 510.21 L0 591.69 L129.86 591.69 L129.86 510.21 L0 510.21 L0 510.21 Z" class="st4"/>
+		</g>
+		<g id="shape17-34" v:mID="17" v:groupContext="shape" transform="translate(119.187,-447.917)">
+			<title>Sheet.17</title>
+			<path d="M0 510.21 L129.86 510.21 L129.86 591.69 L0 591.69 L0 510.21" class="st5"/>
+		</g>
+		<g id="shape18-37" v:mID="18" v:groupContext="shape" transform="translate(121.944,-471.034)">
+			<title>Sheet.18</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="61.0973" cy="574.07" width="122.2" height="35.245"/>
+			<path d="M122.19 556.45 L0 556.45 L0 591.69 L122.19 591.69 L122.19 556.45" class="st8"/>
+			<text x="20.61" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape19-41" v:mID="19" v:groupContext="shape" transform="translate(329.798,-1.87868)">
+			<title>Sheet.19</title>
+			<path d="M0 510.43 L0 591.69 L289.81 591.69 L289.81 510.43 L0 510.43 L0 510.43 Z" class="st4"/>
+		</g>
+		<g id="shape20-43" v:mID="20" v:groupContext="shape" transform="translate(329.798,-1.87868)">
+			<title>Sheet.20</title>
+			<path d="M0 510.43 L289.81 510.43 L289.81 591.69 L0 591.69 L0 510.43" class="st5"/>
+		</g>
+		<g id="shape21-46" v:mID="21" v:groupContext="shape" transform="translate(424.908,-21.567)">
+			<title>Sheet.21</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="61.0973" cy="574.07" width="122.2" height="35.245"/>
+			<path d="M122.19 556.45 L0 556.45 L0 591.69 L122.19 591.69 L122.19 556.45" class="st8"/>
+			<text x="11.55" y="582.88" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape22-50" v:mID="22" v:groupContext="shape" transform="translate(619.609,-1.87868)">
+			<title>Sheet.22</title>
+			<path d="M0 510.43 L0 591.69 L453.74 591.69 L453.74 510.43 L0 510.43 L0 510.43 Z" class="st6"/>
+		</g>
+		<g id="shape23-52" v:mID="23" v:groupContext="shape" transform="translate(619.609,-1.87868)">
+			<title>Sheet.23</title>
+			<path d="M0 510.43 L453.74 510.43 L453.74 591.69 L0 591.69 L0 510.43" class="st7"/>
+		</g>
+		<g id="shape24-55" v:mID="24" v:groupContext="shape" transform="translate(778.624,-21.5672)">
+			<title>Sheet.24</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="81.8509" cy="574.07" width="163.71" height="35.245"/>
+			<path d="M163.7 556.45 L0 556.45 L0 591.69 L163.7 591.69 L163.7 556.45" class="st8"/>
+			<text x="14.26" y="582.88" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape25-59" v:mID="25" v:groupContext="shape" transform="translate(710.092,-113.83)">
+			<title>Sheet.25</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L349.43 508.89 C357.12 508.89 363.26 515.07 363.26 522.69 L363.26
+						 577.89 C363.26 585.57 357.12 591.69 349.43 591.69 L13.83 591.69 C6.19 591.69 0 585.57 0 577.89 L0 522.69
+						 Z" class="st6"/>
+		</g>
+		<g id="shape26-61" v:mID="26" v:groupContext="shape" transform="translate(710.092,-113.83)">
+			<title>Sheet.26</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L349.43 508.89 C357.12 508.89 363.26 515.07 363.26 522.69 L363.26
+						 577.89 C363.26 585.57 357.12 591.69 349.43 591.69 L13.83 591.69 C6.19 591.69 0 585.57 0 577.89 L0 522.69
+						 Z" class="st12"/>
+		</g>
+		<g id="shape27-63" v:mID="27" v:groupContext="shape" transform="translate(813.057,-150.108)">
+			<title>Sheet.27</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="94.1386" cy="576.19" width="188.28" height="31.0055"/>
+			<path d="M188.28 560.69 L0 560.69 L0 591.69 L188.28 591.69 L188.28 560.69" class="st8"/>
+			<text x="15.43" y="583.94" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape28-67" v:mID="28" v:groupContext="shape" transform="translate(810.845,-123.854)">
+			<title>Sheet.28</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="95.5065" cy="578.442" width="191.02" height="26.501"/>
+			<path d="M191.01 565.19 L0 565.19 L0 591.69 L191.01 591.69 L191.01 565.19" class="st8"/>
+			<text x="15.15" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape29-71" v:mID="29" v:groupContext="shape" transform="translate(573.151,-149.601)">
+			<title>Sheet.29</title>
+			<path d="M0 584.74 L127.76 584.74 L127.76 587.61 L0 587.61 L0 584.74 L0 584.74 ZM125.91 580.65 L136.97 586.17 L125.91
+						 591.69 L125.91 580.65 L125.91 580.65 Z" class="st15"/>
+		</g>
+		<g id="shape30-73" v:mID="30" v:groupContext="shape" transform="translate(0,-309.671)">
+			<title>Sheet.30</title>
+			<desc>Memory copy</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="108.076" cy="574.07" width="216.16" height="35.245"/>
+			<path d="M216.15 556.45 L0 556.45 L0 591.69 L216.15 591.69 L216.15 556.45" class="st8"/>
+			<text x="17.68" y="582.88" class="st16" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Memory copy</text>		</g>
+		<g id="shape31-77" v:mID="31" v:groupContext="shape" transform="translate(680.77,-305.707)">
+			<title>Sheet.31</title>
+			<desc>No Memory Copy</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="136.547" cy="574.07" width="273.1" height="35.245"/>
+			<path d="M273.09 556.45 L0 556.45 L0 591.69 L273.09 591.69 L273.09 556.45" class="st8"/>
+			<text x="21.4" y="582.88" class="st17" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>No Memory Copy</text>		</g>
+		<g id="shape32-81" v:mID="32" v:groupContext="shape" transform="translate(1102.72,-26.7532)">
+			<title>Sheet.32</title>
+			<desc>Logical output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="138.243" cy="578.442" width="276.49" height="26.501"/>
+			<path d="M276.49 565.19 L0 565.19 L0 591.69 L276.49 591.69 L276.49 565.19" class="st8"/>
+			<text x="20.73" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Logical output segment</text>		</g>
+		<g id="shape36-85" v:mID="36" v:groupContext="shape" transform="translate(1106.81,-138.647)">
+			<title>Sheet.36</title>
+			<desc>Two-part output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="144.906" cy="578.442" width="289.82" height="26.501"/>
+			<path d="M289.81 565.19 L0 565.19 L0 591.69 L289.81 591.69 L289.81 565.19" class="st8"/>
+			<text x="16.56" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Two-part output segment</text>		</g>
+		<g id="shape37-89" v:mID="37" v:groupContext="shape" transform="translate(575.916,-453.879)">
+			<title>Sheet.37</title>
+			<path d="M2.88 453.91 L2.9 465.39 L0.03 465.39 L0 453.91 L2.88 453.91 L2.88 453.91 ZM2.9 474 L2.9 476.87 L0.03 476.87
+						 L0.03 474 L2.9 474 L2.9 474 ZM2.9 485.48 L2.9 496.96 L0.03 496.96 L0.03 485.48 L2.9 485.48 L2.9 485.48 ZM2.9
+						 505.58 L2.9 508.45 L0.03 508.45 L0.03 505.58 L2.9 505.58 L2.9 505.58 ZM2.9 517.06 L2.9 528.54 L0.03 528.54
+						 L0.03 517.06 L2.9 517.06 L2.9 517.06 ZM2.9 537.15 L2.9 540.02 L0.03 540.02 L0.03 537.15 L2.9 537.15 L2.9
+						 537.15 ZM2.9 548.63 L2.9 560.12 L0.03 560.12 L0.03 548.63 L2.9 548.63 L2.9 548.63 ZM2.9 568.73 L2.9 571.6
+						 L0.03 571.6 L0.03 568.73 L2.9 568.73 L2.9 568.73 ZM2.9 580.21 L2.9 591.69 L0.03 591.69 L0.03 580.21 L2.9
+						 580.21 L2.9 580.21 Z" class="st18"/>
+		</g>
+		<g id="shape38-91" v:mID="38" v:groupContext="shape" transform="translate(577.354,-193.764)">
+			<title>Sheet.38</title>
+			<path d="M5.59 347.01 L10.92 357.16 L8.38 358.52 L3.04 348.36 L5.59 347.01 L5.59 347.01 ZM14.96 364.78 L16.29 367.32
+						 L13.74 368.67 L12.42 366.13 L14.96 364.78 L14.96 364.78 ZM20.33 374.97 L25.66 385.12 L23.12 386.45 L17.78
+						 376.29 L20.33 374.97 L20.33 374.97 ZM29.7 392.74 L31.03 395.28 L28.48 396.61 L27.16 394.07 L29.7 392.74
+						 L29.7 392.74 ZM35.04 402.9 L40.4 413.06 L37.86 414.38 L32.49 404.22 L35.04 402.9 L35.04 402.9 ZM44.41 420.67
+						 L45.77 423.21 L43.22 424.57 L41.87 422.03 L44.41 420.67 L44.41 420.67 ZM49.78 430.83 L55.14 440.99 L52.6
+						 442.34 L47.23 432.18 L49.78 430.83 L49.78 430.83 ZM59.15 448.61 L60.51 451.15 L57.96 452.5 L56.61 449.96
+						 L59.15 448.61 L59.15 448.61 ZM64.52 458.79 L69.88 468.95 L67.34 470.27 L61.97 460.12 L64.52 458.79 L64.52
+						 458.79 ZM73.89 476.57 L75.25 479.11 L72.7 480.43 L71.35 477.89 L73.89 476.57 L73.89 476.57 ZM79.26 486.72
+						 L84.62 496.88 L82.08 498.21 L76.71 488.05 L79.26 486.72 L79.26 486.72 ZM88.63 504.5 L89.96 507.04 L87.41
+						 508.39 L86.09 505.85 L88.63 504.5 L88.63 504.5 ZM94 514.66 L99.33 524.81 L96.79 526.17 L91.45 516.01 L94
+						 514.66 L94 514.66 ZM103.37 532.43 L104.7 534.97 L102.15 536.32 L100.83 533.79 L103.37 532.43 L103.37 532.43
+						 ZM108.73 542.62 L114.07 552.77 L111.53 554.1 L106.19 543.94 L108.73 542.62 L108.73 542.62 ZM118.11 560.39
+						 L119.44 562.93 L116.89 564.26 L115.57 561.72 L118.11 560.39 L118.11 560.39 ZM123.45 570.55 L128.81 580.71
+						 L126.27 582.03 L120.9 571.87 L123.45 570.55 L123.45 570.55 ZM132.82 588.33 L133.9 590.37 L131.36 591.69
+						 L130.28 589.68 L132.82 588.33 L132.82 588.33 ZM0.28 351.89 L0 339.53 L10.07 346.73 L0.28 351.89 L0.28 351.89
+						 Z" class="st18"/>
+		</g>
+		<g id="shape39-93" v:mID="39" v:groupContext="shape" transform="translate(329.798,-113.83)">
+			<title>Sheet.39</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L229.53 508.89 C237.19 508.89 243.35 515.07 243.35 522.69 L243.35
+						 577.89 C243.35 585.54 237.19 591.69 229.53 591.69 L13.83 591.69 C6.19 591.69 0 585.54 0 577.89 L0 522.69
+						 Z" class="st4"/>
+		</g>
+		<g id="shape40-95" v:mID="40" v:groupContext="shape" transform="translate(329.798,-113.83)">
+			<title>Sheet.40</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L229.53 508.89 C237.19 508.89 243.35 515.07 243.35 522.69 L243.35
+						 577.89 C243.35 585.54 237.19 591.69 229.53 591.69 L13.83 591.69 C6.19 591.69 0 585.54 0 577.89 L0 522.69
+						 Z" class="st12"/>
+		</g>
+		<g id="shape41-97" v:mID="41" v:groupContext="shape" transform="translate(368.774,-150.453)">
+			<title>Sheet.41</title>
+			<desc>Direct mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="82.7002" cy="576.19" width="165.41" height="31.0055"/>
+			<path d="M165.4 560.69 L0 560.69 L0 591.69 L165.4 591.69 L165.4 560.69" class="st8"/>
+			<text x="13.94" y="583.94" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Direct mbuf</text>		</g>
+		<g id="shape42-101" v:mID="42" v:groupContext="shape" transform="translate(351.856,-123.854)">
+			<title>Sheet.42</title>
+			<desc>(copy of headers)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="102.121" cy="578.442" width="204.25" height="26.501"/>
+			<path d="M204.24 565.19 L0 565.19 L0 591.69 L204.24 591.69 L204.24 565.19" class="st8"/>
+			<text x="16.02" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(copy of headers)</text>		</g>
+		<g id="shape43-105" v:mID="43" v:groupContext="shape" transform="translate(619.797,-155.563)">
+			<title>Sheet.43</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="28.011" cy="578.442" width="56.03" height="26.501"/>
+			<path d="M56.02 565.19 L0 565.19 L0 591.69 L56.02 591.69 L56.02 565.19" class="st8"/>
+			<text x="6.35" y="585.07" class="st19" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape44-109" v:mID="44" v:groupContext="shape" transform="translate(700.911,-551.367)">
+			<title>Sheet.44</title>
+			<path d="M0 559.23 L0 591.69 L84.29 591.69 L84.29 559.23 L0 559.23 L0 559.23 Z" class="st2"/>
+		</g>
+		<g id="shape45-111" v:mID="45" v:groupContext="shape" transform="translate(709.883,-555.163)">
+			<title>Sheet.45</title>
+			<desc>segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="30.7501" cy="580.032" width="61.51" height="23.3211"/>
+			<path d="M61.5 568.37 L0 568.37 L0 591.69 L61.5 591.69 L61.5 568.37" class="st8"/>
+			<text x="6.38" y="585.86" class="st20" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>segsz</text>		</g>
+		<g id="shape46-115" v:mID="46" v:groupContext="shape" transform="translate(1111.54,-477.36)">
+			<title>Sheet.46</title>
+			<desc>Input packet</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="74.9" cy="578.442" width="149.8" height="26.501"/>
+			<path d="M149.8 565.19 L0 565.19 L0 591.69 L149.8 591.69 L149.8 565.19" class="st8"/>
+			<text x="12.47" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Input packet</text>		</g>
+	</g>
+</svg>
diff --git a/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg b/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
new file mode 100644
index 0000000..f18a327
--- /dev/null
+++ b/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
@@ -0,0 +1,477 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by Microsoft Visio, SVG Export gso-three-seg-mbuf.svg Page-1 -->
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ev="http://www.w3.org/2001/xml-events"
+		xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/" width="21.8589in" height="9.63966in"
+		viewBox="0 0 1573.84 694.055" xml:space="preserve" color-interpolation-filters="sRGB" class="st23">
+	<title>GSO three-part output segment</title>
+	<v:documentProperties v:langID="1033" v:metric="true" v:viewMarkup="false"/>
+
+	<style type="text/css">
+	<![CDATA[
+		.st1 {fill:#ffc000;stroke:#ffc000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st2 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st3 {fill:none;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.42236}
+		.st4 {fill:#c3d600;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st5 {stroke:#8f9d00;stroke-linecap:round;stroke-linejoin:round;stroke-width:4.47539}
+		.st6 {fill:#00aeef;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st7 {stroke:#007fb0;stroke-linecap:round;stroke-linejoin:round;stroke-width:4.47539}
+		.st8 {stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st9 {fill:#ffffff;font-family:Calibri;font-size:2.08333em;font-weight:bold}
+		.st10 {fill:#ffffff;font-family:Intel Clear;font-size:2.91502em;font-weight:bold}
+		.st11 {fill:#000000;font-family:Intel Clear;font-size:2.19175em}
+		.st12 {fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round;stroke-width:6.58146}
+		.st13 {fill:#000000;font-family:Intel Clear;font-size:2.50001em}
+		.st14 {fill:#000000;font-family:Intel Clear;font-size:1.99999em}
+		.st15 {fill:#0070c0;font-family:Intel Clear;font-size:2.19175em}
+		.st16 {fill:#ffffff;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st17 {fill:#006fc5;font-family:Intel Clear;font-size:1.92874em}
+		.st18 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st19 {fill:#0070c0;font-family:Intel Clear;font-size:1.5em}
+		.st20 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0658146}
+		.st21 {fill:#000000;font-family:Intel Clear;font-size:1.81915em}
+		.st22 {fill:#000000;font-family:Intel Clear;font-size:1.49785em}
+		.st23 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}
+	]]>
+	</style>
+
+	<g v:mID="0" v:index="1" v:groupContext="foregroundPage">
+		<title>Page-1</title>
+		<v:pageProperties v:drawingScale="0.0393701" v:pageScale="0.0393701" v:drawingUnits="24" v:shadowOffsetX="8.50394"
+				v:shadowOffsetY="-8.50394"/>
+		<v:layer v:name="top" v:index="0"/>
+		<v:layer v:name="middle" v:index="1"/>
+		<g id="shape111-1" v:mID="111" v:groupContext="shape" v:layerMember="0" transform="translate(787.208,-220.973)">
+			<title>Sheet.111</title>
+			<path d="M6.65 402.61 L13.01 414.71 L9.98 416.32 L3.62 404.22 L6.65 402.61 L6.65 402.61 ZM17.82 423.78 L19.4 426.81 L16.37
+						 428.42 L14.79 425.39 L17.82 423.78 L17.82 423.78 ZM24.21 435.91 L30.57 448.01 L27.54 449.59 L21.18 437.49
+						 L24.21 435.91 L24.21 435.91 ZM35.38 457.08 L36.96 460.11 L33.93 461.69 L32.35 458.66 L35.38 457.08 L35.38
+						 457.08 ZM41.73 469.18 L48.12 481.28 L45.09 482.86 L38.7 470.76 L41.73 469.18 L41.73 469.18 ZM52.9 490.36
+						 L54.51 493.38 L51.48 494.99 L49.87 491.97 L52.9 490.36 L52.9 490.36 ZM59.29 502.45 L65.68 514.55 L62.65
+						 516.16 L56.26 504.06 L59.29 502.45 L59.29 502.45 ZM70.46 523.63 L72.07 526.65 L69.04 528.26 L67.43 525.24
+						 L70.46 523.63 L70.46 523.63 ZM76.85 535.76 L83.24 547.86 L80.21 549.43 L73.82 537.34 L76.85 535.76 L76.85
+						 535.76 ZM88.01 556.93 L89.63 559.95 L86.6 561.53 L84.98 558.51 L88.01 556.93 L88.01 556.93 ZM94.4 569.03
+						 L100.79 581.13 L97.76 582.7 L91.37 570.61 L94.4 569.03 L94.4 569.03 ZM105.57 590.2 L107.15 593.22 L104.12
+						 594.84 L102.54 591.81 L105.57 590.2 L105.57 590.2 ZM111.96 602.3 L118.32 614.4 L115.28 616.01 L108.93 603.91
+						 L111.96 602.3 L111.96 602.3 ZM123.12 623.47 L124.71 626.5 L121.67 628.11 L120.09 625.08 L123.12 623.47 L123.12
+						 623.47 ZM129.51 635.6 L135.87 647.7 L132.84 649.28 L126.48 637.18 L129.51 635.6 L129.51 635.6 ZM140.68 656.77
+						 L142.26 659.8 L139.23 661.38 L137.65 658.35 L140.68 656.77 L140.68 656.77 ZM147.04 668.87 L153.43 680.97
+						 L150.4 682.55 L144.01 670.45 L147.04 668.87 L147.04 668.87 ZM158.2 690.04 L159.49 692.48 L156.46 694.06
+						 L155.17 691.66 L158.2 690.04 L158.2 690.04 ZM0.33 408.43 L0 393.7 L11.99 402.28 L0.33 408.43 L0.33 408.43
+						 Z" class="st1"/>
+		</g>
+		<g id="shape110-3" v:mID="110" v:groupContext="shape" v:layerMember="0" transform="translate(685.078,-560.166)">
+			<title>Sheet.110</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape4-5" v:mID="4" v:groupContext="shape" transform="translate(718.715,-469.955)">
+			<title>Sheet.4</title>
+			<path d="M0 655.13 L0 678.22 C0 686.97 11.69 694.06 26.05 694.06 C40.45 694.06 52.11 686.97 52.11 678.22 L52.11 673.91
+						 L59.55 673.91 L44.66 664.86 L29.78 673.91 L37.22 673.91 L37.22 678.22 C37.22 681.98 32.25 685 26.05 685
+						 C19.89 685 14.89 681.98 14.89 678.22 L14.89 655.13 L0 655.13 Z" class="st3"/>
+		</g>
+		<g id="shape5-7" v:mID="5" v:groupContext="shape" transform="translate(547.831,-656.823)">
+			<title>Sheet.5</title>
+			<path d="M11 686.43 L19.43 686.43 L19.43 688.53 L11 688.53 L11 686.43 L11 686.43 ZM25.76 686.43 L27.87 686.43 L27.87
+						 688.53 L25.76 688.53 L25.76 686.43 L25.76 686.43 ZM34.19 686.43 L42.62 686.43 L42.62 688.53 L34.19 688.53
+						 L34.19 686.43 L34.19 686.43 ZM48.95 686.43 L51.05 686.43 L51.05 688.53 L48.95 688.53 L48.95 686.43 L48.95
+						 686.43 ZM57.38 686.43 L65.81 686.43 L65.81 688.53 L57.38 688.53 L57.38 686.43 L57.38 686.43 ZM72.14 686.43
+						 L74.24 686.43 L74.24 688.53 L72.14 688.53 L72.14 686.43 L72.14 686.43 ZM80.57 686.43 L89 686.43 L89 688.53
+						 L80.57 688.53 L80.57 686.43 L80.57 686.43 ZM95.32 686.43 L97.43 686.43 L97.43 688.53 L95.32 688.53 L95.32
+						 686.43 L95.32 686.43 ZM103.76 686.43 L112.19 686.43 L112.19 688.53 L103.76 688.53 L103.76 686.43 L103.76
+						 686.43 ZM118.51 686.43 L120.62 686.43 L120.62 688.53 L118.51 688.53 L118.51 686.43 L118.51 686.43 ZM126.94
+						 686.43 L135.38 686.43 L135.38 688.53 L126.94 688.53 L126.94 686.43 L126.94 686.43 ZM141.7 686.43 L143.81
+						 686.43 L143.81 688.53 L141.7 688.53 L141.7 686.43 L141.7 686.43 ZM150.13 686.43 L158.57 686.43 L158.57 688.53
+						 L150.13 688.53 L150.13 686.43 L150.13 686.43 ZM164.89 686.43 L167 686.43 L167 688.53 L164.89 688.53 L164.89
+						 686.43 L164.89 686.43 ZM173.32 686.43 L181.75 686.43 L181.75 688.53 L173.32 688.53 L173.32 686.43 L173.32
+						 686.43 ZM188.08 686.43 L190.19 686.43 L190.19 688.53 L188.08 688.53 L188.08 686.43 L188.08 686.43 ZM196.51
+						 686.43 L204.94 686.43 L204.94 688.53 L196.51 688.53 L196.51 686.43 L196.51 686.43 ZM211.27 686.43 L213.38
+						 686.43 L213.38 688.53 L211.27 688.53 L211.27 686.43 L211.27 686.43 ZM219.7 686.43 L228.13 686.43 L228.13
+						 688.53 L219.7 688.53 L219.7 686.43 L219.7 686.43 ZM234.46 686.43 L236.56 686.43 L236.56 688.53 L234.46 688.53
+						 L234.46 686.43 L234.46 686.43 ZM242.89 686.43 L251.32 686.43 L251.32 688.53 L242.89 688.53 L242.89 686.43
+						 L242.89 686.43 ZM257.64 686.43 L259.75 686.43 L259.75 688.53 L257.64 688.53 L257.64 686.43 L257.64 686.43
+						 ZM266.08 686.43 L274.51 686.43 L274.51 688.53 L266.08 688.53 L266.08 686.43 L266.08 686.43 ZM280.83 686.43
+						 L282.94 686.43 L282.94 688.53 L280.83 688.53 L280.83 686.43 L280.83 686.43 ZM289.27 686.43 L297.7 686.43
+						 L297.7 688.53 L289.27 688.53 L289.27 686.43 L289.27 686.43 ZM304.02 686.43 L306.13 686.43 L306.13 688.53
+						 L304.02 688.53 L304.02 686.43 L304.02 686.43 ZM312.45 686.43 L320.89 686.43 L320.89 688.53 L312.45 688.53
+						 L312.45 686.43 L312.45 686.43 ZM327.21 686.43 L329.32 686.43 L329.32 688.53 L327.21 688.53 L327.21 686.43
+						 L327.21 686.43 ZM335.64 686.43 L344.08 686.43 L344.08 688.53 L335.64 688.53 L335.64 686.43 L335.64 686.43
+						 ZM350.4 686.43 L352.51 686.43 L352.51 688.53 L350.4 688.53 L350.4 686.43 L350.4 686.43 ZM358.83 686.43 L367.26
+						 686.43 L367.26 688.53 L358.83 688.53 L358.83 686.43 L358.83 686.43 ZM373.59 686.43 L375.7 686.43 L375.7
+						 688.53 L373.59 688.53 L373.59 686.43 L373.59 686.43 ZM382.02 686.43 L387.06 686.43 L387.06 688.53 L382.02
+						 688.53 L382.02 686.43 L382.02 686.43 ZM13.18 694.06 L0 687.48 L13.18 680.9 L13.18 694.06 L13.18 694.06 ZM384.89
+						 680.9 L398.06 687.48 L384.89 694.06 L384.89 680.9 L384.89 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape6-9" v:mID="6" v:groupContext="shape" transform="translate(2.5012,-522.82)">
+			<title>Sheet.6</title>
+			<path d="M0 597.01 L0 694.06 L154.68 694.06 L154.68 597.01 L0 597.01 L0 597.01 Z" class="st4"/>
+		</g>
+		<g id="shape7-11" v:mID="7" v:groupContext="shape" transform="translate(2.5012,-522.82)">
+			<title>Sheet.7</title>
+			<path d="M0 597.01 L154.68 597.01 L154.68 694.06 L0 694.06 L0 597.01" class="st5"/>
+		</g>
+		<g id="shape10-14" v:mID="10" v:groupContext="shape" transform="translate(159.025,-522.82)">
+			<title>Sheet.10</title>
+			<path d="M0 597.01 L0 694.06 L563.73 694.06 L563.73 597.01 L0 597.01 L0 597.01 Z" class="st6"/>
+		</g>
+		<g id="shape11-16" v:mID="11" v:groupContext="shape" transform="translate(159.025,-522.82)">
+			<title>Sheet.11</title>
+			<path d="M0 597.01 L563.73 597.01 L563.73 694.06 L0 694.06 L0 597.01" class="st7"/>
+		</g>
+		<g id="shape12-19" v:mID="12" v:groupContext="shape" transform="translate(262.039,-549.269)">
+			<title>Sheet.12</title>
+			<desc>Payload 0</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="46.92" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 0</text>		</g>
+		<g id="shape13-23" v:mID="13" v:groupContext="shape" transform="translate(547.615,-549.269)">
+			<title>Sheet.13</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="87.5716" cy="673.065" width="175.15" height="41.9798"/>
+			<path d="M175.14 652.08 L0 652.08 L0 694.06 L175.14 694.06 L175.14 652.08" class="st8"/>
+			<text x="37" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape15-27" v:mID="15" v:groupContext="shape" transform="translate(2.2377,-522.82)">
+			<title>Sheet.15</title>
+			<path d="M0 597.01 L0 694.06 L154.68 694.06 L154.68 597.01 L0 597.01 L0 597.01 Z" class="st4"/>
+		</g>
+		<g id="shape16-29" v:mID="16" v:groupContext="shape" transform="translate(2.2377,-522.82)">
+			<title>Sheet.16</title>
+			<path d="M0 597.01 L154.68 597.01 L154.68 694.06 L0 694.06 L0 597.01" class="st5"/>
+		</g>
+		<g id="shape17-32" v:mID="17" v:groupContext="shape" transform="translate(6.52106,-546.331)">
+			<title>Sheet.17</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="72.773" cy="673.065" width="145.55" height="41.9798"/>
+			<path d="M145.55 652.08 L0 652.08 L0 694.06 L145.55 694.06 L145.55 652.08" class="st8"/>
+			<text x="34.98" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape23-36" v:mID="23" v:groupContext="shape" transform="translate(286.548,-2.2377)">
+			<title>Sheet.23</title>
+			<path d="M0 597.27 L0 694.06 L345.2 694.06 L345.2 597.27 L0 597.27 L0 597.27 Z" class="st4"/>
+		</g>
+		<g id="shape24-38" v:mID="24" v:groupContext="shape" transform="translate(286.548,-2.2377)">
+			<title>Sheet.24</title>
+			<path d="M0 597.27 L345.2 597.27 L345.2 694.06 L0 694.06 L0 597.27" class="st5"/>
+		</g>
+		<g id="shape25-41" v:mID="25" v:groupContext="shape" transform="translate(399.834,-25.6887)">
+			<title>Sheet.25</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="72.773" cy="673.065" width="145.55" height="41.9798"/>
+			<path d="M145.55 652.08 L0 652.08 L0 694.06 L145.55 694.06 L145.55 652.08" class="st8"/>
+			<text x="13.76" y="683.56" class="st10" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape31-45" v:mID="31" v:groupContext="shape" transform="translate(631.744,-2.2377)">
+			<title>Sheet.31</title>
+			<path d="M0 597.27 L0 694.06 L516.21 694.06 L516.21 597.27 L0 597.27 L0 597.27 Z" class="st6"/>
+		</g>
+		<g id="shape32-47" v:mID="32" v:groupContext="shape" transform="translate(631.744,-2.2377)">
+			<title>Sheet.32</title>
+			<path d="M0 597.27 L516.21 597.27 L516.21 694.06 L0 694.06 L0 597.27" class="st7"/>
+		</g>
+		<g id="shape33-50" v:mID="33" v:groupContext="shape" transform="translate(809.035,-25.6889)">
+			<title>Sheet.33</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="16.99" y="683.56" class="st10" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape35-54" v:mID="35" v:groupContext="shape" transform="translate(1199.29,-21.1708)">
+			<title>Sheet.35</title>
+			<desc>Logical output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="164.662" cy="678.273" width="329.33" height="31.5648"/>
+			<path d="M329.32 662.49 L0 662.49 L0 694.06 L329.32 694.06 L329.32 662.49" class="st8"/>
+			<text x="24.69" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Logical output segment</text>		</g>
+		<g id="shape38-58" v:mID="38" v:groupContext="shape" transform="translate(1204.65,-254.446)">
+			<title>Sheet.38</title>
+			<desc>Three-part output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="181.707" cy="678.273" width="363.42" height="31.5648"/>
+			<path d="M363.41 662.49 L0 662.49 L0 694.06 L363.41 694.06 L363.41 662.49" class="st8"/>
+			<text x="19.51" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Three-part output segment</text>		</g>
+		<g id="shape39-62" v:mID="39" v:groupContext="shape" transform="translate(546.25,-529.921)">
+			<title>Sheet.39</title>
+			<path d="M3.43 529.94 L3.46 543.61 L0.03 543.61 L0 529.94 L3.43 529.94 L3.43 529.94 ZM3.46 553.87 L3.46 557.29 L0.03
+						 557.29 L0.03 553.87 L3.46 553.87 L3.46 553.87 ZM3.46 567.55 L3.46 581.22 L0.03 581.22 L0.03 567.55 L3.46
+						 567.55 L3.46 567.55 ZM3.46 591.48 L3.46 594.9 L0.03 594.9 L0.03 591.48 L3.46 591.48 L3.46 591.48 ZM3.46
+						 605.16 L3.46 618.83 L0.03 618.83 L0.03 605.16 L3.46 605.16 L3.46 605.16 ZM3.46 629.09 L3.46 632.51 L0.03
+						 632.51 L0.03 629.09 L3.46 629.09 L3.46 629.09 ZM3.46 642.77 L3.46 656.45 L0.03 656.45 L0.03 642.77 L3.46
+						 642.77 L3.46 642.77 ZM3.46 666.7 L3.46 670.12 L0.03 670.12 L0.03 666.7 L3.46 666.7 L3.46 666.7 ZM3.46 680.38
+						 L3.46 694.06 L0.03 694.06 L0.03 680.38 L3.46 680.38 L3.46 680.38 Z" class="st1"/>
+		</g>
+		<g id="shape40-64" v:mID="40" v:groupContext="shape" transform="translate(549.097,-223.749)">
+			<title>Sheet.40</title>
+			<path d="M6.65 402.61 L13.01 414.71 L9.98 416.32 L3.62 404.22 L6.65 402.61 L6.65 402.61 ZM17.82 423.78 L19.4 426.81 L16.37
+						 428.42 L14.79 425.39 L17.82 423.78 L17.82 423.78 ZM24.21 435.91 L30.57 448.01 L27.54 449.59 L21.18 437.49
+						 L24.21 435.91 L24.21 435.91 ZM35.38 457.08 L36.96 460.11 L33.93 461.69 L32.35 458.66 L35.38 457.08 L35.38
+						 457.08 ZM41.73 469.18 L48.12 481.28 L45.09 482.86 L38.7 470.76 L41.73 469.18 L41.73 469.18 ZM52.9 490.36
+						 L54.51 493.38 L51.48 494.99 L49.87 491.97 L52.9 490.36 L52.9 490.36 ZM59.29 502.45 L65.68 514.55 L62.65
+						 516.16 L56.26 504.06 L59.29 502.45 L59.29 502.45 ZM70.46 523.63 L72.07 526.65 L69.04 528.26 L67.43 525.24
+						 L70.46 523.63 L70.46 523.63 ZM76.85 535.76 L83.24 547.86 L80.21 549.43 L73.82 537.34 L76.85 535.76 L76.85
+						 535.76 ZM88.01 556.93 L89.63 559.95 L86.6 561.53 L84.98 558.51 L88.01 556.93 L88.01 556.93 ZM94.4 569.03
+						 L100.79 581.13 L97.76 582.7 L91.37 570.61 L94.4 569.03 L94.4 569.03 ZM105.57 590.2 L107.15 593.22 L104.12
+						 594.84 L102.54 591.81 L105.57 590.2 L105.57 590.2 ZM111.96 602.3 L118.32 614.4 L115.28 616.01 L108.93 603.91
+						 L111.96 602.3 L111.96 602.3 ZM123.12 623.47 L124.71 626.5 L121.67 628.11 L120.09 625.08 L123.12 623.47 L123.12
+						 623.47 ZM129.51 635.6 L135.87 647.7 L132.84 649.28 L126.48 637.18 L129.51 635.6 L129.51 635.6 ZM140.68 656.77
+						 L142.26 659.8 L139.23 661.38 L137.65 658.35 L140.68 656.77 L140.68 656.77 ZM147.04 668.87 L153.43 680.97
+						 L150.4 682.55 L144.01 670.45 L147.04 668.87 L147.04 668.87 ZM158.2 690.04 L159.49 692.48 L156.46 694.06
+						 L155.17 691.66 L158.2 690.04 L158.2 690.04 ZM0.33 408.43 L0 393.7 L11.99 402.28 L0.33 408.43 L0.33 408.43
+						 Z" class="st1"/>
+		</g>
+		<g id="shape46-66" v:mID="46" v:groupContext="shape" transform="translate(66.8445,-221.499)">
+			<title>Sheet.46</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L273.39 595.43 C282.51 595.43 289.86 602.79 289.86 611.87 L289.86
+						 677.62 C289.86 686.72 282.51 694.06 273.39 694.06 L16.47 694.06 C7.38 694.06 -0 686.72 0 677.62 L0 611.87
+						 Z" class="st4"/>
+		</g>
+		<g id="shape47-68" v:mID="47" v:groupContext="shape" transform="translate(66.8445,-221.499)">
+			<title>Sheet.47</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L273.39 595.43 C282.51 595.43 289.86 602.79 289.86 611.87 L289.86
+						 677.62 C289.86 686.72 282.51 694.06 273.39 694.06 L16.47 694.06 C7.38 694.06 -0 686.72 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape48-70" v:mID="48" v:groupContext="shape" transform="translate(113.27,-263.667)">
+			<title>Sheet.48</title>
+			<desc>Direct mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="98.5041" cy="675.59" width="197.01" height="36.9302"/>
+			<path d="M197.01 657.13 L0 657.13 L0 694.06 L197.01 694.06 L197.01 657.13" class="st8"/>
+			<text x="18.66" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Direct mbuf</text>		</g>
+		<g id="shape51-74" v:mID="51" v:groupContext="shape" transform="translate(85.817,-233.439)">
+			<title>Sheet.51</title>
+			<desc>(copy of headers)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="127.916" cy="678.273" width="255.84" height="31.5648"/>
+			<path d="M255.83 662.49 L0 662.49 L0 694.06 L255.83 694.06 L255.83 662.49" class="st8"/>
+			<text x="34.33" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(copy of headers)</text>		</g>
+		<g id="shape53-78" v:mID="53" v:groupContext="shape" transform="translate(371.944,-275.998)">
+			<title>Sheet.53</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="7.56" y="686.16" class="st15" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape54-82" v:mID="54" v:groupContext="shape" transform="translate(695.132,-646.04)">
+			<title>Sheet.54</title>
+			<path d="M0 655.39 L0 694.06 L100.4 694.06 L100.4 655.39 L0 655.39 L0 655.39 Z" class="st16"/>
+		</g>
+		<g id="shape55-84" v:mID="55" v:groupContext="shape" transform="translate(709.033,-648.946)">
+			<title>Sheet.55</title>
+			<desc>segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="36.6265" cy="680.167" width="73.26" height="27.7775"/>
+			<path d="M73.25 666.28 L0 666.28 L0 694.06 L73.25 694.06 L73.25 666.28" class="st8"/>
+			<text x="7.6" y="687.11" class="st17" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>segsz</text>		</g>
+		<g id="shape56-88" v:mID="56" v:groupContext="shape" transform="translate(785.874,-521.182)">
+			<title>Sheet.56</title>
+			<path d="M0 597.27 L0 694.06 L363.41 694.06 L363.41 597.27 L0 597.27 L0 597.27 Z" class="st6"/>
+		</g>
+		<g id="shape57-90" v:mID="57" v:groupContext="shape" transform="translate(785.874,-521.182)">
+			<title>Sheet.57</title>
+			<path d="M0 597.27 L363.41 597.27 L363.41 694.06 L0 694.06 L0 597.27" class="st7"/>
+		</g>
+		<g id="shape58-93" v:mID="58" v:groupContext="shape" v:layerMember="0" transform="translate(943.158,-529.889)">
+			<title>Sheet.58</title>
+			<path d="M1.35 529.91 L1.25 543.58 L4.68 543.61 L4.78 529.94 L1.35 529.91 L1.35 529.91 ZM1.15 553.84 L1.12 557.26 L4.55
+						 557.29 L4.58 553.87 L1.15 553.84 L1.15 553.84 ZM1.05 567.52 L0.92 581.19 L4.35 581.22 L4.48 567.55 L1.05
+						 567.52 L1.05 567.52 ZM0.86 591.45 L0.82 594.87 L4.25 594.9 L4.28 591.48 L0.86 591.45 L0.86 591.45 ZM0.72
+						 605.13 L0.63 618.8 L4.05 618.83 L4.15 605.16 L0.72 605.13 L0.72 605.13 ZM0.53 629.06 L0.53 632.48 L3.95
+						 632.51 L3.95 629.09 L0.53 629.06 L0.53 629.06 ZM0.43 642.74 L0.33 656.41 L3.75 656.45 L3.85 642.77 L0.43
+						 642.74 L0.43 642.74 ZM0.23 666.67 L0.2 670.09 L3.62 670.12 L3.66 666.7 L0.23 666.67 L0.23 666.67 ZM0.13
+						 680.35 L0 694.02 L3.43 694.06 L3.56 680.38 L0.13 680.35 L0.13 680.35 Z" class="st18"/>
+		</g>
+		<g id="shape59-95" v:mID="59" v:groupContext="shape" transform="translate(785.874,-549.473)">
+			<title>Sheet.59</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="77.3395" cy="673.065" width="154.68" height="41.9798"/>
+			<path d="M154.68 652.08 L0 652.08 L0 694.06 L154.68 694.06 L154.68 652.08" class="st8"/>
+			<text x="26.77" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape60-99" v:mID="60" v:groupContext="shape" transform="translate(952.97,-548.822)">
+			<title>Sheet.60</title>
+			<desc>Payload 2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="46.92" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 2</text>		</g>
+		<g id="shape63-103" v:mID="63" v:groupContext="shape" transform="translate(1210.43,-551.684)">
+			<title>Sheet.63</title>
+			<desc>Multi-segment input packet</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="181.707" cy="678.273" width="363.42" height="31.5648"/>
+			<path d="M363.41 662.49 L0 662.49 L0 694.06 L363.41 694.06 L363.41 662.49" class="st8"/>
+			<text x="17.75" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Multi-segment input packet</text>		</g>
+		<g id="shape70-107" v:mID="70" v:groupContext="shape" v:layerMember="1" transform="translate(455.049,-221.499)">
+			<title>Sheet.70</title>
+			<path d="M0 611.87 C0 602.79 5.33 595.43 11.89 595.43 L282.92 595.43 C289.53 595.43 294.8 602.79 294.8 611.87 L294.8
+						 677.62 C294.8 686.76 289.53 694.06 282.92 694.06 L11.89 694.06 C5.33 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st6"/>
+		</g>
+		<g id="shape71-109" v:mID="71" v:groupContext="shape" transform="translate(455.049,-221.499)">
+			<title>Sheet.71</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L391.97 595.43 C401.12 595.43 408.44 602.79 408.44 611.87 L408.44
+						 677.62 C408.44 686.76 401.12 694.06 391.97 694.06 L16.47 694.06 C7.38 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape72-111" v:mID="72" v:groupContext="shape" transform="translate(489.065,-263.434)">
+			<title>Sheet.72</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="112.128" cy="675.59" width="224.26" height="36.9302"/>
+			<path d="M224.26 657.13 L0 657.13 L0 694.06 L224.26 694.06 L224.26 657.13" class="st8"/>
+			<text x="20.73" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape75-115" v:mID="75" v:groupContext="shape" transform="translate(849.065,-281.435)">
+			<title>Sheet.75</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="4.49" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape77-119" v:mID="77" v:groupContext="shape" transform="translate(717.742,-563.523)">
+			<title>Sheet.77</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="15.71" y="683.67" class="st19" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape78-123" v:mID="78" v:groupContext="shape" transform="translate(1148.17,-529.067)">
+			<title>Sheet.78</title>
+			<path d="M1.38 529.87 L1.25 543.55 L4.68 543.61 L4.81 529.94 L1.38 529.87 L1.38 529.87 ZM1.19 553.81 L1.12 557.23 L4.55
+						 557.29 L4.61 553.87 L1.19 553.81 L1.19 553.81 ZM1.05 567.48 L0.92 581.16 L4.35 581.22 L4.48 567.55 L1.05
+						 567.48 L1.05 567.48 ZM0.86 591.42 L0.86 594.84 L4.28 594.9 L4.28 591.48 L0.86 591.42 L0.86 591.42 ZM0.72
+						 605.09 L0.66 618.77 L4.08 618.83 L4.15 605.16 L0.72 605.09 L0.72 605.09 ZM0.53 629.03 L0.53 632.45 L3.95
+						 632.51 L3.95 629.09 L0.53 629.03 L0.53 629.03 ZM0.46 642.7 L0.33 656.38 L3.75 656.45 L3.89 642.77 L0.46
+						 642.7 L0.46 642.7 ZM0.26 666.64 L0.2 670.06 L3.62 670.12 L3.69 666.7 L0.26 666.64 L0.26 666.64 ZM0.13 680.31
+						 L0 693.99 L3.43 694.06 L3.56 680.38 L0.13 680.31 L0.13 680.31 Z" class="st20"/>
+		</g>
+		<g id="shape79-125" v:mID="79" v:groupContext="shape" transform="translate(946.254,-657.81)">
+			<title>Sheet.79</title>
+			<path d="M11 686.69 L17.33 686.69 L17.33 688.27 L11 688.27 L11 686.69 L11 686.69 ZM22.07 686.69 L23.65 686.69 L23.65
+						 688.27 L22.07 688.27 L22.07 686.69 L22.07 686.69 ZM28.39 686.69 L34.72 686.69 L34.72 688.27 L28.39 688.27
+						 L28.39 686.69 L28.39 686.69 ZM39.46 686.69 L41.04 686.69 L41.04 688.27 L39.46 688.27 L39.46 686.69 L39.46
+						 686.69 ZM45.78 686.69 L52.11 686.69 L52.11 688.27 L45.78 688.27 L45.78 686.69 L45.78 686.69 ZM56.85 686.69
+						 L58.43 686.69 L58.43 688.27 L56.85 688.27 L56.85 686.69 L56.85 686.69 ZM63.18 686.69 L69.5 686.69 L69.5
+						 688.27 L63.18 688.27 L63.18 686.69 L63.18 686.69 ZM74.24 686.69 L75.82 686.69 L75.82 688.27 L74.24 688.27
+						 L74.24 686.69 L74.24 686.69 ZM80.57 686.69 L86.89 686.69 L86.89 688.27 L80.57 688.27 L80.57 686.69 L80.57
+						 686.69 ZM91.63 686.69 L93.22 686.69 L93.22 688.27 L91.63 688.27 L91.63 686.69 L91.63 686.69 ZM97.96 686.69
+						 L104.28 686.69 L104.28 688.27 L97.96 688.27 L97.96 686.69 L97.96 686.69 ZM109.03 686.69 L110.61 686.69 L110.61
+						 688.27 L109.03 688.27 L109.03 686.69 L109.03 686.69 ZM115.35 686.69 L121.67 686.69 L121.67 688.27 L115.35
+						 688.27 L115.35 686.69 L115.35 686.69 ZM126.42 686.69 L128 686.69 L128 688.27 L126.42 688.27 L126.42 686.69
+						 L126.42 686.69 ZM132.74 686.69 L139.07 686.69 L139.07 688.27 L132.74 688.27 L132.74 686.69 L132.74 686.69
+						 ZM143.81 686.69 L145.39 686.69 L145.39 688.27 L143.81 688.27 L143.81 686.69 L143.81 686.69 ZM150.13 686.69
+						 L156.46 686.69 L156.46 688.27 L150.13 688.27 L150.13 686.69 L150.13 686.69 ZM161.2 686.69 L162.78 686.69
+						 L162.78 688.27 L161.2 688.27 L161.2 686.69 L161.2 686.69 ZM167.53 686.69 L173.85 686.69 L173.85 688.27 L167.53
+						 688.27 L167.53 686.69 L167.53 686.69 ZM178.59 686.69 L180.17 686.69 L180.17 688.27 L178.59 688.27 L178.59
+						 686.69 L178.59 686.69 ZM184.92 686.69 L189.4 686.69 L189.4 688.27 L184.92 688.27 L184.92 686.69 L184.92
+						 686.69 ZM13.18 694.06 L0 687.41 L13.18 680.9 L13.18 694.06 L13.18 694.06 ZM187.22 680.9 L200.4 687.48 L187.22
+						 694.06 L187.22 680.9 L187.22 680.9 Z" class="st20"/>
+		</g>
+		<g id="shape80-127" v:mID="80" v:groupContext="shape" transform="translate(982.882,-643.673)">
+			<title>Sheet.80</title>
+			<path d="M0 655.13 L0 694.06 L127.01 694.06 L127.01 655.13 L0 655.13 L0 655.13 Z" class="st16"/>
+		</g>
+		<g id="shape81-129" v:mID="81" v:groupContext="shape" transform="translate(1003.39,-660.621)">
+			<title>Sheet.81</title>
+			<desc>pkt_len</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="48.6041" cy="680.956" width="97.21" height="26.1994"/>
+			<path d="M97.21 667.86 L0 667.86 L0 694.06 L97.21 694.06 L97.21 667.86" class="st8"/>
+			<text x="11.67" y="687.5" class="st21" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>pkt_len  </text>		</g>
+		<g id="shape82-133" v:mID="82" v:groupContext="shape" transform="translate(1001.18,-634.321)">
+			<title>Sheet.82</title>
+			<desc>% segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="49.2945" cy="680.956" width="98.59" height="26.1994"/>
+			<path d="M98.59 667.86 L0 667.86 L0 694.06 L98.59 694.06 L98.59 667.86" class="st8"/>
+			<text x="9.09" y="687.5" class="st21" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>% segsz</text>		</g>
+		<g id="shape34-137" v:mID="34" v:groupContext="shape" v:layerMember="0" transform="translate(356.703,-264.106)">
+			<title>Sheet.34</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape85-139" v:mID="85" v:groupContext="shape" v:layerMember="0" transform="translate(78.5359,-282.66)">
+			<title>Sheet.85</title>
+			<path d="M0 680.87 C-0 673.59 6.88 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.88 694.06 0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape87-141" v:mID="87" v:groupContext="shape" v:layerMember="0" transform="translate(85.4791,-284.062)">
+			<title>Sheet.87</title>
+			<desc>1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>1</text>		</g>
+		<g id="shape88-145" v:mID="88" v:groupContext="shape" v:layerMember="0" transform="translate(468.906,-282.66)">
+			<title>Sheet.88</title>
+			<path d="M0 680.87 C-0 673.59 6.89 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.89 694.06 -0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape90-147" v:mID="90" v:groupContext="shape" v:layerMember="0" transform="translate(474.575,-284.062)">
+			<title>Sheet.90</title>
+			<desc>2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>2</text>		</g>
+		<g id="shape95-151" v:mID="95" v:groupContext="shape" v:layerMember="0" transform="translate(764.026,-275.998)">
+			<title>Sheet.95</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="7.56" y="686.16" class="st15" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape97-155" v:mID="97" v:groupContext="shape" v:layerMember="0" transform="translate(889.755,-220.915)">
+			<title>Sheet.97</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L391.97 595.43 C401.12 595.43 408.44 602.79 408.44 611.87 L408.44
+						 677.62 C408.44 686.76 401.12 694.06 391.97 694.06 L16.47 694.06 C7.38 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape100-157" v:mID="100" v:groupContext="shape" v:layerMember="0" transform="translate(751.857,-262.528)">
+			<title>Sheet.100</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape104-159" v:mID="104" v:groupContext="shape" v:layerMember="1" transform="translate(851.429,-218.08)">
+			<title>Sheet.104</title>
+			<path d="M0 611.87 C0 602.79 5.33 595.43 11.89 595.43 L282.92 595.43 C289.53 595.43 294.8 602.79 294.8 611.87 L294.8
+						 677.62 C294.8 686.76 289.53 694.06 282.92 694.06 L11.89 694.06 C5.33 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st6"/>
+		</g>
+		<g id="shape105-161" v:mID="105" v:groupContext="shape" v:layerMember="0" transform="translate(885.444,-260.015)">
+			<title>Sheet.105</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="112.128" cy="675.59" width="224.26" height="36.9302"/>
+			<path d="M224.26 657.13 L0 657.13 L0 694.06 L224.26 694.06 L224.26 657.13" class="st8"/>
+			<text x="20.73" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape106-165" v:mID="106" v:groupContext="shape" v:layerMember="0" transform="translate(895.672,-229.419)">
+			<title>Sheet.106</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="12.86" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape107-169" v:mID="107" v:groupContext="shape" v:layerMember="0" transform="translate(863.297,-280.442)">
+			<title>Sheet.107</title>
+			<path d="M0 680.87 C-0 673.59 6.89 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.89 694.06 -0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape108-171" v:mID="108" v:groupContext="shape" v:layerMember="0" transform="translate(870.001,-281.547)">
+			<title>Sheet.108</title>
+			<desc>3</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>3</text>		</g>
+		<g id="shape109-175" v:mID="109" v:groupContext="shape" v:layerMember="0" transform="translate(500.959,-231.87)">
+			<title>Sheet.109</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="12.86" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+	</g>
+</svg>
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 40f04a1..c7c8b17 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -56,6 +56,7 @@ Programmer's Guide
     reorder_lib
     ip_fragment_reassembly_lib
     generic_receive_offload_lib
+    generic_segmentation_offload_lib
     pdump_lib
     multi_proc_support
     kernel_nic_interface
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* Re: [PATCH v8 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-10-05 15:43             ` [PATCH v8 " Mark Kavanagh
@ 2017-10-05 17:12               ` Ananyev, Konstantin
  2017-10-05 20:16                 ` Kavanagh, Mark B
  2017-10-05 20:36               ` [PATCH v9 " Mark Kavanagh
                                 ` (6 subsequent siblings)
  7 siblings, 1 reply; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-10-05 17:12 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas

Hi Mark,

> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Thursday, October 5, 2017 4:44 PM
> To: dev@dpdk.org
> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> Subject: [PATCH v8 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
> 
> Generic Segmentation Offload (GSO) is a SW technique to split large
> packets into small ones. Akin to TSO, GSO enables applications to
> operate on large packets, thus reducing per-packet processing overhead.
> 
> To enable more flexibility to applications, DPDK GSO is implemented
> as a standalone library. Applications explicitly use the GSO library
> to segment packets. This patch adds GSO support to DPDK for specific
> packet types: specifically, TCP/IPv4, VxLAN, and GRE.
> 
> The first patch introduces the GSO API framework. The second patch
> adds GSO support for TCP/IPv4 packets (containing an optional VLAN
> tag). The third patch adds GSO support for VxLAN packets that contain
> outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or
> outer VLAN tags). The fourth patch adds GSO support for GRE packets
> that contain outer IPv4, and inner TCP/IPv4 headers (with optional
> outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
> and GRE GSO in testpmd's checksum forwarding engine. The final patch
> in the series adds GSO documentation to the programmer's guide.
> 
> Performance Testing
> ===================
> The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
> iperf. Setup for the test is described as follows:
> 
> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>    machine, together physically.
> b. Launch testpmd with P0 and a vhost-user port, and use csum
>    forwarding engine with "retry".
> c. Select IP and TCP HW checksum calculation for P0; select TCP HW
>    checksum calculation for vhost-user port.
> d. Launch a VM with csum and tso offloading enabled.
> e. Run iperf-client on virtio-net port in the VM to send TCP packets.
>    With enabling csum and tso, the VM can send large TCP/IPv4 packets
>    (mss is up to 64KB).
> f. P1 is assigned to linux kernel and enabled kernel GRO. Run
>    iperf-server on P1.
> 
> We conduct three iperf tests:
> 
> test-1: enable GSO for P0 in testpmd, and set max GSO segment length
>     to 1518B. Run two iperf-client in the VM.
> test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
>     two iperf-client in the VM.
> test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.
> 
> Throughput of the above three tests:
> 
> test-1: 9.4Gbps
> test-2: 9.5Gbps
> test-3: 3Mbps
> 
> Functional Testing
> ==================
> Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
> length of tunneled packets from VMs is 1514B. So current experiment
> method can't be used to measure VxLAN and GRE GSO performance, but simply
> test the functionality via setting small GSO segment length (e.g. 500B).
> 
> VxLAN
> -----
> To test VxLAN GSO functionality, we use the following setup:
> 
> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>    machine, together physically.
> b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
>    engine with "retry".
> c. Testpmd commands:
>     - csum parse_tunnel on "P0"
>     - csum parse_tunnel on "vhost-user port"
>     - csum set outer-ip hw "P0"
>     - csum set ip hw "P0"
>     - csum set tcp hw "P0"
>     - csum set tcp hw "vhost-user port"
>     - set port "P0" gso on
>     - set gso segsz 500
> d. Launch a VM with csum and tso offloading enabled.
> e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
>    on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
>    max packet length is 1514B.
> f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
>    create a VxLAN port for P1, and run iperf-server on the VxLAN port.
> 
> In testpmd, we can see the length of all packets sent from P0 is smaller
> than or equal to 500B. Additionally, the packets arriving in P1 is
> encapsulated and is smaller than or equal to 500B.
> 
> GRE
> ---
> The same process may be used to test GRE functionality, with the exception that
> the tunnel type created for both the guest's virtio-net, and the host's kernel
> interfaces is GRE:
>    `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`
> 
> As in the VxLAN testcase, the length of packets sent from P0, and received on
> P1, is less than 500B.
> 
> Change log
> ==========
> v8:
> - resolve coding style infractions (indentation).
> - centralize invalid parameter checking for rte_gso_segment() into a single
>   'if' statement.
> - don't clear PKT_TX_TCP_SEG flag for packets that don't qualify for GSO
>   on account of invalid params.
> - allow GSO for tunneled packets only via gso_ctx (by correcting 'if'
>   statement condition).

Last (hopefully :)) few nits from me:
1. [dpdk-dev,v8,5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
I686 build fails for me, I think you need to:

--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -4052,7 +4052,7 @@ cmd_gso_size_parsed(void *parsed_result,
        if (!strcmp(res->cmd_keyword, "gso") &&
                        !strcmp(res->cmd_segsz, "segsz")) {
                if (res->cmd_size < RTE_GSO_SEG_SIZE_MIN)
-                       printf("gso_size should be larger than %lu."
+                       printf("gso_size should be larger than %zu."
                                        " Please input a legal value\n",
                                        RTE_GSO_SEG_SIZE_MIN);
                else

2. [dpdk-dev,v8,2/6] gso: add TCP/IPv4 GSO support

int
 rte_gso_segment(struct rte_mbuf *pkt,
...
+	} else {
+		pkts_out[0] = pkt;
+		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
+		return 1;
+	}
+

I still think that log level should be DEBUG here.
Konstantin

> 
> v7:
> - add RTE_GSO_SEG_SIZE_MIN macro; use this to validate gso_ctx.gso_segsz.
> - rename 'ipid_flag' member of gso_ctx to 'flag'.
> - remove mention of VLAN tags in supported packet types.
> - don't clear PKT_TX_TCP_SEG flag if GSO fails.
> - take all packet overhead into account when checking for empty packet.
> - ensure that only enabled GSO types are enacted upon (i.e. no fall-through to
>   TCP/IPv4 case from tunneled case).
> - validate user-supplied gso segsz arg against RTE_GSO_SEG_SIZE_MIN in testpmd.
> - simplify error-checking/handling for GSO failure case in testpmd csum engine.
> - use 0 instead of !RTE_GSO_IPID_FIXED in testpmd.
> 
> v6:
> - rebase to HEAD of master (i5dce9fcA)
> - remove 'l3_offset' parameter from 'update_ipv4_tcp_headers'
> 
> v5:
> - add GSO section to the programmer's guide.
> - use MF or (previously 'and') offset to check if a packet is IP
>   fragmented.
> - move 'update_header' helper functions to gso_common.h.
> - move txp/ipv4 'update_header' function to gso_tcp4.c.
> - move tunnel 'update_header' function to gso_tunnel_tcp4.c.
> - add offset parameter to 'update_header' functions.
> - combine GRE and VxLAN tunnel header update functions into a single
>   function.
> - correct typos and errors in comments/commit messages.
> 
> v4:
> - use ol_flags instead of packet_type to decide which segmentation
>   function to use.
> - use MF and offset to check if a packet is IP fragmented, instead of
>   using DF.
> - remove ETHER_CRC_LEN from gso segment payload length calculation.
> - refactor internal header update and other functions.
> - remove RTE_GSO_IPID_INCREASE.
> - add some of GSO documents.
> - set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
>   packets sent from GSO-enabled ports in testpmd.
> v3:
> - support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
>   RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
>   UNKNOWN.
> - fill mbuf->packet_type instead of using rte_net_get_ptype() in
>   csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
> - store the input packet into pkts_out inside gso_tcp4_segment() and
>   gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
>   is performed.
> - add missing incldues.
> - optimize file names, function names and function description.
> - fix one bug in testpmd.
> v2:
> - merge data segments whose data_len is less than mss into a large data
>   segment in gso_do_segment().
> - use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
>   header in rte_gso_segment().
> - provide IP id macros for applications to select fixed or incremental IP
>   ids.
> 
> Jiayu Hu (3):
>   gso: add Generic Segmentation Offload API framework
>   gso: add TCP/IPv4 GSO support
>   app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
> 
> Mark Kavanagh (3):
>   gso: add VxLAN GSO support
>   gso: add GRE GSO support
>   doc: add GSO programmer's guide
> 
>  MAINTAINERS                                        |   6 +
>  app/test-pmd/cmdline.c                             | 179 ++++++++
>  app/test-pmd/config.c                              |  24 ++
>  app/test-pmd/csumonly.c                            |  42 +-
>  app/test-pmd/testpmd.c                             |  13 +
>  app/test-pmd/testpmd.h                             |  10 +
>  config/common_base                                 |   5 +
>  doc/api/doxy-api-index.md                          |   1 +
>  doc/api/doxy-api.conf                              |   1 +
>  .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
>  .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
>  doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
>  doc/guides/prog_guide/index.rst                    |   1 +
>  doc/guides/rel_notes/release_17_11.rst             |  17 +
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  46 ++
>  lib/Makefile                                       |   2 +
>  lib/librte_eal/common/include/rte_log.h            |   1 +
>  lib/librte_gso/Makefile                            |  52 +++
>  lib/librte_gso/gso_common.c                        | 153 +++++++
>  lib/librte_gso/gso_common.h                        | 171 ++++++++
>  lib/librte_gso/gso_tcp4.c                          | 104 +++++
>  lib/librte_gso/gso_tcp4.h                          |  74 ++++
>  lib/librte_gso/gso_tunnel_tcp4.c                   | 126 ++++++
>  lib/librte_gso/gso_tunnel_tcp4.h                   |  75 ++++
>  lib/librte_gso/rte_gso.c                           | 110 +++++
>  lib/librte_gso/rte_gso.h                           | 148 +++++++
>  lib/librte_gso/rte_gso_version.map                 |   7 +
>  mk/rte.app.mk                                      |   1 +
>  28 files changed, 2411 insertions(+), 4 deletions(-)
>  create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
>  create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
>  create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
>  create mode 100644 lib/librte_gso/Makefile
>  create mode 100644 lib/librte_gso/gso_common.c
>  create mode 100644 lib/librte_gso/gso_common.h
>  create mode 100644 lib/librte_gso/gso_tcp4.c
>  create mode 100644 lib/librte_gso/gso_tcp4.h
>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
>  create mode 100644 lib/librte_gso/rte_gso.c
>  create mode 100644 lib/librte_gso/rte_gso.h
>  create mode 100644 lib/librte_gso/rte_gso_version.map
> 
> --
> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v8 6/6] doc: add GSO programmer's guide
  2017-10-05 15:44             ` [PATCH v8 6/6] doc: add GSO programmer's guide Mark Kavanagh
@ 2017-10-05 17:57               ` Mcnamara, John
  0 siblings, 0 replies; 157+ messages in thread
From: Mcnamara, John @ 2017-10-05 17:57 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev
  Cc: Hu, Jiayu, Tan, Jianfeng, Ananyev, Konstantin, Yigit, Ferruh,
	thomas, Kavanagh, Mark B



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Mark Kavanagh
> Sent: Thursday, October 5, 2017 4:44 PM
> To: dev@dpdk.org
> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng
> <jianfeng.tan@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> thomas@monjalon.net; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> Subject: [dpdk-dev] [PATCH v8 6/6] doc: add GSO programmer's guide
> 
> Add programmer's guide doc to explain the design and use of the
> GSO library.
> 
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>

Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v8 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-10-05 17:12               ` Ananyev, Konstantin
@ 2017-10-05 20:16                 ` Kavanagh, Mark B
  0 siblings, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-10-05 20:16 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



>-----Original Message-----
>From: Ananyev, Konstantin
>Sent: Thursday, October 5, 2017 6:12 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>Subject: RE: [PATCH v8 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
>
>Hi Mark,
>
>> -----Original Message-----
>> From: Kavanagh, Mark B
>> Sent: Thursday, October 5, 2017 4:44 PM
>> To: dev@dpdk.org
>> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
>> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>
>> Subject: [PATCH v8 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
>>
>> Generic Segmentation Offload (GSO) is a SW technique to split large
>> packets into small ones. Akin to TSO, GSO enables applications to
>> operate on large packets, thus reducing per-packet processing overhead.
>>
>> To enable more flexibility to applications, DPDK GSO is implemented
>> as a standalone library. Applications explicitly use the GSO library
>> to segment packets. This patch adds GSO support to DPDK for specific
>> packet types: specifically, TCP/IPv4, VxLAN, and GRE.
>>
>> The first patch introduces the GSO API framework. The second patch
>> adds GSO support for TCP/IPv4 packets (containing an optional VLAN
>> tag). The third patch adds GSO support for VxLAN packets that contain
>> outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or
>> outer VLAN tags). The fourth patch adds GSO support for GRE packets
>> that contain outer IPv4, and inner TCP/IPv4 headers (with optional
>> outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
>> and GRE GSO in testpmd's checksum forwarding engine. The final patch
>> in the series adds GSO documentation to the programmer's guide.
>>
>> Performance Testing
>> ===================
>> The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
>> iperf. Setup for the test is described as follows:
>>
>> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>>    machine, together physically.
>> b. Launch testpmd with P0 and a vhost-user port, and use csum
>>    forwarding engine with "retry".
>> c. Select IP and TCP HW checksum calculation for P0; select TCP HW
>>    checksum calculation for vhost-user port.
>> d. Launch a VM with csum and tso offloading enabled.
>> e. Run iperf-client on virtio-net port in the VM to send TCP packets.
>>    With enabling csum and tso, the VM can send large TCP/IPv4 packets
>>    (mss is up to 64KB).
>> f. P1 is assigned to linux kernel and enabled kernel GRO. Run
>>    iperf-server on P1.
>>
>> We conduct three iperf tests:
>>
>> test-1: enable GSO for P0 in testpmd, and set max GSO segment length
>>     to 1518B. Run two iperf-client in the VM.
>> test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
>>     two iperf-client in the VM.
>> test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.
>>
>> Throughput of the above three tests:
>>
>> test-1: 9.4Gbps
>> test-2: 9.5Gbps
>> test-3: 3Mbps
>>
>> Functional Testing
>> ==================
>> Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
>> length of tunneled packets from VMs is 1514B. So current experiment
>> method can't be used to measure VxLAN and GRE GSO performance, but simply
>> test the functionality via setting small GSO segment length (e.g. 500B).
>>
>> VxLAN
>> -----
>> To test VxLAN GSO functionality, we use the following setup:
>>
>> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>>    machine, together physically.
>> b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
>>    engine with "retry".
>> c. Testpmd commands:
>>     - csum parse_tunnel on "P0"
>>     - csum parse_tunnel on "vhost-user port"
>>     - csum set outer-ip hw "P0"
>>     - csum set ip hw "P0"
>>     - csum set tcp hw "P0"
>>     - csum set tcp hw "vhost-user port"
>>     - set port "P0" gso on
>>     - set gso segsz 500
>> d. Launch a VM with csum and tso offloading enabled.
>> e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
>>    on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
>>    max packet length is 1514B.
>> f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
>>    create a VxLAN port for P1, and run iperf-server on the VxLAN port.
>>
>> In testpmd, we can see the length of all packets sent from P0 is smaller
>> than or equal to 500B. Additionally, the packets arriving in P1 is
>> encapsulated and is smaller than or equal to 500B.
>>
>> GRE
>> ---
>> The same process may be used to test GRE functionality, with the exception
>that
>> the tunnel type created for both the guest's virtio-net, and the host's
>kernel
>> interfaces is GRE:
>>    `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`
>>
>> As in the VxLAN testcase, the length of packets sent from P0, and received
>on
>> P1, is less than 500B.
>>
>> Change log
>> ==========
>> v8:
>> - resolve coding style infractions (indentation).
>> - centralize invalid parameter checking for rte_gso_segment() into a single
>>   'if' statement.
>> - don't clear PKT_TX_TCP_SEG flag for packets that don't qualify for GSO
>>   on account of invalid params.
>> - allow GSO for tunneled packets only via gso_ctx (by correcting 'if'
>>   statement condition).
>

Hey Konstantin,

>Last (hopefully :)) few nits from me:

No worries! :)

>1. [dpdk-dev,v8,5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
>I686 build fails for me, I think you need to:

Oh wow - sorry about that. I hadn't considered building for that target to be honest - thanks for both the catch, and the fix!
>
>--- a/app/test-pmd/cmdline.c
>+++ b/app/test-pmd/cmdline.c
>@@ -4052,7 +4052,7 @@ cmd_gso_size_parsed(void *parsed_result,
>        if (!strcmp(res->cmd_keyword, "gso") &&
>                        !strcmp(res->cmd_segsz, "segsz")) {
>                if (res->cmd_size < RTE_GSO_SEG_SIZE_MIN)
>-                       printf("gso_size should be larger than %lu."
>+                       printf("gso_size should be larger than %zu."
>                                        " Please input a legal value\n",
>                                        RTE_GSO_SEG_SIZE_MIN);
>                else
>
>2. [dpdk-dev,v8,2/6] gso: add TCP/IPv4 GSO support
>
>int
> rte_gso_segment(struct rte_mbuf *pkt,
>...
>+	} else {
>+		pkts_out[0] = pkt;
>+		RTE_LOG(WARNING, GSO, "Unsupported packet type\n");
>+		return 1;
>+	}
>+
>
>I still think that log level should be DEBUG here.
>Konstantin

No problem. If you pointed that out earlier I certainly missed it - apologies!

Many thanks again,
Mark

>
>>
>> v7:
>> - add RTE_GSO_SEG_SIZE_MIN macro; use this to validate gso_ctx.gso_segsz.
>> - rename 'ipid_flag' member of gso_ctx to 'flag'.
>> - remove mention of VLAN tags in supported packet types.
>> - don't clear PKT_TX_TCP_SEG flag if GSO fails.
>> - take all packet overhead into account when checking for empty packet.
>> - ensure that only enabled GSO types are enacted upon (i.e. no fall-through
>to
>>   TCP/IPv4 case from tunneled case).
>> - validate user-supplied gso segsz arg against RTE_GSO_SEG_SIZE_MIN in
>testpmd.
>> - simplify error-checking/handling for GSO failure case in testpmd csum
>engine.
>> - use 0 instead of !RTE_GSO_IPID_FIXED in testpmd.
>>
>> v6:
>> - rebase to HEAD of master (i5dce9fcA)
>> - remove 'l3_offset' parameter from 'update_ipv4_tcp_headers'
>>
>> v5:
>> - add GSO section to the programmer's guide.
>> - use MF or (previously 'and') offset to check if a packet is IP
>>   fragmented.
>> - move 'update_header' helper functions to gso_common.h.
>> - move txp/ipv4 'update_header' function to gso_tcp4.c.
>> - move tunnel 'update_header' function to gso_tunnel_tcp4.c.
>> - add offset parameter to 'update_header' functions.
>> - combine GRE and VxLAN tunnel header update functions into a single
>>   function.
>> - correct typos and errors in comments/commit messages.
>>
>> v4:
>> - use ol_flags instead of packet_type to decide which segmentation
>>   function to use.
>> - use MF and offset to check if a packet is IP fragmented, instead of
>>   using DF.
>> - remove ETHER_CRC_LEN from gso segment payload length calculation.
>> - refactor internal header update and other functions.
>> - remove RTE_GSO_IPID_INCREASE.
>> - add some of GSO documents.
>> - set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
>>   packets sent from GSO-enabled ports in testpmd.
>> v3:
>> - support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
>>   RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
>>   UNKNOWN.
>> - fill mbuf->packet_type instead of using rte_net_get_ptype() in
>>   csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
>> - store the input packet into pkts_out inside gso_tcp4_segment() and
>>   gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
>>   is performed.
>> - add missing incldues.
>> - optimize file names, function names and function description.
>> - fix one bug in testpmd.
>> v2:
>> - merge data segments whose data_len is less than mss into a large data
>>   segment in gso_do_segment().
>> - use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
>>   header in rte_gso_segment().
>> - provide IP id macros for applications to select fixed or incremental IP
>>   ids.
>>
>> Jiayu Hu (3):
>>   gso: add Generic Segmentation Offload API framework
>>   gso: add TCP/IPv4 GSO support
>>   app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
>>
>> Mark Kavanagh (3):
>>   gso: add VxLAN GSO support
>>   gso: add GRE GSO support
>>   doc: add GSO programmer's guide
>>
>>  MAINTAINERS                                        |   6 +
>>  app/test-pmd/cmdline.c                             | 179 ++++++++
>>  app/test-pmd/config.c                              |  24 ++
>>  app/test-pmd/csumonly.c                            |  42 +-
>>  app/test-pmd/testpmd.c                             |  13 +
>>  app/test-pmd/testpmd.h                             |  10 +
>>  config/common_base                                 |   5 +
>>  doc/api/doxy-api-index.md                          |   1 +
>>  doc/api/doxy-api.conf                              |   1 +
>>  .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
>>  .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
>>  doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477
>+++++++++++++++++++++
>>  doc/guides/prog_guide/index.rst                    |   1 +
>>  doc/guides/rel_notes/release_17_11.rst             |  17 +
>>  doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  46 ++
>>  lib/Makefile                                       |   2 +
>>  lib/librte_eal/common/include/rte_log.h            |   1 +
>>  lib/librte_gso/Makefile                            |  52 +++
>>  lib/librte_gso/gso_common.c                        | 153 +++++++
>>  lib/librte_gso/gso_common.h                        | 171 ++++++++
>>  lib/librte_gso/gso_tcp4.c                          | 104 +++++
>>  lib/librte_gso/gso_tcp4.h                          |  74 ++++
>>  lib/librte_gso/gso_tunnel_tcp4.c                   | 126 ++++++
>>  lib/librte_gso/gso_tunnel_tcp4.h                   |  75 ++++
>>  lib/librte_gso/rte_gso.c                           | 110 +++++
>>  lib/librte_gso/rte_gso.h                           | 148 +++++++
>>  lib/librte_gso/rte_gso_version.map                 |   7 +
>>  mk/rte.app.mk                                      |   1 +
>>  28 files changed, 2411 insertions(+), 4 deletions(-)
>>  create mode 100644
>doc/guides/prog_guide/generic_segmentation_offload_lib.rst
>>  create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
>>  create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
>>  create mode 100644 lib/librte_gso/Makefile
>>  create mode 100644 lib/librte_gso/gso_common.c
>>  create mode 100644 lib/librte_gso/gso_common.h
>>  create mode 100644 lib/librte_gso/gso_tcp4.c
>>  create mode 100644 lib/librte_gso/gso_tcp4.h
>>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
>>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
>>  create mode 100644 lib/librte_gso/rte_gso.c
>>  create mode 100644 lib/librte_gso/rte_gso.h
>>  create mode 100644 lib/librte_gso/rte_gso_version.map
>>
>> --
>> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v9 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-10-05 15:43             ` [PATCH v8 " Mark Kavanagh
  2017-10-05 17:12               ` Ananyev, Konstantin
@ 2017-10-05 20:36               ` Mark Kavanagh
  2017-10-05 22:24                 ` Ananyev, Konstantin
                                   ` (2 more replies)
  2017-10-05 20:36               ` [PATCH v9 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
                                 ` (5 subsequent siblings)
  7 siblings, 3 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 20:36 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. This patch adds GSO support to DPDK for specific
packet types: specifically, TCP/IPv4, VxLAN, and GRE.

The first patch introduces the GSO API framework. The second patch
adds GSO support for TCP/IPv4 packets (containing an optional VLAN
tag). The third patch adds GSO support for VxLAN packets that contain
outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or 
outer VLAN tags). The fourth patch adds GSO support for GRE packets
that contain outer IPv4, and inner TCP/IPv4 headers (with optional 
outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
and GRE GSO in testpmd's checksum forwarding engine. The final patch
in the series adds GSO documentation to the programmer's guide.

Performance Testing
===================
The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
iperf. Setup for the test is described as follows:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum
   forwarding engine with "retry".
c. Select IP and TCP HW checksum calculation for P0; select TCP HW
   checksum calculation for vhost-user port.
d. Launch a VM with csum and tso offloading enabled.
e. Run iperf-client on virtio-net port in the VM to send TCP packets.
   With enabling csum and tso, the VM can send large TCP/IPv4 packets
   (mss is up to 64KB).
f. P1 is assigned to linux kernel and enabled kernel GRO. Run
   iperf-server on P1.

We conduct three iperf tests:

test-1: enable GSO for P0 in testpmd, and set max GSO segment length
    to 1518B. Run two iperf-client in the VM.
test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
    two iperf-client in the VM.
test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.

Throughput of the above three tests:

test-1: 9.4Gbps
test-2: 9.5Gbps
test-3: 3Mbps

Functional Testing
==================
Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
length of tunneled packets from VMs is 1514B. So current experiment
method can't be used to measure VxLAN and GRE GSO performance, but simply
test the functionality via setting small GSO segment length (e.g. 500B).

VxLAN
-----
To test VxLAN GSO functionality, we use the following setup:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
   engine with "retry".
c. Testpmd commands:
    - csum parse_tunnel on "P0"
    - csum parse_tunnel on "vhost-user port"
    - csum set outer-ip hw "P0"
    - csum set ip hw "P0"
    - csum set tcp hw "P0"
    - csum set tcp hw "vhost-user port"
    - set port "P0" gso on
    - set gso segsz 500
d. Launch a VM with csum and tso offloading enabled.
e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
   on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
   max packet length is 1514B.
f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
   create a VxLAN port for P1, and run iperf-server on the VxLAN port.

In testpmd, we can see the length of all packets sent from P0 is smaller
than or equal to 500B. Additionally, the packets arriving in P1 is
encapsulated and is smaller than or equal to 500B.

GRE
---
The same process may be used to test GRE functionality, with the exception that
the tunnel type created for both the guest's virtio-net, and the host's kernel
interfaces is GRE:
   `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`

As in the VxLAN testcase, the length of packets sent from P0, and received on
P1, is less than 500B.

Change log
==========
v9:
- fix testpmd build for i686 target
- change log level from WARNING to DEBUG in the case of unsupported packet
  (rte_gso_segment())

v8:
- resolve coding style infractions (indentation).
- centralize invalid parameter checking for rte_gso_segment() into a single
  'if' statement.
- don't clear PKT_TX_TCP_SEG flag for packets that don't qualify for GSO
  on account of invalid params.
- allow GSO for tunneled packets only via gso_ctx (by correcting 'if'
  statement condition).

v7:
- add RTE_GSO_SEG_SIZE_MIN macro; use this to validate gso_ctx.gso_segsz.
- rename 'ipid_flag' member of gso_ctx to 'flag'.
- remove mention of VLAN tags in supported packet types.
- don't clear PKT_TX_TCP_SEG flag if GSO fails.
- take all packet overhead into account when checking for empty packet.
- ensure that only enabled GSO types are enacted upon (i.e. no fall-through to
  TCP/IPv4 case from tunneled case).
- validate user-supplied gso segsz arg against RTE_GSO_SEG_SIZE_MIN in testpmd.
- simplify error-checking/handling for GSO failure case in testpmd csum engine.
- use 0 instead of !RTE_GSO_IPID_FIXED in testpmd.

v6:
- rebase to HEAD of master (i5dce9fcA)
- remove 'l3_offset' parameter from 'update_ipv4_tcp_headers'

v5:
- add GSO section to the programmer's guide.
- use MF or (previously 'and') offset to check if a packet is IP
  fragmented.
- move 'update_header' helper functions to gso_common.h.
- move txp/ipv4 'update_header' function to gso_tcp4.c.
- move tunnel 'update_header' function to gso_tunnel_tcp4.c.
- add offset parameter to 'update_header' functions.
- combine GRE and VxLAN tunnel header update functions into a single
  function.
- correct typos and errors in comments/commit messages.

v4:
- use ol_flags instead of packet_type to decide which segmentation
  function to use.
- use MF and offset to check if a packet is IP fragmented, instead of
  using DF.
- remove ETHER_CRC_LEN from gso segment payload length calculation.
- refactor internal header update and other functions.
- remove RTE_GSO_IPID_INCREASE.
- add some of GSO documents.
- set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
  packets sent from GSO-enabled ports in testpmd.
v3:
- support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
  RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
  UNKNOWN.
- fill mbuf->packet_type instead of using rte_net_get_ptype() in
  csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
- store the input packet into pkts_out inside gso_tcp4_segment() and
  gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
  is performed.
- add missing incldues.
- optimize file names, function names and function description.
- fix one bug in testpmd.
v2:
- merge data segments whose data_len is less than mss into a large data
  segment in gso_do_segment().
- use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
  header in rte_gso_segment().
- provide IP id macros for applications to select fixed or incremental IP
  ids.

Jiayu Hu (3):
  gso: add Generic Segmentation Offload API framework
  gso: add TCP/IPv4 GSO support
  app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO

Mark Kavanagh (3):
  gso: add VxLAN GSO support
  gso: add GRE GSO support
  doc: add GSO programmer's guide

 MAINTAINERS                                        |   6 +
 app/test-pmd/cmdline.c                             | 179 ++++++++
 app/test-pmd/config.c                              |  24 ++
 app/test-pmd/csumonly.c                            |  42 +-
 app/test-pmd/testpmd.c                             |  13 +
 app/test-pmd/testpmd.h                             |  10 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   1 +
 doc/api/doxy-api.conf                              |   1 +
 .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
 .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
 doc/guides/prog_guide/index.rst                    |   1 +
 doc/guides/rel_notes/release_17_11.rst             |  17 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  46 ++
 lib/Makefile                                       |   2 +
 lib/librte_eal/common/include/rte_log.h            |   1 +
 lib/librte_gso/Makefile                            |  52 +++
 lib/librte_gso/gso_common.c                        | 153 +++++++
 lib/librte_gso/gso_common.h                        | 171 ++++++++
 lib/librte_gso/gso_tcp4.c                          | 104 +++++
 lib/librte_gso/gso_tcp4.h                          |  74 ++++
 lib/librte_gso/gso_tunnel_tcp4.c                   | 126 ++++++
 lib/librte_gso/gso_tunnel_tcp4.h                   |  75 ++++
 lib/librte_gso/rte_gso.c                           | 110 +++++
 lib/librte_gso/rte_gso.h                           | 148 +++++++
 lib/librte_gso/rte_gso_version.map                 |   7 +
 mk/rte.app.mk                                      |   1 +
 28 files changed, 2411 insertions(+), 4 deletions(-)
 create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
 create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
 create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

-- 
1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v9 1/6] gso: add Generic Segmentation Offload API framework
  2017-10-05 15:43             ` [PATCH v8 " Mark Kavanagh
  2017-10-05 17:12               ` Ananyev, Konstantin
  2017-10-05 20:36               ` [PATCH v9 " Mark Kavanagh
@ 2017-10-05 20:36               ` Mark Kavanagh
  2017-10-05 20:36               ` [PATCH v9 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
                                 ` (4 subsequent siblings)
  7 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 20:36 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. To segment a packet requires two steps. The first
is to set proper flags to mbuf->ol_flags, where the flags are the same
as that of TSO. The second is to call the segmentation API,
rte_gso_segment(). This patch introduces the GSO API framework to DPDK.

rte_gso_segment() splits an input packet into small ones in each
invocation. The GSO library refers to these small packets generated
by rte_gso_segment() as GSO segments. Each of the newly-created GSO
segments is organized as a two-segment MBUF, where the first segment is a
standard MBUF, which stores a copy of packet header, and the second is an
indirect MBUF which points to a section of data in the input packet.
rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
when all GSO segments are freed, the input packet is freed automatically.
Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
the driver of the interface which the GSO segments are sent to should
support to transmit multi-segment packets.

The GSO framework clears the PKT_TX_TCP_SEG flag for both the input
packet, and all produced GSO segments in the event of success, since
segmentation in hardware is no longer required at that point.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 config/common_base                     |   5 ++
 doc/api/doxy-api-index.md              |   1 +
 doc/api/doxy-api.conf                  |   1 +
 doc/guides/rel_notes/release_17_11.rst |   1 +
 lib/Makefile                           |   2 +
 lib/librte_gso/Makefile                |  49 +++++++++++
 lib/librte_gso/rte_gso.c               |  52 ++++++++++++
 lib/librte_gso/rte_gso.h               | 143 +++++++++++++++++++++++++++++++++
 lib/librte_gso/rte_gso_version.map     |   7 ++
 mk/rte.app.mk                          |   1 +
 10 files changed, 262 insertions(+)
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

diff --git a/config/common_base b/config/common_base
index 12f6be9..58ca5c0 100644
--- a/config/common_base
+++ b/config/common_base
@@ -653,6 +653,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
 CONFIG_RTE_LIBRTE_GRO=y
 
 #
+# Compile GSO library
+#
+CONFIG_RTE_LIBRTE_GSO=y
+
+#
 # Compile librte_meter
 #
 CONFIG_RTE_LIBRTE_METER=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 19e0d4f..6512918 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -101,6 +101,7 @@ The public API headers are grouped by topics:
   [TCP]                (@ref rte_tcp.h),
   [UDP]                (@ref rte_udp.h),
   [GRO]                (@ref rte_gro.h),
+  [GSO]                (@ref rte_gso.h),
   [frag/reass]         (@ref rte_ip_frag.h),
   [LPM IPv4 route]     (@ref rte_lpm.h),
   [LPM IPv6 route]     (@ref rte_lpm6.h),
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 823554f..408f2e6 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -47,6 +47,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_ether \
                           lib/librte_eventdev \
                           lib/librte_gro \
+                          lib/librte_gso \
                           lib/librte_hash \
                           lib/librte_ip_frag \
                           lib/librte_jobstats \
diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index f6f9169..5bb36b7 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -174,6 +174,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_ethdev.so.7
      librte_eventdev.so.2
      librte_gro.so.1
+   + librte_gso.so.1
      librte_hash.so.2
      librte_ip_frag.so.1
      librte_jobstats.so.1
diff --git a/lib/Makefile b/lib/Makefile
index 86caba1..3d123f4 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
 DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
+DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
+DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
new file mode 100644
index 0000000..aeaacbc
--- /dev/null
+++ b/lib/librte_gso/Makefile
@@ -0,0 +1,49 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_gso.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+
+EXPORT_MAP := rte_gso_version.map
+
+LIBABIVER := 1
+
+#source files
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
new file mode 100644
index 0000000..b773636
--- /dev/null
+++ b/lib/librte_gso/rte_gso.c
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <errno.h>
+
+#include "rte_gso.h"
+
+int
+rte_gso_segment(struct rte_mbuf *pkt,
+		const struct rte_gso_ctx *gso_ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
+			nb_pkts_out < 1)
+		return -EINVAL;
+
+	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+	pkts_out[0] = pkt;
+
+	return 1;
+}
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
new file mode 100644
index 0000000..7d343d7
--- /dev/null
+++ b/lib/librte_gso/rte_gso.h
@@ -0,0 +1,143 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_GSO_H_
+#define _RTE_GSO_H_
+
+/**
+ * @file
+ * Interface to GSO library
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/* GSO flags for rte_gso_ctx. */
+#define RTE_GSO_FLAG_IPID_FIXED (1ULL << 0)
+/**< Use fixed IP ids for output GSO segments. Setting
+ * !RTE_GSO_IPID_FIXED indicates using incremental IP ids.
+ */
+
+/**
+ * GSO context structure.
+ */
+struct rte_gso_ctx {
+	struct rte_mempool *direct_pool;
+	/**< MBUF pool for allocating direct buffers, which are used
+	 * to store packet headers for GSO segments.
+	 */
+	struct rte_mempool *indirect_pool;
+	/**< MBUF pool for allocating indirect buffers, which are used
+	 * to locate packet payloads for GSO segments. The indirect
+	 * buffer doesn't contain any data, but simply points to an
+	 * offset within the packet to segment.
+	 */
+	uint64_t flag;
+	/**< flag that controls specific attributes of output segments,
+	 * such as the type of IP ID generated (i.e. fixed or incremental).
+	 */
+	uint32_t gso_types;
+	/**< the bit mask of required GSO types. The GSO library
+	 * uses the same macros as that of describing device TX
+	 * offloading capabilities (i.e. DEV_TX_OFFLOAD_*_TSO) for
+	 * gso_types.
+	 *
+	 * For example, if applications want to segment TCP/IPv4
+	 * packets, set DEV_TX_OFFLOAD_TCP_TSO in gso_types.
+	 */
+	uint16_t gso_size;
+	/**< maximum size of an output GSO segment, including packet
+	 * header and payload, measured in bytes.
+	 */
+};
+
+/**
+ * Segmentation function, which supports processing of both single- and
+ * multi- MBUF packets.
+ *
+ * Note that we refer to the packets that are segmented from the input
+ * packet as 'GSO segments'. rte_gso_segment() doesn't check if the
+ * input packet has correct checksums, and doesn't update checksums for
+ * output GSO segments. Additionally, it doesn't process IP fragment
+ * packets.
+ *
+ * Before calling rte_gso_segment(), applications must set proper ol_flags
+ * for the packet. The GSO library uses the same macros as that of TSO.
+ * For example, set PKT_TX_TCP_SEG and PKT_TX_IPV4 in ol_flags to segment
+ * a TCP/IPv4 packet. If rte_gso_segment() succceds, the PKT_TX_TCP_SEG
+ * flag is removed for all GSO segments and the input packet.
+ *
+ * Each of the newly-created GSO segments is organized as a two-segment
+ * MBUF, where the first segment is a standard MBUF, which stores a copy
+ * of packet header, and the second is an indirect MBUF which points to
+ * a section of data in the input packet. Since each GSO segment has
+ * multiple MBUFs (i.e. typically 2 MBUFs), the driver of the interface which
+ * the GSO segments are sent to should support transmission of multi-segment
+ * packets.
+ *
+ * If the input packet is GSO'd, its mbuf refcnt reduces by 1. Therefore,
+ * when all GSO segments are freed, the input packet is freed automatically.
+ *
+ * If the memory space in pkts_out or MBUF pools is insufficient, this
+ * function fails, and it returns (-1) * errno. Otherwise, GSO succeeds,
+ * and this function returns the number of output GSO segments filled in
+ * pkts_out.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param ctx
+ *  GSO context object pointer.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when rte_gso_segment() succeeds.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of GSO segments filled in pkts_out on success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int rte_gso_segment(struct rte_mbuf *pkt,
+		const struct rte_gso_ctx *ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_GSO_H_ */
diff --git a/lib/librte_gso/rte_gso_version.map b/lib/librte_gso/rte_gso_version.map
new file mode 100644
index 0000000..e1fd453
--- /dev/null
+++ b/lib/librte_gso/rte_gso_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_gso_segment;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index c25fdd9..d4c9873 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
+_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v9 2/6] gso: add TCP/IPv4 GSO support
  2017-10-05 15:43             ` [PATCH v8 " Mark Kavanagh
                                 ` (2 preceding siblings ...)
  2017-10-05 20:36               ` [PATCH v9 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
@ 2017-10-05 20:36               ` Mark Kavanagh
  2017-10-05 20:36               ` [PATCH v9 3/6] gso: add VxLAN " Mark Kavanagh
                                 ` (3 subsequent siblings)
  7 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 20:36 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

This patch adds GSO support for TCP/IPv4 packets. Supported packets
may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
packets have correct checksums, and doesn't update checksums for
output packets (the responsibility for this lies with the application).
Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.

TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
MBUF, to organize an output packet. Note that we refer to these two
chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
header, while the indirect mbuf simply points to a location within the
original packet's payload. Consequently, use of the GSO library requires
multi-segment MBUF support in the TX functions of the NIC driver.

If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
result, when all of its GSOed segments are freed, the packet is freed
automatically.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst  |  12 +++
 lib/Makefile                            |   2 +-
 lib/librte_eal/common/include/rte_log.h |   1 +
 lib/librte_gso/Makefile                 |   2 +
 lib/librte_gso/gso_common.c             | 153 ++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h             | 141 +++++++++++++++++++++++++++++
 lib/librte_gso/gso_tcp4.c               | 104 ++++++++++++++++++++++
 lib/librte_gso/gso_tcp4.h               |  74 +++++++++++++++
 lib/librte_gso/rte_gso.c                |  53 ++++++++++-
 lib/librte_gso/rte_gso.h                |   7 +-
 10 files changed, 543 insertions(+), 6 deletions(-)
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 5bb36b7..dd37169 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -41,6 +41,18 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added the Generic Segmentation Offload Library.**
+
+  Added the Generic Segmentation Offload (GSO) library to enable
+  applications to split large packets (e.g. MTU is 64KB) into small
+  ones (e.g. MTU is 1500B). Supported packet types are:
+
+  * TCP/IPv4 packets.
+
+  The GSO library doesn't check if the input packets have correct
+  checksums, and doesn't update checksums for output packets.
+  Additionally, the GSO library doesn't process IP fragmented packets.
+
 
 Resolved Issues
 ---------------
diff --git a/lib/Makefile b/lib/Makefile
index 3d123f4..5ecd1b3 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -109,7 +109,7 @@ DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
 DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
-DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
+DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net librte_mempool
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index ec8dba7..2fa1199 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -87,6 +87,7 @@ struct rte_logs {
 #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
 #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
 #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
+#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index aeaacbc..2be64d1 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -42,6 +42,8 @@ LIBABIVER := 1
 
 #source files
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
new file mode 100644
index 0000000..ee75d4c
--- /dev/null
+++ b/lib/librte_gso/gso_common.c
@@ -0,0 +1,153 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdbool.h>
+#include <errno.h>
+
+#include <rte_memcpy.h>
+#include <rte_mempool.h>
+
+#include "gso_common.h"
+
+static inline void
+hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset)
+{
+	/* Copy MBUF metadata */
+	hdr_segment->nb_segs = 1;
+	hdr_segment->port = pkt->port;
+	hdr_segment->ol_flags = pkt->ol_flags;
+	hdr_segment->packet_type = pkt->packet_type;
+	hdr_segment->pkt_len = pkt_hdr_offset;
+	hdr_segment->data_len = pkt_hdr_offset;
+	hdr_segment->tx_offload = pkt->tx_offload;
+
+	/* Copy the packet header */
+	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
+			rte_pktmbuf_mtod(pkt, char *),
+			pkt_hdr_offset);
+}
+
+static inline void
+free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
+{
+	uint16_t i;
+
+	for (i = 0; i < nb_pkts; i++)
+		rte_pktmbuf_free(pkts[i]);
+}
+
+int
+gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct rte_mbuf *pkt_in;
+	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
+	uint16_t pkt_in_data_pos, segment_bytes_remaining;
+	uint16_t pyld_len, nb_segs;
+	bool more_in_pkt, more_out_segs;
+
+	pkt_in = pkt;
+	nb_segs = 0;
+	more_in_pkt = 1;
+	pkt_in_data_pos = pkt_hdr_offset;
+
+	while (more_in_pkt) {
+		if (unlikely(nb_segs >= nb_pkts_out)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -EINVAL;
+		}
+
+		/* Allocate a direct MBUF */
+		hdr_segment = rte_pktmbuf_alloc(direct_pool);
+		if (unlikely(hdr_segment == NULL)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -ENOMEM;
+		}
+		/* Fill the packet header */
+		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
+
+		prev_segment = hdr_segment;
+		segment_bytes_remaining = pyld_unit_size;
+		more_out_segs = 1;
+
+		while (more_out_segs && more_in_pkt) {
+			/* Allocate an indirect MBUF */
+			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
+			if (unlikely(pyld_segment == NULL)) {
+				rte_pktmbuf_free(hdr_segment);
+				free_gso_segment(pkts_out, nb_segs);
+				return -ENOMEM;
+			}
+			/* Attach to current MBUF segment of pkt */
+			rte_pktmbuf_attach(pyld_segment, pkt_in);
+
+			prev_segment->next = pyld_segment;
+			prev_segment = pyld_segment;
+
+			pyld_len = segment_bytes_remaining;
+			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
+				pyld_len = pkt_in->data_len - pkt_in_data_pos;
+
+			pyld_segment->data_off = pkt_in_data_pos +
+				pkt_in->data_off;
+			pyld_segment->data_len = pyld_len;
+
+			/* Update header segment */
+			hdr_segment->pkt_len += pyld_len;
+			hdr_segment->nb_segs++;
+
+			pkt_in_data_pos += pyld_len;
+			segment_bytes_remaining -= pyld_len;
+
+			/* Finish processing a MBUF segment of pkt */
+			if (pkt_in_data_pos == pkt_in->data_len) {
+				pkt_in = pkt_in->next;
+				pkt_in_data_pos = 0;
+				if (pkt_in == NULL)
+					more_in_pkt = 0;
+			}
+
+			/* Finish generating a GSO segment */
+			if (segment_bytes_remaining == 0)
+				more_out_segs = 0;
+		}
+		pkts_out[nb_segs++] = hdr_segment;
+	}
+	return nb_segs;
+}
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
new file mode 100644
index 0000000..a8ad638
--- /dev/null
+++ b/lib/librte_gso/gso_common.h
@@ -0,0 +1,141 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_COMMON_H_
+#define _GSO_COMMON_H_
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
+#define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
+		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
+
+#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
+#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
+
+#define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
+
+/**
+ * Internal function which updates the TCP header of a packet, following
+ * segmentation. This is required to update the header's 'sent' sequence
+ * number, and also to clear 'PSH' and 'FIN' flags for non-tail segments.
+ *
+ * @param pkt
+ *  The packet containing the TCP header.
+ * @param l4_offset
+ *  The offset of the TCP header from the start of the packet.
+ * @param sent_seq
+ *  The sent sequence number.
+ * @param non-tail
+ *  Indicates whether or not this is a tail segment.
+ */
+static inline void
+update_tcp_header(struct rte_mbuf *pkt, uint16_t l4_offset, uint32_t sent_seq,
+		uint8_t non_tail)
+{
+	struct tcp_hdr *tcp_hdr;
+
+	tcp_hdr = (struct tcp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l4_offset);
+	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
+	if (likely(non_tail))
+		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
+					TCP_HDR_FIN_MASK));
+}
+
+/**
+ * Internal function which updates the IPv4 header of a packet, following
+ * segmentation. This is required to update the header's 'total_length' field,
+ * to reflect the reduced length of the now-segmented packet. Furthermore, the
+ * header's 'packet_id' field must be updated to reflect the new ID of the
+ * now-segmented packet.
+ *
+ * @param pkt
+ *  The packet containing the IPv4 header.
+ * @param l3_offset
+ *  The offset of the IPv4 header from the start of the packet.
+ * @param id
+ *  The new ID of the packet.
+ */
+static inline void
+update_ipv4_header(struct rte_mbuf *pkt, uint16_t l3_offset, uint16_t id)
+{
+	struct ipv4_hdr *ipv4_hdr;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l3_offset);
+	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len - l3_offset);
+	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
+}
+
+/**
+ * Internal function which divides the input packet into small segments.
+ * Each of the newly-created segments is organized as a two-segment MBUF,
+ * where the first segment is a standard mbuf, which stores a copy of
+ * packet header, and the second is an indirect mbuf which points to a
+ * section of data in the input packet.
+ *
+ * @param pkt
+ *  Packet to segment.
+ * @param pkt_hdr_offset
+ *  Packet header offset, measured in bytes.
+ * @param pyld_unit_size
+ *  The max payload length of a GSO segment.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to keep the mbuf addresses of output segments. If
+ *  the memory space in pkts_out is insufficient, gso_do_segment() fails
+ *  and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of segments created in the event of success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
new file mode 100644
index 0000000..d83e610
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.c
@@ -0,0 +1,104 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "gso_common.h"
+#include "gso_tcp4.h"
+
+static void
+update_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t id, tail_idx, i;
+	uint16_t l3_offset = pkt->l2_len;
+	uint16_t l4_offset = l3_offset + pkt->l3_len;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char*) +
+			l3_offset);
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+
+	for (i = 0; i < nb_segs; i++) {
+		update_ipv4_header(segs[i], l3_offset, id);
+		update_tcp_header(segs[i], l4_offset, sent_seq, i < tail_idx);
+		id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
+
+int
+gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t tcp_dl;
+	uint16_t pyld_unit_size, hdr_offset;
+	uint16_t frag_off;
+	int ret;
+
+	/* Don't process the fragmented packet */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->l2_len);
+	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	/* Don't process the packet without data */
+	tcp_dl = pkt->pkt_len - pkt->l2_len - pkt->l3_len - pkt->l4_len;
+	if (unlikely(tcp_dl == 0)) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret > 1)
+		update_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
new file mode 100644
index 0000000..1c57441
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.h
@@ -0,0 +1,74 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TCP4_H_
+#define _GSO_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment an IPv4/TCP packet. This function doesn't check if the input
+ * packet has correct checksums, and doesn't update checksums for output
+ * GSO segments. Furthermore, it doesn't process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing unit of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when the function succeeds. If the memory space in
+ *  pkts_out is insufficient, it fails and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ip_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index b773636..822693f 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -33,7 +33,12 @@
 
 #include <errno.h>
 
+#include <rte_log.h>
+#include <rte_ethdev.h>
+
 #include "rte_gso.h"
+#include "gso_common.h"
+#include "gso_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -41,12 +46,52 @@
 		struct rte_mbuf **pkts_out,
 		uint16_t nb_pkts_out)
 {
+	struct rte_mempool *direct_pool, *indirect_pool;
+	struct rte_mbuf *pkt_seg;
+	uint64_t ol_flags;
+	uint16_t gso_size;
+	uint8_t ipid_delta;
+	int ret = 1;
+
 	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
-			nb_pkts_out < 1)
+			nb_pkts_out < 1 ||
+			gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN ||
+			gso_ctx->gso_types != DEV_TX_OFFLOAD_TCP_TSO)
 		return -EINVAL;
 
-	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
-	pkts_out[0] = pkt;
+	if (gso_ctx->gso_size >= pkt->pkt_len) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	direct_pool = gso_ctx->direct_pool;
+	indirect_pool = gso_ctx->indirect_pool;
+	gso_size = gso_ctx->gso_size;
+	ipid_delta = (gso_ctx->flag != RTE_GSO_FLAG_IPID_FIXED);
+	ol_flags = pkt->ol_flags;
+
+	if (IS_IPV4_TCP(pkt->ol_flags)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else {
+		pkts_out[0] = pkt;
+		RTE_LOG(DEBUG, GSO, "Unsupported packet type\n");
+		return 1;
+	}
+
+	if (ret > 1) {
+		pkt_seg = pkt;
+		while (pkt_seg) {
+			rte_mbuf_refcnt_update(pkt_seg, -1);
+			pkt_seg = pkt_seg->next;
+		}
+	} else if (ret < 0) {
+		/* Revert the ol_flags in the event of failure. */
+		pkt->ol_flags = ol_flags;
+	}
 
-	return 1;
+	return ret;
 }
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
index 7d343d7..7ca2d81 100644
--- a/lib/librte_gso/rte_gso.h
+++ b/lib/librte_gso/rte_gso.h
@@ -46,6 +46,10 @@
 #include <stdint.h>
 #include <rte_mbuf.h>
 
+/* Minimum GSO segment size. */
+#define RTE_GSO_SEG_SIZE_MIN (sizeof(struct ether_hdr) + \
+		sizeof(struct ipv4_hdr) + sizeof(struct tcp_hdr) + 1)
+
 /* GSO flags for rte_gso_ctx. */
 #define RTE_GSO_FLAG_IPID_FIXED (1ULL << 0)
 /**< Use fixed IP ids for output GSO segments. Setting
@@ -81,7 +85,8 @@ struct rte_gso_ctx {
 	 */
 	uint16_t gso_size;
 	/**< maximum size of an output GSO segment, including packet
-	 * header and payload, measured in bytes.
+	 * header and payload, measured in bytes. Must exceed
+	 * RTE_GSO_SEG_SIZE_MIN.
 	 */
 };
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v9 3/6] gso: add VxLAN GSO support
  2017-10-05 15:43             ` [PATCH v8 " Mark Kavanagh
                                 ` (3 preceding siblings ...)
  2017-10-05 20:36               ` [PATCH v9 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
@ 2017-10-05 20:36               ` Mark Kavanagh
  2017-10-05 20:36               ` [PATCH v9 4/6] gso: add GRE " Mark Kavanagh
                                 ` (2 subsequent siblings)
  7 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 20:36 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

This patch adds a framework that allows GSO on tunneled packets.
Furthermore, it leverages that framework to provide GSO support for
VxLAN-encapsulated packets.

Supported VxLAN packets must have an outer IPv4 header (prepended by an
optional VLAN tag), and contain an inner TCP/IPv4 packet (with an optional
inner VLAN tag).

VxLAN GSO doesn't check if input packets have correct checksums and
doesn't update checksums for output packets. Additionally, it doesn't
process IP fragmented packets.

As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
output packet, which mandates support for multi-segment mbufs in the TX
functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
reduces its MBUF refcnt by 1. As a result, when all of its GSO'd segments
are freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst |   2 +
 lib/librte_gso/Makefile                |   1 +
 lib/librte_gso/gso_common.h            |  25 +++++++
 lib/librte_gso/gso_tunnel_tcp4.c       | 120 +++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_tunnel_tcp4.h       |  75 +++++++++++++++++++++
 lib/librte_gso/rte_gso.c               |  14 +++-
 6 files changed, 235 insertions(+), 2 deletions(-)
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index dd37169..c58eeb1 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -48,6 +48,8 @@ New Features
   ones (e.g. MTU is 1500B). Supported packet types are:
 
   * TCP/IPv4 packets.
+  * VxLAN packets, which must have an outer IPv4 header, and contain
+    an inner TCP/IPv4 packet.
 
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index 2be64d1..e6d41df 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -44,6 +44,7 @@ LIBABIVER := 1
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index a8ad638..95d54e7 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -39,6 +39,7 @@
 #include <rte_mbuf.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
+#include <rte_udp.h>
 
 #define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
 		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
@@ -49,6 +50,30 @@
 #define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
 		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
 
+#define IS_IPV4_VXLAN_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
+				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
+		 PKT_TX_TUNNEL_VXLAN))
+
+/**
+ * Internal function which updates the UDP header of a packet, following
+ * segmentation. This is required to update the header's datagram length field.
+ *
+ * @param pkt
+ *  The packet containing the UDP header.
+ * @param udp_offset
+ *  The offset of the UDP header from the start of the packet.
+ */
+static inline void
+update_udp_header(struct rte_mbuf *pkt, uint16_t udp_offset)
+{
+	struct udp_hdr *udp_hdr;
+
+	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			udp_offset);
+	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - udp_offset);
+}
+
 /**
  * Internal function which updates the TCP header of a packet, following
  * segmentation. This is required to update the header's 'sent' sequence
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
new file mode 100644
index 0000000..5e8c8e5
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -0,0 +1,120 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "gso_common.h"
+#include "gso_tunnel_tcp4.h"
+
+static void
+update_tunnel_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t outer_id, inner_id, tail_idx, i;
+	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
+
+	outer_ipv4_offset = pkt->outer_l2_len;
+	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
+	inner_ipv4_offset = udp_offset + pkt->l2_len;
+	tcp_offset = inner_ipv4_offset + pkt->l3_len;
+
+	/* Outer IPv4 header. */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			outer_ipv4_offset);
+	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+
+	/* Inner IPv4 header. */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			inner_ipv4_offset);
+	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+
+	for (i = 0; i < nb_segs; i++) {
+		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
+		update_udp_header(segs[i], udp_offset);
+		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
+		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
+		outer_id++;
+		inner_id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
+
+int
+gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *inner_ipv4_hdr;
+	uint16_t pyld_unit_size, hdr_offset, frag_off;
+	int ret = 1;
+
+	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
+	inner_ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			hdr_offset);
+	/*
+	 * Don't process the packet whose MF bit or offset in the inner
+	 * IPv4 header are non-zero.
+	 */
+	frag_off = rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	hdr_offset += pkt->l3_len + pkt->l4_len;
+	/* Don't process the packet without data */
+	if (hdr_offset >= pkt->pkt_len) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret <= 1)
+		return ret;
+
+	update_tunnel_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tunnel_tcp4.h b/lib/librte_gso/gso_tunnel_tcp4.h
new file mode 100644
index 0000000..3c67f0c
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.h
@@ -0,0 +1,75 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TUNNEL_TCP4_H_
+#define _GSO_TUNNEL_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment a tunneling packet with inner TCP/IPv4 headers. This function
+ * doesn't check if the input packet has correct checksums, and doesn't
+ * update checksums for output GSO segments. Furthermore, it doesn't
+ * process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing unit of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when it succeeds. If the memory space in pkts_out is
+ *  insufficient, it fails and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index 822693f..0a3ef11 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -39,6 +39,7 @@
 #include "rte_gso.h"
 #include "gso_common.h"
 #include "gso_tcp4.h"
+#include "gso_tunnel_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -56,7 +57,8 @@
 	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
 			nb_pkts_out < 1 ||
 			gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN ||
-			gso_ctx->gso_types != DEV_TX_OFFLOAD_TCP_TSO)
+			((gso_ctx->gso_types & (DEV_TX_OFFLOAD_TCP_TSO |
+			DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) == 0))
 		return -EINVAL;
 
 	if (gso_ctx->gso_size >= pkt->pkt_len) {
@@ -71,12 +73,20 @@
 	ipid_delta = (gso_ctx->flag != RTE_GSO_FLAG_IPID_FIXED);
 	ol_flags = pkt->ol_flags;
 
-	if (IS_IPV4_TCP(pkt->ol_flags)) {
+	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)
+		&& (gso_ctx->gso_types & DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
+			(gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO)) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
 				pkts_out, nb_pkts_out);
 	} else {
+		/* unsupported packet, skip */
 		pkts_out[0] = pkt;
 		RTE_LOG(DEBUG, GSO, "Unsupported packet type\n");
 		return 1;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v9 4/6] gso: add GRE GSO support
  2017-10-05 15:43             ` [PATCH v8 " Mark Kavanagh
                                 ` (4 preceding siblings ...)
  2017-10-05 20:36               ` [PATCH v9 3/6] gso: add VxLAN " Mark Kavanagh
@ 2017-10-05 20:36               ` Mark Kavanagh
  2017-10-05 20:36               ` [PATCH v9 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
  2017-10-05 20:36               ` [PATCH v9 6/6] doc: add GSO programmer's guide Mark Kavanagh
  7 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 20:36 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

This patch adds GSO support for GRE-tunneled packets. Supported GRE
packets must contain an outer IPv4 header, and inner TCP/IPv4 headers.
They may also contain a single VLAN tag. GRE GSO doesn't check if all
input packets have correct checksums and doesn't update checksums for
output packets. Additionally, it doesn't process IP fragmented packets.

As with VxLAN GSO, GRE GSO uses a two-segment MBUF to organize each
output packet, which requires multi-segment mbuf support in the TX
functions of the NIC driver. Also, if a packet is GSOed, GRE GSO reduces
its MBUF refcnt by 1. As a result, when all of its GSOed segments are
freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst |  2 ++
 lib/librte_gso/gso_common.h            |  5 +++++
 lib/librte_gso/gso_tunnel_tcp4.c       | 14 ++++++++++----
 lib/librte_gso/rte_gso.c               |  9 ++++++---
 4 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index c58eeb1..2faa630 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -50,6 +50,8 @@ New Features
   * TCP/IPv4 packets.
   * VxLAN packets, which must have an outer IPv4 header, and contain
     an inner TCP/IPv4 packet.
+  * GRE packets, which must contain an outer IPv4 header, and inner
+    TCP/IPv4 headers.
 
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index 95d54e7..145ea49 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -55,6 +55,11 @@
 		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
 		 PKT_TX_TUNNEL_VXLAN))
 
+#define IS_IPV4_GRE_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
+				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_GRE)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
+		 PKT_TX_TUNNEL_GRE))
+
 /**
  * Internal function which updates the UDP header of a packet, following
  * segmentation. This is required to update the header's datagram length field.
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
index 5e8c8e5..8d0cfd7 100644
--- a/lib/librte_gso/gso_tunnel_tcp4.c
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -42,11 +42,13 @@
 	struct tcp_hdr *tcp_hdr;
 	uint32_t sent_seq;
 	uint16_t outer_id, inner_id, tail_idx, i;
-	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
+	uint16_t outer_ipv4_offset, inner_ipv4_offset;
+	uint16_t udp_gre_offset, tcp_offset;
+	uint8_t update_udp_hdr;
 
 	outer_ipv4_offset = pkt->outer_l2_len;
-	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
-	inner_ipv4_offset = udp_offset + pkt->l2_len;
+	udp_gre_offset = outer_ipv4_offset + pkt->outer_l3_len;
+	inner_ipv4_offset = udp_gre_offset + pkt->l2_len;
 	tcp_offset = inner_ipv4_offset + pkt->l3_len;
 
 	/* Outer IPv4 header. */
@@ -63,9 +65,13 @@
 	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
 	tail_idx = nb_segs - 1;
 
+	/* Only update UDP header for VxLAN packets. */
+	update_udp_hdr = (pkt->ol_flags & PKT_TX_TUNNEL_VXLAN) ? 1 : 0;
+
 	for (i = 0; i < nb_segs; i++) {
 		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
-		update_udp_header(segs[i], udp_offset);
+		if (update_udp_hdr)
+			update_udp_header(segs[i], udp_gre_offset);
 		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
 		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
 		outer_id++;
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index 0a3ef11..f86e654 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -58,7 +58,8 @@
 			nb_pkts_out < 1 ||
 			gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN ||
 			((gso_ctx->gso_types & (DEV_TX_OFFLOAD_TCP_TSO |
-			DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) == 0))
+			DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+			DEV_TX_OFFLOAD_GRE_TNL_TSO)) == 0))
 		return -EINVAL;
 
 	if (gso_ctx->gso_size >= pkt->pkt_len) {
@@ -73,8 +74,10 @@
 	ipid_delta = (gso_ctx->flag != RTE_GSO_FLAG_IPID_FIXED);
 	ol_flags = pkt->ol_flags;
 
-	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)
-		&& (gso_ctx->gso_types & DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) {
+	if ((IS_IPV4_VXLAN_TCP4(pkt->ol_flags) &&
+			(gso_ctx->gso_types & DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) ||
+			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
+			 (gso_ctx->gso_types & DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v9 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-10-05 15:43             ` [PATCH v8 " Mark Kavanagh
                                 ` (5 preceding siblings ...)
  2017-10-05 20:36               ` [PATCH v9 4/6] gso: add GRE " Mark Kavanagh
@ 2017-10-05 20:36               ` Mark Kavanagh
  2017-10-05 20:36               ` [PATCH v9 6/6] doc: add GSO programmer's guide Mark Kavanagh
  7 siblings, 0 replies; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 20:36 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

From: Jiayu Hu <jiayu.hu@intel.com>

This patch adds GSO support to the csum forwarding engine. Oversized
packets transmitted over a GSO-enabled port will undergo segmentation
(with the exception of packet-types unsupported by the GSO library).
GSO support is disabled by default.

GSO support may be toggled on a per-port basis, using the command:

        "set port <port_id> gso on|off"

The maximum packet length (including the packet header and payload) for
GSO segments may be set with the command:

        "set gso segsz <length>"

Show GSO configuration for a given port with the command:

	"show port <port_id> gso"

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
---
 app/test-pmd/cmdline.c                      | 179 ++++++++++++++++++++++++++++
 app/test-pmd/config.c                       |  24 ++++
 app/test-pmd/csumonly.c                     |  42 ++++++-
 app/test-pmd/testpmd.c                      |  13 ++
 app/test-pmd/testpmd.h                      |  10 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  46 +++++++
 6 files changed, 310 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index ccdf239..e92cd59 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -431,6 +431,17 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"    Set max flow number and max packet number per-flow"
 			" for GRO.\n\n"
 
+			"set port (port_id) gso (on|off)"
+			"    Enable or disable Generic Segmentation Offload in"
+			" csum forwarding engine.\n\n"
+
+			"set gso segsz (length)\n"
+			"    Set max packet length for output GSO segments,"
+			" including packet header and payload.\n\n"
+
+			"show port (port_id) gso\n"
+			"    Show GSO configuration.\n\n"
+
 			"set fwd (%s)\n"
 			"    Set packet forwarding mode.\n\n"
 
@@ -3967,6 +3978,171 @@ struct cmd_gro_set_result {
 	},
 };
 
+/* *** ENABLE/DISABLE GSO *** */
+struct cmd_gso_enable_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_mode;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_enable_parsed(void *parsed_result,
+		__attribute__((unused)) struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_gso_enable_result *res;
+
+	res = parsed_result;
+	if (!strcmp(res->cmd_keyword, "gso"))
+		setup_gso(res->cmd_mode, res->cmd_pid);
+}
+
+cmdline_parse_token_string_t cmd_gso_enable_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_enable_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_enable_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_enable_mode =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_mode, "on#off");
+cmdline_parse_token_num_t cmd_gso_enable_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_enable = {
+	.f = cmd_gso_enable_parsed,
+	.data = NULL,
+	.help_str = "set port <port_id> gso on|off",
+	.tokens = {
+		(void *)&cmd_gso_enable_set,
+		(void *)&cmd_gso_enable_port,
+		(void *)&cmd_gso_enable_pid,
+		(void *)&cmd_gso_enable_keyword,
+		(void *)&cmd_gso_enable_mode,
+		NULL,
+	},
+};
+
+/* *** SET MAX PACKET LENGTH FOR GSO SEGMENTS *** */
+struct cmd_gso_size_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_segsz;
+	uint16_t cmd_size;
+};
+
+static void
+cmd_gso_size_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_size_result *res = parsed_result;
+
+	if (test_done == 0) {
+		printf("Before setting GSO segsz, please first stop fowarding\n");
+		return;
+	}
+
+	if (!strcmp(res->cmd_keyword, "gso") &&
+			!strcmp(res->cmd_segsz, "segsz")) {
+		if (res->cmd_size < RTE_GSO_SEG_SIZE_MIN)
+			printf("gso_size should be larger than %zu."
+					" Please input a legal value\n",
+					RTE_GSO_SEG_SIZE_MIN);
+		else
+			gso_max_segment_size = res->cmd_size;
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_size_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_size_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_size_segsz =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_segsz, "segsz");
+cmdline_parse_token_num_t cmd_gso_size_size =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_size_result,
+				cmd_size, UINT16);
+
+cmdline_parse_inst_t cmd_gso_size = {
+	.f = cmd_gso_size_parsed,
+	.data = NULL,
+	.help_str = "set gso segsz <length>",
+	.tokens = {
+		(void *)&cmd_gso_size_set,
+		(void *)&cmd_gso_size_keyword,
+		(void *)&cmd_gso_size_segsz,
+		(void *)&cmd_gso_size_size,
+		NULL,
+	},
+};
+
+/* *** SHOW GSO CONFIGURATION *** */
+struct cmd_gso_show_result {
+	cmdline_fixed_string_t cmd_show;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	uint8_t cmd_pid;
+};
+
+static void
+cmd_gso_show_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_show_result *res = parsed_result;
+
+	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
+		printf("invalid port id %u\n", res->cmd_pid);
+		return;
+	}
+	if (!strcmp(res->cmd_keyword, "gso")) {
+		if (gso_ports[res->cmd_pid].enable) {
+			printf("Max GSO'd packet size: %uB\n"
+					"Supported GSO types: TCP/IPv4, "
+					"VxLAN with inner TCP/IPv4 packet, "
+					"GRE with inner TCP/IPv4  packet\n",
+					gso_max_segment_size);
+		} else
+			printf("GSO is not enabled on Port %u\n", res->cmd_pid);
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_show_show =
+TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+		cmd_show, "show");
+cmdline_parse_token_string_t cmd_gso_show_port =
+TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+		cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_show_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_num_t cmd_gso_show_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_show_result,
+				cmd_pid, UINT8);
+
+cmdline_parse_inst_t cmd_gso_show = {
+	.f = cmd_gso_show_parsed,
+	.data = NULL,
+	.help_str = "show port <port_id> gso",
+	.tokens = {
+		(void *)&cmd_gso_show_show,
+		(void *)&cmd_gso_show_port,
+		(void *)&cmd_gso_show_pid,
+		(void *)&cmd_gso_show_keyword,
+		NULL,
+	},
+};
+
 /* *** ENABLE/DISABLE FLUSH ON RX STREAMS *** */
 struct cmd_set_flush_rx {
 	cmdline_fixed_string_t set;
@@ -14255,6 +14431,9 @@ struct cmd_cmdfile_result {
 	(cmdline_parse_inst_t *)&cmd_tunnel_tso_show,
 	(cmdline_parse_inst_t *)&cmd_enable_gro,
 	(cmdline_parse_inst_t *)&cmd_gro_set,
+	(cmdline_parse_inst_t *)&cmd_gso_enable,
+	(cmdline_parse_inst_t *)&cmd_gso_size,
+	(cmdline_parse_inst_t *)&cmd_gso_show,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_rx,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_tx,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 3ae3e1c..88d09d0 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2454,6 +2454,30 @@ struct igb_ring_desc_16_bytes {
 	}
 }
 
+void
+setup_gso(const char *mode, uint8_t port_id)
+{
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		printf("invalid port id %u\n", port_id);
+		return;
+	}
+	if (strcmp(mode, "on") == 0) {
+		if (test_done == 0) {
+			printf("before enabling GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 1;
+	} else if (strcmp(mode, "off") == 0) {
+		if (test_done == 0) {
+			printf("before disabling GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 0;
+	}
+}
+
 char*
 list_pkt_forwarding_modes(void)
 {
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 90c8119..e2b18cb 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -70,6 +70,8 @@
 #include <rte_string_fns.h>
 #include <rte_flow.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
+
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
@@ -91,6 +93,7 @@
 /* structure that caches offload info for the current packet */
 struct testpmd_offload_info {
 	uint16_t ethertype;
+	uint8_t gso_enable;
 	uint16_t l2_len;
 	uint16_t l3_len;
 	uint16_t l4_len;
@@ -381,6 +384,8 @@ struct simple_gre_hdr {
 				get_udptcp_checksum(l3_hdr, tcp_hdr,
 					info->ethertype);
 		}
+		if (info->gso_enable)
+			ol_flags |= PKT_TX_TCP_SEG;
 	} else if (info->l4_proto == IPPROTO_SCTP) {
 		sctp_hdr = (struct sctp_hdr *)((char *)l3_hdr + info->l3_len);
 		sctp_hdr->cksum = 0;
@@ -627,6 +632,9 @@ struct simple_gre_hdr {
 pkt_burst_checksum_forward(struct fwd_stream *fs)
 {
 	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_mbuf *gso_segments[GSO_MAX_PKT_BURST];
+	struct rte_gso_ctx *gso_ctx;
+	struct rte_mbuf **tx_pkts_burst;
 	struct rte_port *txp;
 	struct rte_mbuf *m, *p;
 	struct ether_hdr *eth_hdr;
@@ -641,6 +649,8 @@ struct simple_gre_hdr {
 	uint32_t rx_bad_ip_csum;
 	uint32_t rx_bad_l4_csum;
 	struct testpmd_offload_info info;
+	uint16_t nb_segments = 0;
+	int ret;
 
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	uint64_t start_tsc;
@@ -674,6 +684,8 @@ struct simple_gre_hdr {
 	memset(&info, 0, sizeof(info));
 	info.tso_segsz = txp->tso_segsz;
 	info.tunnel_tso_segsz = txp->tunnel_tso_segsz;
+	if (gso_ports[fs->tx_port].enable)
+		info.gso_enable = 1;
 
 	for (i = 0; i < nb_rx; i++) {
 		if (likely(i < nb_rx - 1))
@@ -851,13 +863,34 @@ struct simple_gre_hdr {
 		}
 	}
 
+	if (gso_ports[fs->tx_port].enable == 0)
+		tx_pkts_burst = pkts_burst;
+	else {
+		gso_ctx = &(current_fwd_lcore()->gso_ctx);
+		gso_ctx->gso_size = gso_max_segment_size;
+		for (i = 0; i < nb_rx; i++) {
+			ret = rte_gso_segment(pkts_burst[i], gso_ctx,
+					&gso_segments[nb_segments],
+					RTE_DIM(gso_segments) - nb_segments);
+			if (ret < 0)  {
+				RTE_LOG(DEBUG, USER1,
+						"Unable to segment packet");
+				rte_pktmbuf_free(pkts_burst[i]);
+			} else
+				nb_segments += ret;
+		}
+
+		tx_pkts_burst = gso_segments;
+		nb_rx = nb_segments;
+	}
+
 	nb_prep = rte_eth_tx_prepare(fs->tx_port, fs->tx_queue,
-			pkts_burst, nb_rx);
+			tx_pkts_burst, nb_rx);
 	if (nb_prep != nb_rx)
 		printf("Preparing packet burst to transmit failed: %s\n",
 				rte_strerror(rte_errno));
 
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst,
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, tx_pkts_burst,
 			nb_prep);
 
 	/*
@@ -868,7 +901,7 @@ struct simple_gre_hdr {
 		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
 			rte_delay_us(burst_tx_delay_time);
 			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_rx - nb_tx);
+					&tx_pkts_burst[nb_tx], nb_rx - nb_tx);
 		}
 	}
 	fs->tx_packets += nb_tx;
@@ -881,9 +914,10 @@ struct simple_gre_hdr {
 	if (unlikely(nb_tx < nb_rx)) {
 		fs->fwd_dropped += (nb_rx - nb_tx);
 		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
+			rte_pktmbuf_free(tx_pkts_burst[nb_tx]);
 		} while (++nb_tx < nb_rx);
 	}
+
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	end_tsc = rte_rdtsc();
 	core_cycles = (end_tsc - start_tsc);
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index e097ee0..b9ee77c 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -400,6 +400,9 @@ static int eth_event_callback(uint8_t port_id,
  */
 static int all_ports_started(void);
 
+struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+uint16_t gso_max_segment_size = ETHER_MAX_LEN - ETHER_CRC_LEN;
+
 /*
  * Helper function to check if socket is already discovered.
  * If yes, return positive value. If not, return zero.
@@ -570,6 +573,7 @@ static int eth_event_callback(uint8_t port_id,
 	unsigned int nb_mbuf_per_pool;
 	lcoreid_t  lc_id;
 	uint8_t port_per_socket[RTE_MAX_NUMA_NODES];
+	uint32_t gso_types = 0;
 
 	memset(port_per_socket,0,RTE_MAX_NUMA_NODES);
 
@@ -654,6 +658,8 @@ static int eth_event_callback(uint8_t port_id,
 
 	init_port_config();
 
+	gso_types = DEV_TX_OFFLOAD_TCP_TSO | DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+		DEV_TX_OFFLOAD_GRE_TNL_TSO;
 	/*
 	 * Records which Mbuf pool to use by each logical core, if needed.
 	 */
@@ -664,6 +670,13 @@ static int eth_event_callback(uint8_t port_id,
 		if (mbp == NULL)
 			mbp = mbuf_pool_find(0);
 		fwd_lcores[lc_id]->mbp = mbp;
+		/* initialize GSO context */
+		fwd_lcores[lc_id]->gso_ctx.direct_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.indirect_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.gso_types = gso_types;
+		fwd_lcores[lc_id]->gso_ctx.gso_size = ETHER_MAX_LEN -
+			ETHER_CRC_LEN;
+		fwd_lcores[lc_id]->gso_ctx.flag = 0;
 	}
 
 	/* Configuration of packet forwarding streams. */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 1d1ee75..ff842a1 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -36,6 +36,7 @@
 
 #include <rte_pci.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
@@ -205,6 +206,7 @@ struct rte_port {
  * CPU id. configuration table.
  */
 struct fwd_lcore {
+	struct rte_gso_ctx gso_ctx;     /**< GSO context */
 	struct rte_mempool *mbp; /**< The mbuf pool to use by this core */
 	streamid_t stream_idx;   /**< index of 1st stream in "fwd_streams" */
 	streamid_t stream_nb;    /**< number of streams in "fwd_streams" */
@@ -442,6 +444,13 @@ struct gro_status {
 };
 extern struct gro_status gro_ports[RTE_MAX_ETHPORTS];
 
+#define GSO_MAX_PKT_BURST 2048
+struct gso_status {
+	uint8_t enable;
+};
+extern struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+extern uint16_t gso_max_segment_size;
+
 static inline unsigned int
 lcore_num(void)
 {
@@ -642,6 +651,7 @@ void port_rss_hash_key_update(portid_t port_id, char rss_type[],
 int rx_queue_id_is_invalid(queueid_t rxq_id);
 int tx_queue_id_is_invalid(queueid_t txq_id);
 void setup_gro(const char *mode, uint8_t port_id);
+void setup_gso(const char *mode, uint8_t port_id);
 
 /* Functions to manage the set of filtered Multicast MAC addresses */
 void mcast_addr_add(uint8_t port_id, struct ether_addr *mc_addr);
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 2ed62f5..f9b5bda 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -932,6 +932,52 @@ number of packets a GRO table can store.
 If current packet number is greater than or equal to the max value, GRO
 will stop processing incoming packets.
 
+set port - gso
+~~~~~~~~~~~~~~
+
+Toggle per-port GSO support in ``csum`` forwarding engine::
+
+   testpmd> set port <port_id> gso on|off
+
+If enabled, the csum forwarding engine will perform GSO on supported IPv4
+packets, transmitted on the given port.
+
+If disabled, packets transmitted on the given port will not undergo GSO.
+By default, GSO is disabled for all ports.
+
+.. note::
+
+   When GSO is enabled on a port, supported IPv4 packets transmitted on that
+   port undergo GSO. Afterwards, the segmented packets are represented by
+   multi-segment mbufs; however, the csum forwarding engine doesn't calculation
+   of checksums for GSO'd segments in SW. As a result, if users want correct
+   checksums in GSO segments, they should enable HW checksum calculation for
+   GSO-enabled ports.
+
+   For example, HW checksum calculation for VxLAN GSO'd packets may be enabled
+   by setting the following options in the csum forwarding engine:
+
+   testpmd> csum set outer_ip hw <port_id>
+
+   testpmd> csum set ip hw <port_id>
+
+   testpmd> csum set tcp hw <port_id>
+
+set gso segsz
+~~~~~~~~~~~~~
+
+Set the maximum GSO segment size (measured in bytes), which includes the
+packet header and the packet payload for GSO-enabled ports (global)::
+
+   testpmd> set gso segsz <length>
+
+show port - gso
+~~~~~~~~~~~~~~~
+
+Display the status of Generic Segmentation Offload for a given port::
+
+   testpmd> show port <port_id> gso
+
 mac_addr add
 ~~~~~~~~~~~~
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v9 6/6] doc: add GSO programmer's guide
  2017-10-05 15:43             ` [PATCH v8 " Mark Kavanagh
                                 ` (6 preceding siblings ...)
  2017-10-05 20:36               ` [PATCH v9 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
@ 2017-10-05 20:36               ` Mark Kavanagh
  2017-10-06 13:34                 ` Mcnamara, John
  7 siblings, 1 reply; 157+ messages in thread
From: Mark Kavanagh @ 2017-10-05 20:36 UTC (permalink / raw)
  To: dev
  Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, ferruh.yigit, thomas,
	Mark Kavanagh

Add programmer's guide doc to explain the design and use of the
GSO library.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 MAINTAINERS                                        |   6 +
 .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
 .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
 doc/guides/prog_guide/index.rst                    |   1 +
 5 files changed, 1053 insertions(+)
 create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
 create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
 create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg

diff --git a/MAINTAINERS b/MAINTAINERS
index 8df2a7f..8f0a4bd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -644,6 +644,12 @@ M: Jiayu Hu <jiayu.hu@intel.com>
 F: lib/librte_gro/
 F: doc/guides/prog_guide/generic_receive_offload_lib.rst
 
+Generic Segmentation Offload
+M: Jiayu Hu <jiayu.hu@intel.com>
+M: Mark Kavanagh <mark.b.kavanagh@intel.com>
+F: lib/librte_gso/
+F: doc/guides/prog_guide/generic_segmentation_offload_lib.rst
+
 Distributor
 M: Bruce Richardson <bruce.richardson@intel.com>
 M: David Hunt <david.hunt@intel.com>
diff --git a/doc/guides/prog_guide/generic_segmentation_offload_lib.rst b/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
new file mode 100644
index 0000000..5e78f16
--- /dev/null
+++ b/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
@@ -0,0 +1,256 @@
+..  BSD LICENSE
+    Copyright(c) 2017 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Generic Segmentation Offload Library
+====================================
+
+Overview
+--------
+Generic Segmentation Offload (GSO) is a widely used software implementation of
+TCP Segmentation Offload (TSO), which reduces per-packet processing overhead.
+Much like TSO, GSO gains performance by enabling upper layer applications to
+process a smaller number of large packets (e.g. MTU size of 64KB), instead of
+processing higher numbers of small packets (e.g. MTU size of 1500B), thus
+reducing per-packet overhead.
+
+For example, GSO allows guest kernel stacks to transmit over-sized TCP segments
+that far exceed the kernel interface's MTU; this eliminates the need to segment
+packets within the guest, and improves the data-to-overhead ratio of both the
+guest-host link, and PCI bus. The expectation of the guest network stack in this
+scenario is that segmentation of egress frames will take place either in the NIC
+HW, or where that hardware capability is unavailable, either in the host
+application, or network stack.
+
+Bearing that in mind, the GSO library enables DPDK applications to segment
+packets in software. Note however, that GSO is implemented as a standalone
+library, and not via a 'fallback' mechanism (i.e. for when TSO is unsupported
+in the underlying hardware); that is, applications must explicitly invoke the
+GSO library to segment packets. The size of GSO segments ``(segsz)`` is
+configurable by the application.
+
+Limitations
+-----------
+
+#. The GSO library doesn't check if input packets have correct checksums.
+
+#. In addition, the GSO library doesn't re-calculate checksums for segmented
+   packets (that task is left to the application).
+
+#. IP fragments are unsupported by the GSO library.
+
+#. The egress interface's driver must support multi-segment packets.
+
+#. Currently, the GSO library supports the following IPv4 packet types:
+
+ - TCP
+ - VxLAN
+ - GRE
+
+  See `Supported GSO Packet Types`_ for further details.
+
+Packet Segmentation
+-------------------
+
+The ``rte_gso_segment()`` function is the GSO library's primary
+segmentation API.
+
+Before performing segmentation, an application must create a GSO context object
+``(struct rte_gso_ctx)``, which provides the library with some of the
+information required to understand how the packet should be segmented. Refer to
+`How to Segment a Packet`_ for additional details on same. Once the GSO context
+has been created, and populated, the application can then use the
+``rte_gso_segment()`` function to segment packets.
+
+The GSO library typically stores each segment that it creates in two parts: the
+first part contains a copy of the original packet's headers, while the second
+part contains a pointer to an offset within the original packet. This mechanism
+is explained in more detail in `GSO Output Segment Format`_.
+
+The GSO library supports both single- and multi-segment input mbufs.
+
+GSO Output Segment Format
+~~~~~~~~~~~~~~~~~~~~~~~~~
+To reduce the number of expensive memcpy operations required when segmenting a
+packet, the GSO library typically stores each segment that it creates as a
+two-part mbuf (technically, this is termed a 'two-segment' mbuf; however, since
+the elements produced by the API are also called 'segments', for clarity the
+term 'part' is used here instead).
+
+The first part of each output segment is a direct mbuf and contains a copy of
+the original packet's headers, which must be prepended to each output segment.
+These headers are copied from the original packet into each output segment.
+
+The second part of each output segment, represents a section of data from the
+original packet, i.e. a data segment. Rather than copy the data directly from
+the original packet into the output segment (which would impact performance
+considerably), the second part of each output segment is an indirect mbuf,
+which contains no actual data, but simply points to an offset within the
+original packet.
+
+The combination of the 'header' segment and the 'data' segment constitutes a
+single logical output GSO segment of the original packet. This is illustrated
+in :numref:`figure_gso-output-segment-format`.
+
+.. _figure_gso-output-segment-format:
+
+.. figure:: img/gso-output-segment-format.svg
+   :align: center
+
+   Two-part GSO output segment
+
+In one situation, the output segment may contain additional 'data' segments.
+This only occurs when:
+
+- the input packet on which GSO is to be performed is represented by a
+  multi-segment mbuf.
+
+- the output segment is required to contain data that spans the boundaries
+  between segments of the input multi-segment mbuf.
+
+The GSO library traverses each segment of the input packet, and produces
+numerous output segments; for optimal performance, the number of output
+segments is kept to a minimum. Consequently, the GSO library maximizes the
+amount of data contained within each output segment; i.e. each output segment
+``segsz`` bytes of data. The only exception to this is in the case of the very
+final output segment; if ``pkt_len`` % ``segsz``, then the final segment is
+smaller than the rest.
+
+In order for an output segment to meet its MSS, it may need to include data from
+multiple input segments. Due to the nature of indirect mbufs (each indirect mbuf
+can point to only one direct mbuf), the solution here is to add another indirect
+mbuf to the output segment; this additional segment then points to the next
+input segment. If necessary, this chaining process is repeated, until the sum of
+all of the data 'contained' in the output segment reaches ``segsz``. This
+ensures that the amount of data contained within each output segment is uniform,
+with the possible exception of the last segment, as previously described.
+
+:numref:`figure_gso-three-seg-mbuf` illustrates an example of a three-part
+output segment. In this example, the output segment needs to include data from
+the end of one input segment, and the beginning of another. To achieve this,
+an additional indirect mbuf is chained to the second part of the output segment,
+and is attached to the next input segment (i.e. it points to the data in the
+next input segment).
+
+.. _figure_gso-three-seg-mbuf:
+
+.. figure:: img/gso-three-seg-mbuf.svg
+   :align: center
+
+   Three-part GSO output segment
+
+Supported GSO Packet Types
+--------------------------
+
+TCP/IPv4 GSO
+~~~~~~~~~~~~
+TCP/IPv4 GSO supports segmentation of suitably large TCP/IPv4 packets, which
+may also contain an optional VLAN tag.
+
+VxLAN GSO
+~~~~~~~~~
+VxLAN packets GSO supports segmentation of suitably large VxLAN packets,
+which contain an outer IPv4 header, inner TCP/IPv4 headers, and optional
+inner and/or outer VLAN tag(s).
+
+GRE GSO
+~~~~~~~
+GRE GSO supports segmentation of suitably large GRE packets, which contain
+an outer IPv4 header, inner TCP/IPv4 headers, and an optional VLAN tag.
+
+How to Segment a Packet
+-----------------------
+
+To segment an outgoing packet, an application must:
+
+#. First create a GSO context ``(struct rte_gso_ctx)``; this contains:
+
+   - a pointer to the mbuf pool for allocating the direct buffers, which are
+     used to store the GSO segments' packet headers.
+
+   - a pointer to the mbuf pool for allocating indirect buffers, which are
+     used to locate GSO segments' packet payloads.
+
+.. note::
+
+     An application may use the same pool for both direct and indirect
+     buffers. However, since each indirect mbuf simply stores a pointer, the
+     application may reduce its memory consumption by creating a separate memory
+     pool, containing smaller elements, for the indirect pool.
+
+   - the size of each output segment, including packet headers and payload,
+     measured in bytes.
+
+   - the bit mask of required GSO types. The GSO library uses the same macros as
+     those that describe a physical device's TX offloading capabilities (i.e.
+     ``DEV_TX_OFFLOAD_*_TSO``) for gso_types. For example, if an application
+     wants to segment TCP/IPv4 packets, it should set gso_types to
+     ``DEV_TX_OFFLOAD_TCP_TSO``. The only other supported values currently
+     supported for gso_types are ``DEV_TX_OFFLOAD_VXLAN_TNL_TSO``, and
+     ``DEV_TX_OFFLOAD_GRE_TNL_TSO``; a combination of these macros is also
+     allowed.
+
+   - a flag, that indicates whether the IPv4 headers of output segments should
+     contain fixed or incremental ID values.
+
+2. Set the appropriate ol_flags in the mbuf.
+
+   - The GSO library use the value of an mbuf's ``ol_flags`` attribute to
+     to determine how a packet should be segmented. It is the application's
+     responsibility to ensure that these flags are set.
+
+   - For example, in order to segment TCP/IPv4 packets, the application should
+     add the ``PKT_TX_IPV4`` and ``PKT_TX_TCP_SEG`` flags to the mbuf's
+     ol_flags.
+
+   - If checksum calculation in hardware is required, the application should
+     also add the ``PKT_TX_TCP_CKSUM`` and ``PKT_TX_IP_CKSUM`` flags.
+
+#. Check if the packet should be processed. Packets with one of the
+   following properties are not processed and are returned immediately:
+
+   - Packet length is less than ``segsz`` (i.e. GSO is not required).
+
+   - Packet type is not supported by GSO library (see
+     `Supported GSO Packet Types`_).
+
+   - Application has not enabled GSO support for the packet type.
+
+   - Packet's ol_flags have been incorrectly set.
+
+#. Allocate space in which to store the output GSO segments. If the amount of
+   space allocated by the application is insufficient, segmentation will fail.
+
+#. Invoke the GSO segmentation API, ``rte_gso_segment()``.
+
+#. If required, update the L3 and L4 checksums of the newly-created segments.
+   For tunneled packets, the outer IPv4 headers' checksums should also be
+   updated. Alternatively, the application may offload checksum calculation
+   to HW.
+
diff --git a/doc/guides/prog_guide/img/gso-output-segment-format.svg b/doc/guides/prog_guide/img/gso-output-segment-format.svg
new file mode 100644
index 0000000..bdb5ec3
--- /dev/null
+++ b/doc/guides/prog_guide/img/gso-output-segment-format.svg
@@ -0,0 +1,313 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by Microsoft Visio, SVG Export gso-output-segment-format.svg Page-1 -->
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ev="http://www.w3.org/2001/xml-events"
+		xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/" width="19.3975in" height="8.21796in"
+		viewBox="0 0 1396.62 591.693" xml:space="preserve" color-interpolation-filters="sRGB" class="st21">
+	<v:documentProperties v:langID="1033" v:metric="true" v:viewMarkup="false"/>
+
+	<style type="text/css">
+	<![CDATA[
+		.st1 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0552552}
+		.st2 {fill:#ffffff;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st3 {stroke:#c3d600;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.68828}
+		.st4 {fill:#c3d600;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st5 {stroke:#8f9d00;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.75735}
+		.st6 {fill:#00aeef;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st7 {stroke:#007fb0;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.75735}
+		.st8 {stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st9 {fill:#ffffff;font-family:Intel Clear;font-size:1.99999em;font-weight:bold}
+		.st10 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0552552}
+		.st11 {fill:#ffffff;font-family:Intel Clear;font-size:2.44732em;font-weight:bold}
+		.st12 {fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round;stroke-width:5.52552}
+		.st13 {fill:#000000;font-family:Intel Clear;font-size:2.15291em}
+		.st14 {fill:#000000;font-family:Intel Clear;font-size:1.8401em}
+		.st15 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0276276}
+		.st16 {fill:#c3d600;font-family:Intel Clear;font-size:2.44732em}
+		.st17 {fill:#ffc000;font-family:Intel Clear;font-size:2.44732em}
+		.st18 {fill:#ffc000;stroke:#ffc000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0276276}
+		.st19 {fill:#0070c0;font-family:Intel Clear;font-size:1.8401em}
+		.st20 {fill:#006fc5;font-family:Intel Clear;font-size:1.61927em}
+		.st21 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}
+	]]>
+	</style>
+
+	<g v:mID="0" v:index="1" v:groupContext="foregroundPage">
+		<title>Page-1</title>
+		<v:pageProperties v:drawingScale="0.0393701" v:pageScale="0.0393701" v:drawingUnits="24" v:shadowOffsetX="8.50394"
+				v:shadowOffsetY="-8.50394"/>
+		<g id="shape3-1" v:mID="3" v:groupContext="shape" transform="translate(577.244,-560.42)">
+			<title>Sheet.3</title>
+			<path d="M9.24 585.29 L16.32 585.29 L16.32 587.06 L9.24 587.06 L9.24 585.29 L9.24 585.29 ZM21.63 585.29 L23.4 585.29
+						 L23.4 587.06 L21.63 587.06 L21.63 585.29 L21.63 585.29 ZM28.7 585.29 L35.78 585.29 L35.78 587.06 L28.7 587.06
+						 L28.7 585.29 L28.7 585.29 ZM41.09 585.29 L42.86 585.29 L42.86 587.06 L41.09 587.06 L41.09 585.29 L41.09
+						 585.29 ZM48.17 585.29 L55.25 585.29 L55.25 587.06 L48.17 587.06 L48.17 585.29 L48.17 585.29 ZM60.56 585.29
+						 L62.33 585.29 L62.33 587.06 L60.56 587.06 L60.56 585.29 L60.56 585.29 ZM67.64 585.29 L74.72 585.29 L74.72
+						 587.06 L67.64 587.06 L67.64 585.29 L67.64 585.29 ZM80.03 585.29 L81.8 585.29 L81.8 587.06 L80.03 587.06
+						 L80.03 585.29 L80.03 585.29 ZM87.11 585.29 L94.19 585.29 L94.19 587.06 L87.11 587.06 L87.11 585.29 L87.11
+						 585.29 ZM99.5 585.29 L101.27 585.29 L101.27 587.06 L99.5 587.06 L99.5 585.29 L99.5 585.29 ZM106.58 585.29
+						 L113.66 585.29 L113.66 587.06 L106.58 587.06 L106.58 585.29 L106.58 585.29 ZM118.97 585.29 L120.74 585.29
+						 L120.74 587.06 L118.97 587.06 L118.97 585.29 L118.97 585.29 ZM126.05 585.29 L133.13 585.29 L133.13 587.06
+						 L126.05 587.06 L126.05 585.29 L126.05 585.29 ZM138.43 585.29 L140.2 585.29 L140.2 587.06 L138.43 587.06
+						 L138.43 585.29 L138.43 585.29 ZM145.51 585.29 L152.59 585.29 L152.59 587.06 L145.51 587.06 L145.51 585.29
+						 L145.51 585.29 ZM157.9 585.29 L159.67 585.29 L159.67 587.06 L157.9 587.06 L157.9 585.29 L157.9 585.29 ZM164.98
+						 585.29 L172.06 585.29 L172.06 587.06 L164.98 587.06 L164.98 585.29 L164.98 585.29 ZM177.37 585.29 L179.14
+						 585.29 L179.14 587.06 L177.37 587.06 L177.37 585.29 L177.37 585.29 ZM184.45 585.29 L191.53 585.29 L191.53
+						 587.06 L184.45 587.06 L184.45 585.29 L184.45 585.29 ZM196.84 585.29 L198.61 585.29 L198.61 587.06 L196.84
+						 587.06 L196.84 585.29 L196.84 585.29 ZM203.92 585.29 L211 585.29 L211 587.06 L203.92 587.06 L203.92 585.29
+						 L203.92 585.29 ZM216.31 585.29 L218.08 585.29 L218.08 587.06 L216.31 587.06 L216.31 585.29 L216.31 585.29
+						 ZM223.39 585.29 L230.47 585.29 L230.47 587.06 L223.39 587.06 L223.39 585.29 L223.39 585.29 ZM235.78 585.29
+						 L237.55 585.29 L237.55 587.06 L235.78 587.06 L235.78 585.29 L235.78 585.29 ZM242.86 585.29 L249.93 585.29
+						 L249.93 587.06 L242.86 587.06 L242.86 585.29 L242.86 585.29 ZM255.24 585.29 L257.01 585.29 L257.01 587.06
+						 L255.24 587.06 L255.24 585.29 L255.24 585.29 ZM262.32 585.29 L269.4 585.29 L269.4 587.06 L262.32 587.06
+						 L262.32 585.29 L262.32 585.29 ZM274.71 585.29 L276.48 585.29 L276.48 587.06 L274.71 587.06 L274.71 585.29
+						 L274.71 585.29 ZM281.79 585.29 L288.87 585.29 L288.87 587.06 L281.79 587.06 L281.79 585.29 L281.79 585.29
+						 ZM294.18 585.29 L295.95 585.29 L295.95 587.06 L294.18 587.06 L294.18 585.29 L294.18 585.29 ZM301.26 585.29
+						 L308.34 585.29 L308.34 587.06 L301.26 587.06 L301.26 585.29 L301.26 585.29 ZM313.65 585.29 L315.42 585.29
+						 L315.42 587.06 L313.65 587.06 L313.65 585.29 L313.65 585.29 ZM320.73 585.29 L324.99 585.29 L324.99 587.06
+						 L320.73 587.06 L320.73 585.29 L320.73 585.29 ZM11.06 591.69 L0 586.17 L11.06 580.65 L11.06 591.69 L11.06
+						 591.69 ZM323.16 580.65 L334.22 586.17 L323.16 591.69 L323.16 580.65 L323.16 580.65 Z" class="st1"/>
+		</g>
+		<g id="shape4-3" v:mID="4" v:groupContext="shape" transform="translate(184.298,-201.906)">
+			<title>Sheet.4</title>
+			<path d="M94.04 570.43 L117.87 557.26 L0 344.58 L47.68 318.26 L165.55 530.94 L189.39 517.79 L168.08 591.69 L94.04 570.43
+						 Z" class="st2"/>
+		</g>
+		<g id="shape5-5" v:mID="5" v:groupContext="shape" transform="translate(184.298,-201.906)">
+			<title>Sheet.5</title>
+			<path d="M94.04 570.43 L117.87 557.26 L0 344.58 L47.68 318.26 L165.55 530.94 L189.39 517.79 L168.08 591.69 L94.04 570.43"
+					class="st3"/>
+		</g>
+		<g id="shape6-8" v:mID="6" v:groupContext="shape" transform="translate(119.408,-447.917)">
+			<title>Sheet.6</title>
+			<path d="M0 510.21 L0 591.69 L129.86 591.69 L129.86 510.21 L0 510.21 L0 510.21 Z" class="st4"/>
+		</g>
+		<g id="shape7-10" v:mID="7" v:groupContext="shape" transform="translate(119.408,-447.917)">
+			<title>Sheet.7</title>
+			<path d="M0 510.21 L129.86 510.21 L129.86 591.69 L0 591.69 L0 510.21" class="st5"/>
+		</g>
+		<g id="shape10-13" v:mID="10" v:groupContext="shape" transform="translate(250.819,-447.917)">
+			<title>Sheet.10</title>
+			<path d="M0 510.21 L0 591.69 L822.53 591.69 L822.53 510.21 L0 510.21 L0 510.21 Z" class="st6"/>
+		</g>
+		<g id="shape11-15" v:mID="11" v:groupContext="shape" transform="translate(250.819,-447.917)">
+			<title>Sheet.11</title>
+			<path d="M0 510.21 L822.53 510.21 L822.53 591.69 L0 591.69 L0 510.21" class="st7"/>
+		</g>
+		<g id="shape12-18" v:mID="12" v:groupContext="shape" transform="translate(255.478,-470.123)">
+			<title>Sheet.12</title>
+			<desc>Payload 0</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="157.315" cy="574.07" width="314.63" height="35.245"/>
+			<path d="M314.63 556.45 L0 556.45 L0 591.69 L314.63 591.69 L314.63 556.45" class="st8"/>
+			<text x="102.08" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 0</text>		</g>
+		<g id="shape13-22" v:mID="13" v:groupContext="shape" transform="translate(577.354,-470.123)">
+			<title>Sheet.13</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="167.112" cy="574.07" width="334.23" height="35.245"/>
+			<path d="M334.22 556.45 L0 556.45 L0 591.69 L334.22 591.69 L334.22 556.45" class="st8"/>
+			<text x="111.88" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape14-26" v:mID="14" v:groupContext="shape" transform="translate(910.635,-470.956)">
+			<title>Sheet.14</title>
+			<desc>Payload 2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="81.8509" cy="574.07" width="163.71" height="35.245"/>
+			<path d="M163.7 556.45 L0 556.45 L0 591.69 L163.7 591.69 L163.7 556.45" class="st8"/>
+			<text x="26.61" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 2</text>		</g>
+		<g id="shape15-30" v:mID="15" v:groupContext="shape" transform="translate(909.144,-453.824)">
+			<title>Sheet.15</title>
+			<path d="M1.16 453.85 L1.05 465.33 L3.93 465.39 L4.04 453.91 L1.16 453.85 L1.16 453.85 ZM1 473.95 L0.94 476.82 L3.82
+						 476.87 L3.87 474 L1 473.95 L1 473.95 ZM0.88 485.43 L0.77 496.91 L3.65 496.96 L3.76 485.48 L0.88 485.43 L0.88
+						 485.43 ZM0.72 505.52 L0.72 508.39 L3.59 508.45 L3.59 505.58 L0.72 505.52 L0.72 505.52 ZM0.61 517 L0.55 528.49
+						 L3.43 528.54 L3.48 517.06 L0.61 517 L0.61 517 ZM0.44 537.1 L0.44 539.97 L3.32 540.02 L3.32 537.15 L0.44
+						 537.1 L0.44 537.1 ZM0.39 548.58 L0.28 560.06 L3.15 560.12 L3.26 548.63 L0.39 548.58 L0.39 548.58 ZM0.22
+						 568.67 L0.17 571.54 L3.04 571.6 L3.1 568.73 L0.22 568.67 L0.22 568.67 ZM0.11 580.16 L0 591.64 L2.88 591.69
+						 L2.99 580.21 L0.11 580.16 L0.11 580.16 Z" class="st10"/>
+		</g>
+		<g id="shape16-32" v:mID="16" v:groupContext="shape" transform="translate(119.187,-447.917)">
+			<title>Sheet.16</title>
+			<path d="M0 510.21 L0 591.69 L129.86 591.69 L129.86 510.21 L0 510.21 L0 510.21 Z" class="st4"/>
+		</g>
+		<g id="shape17-34" v:mID="17" v:groupContext="shape" transform="translate(119.187,-447.917)">
+			<title>Sheet.17</title>
+			<path d="M0 510.21 L129.86 510.21 L129.86 591.69 L0 591.69 L0 510.21" class="st5"/>
+		</g>
+		<g id="shape18-37" v:mID="18" v:groupContext="shape" transform="translate(121.944,-471.034)">
+			<title>Sheet.18</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="61.0973" cy="574.07" width="122.2" height="35.245"/>
+			<path d="M122.19 556.45 L0 556.45 L0 591.69 L122.19 591.69 L122.19 556.45" class="st8"/>
+			<text x="20.61" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape19-41" v:mID="19" v:groupContext="shape" transform="translate(329.798,-1.87868)">
+			<title>Sheet.19</title>
+			<path d="M0 510.43 L0 591.69 L289.81 591.69 L289.81 510.43 L0 510.43 L0 510.43 Z" class="st4"/>
+		</g>
+		<g id="shape20-43" v:mID="20" v:groupContext="shape" transform="translate(329.798,-1.87868)">
+			<title>Sheet.20</title>
+			<path d="M0 510.43 L289.81 510.43 L289.81 591.69 L0 591.69 L0 510.43" class="st5"/>
+		</g>
+		<g id="shape21-46" v:mID="21" v:groupContext="shape" transform="translate(424.908,-21.567)">
+			<title>Sheet.21</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="61.0973" cy="574.07" width="122.2" height="35.245"/>
+			<path d="M122.19 556.45 L0 556.45 L0 591.69 L122.19 591.69 L122.19 556.45" class="st8"/>
+			<text x="11.55" y="582.88" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape22-50" v:mID="22" v:groupContext="shape" transform="translate(619.609,-1.87868)">
+			<title>Sheet.22</title>
+			<path d="M0 510.43 L0 591.69 L453.74 591.69 L453.74 510.43 L0 510.43 L0 510.43 Z" class="st6"/>
+		</g>
+		<g id="shape23-52" v:mID="23" v:groupContext="shape" transform="translate(619.609,-1.87868)">
+			<title>Sheet.23</title>
+			<path d="M0 510.43 L453.74 510.43 L453.74 591.69 L0 591.69 L0 510.43" class="st7"/>
+		</g>
+		<g id="shape24-55" v:mID="24" v:groupContext="shape" transform="translate(778.624,-21.5672)">
+			<title>Sheet.24</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="81.8509" cy="574.07" width="163.71" height="35.245"/>
+			<path d="M163.7 556.45 L0 556.45 L0 591.69 L163.7 591.69 L163.7 556.45" class="st8"/>
+			<text x="14.26" y="582.88" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape25-59" v:mID="25" v:groupContext="shape" transform="translate(710.092,-113.83)">
+			<title>Sheet.25</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L349.43 508.89 C357.12 508.89 363.26 515.07 363.26 522.69 L363.26
+						 577.89 C363.26 585.57 357.12 591.69 349.43 591.69 L13.83 591.69 C6.19 591.69 0 585.57 0 577.89 L0 522.69
+						 Z" class="st6"/>
+		</g>
+		<g id="shape26-61" v:mID="26" v:groupContext="shape" transform="translate(710.092,-113.83)">
+			<title>Sheet.26</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L349.43 508.89 C357.12 508.89 363.26 515.07 363.26 522.69 L363.26
+						 577.89 C363.26 585.57 357.12 591.69 349.43 591.69 L13.83 591.69 C6.19 591.69 0 585.57 0 577.89 L0 522.69
+						 Z" class="st12"/>
+		</g>
+		<g id="shape27-63" v:mID="27" v:groupContext="shape" transform="translate(813.057,-150.108)">
+			<title>Sheet.27</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="94.1386" cy="576.19" width="188.28" height="31.0055"/>
+			<path d="M188.28 560.69 L0 560.69 L0 591.69 L188.28 591.69 L188.28 560.69" class="st8"/>
+			<text x="15.43" y="583.94" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape28-67" v:mID="28" v:groupContext="shape" transform="translate(810.845,-123.854)">
+			<title>Sheet.28</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="95.5065" cy="578.442" width="191.02" height="26.501"/>
+			<path d="M191.01 565.19 L0 565.19 L0 591.69 L191.01 591.69 L191.01 565.19" class="st8"/>
+			<text x="15.15" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape29-71" v:mID="29" v:groupContext="shape" transform="translate(573.151,-149.601)">
+			<title>Sheet.29</title>
+			<path d="M0 584.74 L127.76 584.74 L127.76 587.61 L0 587.61 L0 584.74 L0 584.74 ZM125.91 580.65 L136.97 586.17 L125.91
+						 591.69 L125.91 580.65 L125.91 580.65 Z" class="st15"/>
+		</g>
+		<g id="shape30-73" v:mID="30" v:groupContext="shape" transform="translate(0,-309.671)">
+			<title>Sheet.30</title>
+			<desc>Memory copy</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="108.076" cy="574.07" width="216.16" height="35.245"/>
+			<path d="M216.15 556.45 L0 556.45 L0 591.69 L216.15 591.69 L216.15 556.45" class="st8"/>
+			<text x="17.68" y="582.88" class="st16" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Memory copy</text>		</g>
+		<g id="shape31-77" v:mID="31" v:groupContext="shape" transform="translate(680.77,-305.707)">
+			<title>Sheet.31</title>
+			<desc>No Memory Copy</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="136.547" cy="574.07" width="273.1" height="35.245"/>
+			<path d="M273.09 556.45 L0 556.45 L0 591.69 L273.09 591.69 L273.09 556.45" class="st8"/>
+			<text x="21.4" y="582.88" class="st17" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>No Memory Copy</text>		</g>
+		<g id="shape32-81" v:mID="32" v:groupContext="shape" transform="translate(1102.72,-26.7532)">
+			<title>Sheet.32</title>
+			<desc>Logical output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="138.243" cy="578.442" width="276.49" height="26.501"/>
+			<path d="M276.49 565.19 L0 565.19 L0 591.69 L276.49 591.69 L276.49 565.19" class="st8"/>
+			<text x="20.73" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Logical output segment</text>		</g>
+		<g id="shape36-85" v:mID="36" v:groupContext="shape" transform="translate(1106.81,-138.647)">
+			<title>Sheet.36</title>
+			<desc>Two-part output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="144.906" cy="578.442" width="289.82" height="26.501"/>
+			<path d="M289.81 565.19 L0 565.19 L0 591.69 L289.81 591.69 L289.81 565.19" class="st8"/>
+			<text x="16.56" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Two-part output segment</text>		</g>
+		<g id="shape37-89" v:mID="37" v:groupContext="shape" transform="translate(575.916,-453.879)">
+			<title>Sheet.37</title>
+			<path d="M2.88 453.91 L2.9 465.39 L0.03 465.39 L0 453.91 L2.88 453.91 L2.88 453.91 ZM2.9 474 L2.9 476.87 L0.03 476.87
+						 L0.03 474 L2.9 474 L2.9 474 ZM2.9 485.48 L2.9 496.96 L0.03 496.96 L0.03 485.48 L2.9 485.48 L2.9 485.48 ZM2.9
+						 505.58 L2.9 508.45 L0.03 508.45 L0.03 505.58 L2.9 505.58 L2.9 505.58 ZM2.9 517.06 L2.9 528.54 L0.03 528.54
+						 L0.03 517.06 L2.9 517.06 L2.9 517.06 ZM2.9 537.15 L2.9 540.02 L0.03 540.02 L0.03 537.15 L2.9 537.15 L2.9
+						 537.15 ZM2.9 548.63 L2.9 560.12 L0.03 560.12 L0.03 548.63 L2.9 548.63 L2.9 548.63 ZM2.9 568.73 L2.9 571.6
+						 L0.03 571.6 L0.03 568.73 L2.9 568.73 L2.9 568.73 ZM2.9 580.21 L2.9 591.69 L0.03 591.69 L0.03 580.21 L2.9
+						 580.21 L2.9 580.21 Z" class="st18"/>
+		</g>
+		<g id="shape38-91" v:mID="38" v:groupContext="shape" transform="translate(577.354,-193.764)">
+			<title>Sheet.38</title>
+			<path d="M5.59 347.01 L10.92 357.16 L8.38 358.52 L3.04 348.36 L5.59 347.01 L5.59 347.01 ZM14.96 364.78 L16.29 367.32
+						 L13.74 368.67 L12.42 366.13 L14.96 364.78 L14.96 364.78 ZM20.33 374.97 L25.66 385.12 L23.12 386.45 L17.78
+						 376.29 L20.33 374.97 L20.33 374.97 ZM29.7 392.74 L31.03 395.28 L28.48 396.61 L27.16 394.07 L29.7 392.74
+						 L29.7 392.74 ZM35.04 402.9 L40.4 413.06 L37.86 414.38 L32.49 404.22 L35.04 402.9 L35.04 402.9 ZM44.41 420.67
+						 L45.77 423.21 L43.22 424.57 L41.87 422.03 L44.41 420.67 L44.41 420.67 ZM49.78 430.83 L55.14 440.99 L52.6
+						 442.34 L47.23 432.18 L49.78 430.83 L49.78 430.83 ZM59.15 448.61 L60.51 451.15 L57.96 452.5 L56.61 449.96
+						 L59.15 448.61 L59.15 448.61 ZM64.52 458.79 L69.88 468.95 L67.34 470.27 L61.97 460.12 L64.52 458.79 L64.52
+						 458.79 ZM73.89 476.57 L75.25 479.11 L72.7 480.43 L71.35 477.89 L73.89 476.57 L73.89 476.57 ZM79.26 486.72
+						 L84.62 496.88 L82.08 498.21 L76.71 488.05 L79.26 486.72 L79.26 486.72 ZM88.63 504.5 L89.96 507.04 L87.41
+						 508.39 L86.09 505.85 L88.63 504.5 L88.63 504.5 ZM94 514.66 L99.33 524.81 L96.79 526.17 L91.45 516.01 L94
+						 514.66 L94 514.66 ZM103.37 532.43 L104.7 534.97 L102.15 536.32 L100.83 533.79 L103.37 532.43 L103.37 532.43
+						 ZM108.73 542.62 L114.07 552.77 L111.53 554.1 L106.19 543.94 L108.73 542.62 L108.73 542.62 ZM118.11 560.39
+						 L119.44 562.93 L116.89 564.26 L115.57 561.72 L118.11 560.39 L118.11 560.39 ZM123.45 570.55 L128.81 580.71
+						 L126.27 582.03 L120.9 571.87 L123.45 570.55 L123.45 570.55 ZM132.82 588.33 L133.9 590.37 L131.36 591.69
+						 L130.28 589.68 L132.82 588.33 L132.82 588.33 ZM0.28 351.89 L0 339.53 L10.07 346.73 L0.28 351.89 L0.28 351.89
+						 Z" class="st18"/>
+		</g>
+		<g id="shape39-93" v:mID="39" v:groupContext="shape" transform="translate(329.798,-113.83)">
+			<title>Sheet.39</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L229.53 508.89 C237.19 508.89 243.35 515.07 243.35 522.69 L243.35
+						 577.89 C243.35 585.54 237.19 591.69 229.53 591.69 L13.83 591.69 C6.19 591.69 0 585.54 0 577.89 L0 522.69
+						 Z" class="st4"/>
+		</g>
+		<g id="shape40-95" v:mID="40" v:groupContext="shape" transform="translate(329.798,-113.83)">
+			<title>Sheet.40</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L229.53 508.89 C237.19 508.89 243.35 515.07 243.35 522.69 L243.35
+						 577.89 C243.35 585.54 237.19 591.69 229.53 591.69 L13.83 591.69 C6.19 591.69 0 585.54 0 577.89 L0 522.69
+						 Z" class="st12"/>
+		</g>
+		<g id="shape41-97" v:mID="41" v:groupContext="shape" transform="translate(368.774,-150.453)">
+			<title>Sheet.41</title>
+			<desc>Direct mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="82.7002" cy="576.19" width="165.41" height="31.0055"/>
+			<path d="M165.4 560.69 L0 560.69 L0 591.69 L165.4 591.69 L165.4 560.69" class="st8"/>
+			<text x="13.94" y="583.94" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Direct mbuf</text>		</g>
+		<g id="shape42-101" v:mID="42" v:groupContext="shape" transform="translate(351.856,-123.854)">
+			<title>Sheet.42</title>
+			<desc>(copy of headers)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="102.121" cy="578.442" width="204.25" height="26.501"/>
+			<path d="M204.24 565.19 L0 565.19 L0 591.69 L204.24 591.69 L204.24 565.19" class="st8"/>
+			<text x="16.02" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(copy of headers)</text>		</g>
+		<g id="shape43-105" v:mID="43" v:groupContext="shape" transform="translate(619.797,-155.563)">
+			<title>Sheet.43</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="28.011" cy="578.442" width="56.03" height="26.501"/>
+			<path d="M56.02 565.19 L0 565.19 L0 591.69 L56.02 591.69 L56.02 565.19" class="st8"/>
+			<text x="6.35" y="585.07" class="st19" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape44-109" v:mID="44" v:groupContext="shape" transform="translate(700.911,-551.367)">
+			<title>Sheet.44</title>
+			<path d="M0 559.23 L0 591.69 L84.29 591.69 L84.29 559.23 L0 559.23 L0 559.23 Z" class="st2"/>
+		</g>
+		<g id="shape45-111" v:mID="45" v:groupContext="shape" transform="translate(709.883,-555.163)">
+			<title>Sheet.45</title>
+			<desc>segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="30.7501" cy="580.032" width="61.51" height="23.3211"/>
+			<path d="M61.5 568.37 L0 568.37 L0 591.69 L61.5 591.69 L61.5 568.37" class="st8"/>
+			<text x="6.38" y="585.86" class="st20" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>segsz</text>		</g>
+		<g id="shape46-115" v:mID="46" v:groupContext="shape" transform="translate(1111.54,-477.36)">
+			<title>Sheet.46</title>
+			<desc>Input packet</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="74.9" cy="578.442" width="149.8" height="26.501"/>
+			<path d="M149.8 565.19 L0 565.19 L0 591.69 L149.8 591.69 L149.8 565.19" class="st8"/>
+			<text x="12.47" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Input packet</text>		</g>
+	</g>
+</svg>
diff --git a/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg b/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
new file mode 100644
index 0000000..f18a327
--- /dev/null
+++ b/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
@@ -0,0 +1,477 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by Microsoft Visio, SVG Export gso-three-seg-mbuf.svg Page-1 -->
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ev="http://www.w3.org/2001/xml-events"
+		xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/" width="21.8589in" height="9.63966in"
+		viewBox="0 0 1573.84 694.055" xml:space="preserve" color-interpolation-filters="sRGB" class="st23">
+	<title>GSO three-part output segment</title>
+	<v:documentProperties v:langID="1033" v:metric="true" v:viewMarkup="false"/>
+
+	<style type="text/css">
+	<![CDATA[
+		.st1 {fill:#ffc000;stroke:#ffc000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st2 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st3 {fill:none;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.42236}
+		.st4 {fill:#c3d600;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st5 {stroke:#8f9d00;stroke-linecap:round;stroke-linejoin:round;stroke-width:4.47539}
+		.st6 {fill:#00aeef;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st7 {stroke:#007fb0;stroke-linecap:round;stroke-linejoin:round;stroke-width:4.47539}
+		.st8 {stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st9 {fill:#ffffff;font-family:Calibri;font-size:2.08333em;font-weight:bold}
+		.st10 {fill:#ffffff;font-family:Intel Clear;font-size:2.91502em;font-weight:bold}
+		.st11 {fill:#000000;font-family:Intel Clear;font-size:2.19175em}
+		.st12 {fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round;stroke-width:6.58146}
+		.st13 {fill:#000000;font-family:Intel Clear;font-size:2.50001em}
+		.st14 {fill:#000000;font-family:Intel Clear;font-size:1.99999em}
+		.st15 {fill:#0070c0;font-family:Intel Clear;font-size:2.19175em}
+		.st16 {fill:#ffffff;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st17 {fill:#006fc5;font-family:Intel Clear;font-size:1.92874em}
+		.st18 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st19 {fill:#0070c0;font-family:Intel Clear;font-size:1.5em}
+		.st20 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0658146}
+		.st21 {fill:#000000;font-family:Intel Clear;font-size:1.81915em}
+		.st22 {fill:#000000;font-family:Intel Clear;font-size:1.49785em}
+		.st23 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}
+	]]>
+	</style>
+
+	<g v:mID="0" v:index="1" v:groupContext="foregroundPage">
+		<title>Page-1</title>
+		<v:pageProperties v:drawingScale="0.0393701" v:pageScale="0.0393701" v:drawingUnits="24" v:shadowOffsetX="8.50394"
+				v:shadowOffsetY="-8.50394"/>
+		<v:layer v:name="top" v:index="0"/>
+		<v:layer v:name="middle" v:index="1"/>
+		<g id="shape111-1" v:mID="111" v:groupContext="shape" v:layerMember="0" transform="translate(787.208,-220.973)">
+			<title>Sheet.111</title>
+			<path d="M6.65 402.61 L13.01 414.71 L9.98 416.32 L3.62 404.22 L6.65 402.61 L6.65 402.61 ZM17.82 423.78 L19.4 426.81 L16.37
+						 428.42 L14.79 425.39 L17.82 423.78 L17.82 423.78 ZM24.21 435.91 L30.57 448.01 L27.54 449.59 L21.18 437.49
+						 L24.21 435.91 L24.21 435.91 ZM35.38 457.08 L36.96 460.11 L33.93 461.69 L32.35 458.66 L35.38 457.08 L35.38
+						 457.08 ZM41.73 469.18 L48.12 481.28 L45.09 482.86 L38.7 470.76 L41.73 469.18 L41.73 469.18 ZM52.9 490.36
+						 L54.51 493.38 L51.48 494.99 L49.87 491.97 L52.9 490.36 L52.9 490.36 ZM59.29 502.45 L65.68 514.55 L62.65
+						 516.16 L56.26 504.06 L59.29 502.45 L59.29 502.45 ZM70.46 523.63 L72.07 526.65 L69.04 528.26 L67.43 525.24
+						 L70.46 523.63 L70.46 523.63 ZM76.85 535.76 L83.24 547.86 L80.21 549.43 L73.82 537.34 L76.85 535.76 L76.85
+						 535.76 ZM88.01 556.93 L89.63 559.95 L86.6 561.53 L84.98 558.51 L88.01 556.93 L88.01 556.93 ZM94.4 569.03
+						 L100.79 581.13 L97.76 582.7 L91.37 570.61 L94.4 569.03 L94.4 569.03 ZM105.57 590.2 L107.15 593.22 L104.12
+						 594.84 L102.54 591.81 L105.57 590.2 L105.57 590.2 ZM111.96 602.3 L118.32 614.4 L115.28 616.01 L108.93 603.91
+						 L111.96 602.3 L111.96 602.3 ZM123.12 623.47 L124.71 626.5 L121.67 628.11 L120.09 625.08 L123.12 623.47 L123.12
+						 623.47 ZM129.51 635.6 L135.87 647.7 L132.84 649.28 L126.48 637.18 L129.51 635.6 L129.51 635.6 ZM140.68 656.77
+						 L142.26 659.8 L139.23 661.38 L137.65 658.35 L140.68 656.77 L140.68 656.77 ZM147.04 668.87 L153.43 680.97
+						 L150.4 682.55 L144.01 670.45 L147.04 668.87 L147.04 668.87 ZM158.2 690.04 L159.49 692.48 L156.46 694.06
+						 L155.17 691.66 L158.2 690.04 L158.2 690.04 ZM0.33 408.43 L0 393.7 L11.99 402.28 L0.33 408.43 L0.33 408.43
+						 Z" class="st1"/>
+		</g>
+		<g id="shape110-3" v:mID="110" v:groupContext="shape" v:layerMember="0" transform="translate(685.078,-560.166)">
+			<title>Sheet.110</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape4-5" v:mID="4" v:groupContext="shape" transform="translate(718.715,-469.955)">
+			<title>Sheet.4</title>
+			<path d="M0 655.13 L0 678.22 C0 686.97 11.69 694.06 26.05 694.06 C40.45 694.06 52.11 686.97 52.11 678.22 L52.11 673.91
+						 L59.55 673.91 L44.66 664.86 L29.78 673.91 L37.22 673.91 L37.22 678.22 C37.22 681.98 32.25 685 26.05 685
+						 C19.89 685 14.89 681.98 14.89 678.22 L14.89 655.13 L0 655.13 Z" class="st3"/>
+		</g>
+		<g id="shape5-7" v:mID="5" v:groupContext="shape" transform="translate(547.831,-656.823)">
+			<title>Sheet.5</title>
+			<path d="M11 686.43 L19.43 686.43 L19.43 688.53 L11 688.53 L11 686.43 L11 686.43 ZM25.76 686.43 L27.87 686.43 L27.87
+						 688.53 L25.76 688.53 L25.76 686.43 L25.76 686.43 ZM34.19 686.43 L42.62 686.43 L42.62 688.53 L34.19 688.53
+						 L34.19 686.43 L34.19 686.43 ZM48.95 686.43 L51.05 686.43 L51.05 688.53 L48.95 688.53 L48.95 686.43 L48.95
+						 686.43 ZM57.38 686.43 L65.81 686.43 L65.81 688.53 L57.38 688.53 L57.38 686.43 L57.38 686.43 ZM72.14 686.43
+						 L74.24 686.43 L74.24 688.53 L72.14 688.53 L72.14 686.43 L72.14 686.43 ZM80.57 686.43 L89 686.43 L89 688.53
+						 L80.57 688.53 L80.57 686.43 L80.57 686.43 ZM95.32 686.43 L97.43 686.43 L97.43 688.53 L95.32 688.53 L95.32
+						 686.43 L95.32 686.43 ZM103.76 686.43 L112.19 686.43 L112.19 688.53 L103.76 688.53 L103.76 686.43 L103.76
+						 686.43 ZM118.51 686.43 L120.62 686.43 L120.62 688.53 L118.51 688.53 L118.51 686.43 L118.51 686.43 ZM126.94
+						 686.43 L135.38 686.43 L135.38 688.53 L126.94 688.53 L126.94 686.43 L126.94 686.43 ZM141.7 686.43 L143.81
+						 686.43 L143.81 688.53 L141.7 688.53 L141.7 686.43 L141.7 686.43 ZM150.13 686.43 L158.57 686.43 L158.57 688.53
+						 L150.13 688.53 L150.13 686.43 L150.13 686.43 ZM164.89 686.43 L167 686.43 L167 688.53 L164.89 688.53 L164.89
+						 686.43 L164.89 686.43 ZM173.32 686.43 L181.75 686.43 L181.75 688.53 L173.32 688.53 L173.32 686.43 L173.32
+						 686.43 ZM188.08 686.43 L190.19 686.43 L190.19 688.53 L188.08 688.53 L188.08 686.43 L188.08 686.43 ZM196.51
+						 686.43 L204.94 686.43 L204.94 688.53 L196.51 688.53 L196.51 686.43 L196.51 686.43 ZM211.27 686.43 L213.38
+						 686.43 L213.38 688.53 L211.27 688.53 L211.27 686.43 L211.27 686.43 ZM219.7 686.43 L228.13 686.43 L228.13
+						 688.53 L219.7 688.53 L219.7 686.43 L219.7 686.43 ZM234.46 686.43 L236.56 686.43 L236.56 688.53 L234.46 688.53
+						 L234.46 686.43 L234.46 686.43 ZM242.89 686.43 L251.32 686.43 L251.32 688.53 L242.89 688.53 L242.89 686.43
+						 L242.89 686.43 ZM257.64 686.43 L259.75 686.43 L259.75 688.53 L257.64 688.53 L257.64 686.43 L257.64 686.43
+						 ZM266.08 686.43 L274.51 686.43 L274.51 688.53 L266.08 688.53 L266.08 686.43 L266.08 686.43 ZM280.83 686.43
+						 L282.94 686.43 L282.94 688.53 L280.83 688.53 L280.83 686.43 L280.83 686.43 ZM289.27 686.43 L297.7 686.43
+						 L297.7 688.53 L289.27 688.53 L289.27 686.43 L289.27 686.43 ZM304.02 686.43 L306.13 686.43 L306.13 688.53
+						 L304.02 688.53 L304.02 686.43 L304.02 686.43 ZM312.45 686.43 L320.89 686.43 L320.89 688.53 L312.45 688.53
+						 L312.45 686.43 L312.45 686.43 ZM327.21 686.43 L329.32 686.43 L329.32 688.53 L327.21 688.53 L327.21 686.43
+						 L327.21 686.43 ZM335.64 686.43 L344.08 686.43 L344.08 688.53 L335.64 688.53 L335.64 686.43 L335.64 686.43
+						 ZM350.4 686.43 L352.51 686.43 L352.51 688.53 L350.4 688.53 L350.4 686.43 L350.4 686.43 ZM358.83 686.43 L367.26
+						 686.43 L367.26 688.53 L358.83 688.53 L358.83 686.43 L358.83 686.43 ZM373.59 686.43 L375.7 686.43 L375.7
+						 688.53 L373.59 688.53 L373.59 686.43 L373.59 686.43 ZM382.02 686.43 L387.06 686.43 L387.06 688.53 L382.02
+						 688.53 L382.02 686.43 L382.02 686.43 ZM13.18 694.06 L0 687.48 L13.18 680.9 L13.18 694.06 L13.18 694.06 ZM384.89
+						 680.9 L398.06 687.48 L384.89 694.06 L384.89 680.9 L384.89 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape6-9" v:mID="6" v:groupContext="shape" transform="translate(2.5012,-522.82)">
+			<title>Sheet.6</title>
+			<path d="M0 597.01 L0 694.06 L154.68 694.06 L154.68 597.01 L0 597.01 L0 597.01 Z" class="st4"/>
+		</g>
+		<g id="shape7-11" v:mID="7" v:groupContext="shape" transform="translate(2.5012,-522.82)">
+			<title>Sheet.7</title>
+			<path d="M0 597.01 L154.68 597.01 L154.68 694.06 L0 694.06 L0 597.01" class="st5"/>
+		</g>
+		<g id="shape10-14" v:mID="10" v:groupContext="shape" transform="translate(159.025,-522.82)">
+			<title>Sheet.10</title>
+			<path d="M0 597.01 L0 694.06 L563.73 694.06 L563.73 597.01 L0 597.01 L0 597.01 Z" class="st6"/>
+		</g>
+		<g id="shape11-16" v:mID="11" v:groupContext="shape" transform="translate(159.025,-522.82)">
+			<title>Sheet.11</title>
+			<path d="M0 597.01 L563.73 597.01 L563.73 694.06 L0 694.06 L0 597.01" class="st7"/>
+		</g>
+		<g id="shape12-19" v:mID="12" v:groupContext="shape" transform="translate(262.039,-549.269)">
+			<title>Sheet.12</title>
+			<desc>Payload 0</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="46.92" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 0</text>		</g>
+		<g id="shape13-23" v:mID="13" v:groupContext="shape" transform="translate(547.615,-549.269)">
+			<title>Sheet.13</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="87.5716" cy="673.065" width="175.15" height="41.9798"/>
+			<path d="M175.14 652.08 L0 652.08 L0 694.06 L175.14 694.06 L175.14 652.08" class="st8"/>
+			<text x="37" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape15-27" v:mID="15" v:groupContext="shape" transform="translate(2.2377,-522.82)">
+			<title>Sheet.15</title>
+			<path d="M0 597.01 L0 694.06 L154.68 694.06 L154.68 597.01 L0 597.01 L0 597.01 Z" class="st4"/>
+		</g>
+		<g id="shape16-29" v:mID="16" v:groupContext="shape" transform="translate(2.2377,-522.82)">
+			<title>Sheet.16</title>
+			<path d="M0 597.01 L154.68 597.01 L154.68 694.06 L0 694.06 L0 597.01" class="st5"/>
+		</g>
+		<g id="shape17-32" v:mID="17" v:groupContext="shape" transform="translate(6.52106,-546.331)">
+			<title>Sheet.17</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="72.773" cy="673.065" width="145.55" height="41.9798"/>
+			<path d="M145.55 652.08 L0 652.08 L0 694.06 L145.55 694.06 L145.55 652.08" class="st8"/>
+			<text x="34.98" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape23-36" v:mID="23" v:groupContext="shape" transform="translate(286.548,-2.2377)">
+			<title>Sheet.23</title>
+			<path d="M0 597.27 L0 694.06 L345.2 694.06 L345.2 597.27 L0 597.27 L0 597.27 Z" class="st4"/>
+		</g>
+		<g id="shape24-38" v:mID="24" v:groupContext="shape" transform="translate(286.548,-2.2377)">
+			<title>Sheet.24</title>
+			<path d="M0 597.27 L345.2 597.27 L345.2 694.06 L0 694.06 L0 597.27" class="st5"/>
+		</g>
+		<g id="shape25-41" v:mID="25" v:groupContext="shape" transform="translate(399.834,-25.6887)">
+			<title>Sheet.25</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="72.773" cy="673.065" width="145.55" height="41.9798"/>
+			<path d="M145.55 652.08 L0 652.08 L0 694.06 L145.55 694.06 L145.55 652.08" class="st8"/>
+			<text x="13.76" y="683.56" class="st10" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape31-45" v:mID="31" v:groupContext="shape" transform="translate(631.744,-2.2377)">
+			<title>Sheet.31</title>
+			<path d="M0 597.27 L0 694.06 L516.21 694.06 L516.21 597.27 L0 597.27 L0 597.27 Z" class="st6"/>
+		</g>
+		<g id="shape32-47" v:mID="32" v:groupContext="shape" transform="translate(631.744,-2.2377)">
+			<title>Sheet.32</title>
+			<path d="M0 597.27 L516.21 597.27 L516.21 694.06 L0 694.06 L0 597.27" class="st7"/>
+		</g>
+		<g id="shape33-50" v:mID="33" v:groupContext="shape" transform="translate(809.035,-25.6889)">
+			<title>Sheet.33</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="16.99" y="683.56" class="st10" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape35-54" v:mID="35" v:groupContext="shape" transform="translate(1199.29,-21.1708)">
+			<title>Sheet.35</title>
+			<desc>Logical output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="164.662" cy="678.273" width="329.33" height="31.5648"/>
+			<path d="M329.32 662.49 L0 662.49 L0 694.06 L329.32 694.06 L329.32 662.49" class="st8"/>
+			<text x="24.69" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Logical output segment</text>		</g>
+		<g id="shape38-58" v:mID="38" v:groupContext="shape" transform="translate(1204.65,-254.446)">
+			<title>Sheet.38</title>
+			<desc>Three-part output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="181.707" cy="678.273" width="363.42" height="31.5648"/>
+			<path d="M363.41 662.49 L0 662.49 L0 694.06 L363.41 694.06 L363.41 662.49" class="st8"/>
+			<text x="19.51" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Three-part output segment</text>		</g>
+		<g id="shape39-62" v:mID="39" v:groupContext="shape" transform="translate(546.25,-529.921)">
+			<title>Sheet.39</title>
+			<path d="M3.43 529.94 L3.46 543.61 L0.03 543.61 L0 529.94 L3.43 529.94 L3.43 529.94 ZM3.46 553.87 L3.46 557.29 L0.03
+						 557.29 L0.03 553.87 L3.46 553.87 L3.46 553.87 ZM3.46 567.55 L3.46 581.22 L0.03 581.22 L0.03 567.55 L3.46
+						 567.55 L3.46 567.55 ZM3.46 591.48 L3.46 594.9 L0.03 594.9 L0.03 591.48 L3.46 591.48 L3.46 591.48 ZM3.46
+						 605.16 L3.46 618.83 L0.03 618.83 L0.03 605.16 L3.46 605.16 L3.46 605.16 ZM3.46 629.09 L3.46 632.51 L0.03
+						 632.51 L0.03 629.09 L3.46 629.09 L3.46 629.09 ZM3.46 642.77 L3.46 656.45 L0.03 656.45 L0.03 642.77 L3.46
+						 642.77 L3.46 642.77 ZM3.46 666.7 L3.46 670.12 L0.03 670.12 L0.03 666.7 L3.46 666.7 L3.46 666.7 ZM3.46 680.38
+						 L3.46 694.06 L0.03 694.06 L0.03 680.38 L3.46 680.38 L3.46 680.38 Z" class="st1"/>
+		</g>
+		<g id="shape40-64" v:mID="40" v:groupContext="shape" transform="translate(549.097,-223.749)">
+			<title>Sheet.40</title>
+			<path d="M6.65 402.61 L13.01 414.71 L9.98 416.32 L3.62 404.22 L6.65 402.61 L6.65 402.61 ZM17.82 423.78 L19.4 426.81 L16.37
+						 428.42 L14.79 425.39 L17.82 423.78 L17.82 423.78 ZM24.21 435.91 L30.57 448.01 L27.54 449.59 L21.18 437.49
+						 L24.21 435.91 L24.21 435.91 ZM35.38 457.08 L36.96 460.11 L33.93 461.69 L32.35 458.66 L35.38 457.08 L35.38
+						 457.08 ZM41.73 469.18 L48.12 481.28 L45.09 482.86 L38.7 470.76 L41.73 469.18 L41.73 469.18 ZM52.9 490.36
+						 L54.51 493.38 L51.48 494.99 L49.87 491.97 L52.9 490.36 L52.9 490.36 ZM59.29 502.45 L65.68 514.55 L62.65
+						 516.16 L56.26 504.06 L59.29 502.45 L59.29 502.45 ZM70.46 523.63 L72.07 526.65 L69.04 528.26 L67.43 525.24
+						 L70.46 523.63 L70.46 523.63 ZM76.85 535.76 L83.24 547.86 L80.21 549.43 L73.82 537.34 L76.85 535.76 L76.85
+						 535.76 ZM88.01 556.93 L89.63 559.95 L86.6 561.53 L84.98 558.51 L88.01 556.93 L88.01 556.93 ZM94.4 569.03
+						 L100.79 581.13 L97.76 582.7 L91.37 570.61 L94.4 569.03 L94.4 569.03 ZM105.57 590.2 L107.15 593.22 L104.12
+						 594.84 L102.54 591.81 L105.57 590.2 L105.57 590.2 ZM111.96 602.3 L118.32 614.4 L115.28 616.01 L108.93 603.91
+						 L111.96 602.3 L111.96 602.3 ZM123.12 623.47 L124.71 626.5 L121.67 628.11 L120.09 625.08 L123.12 623.47 L123.12
+						 623.47 ZM129.51 635.6 L135.87 647.7 L132.84 649.28 L126.48 637.18 L129.51 635.6 L129.51 635.6 ZM140.68 656.77
+						 L142.26 659.8 L139.23 661.38 L137.65 658.35 L140.68 656.77 L140.68 656.77 ZM147.04 668.87 L153.43 680.97
+						 L150.4 682.55 L144.01 670.45 L147.04 668.87 L147.04 668.87 ZM158.2 690.04 L159.49 692.48 L156.46 694.06
+						 L155.17 691.66 L158.2 690.04 L158.2 690.04 ZM0.33 408.43 L0 393.7 L11.99 402.28 L0.33 408.43 L0.33 408.43
+						 Z" class="st1"/>
+		</g>
+		<g id="shape46-66" v:mID="46" v:groupContext="shape" transform="translate(66.8445,-221.499)">
+			<title>Sheet.46</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L273.39 595.43 C282.51 595.43 289.86 602.79 289.86 611.87 L289.86
+						 677.62 C289.86 686.72 282.51 694.06 273.39 694.06 L16.47 694.06 C7.38 694.06 -0 686.72 0 677.62 L0 611.87
+						 Z" class="st4"/>
+		</g>
+		<g id="shape47-68" v:mID="47" v:groupContext="shape" transform="translate(66.8445,-221.499)">
+			<title>Sheet.47</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L273.39 595.43 C282.51 595.43 289.86 602.79 289.86 611.87 L289.86
+						 677.62 C289.86 686.72 282.51 694.06 273.39 694.06 L16.47 694.06 C7.38 694.06 -0 686.72 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape48-70" v:mID="48" v:groupContext="shape" transform="translate(113.27,-263.667)">
+			<title>Sheet.48</title>
+			<desc>Direct mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="98.5041" cy="675.59" width="197.01" height="36.9302"/>
+			<path d="M197.01 657.13 L0 657.13 L0 694.06 L197.01 694.06 L197.01 657.13" class="st8"/>
+			<text x="18.66" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Direct mbuf</text>		</g>
+		<g id="shape51-74" v:mID="51" v:groupContext="shape" transform="translate(85.817,-233.439)">
+			<title>Sheet.51</title>
+			<desc>(copy of headers)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="127.916" cy="678.273" width="255.84" height="31.5648"/>
+			<path d="M255.83 662.49 L0 662.49 L0 694.06 L255.83 694.06 L255.83 662.49" class="st8"/>
+			<text x="34.33" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(copy of headers)</text>		</g>
+		<g id="shape53-78" v:mID="53" v:groupContext="shape" transform="translate(371.944,-275.998)">
+			<title>Sheet.53</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="7.56" y="686.16" class="st15" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape54-82" v:mID="54" v:groupContext="shape" transform="translate(695.132,-646.04)">
+			<title>Sheet.54</title>
+			<path d="M0 655.39 L0 694.06 L100.4 694.06 L100.4 655.39 L0 655.39 L0 655.39 Z" class="st16"/>
+		</g>
+		<g id="shape55-84" v:mID="55" v:groupContext="shape" transform="translate(709.033,-648.946)">
+			<title>Sheet.55</title>
+			<desc>segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="36.6265" cy="680.167" width="73.26" height="27.7775"/>
+			<path d="M73.25 666.28 L0 666.28 L0 694.06 L73.25 694.06 L73.25 666.28" class="st8"/>
+			<text x="7.6" y="687.11" class="st17" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>segsz</text>		</g>
+		<g id="shape56-88" v:mID="56" v:groupContext="shape" transform="translate(785.874,-521.182)">
+			<title>Sheet.56</title>
+			<path d="M0 597.27 L0 694.06 L363.41 694.06 L363.41 597.27 L0 597.27 L0 597.27 Z" class="st6"/>
+		</g>
+		<g id="shape57-90" v:mID="57" v:groupContext="shape" transform="translate(785.874,-521.182)">
+			<title>Sheet.57</title>
+			<path d="M0 597.27 L363.41 597.27 L363.41 694.06 L0 694.06 L0 597.27" class="st7"/>
+		</g>
+		<g id="shape58-93" v:mID="58" v:groupContext="shape" v:layerMember="0" transform="translate(943.158,-529.889)">
+			<title>Sheet.58</title>
+			<path d="M1.35 529.91 L1.25 543.58 L4.68 543.61 L4.78 529.94 L1.35 529.91 L1.35 529.91 ZM1.15 553.84 L1.12 557.26 L4.55
+						 557.29 L4.58 553.87 L1.15 553.84 L1.15 553.84 ZM1.05 567.52 L0.92 581.19 L4.35 581.22 L4.48 567.55 L1.05
+						 567.52 L1.05 567.52 ZM0.86 591.45 L0.82 594.87 L4.25 594.9 L4.28 591.48 L0.86 591.45 L0.86 591.45 ZM0.72
+						 605.13 L0.63 618.8 L4.05 618.83 L4.15 605.16 L0.72 605.13 L0.72 605.13 ZM0.53 629.06 L0.53 632.48 L3.95
+						 632.51 L3.95 629.09 L0.53 629.06 L0.53 629.06 ZM0.43 642.74 L0.33 656.41 L3.75 656.45 L3.85 642.77 L0.43
+						 642.74 L0.43 642.74 ZM0.23 666.67 L0.2 670.09 L3.62 670.12 L3.66 666.7 L0.23 666.67 L0.23 666.67 ZM0.13
+						 680.35 L0 694.02 L3.43 694.06 L3.56 680.38 L0.13 680.35 L0.13 680.35 Z" class="st18"/>
+		</g>
+		<g id="shape59-95" v:mID="59" v:groupContext="shape" transform="translate(785.874,-549.473)">
+			<title>Sheet.59</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="77.3395" cy="673.065" width="154.68" height="41.9798"/>
+			<path d="M154.68 652.08 L0 652.08 L0 694.06 L154.68 694.06 L154.68 652.08" class="st8"/>
+			<text x="26.77" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape60-99" v:mID="60" v:groupContext="shape" transform="translate(952.97,-548.822)">
+			<title>Sheet.60</title>
+			<desc>Payload 2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="46.92" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 2</text>		</g>
+		<g id="shape63-103" v:mID="63" v:groupContext="shape" transform="translate(1210.43,-551.684)">
+			<title>Sheet.63</title>
+			<desc>Multi-segment input packet</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="181.707" cy="678.273" width="363.42" height="31.5648"/>
+			<path d="M363.41 662.49 L0 662.49 L0 694.06 L363.41 694.06 L363.41 662.49" class="st8"/>
+			<text x="17.75" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Multi-segment input packet</text>		</g>
+		<g id="shape70-107" v:mID="70" v:groupContext="shape" v:layerMember="1" transform="translate(455.049,-221.499)">
+			<title>Sheet.70</title>
+			<path d="M0 611.87 C0 602.79 5.33 595.43 11.89 595.43 L282.92 595.43 C289.53 595.43 294.8 602.79 294.8 611.87 L294.8
+						 677.62 C294.8 686.76 289.53 694.06 282.92 694.06 L11.89 694.06 C5.33 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st6"/>
+		</g>
+		<g id="shape71-109" v:mID="71" v:groupContext="shape" transform="translate(455.049,-221.499)">
+			<title>Sheet.71</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L391.97 595.43 C401.12 595.43 408.44 602.79 408.44 611.87 L408.44
+						 677.62 C408.44 686.76 401.12 694.06 391.97 694.06 L16.47 694.06 C7.38 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape72-111" v:mID="72" v:groupContext="shape" transform="translate(489.065,-263.434)">
+			<title>Sheet.72</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="112.128" cy="675.59" width="224.26" height="36.9302"/>
+			<path d="M224.26 657.13 L0 657.13 L0 694.06 L224.26 694.06 L224.26 657.13" class="st8"/>
+			<text x="20.73" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape75-115" v:mID="75" v:groupContext="shape" transform="translate(849.065,-281.435)">
+			<title>Sheet.75</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="4.49" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape77-119" v:mID="77" v:groupContext="shape" transform="translate(717.742,-563.523)">
+			<title>Sheet.77</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="15.71" y="683.67" class="st19" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape78-123" v:mID="78" v:groupContext="shape" transform="translate(1148.17,-529.067)">
+			<title>Sheet.78</title>
+			<path d="M1.38 529.87 L1.25 543.55 L4.68 543.61 L4.81 529.94 L1.38 529.87 L1.38 529.87 ZM1.19 553.81 L1.12 557.23 L4.55
+						 557.29 L4.61 553.87 L1.19 553.81 L1.19 553.81 ZM1.05 567.48 L0.92 581.16 L4.35 581.22 L4.48 567.55 L1.05
+						 567.48 L1.05 567.48 ZM0.86 591.42 L0.86 594.84 L4.28 594.9 L4.28 591.48 L0.86 591.42 L0.86 591.42 ZM0.72
+						 605.09 L0.66 618.77 L4.08 618.83 L4.15 605.16 L0.72 605.09 L0.72 605.09 ZM0.53 629.03 L0.53 632.45 L3.95
+						 632.51 L3.95 629.09 L0.53 629.03 L0.53 629.03 ZM0.46 642.7 L0.33 656.38 L3.75 656.45 L3.89 642.77 L0.46
+						 642.7 L0.46 642.7 ZM0.26 666.64 L0.2 670.06 L3.62 670.12 L3.69 666.7 L0.26 666.64 L0.26 666.64 ZM0.13 680.31
+						 L0 693.99 L3.43 694.06 L3.56 680.38 L0.13 680.31 L0.13 680.31 Z" class="st20"/>
+		</g>
+		<g id="shape79-125" v:mID="79" v:groupContext="shape" transform="translate(946.254,-657.81)">
+			<title>Sheet.79</title>
+			<path d="M11 686.69 L17.33 686.69 L17.33 688.27 L11 688.27 L11 686.69 L11 686.69 ZM22.07 686.69 L23.65 686.69 L23.65
+						 688.27 L22.07 688.27 L22.07 686.69 L22.07 686.69 ZM28.39 686.69 L34.72 686.69 L34.72 688.27 L28.39 688.27
+						 L28.39 686.69 L28.39 686.69 ZM39.46 686.69 L41.04 686.69 L41.04 688.27 L39.46 688.27 L39.46 686.69 L39.46
+						 686.69 ZM45.78 686.69 L52.11 686.69 L52.11 688.27 L45.78 688.27 L45.78 686.69 L45.78 686.69 ZM56.85 686.69
+						 L58.43 686.69 L58.43 688.27 L56.85 688.27 L56.85 686.69 L56.85 686.69 ZM63.18 686.69 L69.5 686.69 L69.5
+						 688.27 L63.18 688.27 L63.18 686.69 L63.18 686.69 ZM74.24 686.69 L75.82 686.69 L75.82 688.27 L74.24 688.27
+						 L74.24 686.69 L74.24 686.69 ZM80.57 686.69 L86.89 686.69 L86.89 688.27 L80.57 688.27 L80.57 686.69 L80.57
+						 686.69 ZM91.63 686.69 L93.22 686.69 L93.22 688.27 L91.63 688.27 L91.63 686.69 L91.63 686.69 ZM97.96 686.69
+						 L104.28 686.69 L104.28 688.27 L97.96 688.27 L97.96 686.69 L97.96 686.69 ZM109.03 686.69 L110.61 686.69 L110.61
+						 688.27 L109.03 688.27 L109.03 686.69 L109.03 686.69 ZM115.35 686.69 L121.67 686.69 L121.67 688.27 L115.35
+						 688.27 L115.35 686.69 L115.35 686.69 ZM126.42 686.69 L128 686.69 L128 688.27 L126.42 688.27 L126.42 686.69
+						 L126.42 686.69 ZM132.74 686.69 L139.07 686.69 L139.07 688.27 L132.74 688.27 L132.74 686.69 L132.74 686.69
+						 ZM143.81 686.69 L145.39 686.69 L145.39 688.27 L143.81 688.27 L143.81 686.69 L143.81 686.69 ZM150.13 686.69
+						 L156.46 686.69 L156.46 688.27 L150.13 688.27 L150.13 686.69 L150.13 686.69 ZM161.2 686.69 L162.78 686.69
+						 L162.78 688.27 L161.2 688.27 L161.2 686.69 L161.2 686.69 ZM167.53 686.69 L173.85 686.69 L173.85 688.27 L167.53
+						 688.27 L167.53 686.69 L167.53 686.69 ZM178.59 686.69 L180.17 686.69 L180.17 688.27 L178.59 688.27 L178.59
+						 686.69 L178.59 686.69 ZM184.92 686.69 L189.4 686.69 L189.4 688.27 L184.92 688.27 L184.92 686.69 L184.92
+						 686.69 ZM13.18 694.06 L0 687.41 L13.18 680.9 L13.18 694.06 L13.18 694.06 ZM187.22 680.9 L200.4 687.48 L187.22
+						 694.06 L187.22 680.9 L187.22 680.9 Z" class="st20"/>
+		</g>
+		<g id="shape80-127" v:mID="80" v:groupContext="shape" transform="translate(982.882,-643.673)">
+			<title>Sheet.80</title>
+			<path d="M0 655.13 L0 694.06 L127.01 694.06 L127.01 655.13 L0 655.13 L0 655.13 Z" class="st16"/>
+		</g>
+		<g id="shape81-129" v:mID="81" v:groupContext="shape" transform="translate(1003.39,-660.621)">
+			<title>Sheet.81</title>
+			<desc>pkt_len</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="48.6041" cy="680.956" width="97.21" height="26.1994"/>
+			<path d="M97.21 667.86 L0 667.86 L0 694.06 L97.21 694.06 L97.21 667.86" class="st8"/>
+			<text x="11.67" y="687.5" class="st21" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>pkt_len  </text>		</g>
+		<g id="shape82-133" v:mID="82" v:groupContext="shape" transform="translate(1001.18,-634.321)">
+			<title>Sheet.82</title>
+			<desc>% segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="49.2945" cy="680.956" width="98.59" height="26.1994"/>
+			<path d="M98.59 667.86 L0 667.86 L0 694.06 L98.59 694.06 L98.59 667.86" class="st8"/>
+			<text x="9.09" y="687.5" class="st21" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>% segsz</text>		</g>
+		<g id="shape34-137" v:mID="34" v:groupContext="shape" v:layerMember="0" transform="translate(356.703,-264.106)">
+			<title>Sheet.34</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape85-139" v:mID="85" v:groupContext="shape" v:layerMember="0" transform="translate(78.5359,-282.66)">
+			<title>Sheet.85</title>
+			<path d="M0 680.87 C-0 673.59 6.88 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.88 694.06 0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape87-141" v:mID="87" v:groupContext="shape" v:layerMember="0" transform="translate(85.4791,-284.062)">
+			<title>Sheet.87</title>
+			<desc>1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>1</text>		</g>
+		<g id="shape88-145" v:mID="88" v:groupContext="shape" v:layerMember="0" transform="translate(468.906,-282.66)">
+			<title>Sheet.88</title>
+			<path d="M0 680.87 C-0 673.59 6.89 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.89 694.06 -0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape90-147" v:mID="90" v:groupContext="shape" v:layerMember="0" transform="translate(474.575,-284.062)">
+			<title>Sheet.90</title>
+			<desc>2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>2</text>		</g>
+		<g id="shape95-151" v:mID="95" v:groupContext="shape" v:layerMember="0" transform="translate(764.026,-275.998)">
+			<title>Sheet.95</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="7.56" y="686.16" class="st15" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape97-155" v:mID="97" v:groupContext="shape" v:layerMember="0" transform="translate(889.755,-220.915)">
+			<title>Sheet.97</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L391.97 595.43 C401.12 595.43 408.44 602.79 408.44 611.87 L408.44
+						 677.62 C408.44 686.76 401.12 694.06 391.97 694.06 L16.47 694.06 C7.38 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape100-157" v:mID="100" v:groupContext="shape" v:layerMember="0" transform="translate(751.857,-262.528)">
+			<title>Sheet.100</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape104-159" v:mID="104" v:groupContext="shape" v:layerMember="1" transform="translate(851.429,-218.08)">
+			<title>Sheet.104</title>
+			<path d="M0 611.87 C0 602.79 5.33 595.43 11.89 595.43 L282.92 595.43 C289.53 595.43 294.8 602.79 294.8 611.87 L294.8
+						 677.62 C294.8 686.76 289.53 694.06 282.92 694.06 L11.89 694.06 C5.33 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st6"/>
+		</g>
+		<g id="shape105-161" v:mID="105" v:groupContext="shape" v:layerMember="0" transform="translate(885.444,-260.015)">
+			<title>Sheet.105</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="112.128" cy="675.59" width="224.26" height="36.9302"/>
+			<path d="M224.26 657.13 L0 657.13 L0 694.06 L224.26 694.06 L224.26 657.13" class="st8"/>
+			<text x="20.73" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape106-165" v:mID="106" v:groupContext="shape" v:layerMember="0" transform="translate(895.672,-229.419)">
+			<title>Sheet.106</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="12.86" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape107-169" v:mID="107" v:groupContext="shape" v:layerMember="0" transform="translate(863.297,-280.442)">
+			<title>Sheet.107</title>
+			<path d="M0 680.87 C-0 673.59 6.89 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.89 694.06 -0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape108-171" v:mID="108" v:groupContext="shape" v:layerMember="0" transform="translate(870.001,-281.547)">
+			<title>Sheet.108</title>
+			<desc>3</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>3</text>		</g>
+		<g id="shape109-175" v:mID="109" v:groupContext="shape" v:layerMember="0" transform="translate(500.959,-231.87)">
+			<title>Sheet.109</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="12.86" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+	</g>
+</svg>
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 40f04a1..c7c8b17 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -56,6 +56,7 @@ Programmer's Guide
     reorder_lib
     ip_fragment_reassembly_lib
     generic_receive_offload_lib
+    generic_segmentation_offload_lib
     pdump_lib
     multi_proc_support
     kernel_nic_interface
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* Re: [PATCH v9 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-10-05 20:36               ` [PATCH v9 " Mark Kavanagh
@ 2017-10-05 22:24                 ` Ananyev, Konstantin
  2017-10-06  8:24                   ` FW: " Kavanagh, Mark B
  2017-10-06 10:35                   ` Kavanagh, Mark B
  2017-10-06 23:32                 ` Ferruh Yigit
  2017-10-07 14:56                 ` [PATCH v10 " Jiayu Hu
  2 siblings, 2 replies; 157+ messages in thread
From: Ananyev, Konstantin @ 2017-10-05 22:24 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Thursday, October 5, 2017 9:37 PM
> To: dev@dpdk.org
> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> Subject: [PATCH v9 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
> 
> Generic Segmentation Offload (GSO) is a SW technique to split large
> packets into small ones. Akin to TSO, GSO enables applications to
> operate on large packets, thus reducing per-packet processing overhead.
> 
> To enable more flexibility to applications, DPDK GSO is implemented
> as a standalone library. Applications explicitly use the GSO library
> to segment packets. This patch adds GSO support to DPDK for specific
> packet types: specifically, TCP/IPv4, VxLAN, and GRE.
> 
> The first patch introduces the GSO API framework. The second patch
> adds GSO support for TCP/IPv4 packets (containing an optional VLAN
> tag). The third patch adds GSO support for VxLAN packets that contain
> outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or
> outer VLAN tags). The fourth patch adds GSO support for GRE packets
> that contain outer IPv4, and inner TCP/IPv4 headers (with optional
> outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
> and GRE GSO in testpmd's checksum forwarding engine. The final patch
> in the series adds GSO documentation to the programmer's guide.
> 
> Performance Testing
> ===================
> The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
> iperf. Setup for the test is described as follows:
> 
> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>    machine, together physically.
> b. Launch testpmd with P0 and a vhost-user port, and use csum
>    forwarding engine with "retry".
> c. Select IP and TCP HW checksum calculation for P0; select TCP HW
>    checksum calculation for vhost-user port.
> d. Launch a VM with csum and tso offloading enabled.
> e. Run iperf-client on virtio-net port in the VM to send TCP packets.
>    With enabling csum and tso, the VM can send large TCP/IPv4 packets
>    (mss is up to 64KB).
> f. P1 is assigned to linux kernel and enabled kernel GRO. Run
>    iperf-server on P1.
> 
> We conduct three iperf tests:
> 
> test-1: enable GSO for P0 in testpmd, and set max GSO segment length
>     to 1518B. Run two iperf-client in the VM.
> test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
>     two iperf-client in the VM.
> test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.
> 
> Throughput of the above three tests:
> 
> test-1: 9.4Gbps
> test-2: 9.5Gbps
> test-3: 3Mbps
> 
> Functional Testing
> ==================
> Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
> length of tunneled packets from VMs is 1514B. So current experiment
> method can't be used to measure VxLAN and GRE GSO performance, but simply
> test the functionality via setting small GSO segment length (e.g. 500B).
> 
> VxLAN
> -----
> To test VxLAN GSO functionality, we use the following setup:
> 
> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>    machine, together physically.
> b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
>    engine with "retry".
> c. Testpmd commands:
>     - csum parse_tunnel on "P0"
>     - csum parse_tunnel on "vhost-user port"
>     - csum set outer-ip hw "P0"
>     - csum set ip hw "P0"
>     - csum set tcp hw "P0"
>     - csum set tcp hw "vhost-user port"
>     - set port "P0" gso on
>     - set gso segsz 500
> d. Launch a VM with csum and tso offloading enabled.
> e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
>    on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
>    max packet length is 1514B.
> f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
>    create a VxLAN port for P1, and run iperf-server on the VxLAN port.
> 
> In testpmd, we can see the length of all packets sent from P0 is smaller
> than or equal to 500B. Additionally, the packets arriving in P1 is
> encapsulated and is smaller than or equal to 500B.
> 
> GRE
> ---
> The same process may be used to test GRE functionality, with the exception that
> the tunnel type created for both the guest's virtio-net, and the host's kernel
> interfaces is GRE:
>    `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`
> 
> As in the VxLAN testcase, the length of packets sent from P0, and received on
> P1, is less than 500B.
> 
> Change log
> ==========
> v9:
> - fix testpmd build for i686 target
> - change log level from WARNING to DEBUG in the case of unsupported packet
>   (rte_gso_segment())
> 
> v8:
> - resolve coding style infractions (indentation).
> - centralize invalid parameter checking for rte_gso_segment() into a single
>   'if' statement.
> - don't clear PKT_TX_TCP_SEG flag for packets that don't qualify for GSO
>   on account of invalid params.
> - allow GSO for tunneled packets only via gso_ctx (by correcting 'if'
>   statement condition).
> 
> v7:
> - add RTE_GSO_SEG_SIZE_MIN macro; use this to validate gso_ctx.gso_segsz.
> - rename 'ipid_flag' member of gso_ctx to 'flag'.
> - remove mention of VLAN tags in supported packet types.
> - don't clear PKT_TX_TCP_SEG flag if GSO fails.
> - take all packet overhead into account when checking for empty packet.
> - ensure that only enabled GSO types are enacted upon (i.e. no fall-through to
>   TCP/IPv4 case from tunneled case).
> - validate user-supplied gso segsz arg against RTE_GSO_SEG_SIZE_MIN in testpmd.
> - simplify error-checking/handling for GSO failure case in testpmd csum engine.
> - use 0 instead of !RTE_GSO_IPID_FIXED in testpmd.
> 
> v6:
> - rebase to HEAD of master (i5dce9fcA)
> - remove 'l3_offset' parameter from 'update_ipv4_tcp_headers'
> 
> v5:
> - add GSO section to the programmer's guide.
> - use MF or (previously 'and') offset to check if a packet is IP
>   fragmented.
> - move 'update_header' helper functions to gso_common.h.
> - move txp/ipv4 'update_header' function to gso_tcp4.c.
> - move tunnel 'update_header' function to gso_tunnel_tcp4.c.
> - add offset parameter to 'update_header' functions.
> - combine GRE and VxLAN tunnel header update functions into a single
>   function.
> - correct typos and errors in comments/commit messages.
> 
> v4:
> - use ol_flags instead of packet_type to decide which segmentation
>   function to use.
> - use MF and offset to check if a packet is IP fragmented, instead of
>   using DF.
> - remove ETHER_CRC_LEN from gso segment payload length calculation.
> - refactor internal header update and other functions.
> - remove RTE_GSO_IPID_INCREASE.
> - add some of GSO documents.
> - set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
>   packets sent from GSO-enabled ports in testpmd.
> v3:
> - support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
>   RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
>   UNKNOWN.
> - fill mbuf->packet_type instead of using rte_net_get_ptype() in
>   csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
> - store the input packet into pkts_out inside gso_tcp4_segment() and
>   gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
>   is performed.
> - add missing incldues.
> - optimize file names, function names and function description.
> - fix one bug in testpmd.
> v2:
> - merge data segments whose data_len is less than mss into a large data
>   segment in gso_do_segment().
> - use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
>   header in rte_gso_segment().
> - provide IP id macros for applications to select fixed or incremental IP
>   ids.
> 
> Jiayu Hu (3):
>   gso: add Generic Segmentation Offload API framework
>   gso: add TCP/IPv4 GSO support
>   app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
> 
> Mark Kavanagh (3):
>   gso: add VxLAN GSO support
>   gso: add GRE GSO support
>   doc: add GSO programmer's guide
> 
>  MAINTAINERS                                        |   6 +
>  app/test-pmd/cmdline.c                             | 179 ++++++++
>  app/test-pmd/config.c                              |  24 ++
>  app/test-pmd/csumonly.c                            |  42 +-
>  app/test-pmd/testpmd.c                             |  13 +
>  app/test-pmd/testpmd.h                             |  10 +
>  config/common_base                                 |   5 +
>  doc/api/doxy-api-index.md                          |   1 +
>  doc/api/doxy-api.conf                              |   1 +
>  .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
>  .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
>  doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
>  doc/guides/prog_guide/index.rst                    |   1 +
>  doc/guides/rel_notes/release_17_11.rst             |  17 +
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  46 ++
>  lib/Makefile                                       |   2 +
>  lib/librte_eal/common/include/rte_log.h            |   1 +
>  lib/librte_gso/Makefile                            |  52 +++
>  lib/librte_gso/gso_common.c                        | 153 +++++++
>  lib/librte_gso/gso_common.h                        | 171 ++++++++
>  lib/librte_gso/gso_tcp4.c                          | 104 +++++
>  lib/librte_gso/gso_tcp4.h                          |  74 ++++
>  lib/librte_gso/gso_tunnel_tcp4.c                   | 126 ++++++
>  lib/librte_gso/gso_tunnel_tcp4.h                   |  75 ++++
>  lib/librte_gso/rte_gso.c                           | 110 +++++
>  lib/librte_gso/rte_gso.h                           | 148 +++++++
>  lib/librte_gso/rte_gso_version.map                 |   7 +
>  mk/rte.app.mk                                      |   1 +
>  28 files changed, 2411 insertions(+), 4 deletions(-)
>  create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
>  create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
>  create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
>  create mode 100644 lib/librte_gso/Makefile
>  create mode 100644 lib/librte_gso/gso_common.c
>  create mode 100644 lib/librte_gso/gso_common.h
>  create mode 100644 lib/librte_gso/gso_tcp4.c
>  create mode 100644 lib/librte_gso/gso_tcp4.h
>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
>  create mode 100644 lib/librte_gso/rte_gso.c
>  create mode 100644 lib/librte_gso/rte_gso.h
>  create mode 100644 lib/librte_gso/rte_gso_version.map
> 
> --

Series-Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* FW: [PATCH v9 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-10-05 22:24                 ` Ananyev, Konstantin
@ 2017-10-06  8:24                   ` Kavanagh, Mark B
  2017-10-06 10:35                   ` Kavanagh, Mark B
  1 sibling, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-10-06  8:24 UTC (permalink / raw)
  To: Loftus, Ciara, Gray, Mark D, dev; +Cc: Keane, Lorna

FYI - GSO series has been acked, so it will be included in DPDK v17.11 release.
-Mark

>-----Original Message-----
>From: Ananyev, Konstantin
>Sent: Thursday, October 5, 2017 11:24 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>Subject: RE: [PATCH v9 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
>
>
>
>> -----Original Message-----
>> From: Kavanagh, Mark B
>> Sent: Thursday, October 5, 2017 9:37 PM
>> To: dev@dpdk.org
>> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
>> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>
>> Subject: [PATCH v9 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
>>
>> Generic Segmentation Offload (GSO) is a SW technique to split large
>> packets into small ones. Akin to TSO, GSO enables applications to
>> operate on large packets, thus reducing per-packet processing overhead.
>>
>> To enable more flexibility to applications, DPDK GSO is implemented
>> as a standalone library. Applications explicitly use the GSO library
>> to segment packets. This patch adds GSO support to DPDK for specific
>> packet types: specifically, TCP/IPv4, VxLAN, and GRE.
>>
>> The first patch introduces the GSO API framework. The second patch
>> adds GSO support for TCP/IPv4 packets (containing an optional VLAN
>> tag). The third patch adds GSO support for VxLAN packets that contain
>> outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or
>> outer VLAN tags). The fourth patch adds GSO support for GRE packets
>> that contain outer IPv4, and inner TCP/IPv4 headers (with optional
>> outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
>> and GRE GSO in testpmd's checksum forwarding engine. The final patch
>> in the series adds GSO documentation to the programmer's guide.
>>
>> Performance Testing
>> ===================
>> The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
>> iperf. Setup for the test is described as follows:
>>
>> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>>    machine, together physically.
>> b. Launch testpmd with P0 and a vhost-user port, and use csum
>>    forwarding engine with "retry".
>> c. Select IP and TCP HW checksum calculation for P0; select TCP HW
>>    checksum calculation for vhost-user port.
>> d. Launch a VM with csum and tso offloading enabled.
>> e. Run iperf-client on virtio-net port in the VM to send TCP packets.
>>    With enabling csum and tso, the VM can send large TCP/IPv4 packets
>>    (mss is up to 64KB).
>> f. P1 is assigned to linux kernel and enabled kernel GRO. Run
>>    iperf-server on P1.
>>
>> We conduct three iperf tests:
>>
>> test-1: enable GSO for P0 in testpmd, and set max GSO segment length
>>     to 1518B. Run two iperf-client in the VM.
>> test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
>>     two iperf-client in the VM.
>> test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.
>>
>> Throughput of the above three tests:
>>
>> test-1: 9.4Gbps
>> test-2: 9.5Gbps
>> test-3: 3Mbps
>>
>> Functional Testing
>> ==================
>> Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
>> length of tunneled packets from VMs is 1514B. So current experiment
>> method can't be used to measure VxLAN and GRE GSO performance, but simply
>> test the functionality via setting small GSO segment length (e.g. 500B).
>>
>> VxLAN
>> -----
>> To test VxLAN GSO functionality, we use the following setup:
>>
>> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>>    machine, together physically.
>> b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
>>    engine with "retry".
>> c. Testpmd commands:
>>     - csum parse_tunnel on "P0"
>>     - csum parse_tunnel on "vhost-user port"
>>     - csum set outer-ip hw "P0"
>>     - csum set ip hw "P0"
>>     - csum set tcp hw "P0"
>>     - csum set tcp hw "vhost-user port"
>>     - set port "P0" gso on
>>     - set gso segsz 500
>> d. Launch a VM with csum and tso offloading enabled.
>> e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
>>    on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
>>    max packet length is 1514B.
>> f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
>>    create a VxLAN port for P1, and run iperf-server on the VxLAN port.
>>
>> In testpmd, we can see the length of all packets sent from P0 is smaller
>> than or equal to 500B. Additionally, the packets arriving in P1 is
>> encapsulated and is smaller than or equal to 500B.
>>
>> GRE
>> ---
>> The same process may be used to test GRE functionality, with the exception
>that
>> the tunnel type created for both the guest's virtio-net, and the host's
>kernel
>> interfaces is GRE:
>>    `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`
>>
>> As in the VxLAN testcase, the length of packets sent from P0, and received
>on
>> P1, is less than 500B.
>>
>> Change log
>> ==========
>> v9:
>> - fix testpmd build for i686 target
>> - change log level from WARNING to DEBUG in the case of unsupported packet
>>   (rte_gso_segment())
>>
>> v8:
>> - resolve coding style infractions (indentation).
>> - centralize invalid parameter checking for rte_gso_segment() into a single
>>   'if' statement.
>> - don't clear PKT_TX_TCP_SEG flag for packets that don't qualify for GSO
>>   on account of invalid params.
>> - allow GSO for tunneled packets only via gso_ctx (by correcting 'if'
>>   statement condition).
>>
>> v7:
>> - add RTE_GSO_SEG_SIZE_MIN macro; use this to validate gso_ctx.gso_segsz.
>> - rename 'ipid_flag' member of gso_ctx to 'flag'.
>> - remove mention of VLAN tags in supported packet types.
>> - don't clear PKT_TX_TCP_SEG flag if GSO fails.
>> - take all packet overhead into account when checking for empty packet.
>> - ensure that only enabled GSO types are enacted upon (i.e. no fall-through
>to
>>   TCP/IPv4 case from tunneled case).
>> - validate user-supplied gso segsz arg against RTE_GSO_SEG_SIZE_MIN in
>testpmd.
>> - simplify error-checking/handling for GSO failure case in testpmd csum
>engine.
>> - use 0 instead of !RTE_GSO_IPID_FIXED in testpmd.
>>
>> v6:
>> - rebase to HEAD of master (i5dce9fcA)
>> - remove 'l3_offset' parameter from 'update_ipv4_tcp_headers'
>>
>> v5:
>> - add GSO section to the programmer's guide.
>> - use MF or (previously 'and') offset to check if a packet is IP
>>   fragmented.
>> - move 'update_header' helper functions to gso_common.h.
>> - move txp/ipv4 'update_header' function to gso_tcp4.c.
>> - move tunnel 'update_header' function to gso_tunnel_tcp4.c.
>> - add offset parameter to 'update_header' functions.
>> - combine GRE and VxLAN tunnel header update functions into a single
>>   function.
>> - correct typos and errors in comments/commit messages.
>>
>> v4:
>> - use ol_flags instead of packet_type to decide which segmentation
>>   function to use.
>> - use MF and offset to check if a packet is IP fragmented, instead of
>>   using DF.
>> - remove ETHER_CRC_LEN from gso segment payload length calculation.
>> - refactor internal header update and other functions.
>> - remove RTE_GSO_IPID_INCREASE.
>> - add some of GSO documents.
>> - set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
>>   packets sent from GSO-enabled ports in testpmd.
>> v3:
>> - support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
>>   RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
>>   UNKNOWN.
>> - fill mbuf->packet_type instead of using rte_net_get_ptype() in
>>   csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
>> - store the input packet into pkts_out inside gso_tcp4_segment() and
>>   gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
>>   is performed.
>> - add missing incldues.
>> - optimize file names, function names and function description.
>> - fix one bug in testpmd.
>> v2:
>> - merge data segments whose data_len is less than mss into a large data
>>   segment in gso_do_segment().
>> - use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
>>   header in rte_gso_segment().
>> - provide IP id macros for applications to select fixed or incremental IP
>>   ids.
>>
>> Jiayu Hu (3):
>>   gso: add Generic Segmentation Offload API framework
>>   gso: add TCP/IPv4 GSO support
>>   app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
>>
>> Mark Kavanagh (3):
>>   gso: add VxLAN GSO support
>>   gso: add GRE GSO support
>>   doc: add GSO programmer's guide
>>
>>  MAINTAINERS                                        |   6 +
>>  app/test-pmd/cmdline.c                             | 179 ++++++++
>>  app/test-pmd/config.c                              |  24 ++
>>  app/test-pmd/csumonly.c                            |  42 +-
>>  app/test-pmd/testpmd.c                             |  13 +
>>  app/test-pmd/testpmd.h                             |  10 +
>>  config/common_base                                 |   5 +
>>  doc/api/doxy-api-index.md                          |   1 +
>>  doc/api/doxy-api.conf                              |   1 +
>>  .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
>>  .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
>>  doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477
>+++++++++++++++++++++
>>  doc/guides/prog_guide/index.rst                    |   1 +
>>  doc/guides/rel_notes/release_17_11.rst             |  17 +
>>  doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  46 ++
>>  lib/Makefile                                       |   2 +
>>  lib/librte_eal/common/include/rte_log.h            |   1 +
>>  lib/librte_gso/Makefile                            |  52 +++
>>  lib/librte_gso/gso_common.c                        | 153 +++++++
>>  lib/librte_gso/gso_common.h                        | 171 ++++++++
>>  lib/librte_gso/gso_tcp4.c                          | 104 +++++
>>  lib/librte_gso/gso_tcp4.h                          |  74 ++++
>>  lib/librte_gso/gso_tunnel_tcp4.c                   | 126 ++++++
>>  lib/librte_gso/gso_tunnel_tcp4.h                   |  75 ++++
>>  lib/librte_gso/rte_gso.c                           | 110 +++++
>>  lib/librte_gso/rte_gso.h                           | 148 +++++++
>>  lib/librte_gso/rte_gso_version.map                 |   7 +
>>  mk/rte.app.mk                                      |   1 +
>>  28 files changed, 2411 insertions(+), 4 deletions(-)
>>  create mode 100644
>doc/guides/prog_guide/generic_segmentation_offload_lib.rst
>>  create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
>>  create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
>>  create mode 100644 lib/librte_gso/Makefile
>>  create mode 100644 lib/librte_gso/gso_common.c
>>  create mode 100644 lib/librte_gso/gso_common.h
>>  create mode 100644 lib/librte_gso/gso_tcp4.c
>>  create mode 100644 lib/librte_gso/gso_tcp4.h
>>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
>>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
>>  create mode 100644 lib/librte_gso/rte_gso.c
>>  create mode 100644 lib/librte_gso/rte_gso.h
>>  create mode 100644 lib/librte_gso/rte_gso_version.map
>>
>> --
>
>Series-Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
>
>> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v9 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-10-05 22:24                 ` Ananyev, Konstantin
  2017-10-06  8:24                   ` FW: " Kavanagh, Mark B
@ 2017-10-06 10:35                   ` Kavanagh, Mark B
  1 sibling, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-10-06 10:35 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev; +Cc: Hu, Jiayu, Tan, Jianfeng, Yigit, Ferruh, thomas



>-----Original Message-----
>From: Ananyev, Konstantin
>Sent: Thursday, October 5, 2017 11:24 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Yigit, Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
>Subject: RE: [PATCH v9 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
>
>
>
>> -----Original Message-----
>> From: Kavanagh, Mark B
>> Sent: Thursday, October 5, 2017 9:37 PM
>> To: dev@dpdk.org
>> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit,
>> Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>
>> Subject: [PATCH v9 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
>>
>> Generic Segmentation Offload (GSO) is a SW technique to split large
>> packets into small ones. Akin to TSO, GSO enables applications to
>> operate on large packets, thus reducing per-packet processing overhead.
>>
>> To enable more flexibility to applications, DPDK GSO is implemented
>> as a standalone library. Applications explicitly use the GSO library
>> to segment packets. This patch adds GSO support to DPDK for specific
>> packet types: specifically, TCP/IPv4, VxLAN, and GRE.
>>
>> The first patch introduces the GSO API framework. The second patch
>> adds GSO support for TCP/IPv4 packets (containing an optional VLAN
>> tag). The third patch adds GSO support for VxLAN packets that contain
>> outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or
>> outer VLAN tags). The fourth patch adds GSO support for GRE packets
>> that contain outer IPv4, and inner TCP/IPv4 headers (with optional
>> outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
>> and GRE GSO in testpmd's checksum forwarding engine. The final patch
>> in the series adds GSO documentation to the programmer's guide.
>>
>> Performance Testing
>> ===================
>> The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
>> iperf. Setup for the test is described as follows:
>>
>> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>>    machine, together physically.
>> b. Launch testpmd with P0 and a vhost-user port, and use csum
>>    forwarding engine with "retry".
>> c. Select IP and TCP HW checksum calculation for P0; select TCP HW
>>    checksum calculation for vhost-user port.
>> d. Launch a VM with csum and tso offloading enabled.
>> e. Run iperf-client on virtio-net port in the VM to send TCP packets.
>>    With enabling csum and tso, the VM can send large TCP/IPv4 packets
>>    (mss is up to 64KB).
>> f. P1 is assigned to linux kernel and enabled kernel GRO. Run
>>    iperf-server on P1.
>>
>> We conduct three iperf tests:
>>
>> test-1: enable GSO for P0 in testpmd, and set max GSO segment length
>>     to 1518B. Run two iperf-client in the VM.
>> test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
>>     two iperf-client in the VM.
>> test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.
>>
>> Throughput of the above three tests:
>>
>> test-1: 9.4Gbps
>> test-2: 9.5Gbps
>> test-3: 3Mbps
>>
>> Functional Testing
>> ==================
>> Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
>> length of tunneled packets from VMs is 1514B. So current experiment
>> method can't be used to measure VxLAN and GRE GSO performance, but simply
>> test the functionality via setting small GSO segment length (e.g. 500B).
>>
>> VxLAN
>> -----
>> To test VxLAN GSO functionality, we use the following setup:
>>
>> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>>    machine, together physically.
>> b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
>>    engine with "retry".
>> c. Testpmd commands:
>>     - csum parse_tunnel on "P0"
>>     - csum parse_tunnel on "vhost-user port"
>>     - csum set outer-ip hw "P0"
>>     - csum set ip hw "P0"
>>     - csum set tcp hw "P0"
>>     - csum set tcp hw "vhost-user port"
>>     - set port "P0" gso on
>>     - set gso segsz 500
>> d. Launch a VM with csum and tso offloading enabled.
>> e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
>>    on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
>>    max packet length is 1514B.
>> f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
>>    create a VxLAN port for P1, and run iperf-server on the VxLAN port.
>>
>> In testpmd, we can see the length of all packets sent from P0 is smaller
>> than or equal to 500B. Additionally, the packets arriving in P1 is
>> encapsulated and is smaller than or equal to 500B.
>>
>> GRE
>> ---
>> The same process may be used to test GRE functionality, with the exception
>that
>> the tunnel type created for both the guest's virtio-net, and the host's
>kernel
>> interfaces is GRE:
>>    `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`
>>
>> As in the VxLAN testcase, the length of packets sent from P0, and received
>on
>> P1, is less than 500B.
>>
>> Change log
>> ==========
>> v9:
>> - fix testpmd build for i686 target
>> - change log level from WARNING to DEBUG in the case of unsupported packet
>>   (rte_gso_segment())
>>
>> v8:
>> - resolve coding style infractions (indentation).
>> - centralize invalid parameter checking for rte_gso_segment() into a single
>>   'if' statement.
>> - don't clear PKT_TX_TCP_SEG flag for packets that don't qualify for GSO
>>   on account of invalid params.
>> - allow GSO for tunneled packets only via gso_ctx (by correcting 'if'
>>   statement condition).
>>
>> v7:
>> - add RTE_GSO_SEG_SIZE_MIN macro; use this to validate gso_ctx.gso_segsz.
>> - rename 'ipid_flag' member of gso_ctx to 'flag'.
>> - remove mention of VLAN tags in supported packet types.
>> - don't clear PKT_TX_TCP_SEG flag if GSO fails.
>> - take all packet overhead into account when checking for empty packet.
>> - ensure that only enabled GSO types are enacted upon (i.e. no fall-through
>to
>>   TCP/IPv4 case from tunneled case).
>> - validate user-supplied gso segsz arg against RTE_GSO_SEG_SIZE_MIN in
>testpmd.
>> - simplify error-checking/handling for GSO failure case in testpmd csum
>engine.
>> - use 0 instead of !RTE_GSO_IPID_FIXED in testpmd.
>>
>> v6:
>> - rebase to HEAD of master (i5dce9fcA)
>> - remove 'l3_offset' parameter from 'update_ipv4_tcp_headers'
>>
>> v5:
>> - add GSO section to the programmer's guide.
>> - use MF or (previously 'and') offset to check if a packet is IP
>>   fragmented.
>> - move 'update_header' helper functions to gso_common.h.
>> - move txp/ipv4 'update_header' function to gso_tcp4.c.
>> - move tunnel 'update_header' function to gso_tunnel_tcp4.c.
>> - add offset parameter to 'update_header' functions.
>> - combine GRE and VxLAN tunnel header update functions into a single
>>   function.
>> - correct typos and errors in comments/commit messages.
>>
>> v4:
>> - use ol_flags instead of packet_type to decide which segmentation
>>   function to use.
>> - use MF and offset to check if a packet is IP fragmented, instead of
>>   using DF.
>> - remove ETHER_CRC_LEN from gso segment payload length calculation.
>> - refactor internal header update and other functions.
>> - remove RTE_GSO_IPID_INCREASE.
>> - add some of GSO documents.
>> - set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
>>   packets sent from GSO-enabled ports in testpmd.
>> v3:
>> - support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
>>   RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
>>   UNKNOWN.
>> - fill mbuf->packet_type instead of using rte_net_get_ptype() in
>>   csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
>> - store the input packet into pkts_out inside gso_tcp4_segment() and
>>   gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
>>   is performed.
>> - add missing incldues.
>> - optimize file names, function names and function description.
>> - fix one bug in testpmd.
>> v2:
>> - merge data segments whose data_len is less than mss into a large data
>>   segment in gso_do_segment().
>> - use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
>>   header in rte_gso_segment().
>> - provide IP id macros for applications to select fixed or incremental IP
>>   ids.
>>
>> Jiayu Hu (3):
>>   gso: add Generic Segmentation Offload API framework
>>   gso: add TCP/IPv4 GSO support
>>   app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
>>
>> Mark Kavanagh (3):
>>   gso: add VxLAN GSO support
>>   gso: add GRE GSO support
>>   doc: add GSO programmer's guide
>>
>>  MAINTAINERS                                        |   6 +
>>  app/test-pmd/cmdline.c                             | 179 ++++++++
>>  app/test-pmd/config.c                              |  24 ++
>>  app/test-pmd/csumonly.c                            |  42 +-
>>  app/test-pmd/testpmd.c                             |  13 +
>>  app/test-pmd/testpmd.h                             |  10 +
>>  config/common_base                                 |   5 +
>>  doc/api/doxy-api-index.md                          |   1 +
>>  doc/api/doxy-api.conf                              |   1 +
>>  .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
>>  .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
>>  doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477
>+++++++++++++++++++++
>>  doc/guides/prog_guide/index.rst                    |   1 +
>>  doc/guides/rel_notes/release_17_11.rst             |  17 +
>>  doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  46 ++
>>  lib/Makefile                                       |   2 +
>>  lib/librte_eal/common/include/rte_log.h            |   1 +
>>  lib/librte_gso/Makefile                            |  52 +++
>>  lib/librte_gso/gso_common.c                        | 153 +++++++
>>  lib/librte_gso/gso_common.h                        | 171 ++++++++
>>  lib/librte_gso/gso_tcp4.c                          | 104 +++++
>>  lib/librte_gso/gso_tcp4.h                          |  74 ++++
>>  lib/librte_gso/gso_tunnel_tcp4.c                   | 126 ++++++
>>  lib/librte_gso/gso_tunnel_tcp4.h                   |  75 ++++
>>  lib/librte_gso/rte_gso.c                           | 110 +++++
>>  lib/librte_gso/rte_gso.h                           | 148 +++++++
>>  lib/librte_gso/rte_gso_version.map                 |   7 +
>>  mk/rte.app.mk                                      |   1 +
>>  28 files changed, 2411 insertions(+), 4 deletions(-)
>>  create mode 100644
>doc/guides/prog_guide/generic_segmentation_offload_lib.rst
>>  create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
>>  create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
>>  create mode 100644 lib/librte_gso/Makefile
>>  create mode 100644 lib/librte_gso/gso_common.c
>>  create mode 100644 lib/librte_gso/gso_common.h
>>  create mode 100644 lib/librte_gso/gso_tcp4.c
>>  create mode 100644 lib/librte_gso/gso_tcp4.h
>>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
>>  create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
>>  create mode 100644 lib/librte_gso/rte_gso.c
>>  create mode 100644 lib/librte_gso/rte_gso.h
>>  create mode 100644 lib/librte_gso/rte_gso_version.map
>>
>> --
>
>Series-Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

Many thanks once again for your review comments and help Konstantin!
-Mark 

>
>> 1.9.3

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v9 6/6] doc: add GSO programmer's guide
  2017-10-05 20:36               ` [PATCH v9 6/6] doc: add GSO programmer's guide Mark Kavanagh
@ 2017-10-06 13:34                 ` Mcnamara, John
  2017-10-06 13:41                   ` Kavanagh, Mark B
  0 siblings, 1 reply; 157+ messages in thread
From: Mcnamara, John @ 2017-10-06 13:34 UTC (permalink / raw)
  To: Kavanagh, Mark B, dev
  Cc: Hu, Jiayu, Tan, Jianfeng, Ananyev, Konstantin, Yigit, Ferruh,
	thomas, Kavanagh, Mark B



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Mark Kavanagh
> Sent: Thursday, October 5, 2017 9:37 PM
> To: dev@dpdk.org
> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng
> <jianfeng.tan@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> thomas@monjalon.net; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> Subject: [dpdk-dev] [PATCH v9 6/6] doc: add GSO programmer's guide
> 
> Add programmer's guide doc to explain the design and use of the
> GSO library.
> 
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>


Hi Mark,

If the docs (or another part of the patchset) were previously acked but
not changed in a new patchset then it is okay to include the previous
ack line.

Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v9 6/6] doc: add GSO programmer's guide
  2017-10-06 13:34                 ` Mcnamara, John
@ 2017-10-06 13:41                   ` Kavanagh, Mark B
  0 siblings, 0 replies; 157+ messages in thread
From: Kavanagh, Mark B @ 2017-10-06 13:41 UTC (permalink / raw)
  To: Mcnamara, John, dev
  Cc: Hu, Jiayu, Tan, Jianfeng, Ananyev, Konstantin, Yigit, Ferruh, thomas



>-----Original Message-----
>From: Mcnamara, John
>Sent: Friday, October 6, 2017 2:35 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@dpdk.org
>Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
>Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit, Ferruh
><ferruh.yigit@intel.com>; thomas@monjalon.net; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>
>Subject: RE: [dpdk-dev] [PATCH v9 6/6] doc: add GSO programmer's guide
>
>
>
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Mark Kavanagh
>> Sent: Thursday, October 5, 2017 9:37 PM
>> To: dev@dpdk.org
>> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Tan, Jianfeng
>> <jianfeng.tan@intel.com>; Ananyev, Konstantin
>> <konstantin.ananyev@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>;
>> thomas@monjalon.net; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
>> Subject: [dpdk-dev] [PATCH v9 6/6] doc: add GSO programmer's guide
>>
>> Add programmer's guide doc to explain the design and use of the
>> GSO library.
>>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
>
>
>Hi Mark,
>
>If the docs (or another part of the patchset) were previously acked but
>not changed in a new patchset then it is okay to include the previous
>ack line.

Of course - apologies John.

>
>Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v9 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-10-05 20:36               ` [PATCH v9 " Mark Kavanagh
  2017-10-05 22:24                 ` Ananyev, Konstantin
@ 2017-10-06 23:32                 ` Ferruh Yigit
  2017-10-06 23:34                   ` Ferruh Yigit
  2017-10-07 14:56                 ` [PATCH v10 " Jiayu Hu
  2 siblings, 1 reply; 157+ messages in thread
From: Ferruh Yigit @ 2017-10-06 23:32 UTC (permalink / raw)
  To: Mark Kavanagh, dev; +Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, thomas

On 10/5/2017 9:36 PM, Mark Kavanagh wrote:
> Generic Segmentation Offload (GSO) is a SW technique to split large
> packets into small ones. Akin to TSO, GSO enables applications to
> operate on large packets, thus reducing per-packet processing overhead.
> 
> To enable more flexibility to applications, DPDK GSO is implemented
> as a standalone library. Applications explicitly use the GSO library
> to segment packets. This patch adds GSO support to DPDK for specific
> packet types: specifically, TCP/IPv4, VxLAN, and GRE.
> 
> The first patch introduces the GSO API framework. The second patch
> adds GSO support for TCP/IPv4 packets (containing an optional VLAN
> tag). The third patch adds GSO support for VxLAN packets that contain
> outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or 
> outer VLAN tags). The fourth patch adds GSO support for GRE packets
> that contain outer IPv4, and inner TCP/IPv4 headers (with optional 
> outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
> and GRE GSO in testpmd's checksum forwarding engine. The final patch
> in the series adds GSO documentation to the programmer's guide.
> 
> Performance Testing
> ===================
> The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
> iperf. Setup for the test is described as follows:
> 
> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>    machine, together physically.
> b. Launch testpmd with P0 and a vhost-user port, and use csum
>    forwarding engine with "retry".
> c. Select IP and TCP HW checksum calculation for P0; select TCP HW
>    checksum calculation for vhost-user port.
> d. Launch a VM with csum and tso offloading enabled.
> e. Run iperf-client on virtio-net port in the VM to send TCP packets.
>    With enabling csum and tso, the VM can send large TCP/IPv4 packets
>    (mss is up to 64KB).
> f. P1 is assigned to linux kernel and enabled kernel GRO. Run
>    iperf-server on P1.
> 
> We conduct three iperf tests:
> 
> test-1: enable GSO for P0 in testpmd, and set max GSO segment length
>     to 1518B. Run two iperf-client in the VM.
> test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
>     two iperf-client in the VM.
> test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.
> 
> Throughput of the above three tests:
> 
> test-1: 9.4Gbps
> test-2: 9.5Gbps
> test-3: 3Mbps
> 
> Functional Testing
> ==================
> Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
> length of tunneled packets from VMs is 1514B. So current experiment
> method can't be used to measure VxLAN and GRE GSO performance, but simply
> test the functionality via setting small GSO segment length (e.g. 500B).
> 
> VxLAN
> -----
> To test VxLAN GSO functionality, we use the following setup:
> 
> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>    machine, together physically.
> b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
>    engine with "retry".
> c. Testpmd commands:
>     - csum parse_tunnel on "P0"
>     - csum parse_tunnel on "vhost-user port"
>     - csum set outer-ip hw "P0"
>     - csum set ip hw "P0"
>     - csum set tcp hw "P0"
>     - csum set tcp hw "vhost-user port"
>     - set port "P0" gso on
>     - set gso segsz 500
> d. Launch a VM with csum and tso offloading enabled.
> e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
>    on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
>    max packet length is 1514B.
> f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
>    create a VxLAN port for P1, and run iperf-server on the VxLAN port.
> 
> In testpmd, we can see the length of all packets sent from P0 is smaller
> than or equal to 500B. Additionally, the packets arriving in P1 is
> encapsulated and is smaller than or equal to 500B.
> 
> GRE
> ---
> The same process may be used to test GRE functionality, with the exception that
> the tunnel type created for both the guest's virtio-net, and the host's kernel
> interfaces is GRE:
>    `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`
> 
> As in the VxLAN testcase, the length of packets sent from P0, and received on
> P1, is less than 500B.
> 
> Change log
> ==========
> v9:
> - fix testpmd build for i686 target
> - change log level from WARNING to DEBUG in the case of unsupported packet
>   (rte_gso_segment())
> 
> v8:
> - resolve coding style infractions (indentation).
> - centralize invalid parameter checking for rte_gso_segment() into a single
>   'if' statement.
> - don't clear PKT_TX_TCP_SEG flag for packets that don't qualify for GSO
>   on account of invalid params.
> - allow GSO for tunneled packets only via gso_ctx (by correcting 'if'
>   statement condition).
> 
> v7:
> - add RTE_GSO_SEG_SIZE_MIN macro; use this to validate gso_ctx.gso_segsz.
> - rename 'ipid_flag' member of gso_ctx to 'flag'.
> - remove mention of VLAN tags in supported packet types.
> - don't clear PKT_TX_TCP_SEG flag if GSO fails.
> - take all packet overhead into account when checking for empty packet.
> - ensure that only enabled GSO types are enacted upon (i.e. no fall-through to
>   TCP/IPv4 case from tunneled case).
> - validate user-supplied gso segsz arg against RTE_GSO_SEG_SIZE_MIN in testpmd.
> - simplify error-checking/handling for GSO failure case in testpmd csum engine.
> - use 0 instead of !RTE_GSO_IPID_FIXED in testpmd.
> 
> v6:
> - rebase to HEAD of master (i5dce9fcA)
> - remove 'l3_offset' parameter from 'update_ipv4_tcp_headers'
> 
> v5:
> - add GSO section to the programmer's guide.
> - use MF or (previously 'and') offset to check if a packet is IP
>   fragmented.
> - move 'update_header' helper functions to gso_common.h.
> - move txp/ipv4 'update_header' function to gso_tcp4.c.
> - move tunnel 'update_header' function to gso_tunnel_tcp4.c.
> - add offset parameter to 'update_header' functions.
> - combine GRE and VxLAN tunnel header update functions into a single
>   function.
> - correct typos and errors in comments/commit messages.
> 
> v4:
> - use ol_flags instead of packet_type to decide which segmentation
>   function to use.
> - use MF and offset to check if a packet is IP fragmented, instead of
>   using DF.
> - remove ETHER_CRC_LEN from gso segment payload length calculation.
> - refactor internal header update and other functions.
> - remove RTE_GSO_IPID_INCREASE.
> - add some of GSO documents.
> - set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
>   packets sent from GSO-enabled ports in testpmd.
> v3:
> - support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
>   RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
>   UNKNOWN.
> - fill mbuf->packet_type instead of using rte_net_get_ptype() in
>   csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
> - store the input packet into pkts_out inside gso_tcp4_segment() and
>   gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
>   is performed.
> - add missing incldues.
> - optimize file names, function names and function description.
> - fix one bug in testpmd.
> v2:
> - merge data segments whose data_len is less than mss into a large data
>   segment in gso_do_segment().
> - use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
>   header in rte_gso_segment().
> - provide IP id macros for applications to select fixed or incremental IP
>   ids.
> 
> Jiayu Hu (3):
>   gso: add Generic Segmentation Offload API framework
>   gso: add TCP/IPv4 GSO support
>   app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
> 
> Mark Kavanagh (3):
>   gso: add VxLAN GSO support
>   gso: add GRE GSO support
>   doc: add GSO programmer's guide

Hi Mark, Jiayu,

I was about the get this to next-net, but recognized same problem with
gro patch. There are uint8_t storage type usage for port_id. Port id is
now 16bits. And for testpmd you can prefer "portid_t" storage type as well.

Another thing is this patch and GRO patch conflicts, because both
touches same parts. Since Jiayu is the author of the gro patch, is it
possible to define order between these two patches and it doesn't
conflict while applying. Order doesn't matter as long as dependency
defined in cover letter of the patch.

Can you please send a new version with above two items addressed?

Thanks,
ferruh

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [PATCH v9 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-10-06 23:32                 ` Ferruh Yigit
@ 2017-10-06 23:34                   ` Ferruh Yigit
  0 siblings, 0 replies; 157+ messages in thread
From: Ferruh Yigit @ 2017-10-06 23:34 UTC (permalink / raw)
  To: Mark Kavanagh, dev; +Cc: jiayu.hu, jianfeng.tan, konstantin.ananyev, thomas

On 10/7/2017 12:32 AM, Ferruh Yigit wrote:
> On 10/5/2017 9:36 PM, Mark Kavanagh wrote:
>> Generic Segmentation Offload (GSO) is a SW technique to split large
>> packets into small ones. Akin to TSO, GSO enables applications to
>> operate on large packets, thus reducing per-packet processing overhead.
>>
>> To enable more flexibility to applications, DPDK GSO is implemented
>> as a standalone library. Applications explicitly use the GSO library
>> to segment packets. This patch adds GSO support to DPDK for specific
>> packet types: specifically, TCP/IPv4, VxLAN, and GRE.
>>
>> The first patch introduces the GSO API framework. The second patch
>> adds GSO support for TCP/IPv4 packets (containing an optional VLAN
>> tag). The third patch adds GSO support for VxLAN packets that contain
>> outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or 
>> outer VLAN tags). The fourth patch adds GSO support for GRE packets
>> that contain outer IPv4, and inner TCP/IPv4 headers (with optional 
>> outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
>> and GRE GSO in testpmd's checksum forwarding engine. The final patch
>> in the series adds GSO documentation to the programmer's guide.
>>
>> Performance Testing
>> ===================
>> The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
>> iperf. Setup for the test is described as follows:
>>
>> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>>    machine, together physically.
>> b. Launch testpmd with P0 and a vhost-user port, and use csum
>>    forwarding engine with "retry".
>> c. Select IP and TCP HW checksum calculation for P0; select TCP HW
>>    checksum calculation for vhost-user port.
>> d. Launch a VM with csum and tso offloading enabled.
>> e. Run iperf-client on virtio-net port in the VM to send TCP packets.
>>    With enabling csum and tso, the VM can send large TCP/IPv4 packets
>>    (mss is up to 64KB).
>> f. P1 is assigned to linux kernel and enabled kernel GRO. Run
>>    iperf-server on P1.
>>
>> We conduct three iperf tests:
>>
>> test-1: enable GSO for P0 in testpmd, and set max GSO segment length
>>     to 1518B. Run two iperf-client in the VM.
>> test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1518B. Run
>>     two iperf-client in the VM.
>> test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.
>>
>> Throughput of the above three tests:
>>
>> test-1: 9.4Gbps
>> test-2: 9.5Gbps
>> test-3: 3Mbps
>>
>> Functional Testing
>> ==================
>> Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
>> length of tunneled packets from VMs is 1514B. So current experiment
>> method can't be used to measure VxLAN and GRE GSO performance, but simply
>> test the functionality via setting small GSO segment length (e.g. 500B).
>>
>> VxLAN
>> -----
>> To test VxLAN GSO functionality, we use the following setup:
>>
>> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>>    machine, together physically.
>> b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
>>    engine with "retry".
>> c. Testpmd commands:
>>     - csum parse_tunnel on "P0"
>>     - csum parse_tunnel on "vhost-user port"
>>     - csum set outer-ip hw "P0"
>>     - csum set ip hw "P0"
>>     - csum set tcp hw "P0"
>>     - csum set tcp hw "vhost-user port"
>>     - set port "P0" gso on
>>     - set gso segsz 500
>> d. Launch a VM with csum and tso offloading enabled.
>> e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
>>    on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
>>    max packet length is 1514B.
>> f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
>>    create a VxLAN port for P1, and run iperf-server on the VxLAN port.
>>
>> In testpmd, we can see the length of all packets sent from P0 is smaller
>> than or equal to 500B. Additionally, the packets arriving in P1 is
>> encapsulated and is smaller than or equal to 500B.
>>
>> GRE
>> ---
>> The same process may be used to test GRE functionality, with the exception that
>> the tunnel type created for both the guest's virtio-net, and the host's kernel
>> interfaces is GRE:
>>    `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`
>>
>> As in the VxLAN testcase, the length of packets sent from P0, and received on
>> P1, is less than 500B.
>>
>> Change log
>> ==========
>> v9:
>> - fix testpmd build for i686 target
>> - change log level from WARNING to DEBUG in the case of unsupported packet
>>   (rte_gso_segment())
>>
>> v8:
>> - resolve coding style infractions (indentation).
>> - centralize invalid parameter checking for rte_gso_segment() into a single
>>   'if' statement.
>> - don't clear PKT_TX_TCP_SEG flag for packets that don't qualify for GSO
>>   on account of invalid params.
>> - allow GSO for tunneled packets only via gso_ctx (by correcting 'if'
>>   statement condition).
>>
>> v7:
>> - add RTE_GSO_SEG_SIZE_MIN macro; use this to validate gso_ctx.gso_segsz.
>> - rename 'ipid_flag' member of gso_ctx to 'flag'.
>> - remove mention of VLAN tags in supported packet types.
>> - don't clear PKT_TX_TCP_SEG flag if GSO fails.
>> - take all packet overhead into account when checking for empty packet.
>> - ensure that only enabled GSO types are enacted upon (i.e. no fall-through to
>>   TCP/IPv4 case from tunneled case).
>> - validate user-supplied gso segsz arg against RTE_GSO_SEG_SIZE_MIN in testpmd.
>> - simplify error-checking/handling for GSO failure case in testpmd csum engine.
>> - use 0 instead of !RTE_GSO_IPID_FIXED in testpmd.
>>
>> v6:
>> - rebase to HEAD of master (i5dce9fcA)
>> - remove 'l3_offset' parameter from 'update_ipv4_tcp_headers'
>>
>> v5:
>> - add GSO section to the programmer's guide.
>> - use MF or (previously 'and') offset to check if a packet is IP
>>   fragmented.
>> - move 'update_header' helper functions to gso_common.h.
>> - move txp/ipv4 'update_header' function to gso_tcp4.c.
>> - move tunnel 'update_header' function to gso_tunnel_tcp4.c.
>> - add offset parameter to 'update_header' functions.
>> - combine GRE and VxLAN tunnel header update functions into a single
>>   function.
>> - correct typos and errors in comments/commit messages.
>>
>> v4:
>> - use ol_flags instead of packet_type to decide which segmentation
>>   function to use.
>> - use MF and offset to check if a packet is IP fragmented, instead of
>>   using DF.
>> - remove ETHER_CRC_LEN from gso segment payload length calculation.
>> - refactor internal header update and other functions.
>> - remove RTE_GSO_IPID_INCREASE.
>> - add some of GSO documents.
>> - set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
>>   packets sent from GSO-enabled ports in testpmd.
>> v3:
>> - support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
>>   RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
>>   UNKNOWN.
>> - fill mbuf->packet_type instead of using rte_net_get_ptype() in
>>   csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
>> - store the input packet into pkts_out inside gso_tcp4_segment() and
>>   gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
>>   is performed.
>> - add missing incldues.
>> - optimize file names, function names and function description.
>> - fix one bug in testpmd.
>> v2:
>> - merge data segments whose data_len is less than mss into a large data
>>   segment in gso_do_segment().
>> - use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
>>   header in rte_gso_segment().
>> - provide IP id macros for applications to select fixed or incremental IP
>>   ids.
>>
>> Jiayu Hu (3):
>>   gso: add Generic Segmentation Offload API framework
>>   gso: add TCP/IPv4 GSO support
>>   app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
>>
>> Mark Kavanagh (3):
>>   gso: add VxLAN GSO support
>>   gso: add GRE GSO support
>>   doc: add GSO programmer's guide
> 
> Hi Mark, Jiayu,
> 
> I was about the get this to next-net, but recognized same problem with
> gro patch. There are uint8_t storage type usage for port_id. Port id is
> now 16bits. And for testpmd you can prefer "portid_t" storage type as well.
> 
> Another thing is this patch and GRO patch conflicts, because both
> touches same parts. Since Jiayu is the author of the gro patch, is it
> possible to define order between these two patches and it doesn't
> conflict while applying. Order doesn't matter as long as dependency
> defined in cover letter of the patch.
> 
> Can you please send a new version with above two items addressed?

btw, please keep Konstantin's and others Acks in next version of the
patches.

> 
> Thanks,
> ferruh
> 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v10 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-10-05 20:36               ` [PATCH v9 " Mark Kavanagh
  2017-10-05 22:24                 ` Ananyev, Konstantin
  2017-10-06 23:32                 ` Ferruh Yigit
@ 2017-10-07 14:56                 ` Jiayu Hu
  2017-10-07 14:56                   ` [PATCH v10 1/6] gso: add Generic Segmentation Offload API framework Jiayu Hu
                                     ` (6 more replies)
  2 siblings, 7 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-10-07 14:56 UTC (permalink / raw)
  To: dev; +Cc: mark.b.kavanagh, ferruh.yigit, konstantin.ananyev, Jiayu Hu

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. This patch adds GSO support to DPDK for specific
packet types: specifically, TCP/IPv4, VxLAN, and GRE.

The first patch introduces the GSO API framework. The second patch
adds GSO support for TCP/IPv4 packets (containing an optional VLAN
tag). The third patch adds GSO support for VxLAN packets that contain
outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or 
outer VLAN tags). The fourth patch adds GSO support for GRE packets
that contain outer IPv4, and inner TCP/IPv4 headers (with optional 
outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
and GRE GSO in testpmd's checksum forwarding engine. The final patch
in the series adds GSO documentation to the programmer's guide.

Note that this patch set has dependency on the patch "app/testpmd: enable
the heavyweight mode TCP/IPv4 GRO".
http://dpdk.org/dev/patchwork/patch/29867/

Performance Testing
===================
The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
iperf. Setup for the test is described as follows:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum
   forwarding engine with "retry".
c. Select IP and TCP HW checksum calculation for P0; select TCP HW
   checksum calculation for vhost-user port.
d. Launch a VM with csum and tso offloading enabled.
e. Run iperf-client on virtio-net port in the VM to send TCP packets.
   With enabling csum and tso, the VM can send large TCP/IPv4 packets
   (mss is up to 64KB).
f. P1 is assigned to linux kernel and enabled kernel GRO. Run
   iperf-server on P1.

We conduct three iperf tests:

test-1: enable GSO for P0 in testpmd, and set max GSO segment length
    to 1514B. Run four iperf-client in the VM.
test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1514B. Run
    four iperf-client in the VM.
test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.

Throughput of the above three tests:

test-1: 9Gbps
test-2: 9.5Gbps
test-3: 3Mbps

Functional Testing
==================
Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
length of tunneled packets from VMs is 1514B. So current experiment
method can't be used to measure VxLAN and GRE GSO performance, but simply
test the functionality via setting small GSO segment length (e.g. 500B).

VxLAN
-----
To test VxLAN GSO functionality, we use the following setup:

a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
   machine, together physically.
b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
   engine with "retry".
c. Testpmd commands:
    - csum parse_tunnel on "P0"
    - csum parse_tunnel on "vhost-user port"
    - csum set outer-ip hw "P0"
    - csum set ip hw "P0"
    - csum set tcp hw "P0"
    - csum set tcp hw "vhost-user port"
    - set port "P0" gso on
    - set gso segsz 500
d. Launch a VM with csum and tso offloading enabled.
e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
   on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
   max packet length is 1514B.
f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
   create a VxLAN port for P1, and run iperf-server on the VxLAN port.

In testpmd, we can see the length of all packets sent from P0 is smaller
than or equal to 500B. Additionally, the packets arriving in P1 is
encapsulated and is smaller than or equal to 500B.

GRE
---
The same process may be used to test GRE functionality, with the exception that
the tunnel type created for both the guest's virtio-net, and the host's kernel
interfaces is GRE:
   `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`

As in the VxLAN testcase, the length of packets sent from P0, and received on
P1, is less than 500B.

Change log
==========
v10:
- fix portid type conflict (uint8_t -> uint16_t) in testpmd.
- correct the RTE_GSO_FLAG_IPID_FIXED description: use 0 rather than 
  !RTE_GSO_FLAG_IPID_FIXED to indicate using incremental IP ids.
- rebase the GSO codes upon the patch "app/testpmd: enable the heavyweight
  mode TCP/IPv4 GRO" to fix the conflict issue in testpmd.

v9:
- fix testpmd build for i686 target
- change log level from WARNING to DEBUG in the case of unsupported packet
  (rte_gso_segment())

v8:
- resolve coding style infractions (indentation).
- centralize invalid parameter checking for rte_gso_segment() into a single
  'if' statement.
- don't clear PKT_TX_TCP_SEG flag for packets that don't qualify for GSO
  on account of invalid params.
- allow GSO for tunneled packets only via gso_ctx (by correcting 'if'
  statement condition).

v7:
- add RTE_GSO_SEG_SIZE_MIN macro; use this to validate gso_ctx.gso_segsz.
- rename 'ipid_flag' member of gso_ctx to 'flag'.
- remove mention of VLAN tags in supported packet types.
- don't clear PKT_TX_TCP_SEG flag if GSO fails.
- take all packet overhead into account when checking for empty packet.
- ensure that only enabled GSO types are enacted upon (i.e. no fall-through to
  TCP/IPv4 case from tunneled case).
- validate user-supplied gso segsz arg against RTE_GSO_SEG_SIZE_MIN in testpmd.
- simplify error-checking/handling for GSO failure case in testpmd csum engine.
- use 0 instead of !RTE_GSO_IPID_FIXED in testpmd.

v6:
- rebase to HEAD of master (i5dce9fcA)
- remove 'l3_offset' parameter from 'update_ipv4_tcp_headers'

v5:
- add GSO section to the programmer's guide.
- use MF or (previously 'and') offset to check if a packet is IP
  fragmented.
- move 'update_header' helper functions to gso_common.h.
- move txp/ipv4 'update_header' function to gso_tcp4.c.
- move tunnel 'update_header' function to gso_tunnel_tcp4.c.
- add offset parameter to 'update_header' functions.
- combine GRE and VxLAN tunnel header update functions into a single
  function.
- correct typos and errors in comments/commit messages.

v4:
- use ol_flags instead of packet_type to decide which segmentation
  function to use.
- use MF and offset to check if a packet is IP fragmented, instead of
  using DF.
- remove ETHER_CRC_LEN from gso segment payload length calculation.
- refactor internal header update and other functions.
- remove RTE_GSO_IPID_INCREASE.
- add some of GSO documents.
- set the default GSO length to 1514 and fill PKT_TX_TCP_SEG for the
  packets sent from GSO-enabled ports in testpmd.
v3:
- support all IPv4 header flags, including RTE_PTYPE_(INNER_)L3_IPV4,
  RTE_PTYPE_(INNER_)L3_IPV4_EXT and RTE_PTYPE_(INNER_)L3_IPV4_EXT_
  UNKNOWN.
- fill mbuf->packet_type instead of using rte_net_get_ptype() in
  csumonly.c, since rte_net_get_ptype() doesn't support vxlan.
- store the input packet into pkts_out inside gso_tcp4_segment() and
  gso_tunnel_tcp4_segment() instead of rte_gso_segment(), when no GSO
  is performed.
- add missing incldues.
- optimize file names, function names and function description.
- fix one bug in testpmd.
v2:
- merge data segments whose data_len is less than mss into a large data
  segment in gso_do_segment().
- use mbuf->packet_type/l2_len/l3_len etc. instead of parsing the packet
  header in rte_gso_segment().
- provide IP id macros for applications to select fixed or incremental IP
  ids.

Jiayu Hu (3):
  gso: add Generic Segmentation Offload API framework
  gso: add TCP/IPv4 GSO support
  app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO

Mark Kavanagh (3):
  gso: add VxLAN GSO support
  gso: add GRE GSO support
  doc: add GSO programmer's guide

 MAINTAINERS                                        |   6 +
 app/test-pmd/cmdline.c                             | 180 ++++++++
 app/test-pmd/config.c                              |  24 ++
 app/test-pmd/csumonly.c                            |  43 +-
 app/test-pmd/testpmd.c                             |  13 +
 app/test-pmd/testpmd.h                             |  10 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   1 +
 doc/api/doxy-api.conf                              |   1 +
 .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
 .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
 doc/guides/prog_guide/index.rst                    |   1 +
 doc/guides/rel_notes/release_17_11.rst             |  17 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  46 ++
 lib/Makefile                                       |   2 +
 lib/librte_eal/common/include/rte_log.h            |   1 +
 lib/librte_gso/Makefile                            |  52 +++
 lib/librte_gso/gso_common.c                        | 153 +++++++
 lib/librte_gso/gso_common.h                        | 171 ++++++++
 lib/librte_gso/gso_tcp4.c                          | 102 +++++
 lib/librte_gso/gso_tcp4.h                          |  74 ++++
 lib/librte_gso/gso_tunnel_tcp4.c                   | 126 ++++++
 lib/librte_gso/gso_tunnel_tcp4.h                   |  75 ++++
 lib/librte_gso/rte_gso.c                           | 110 +++++
 lib/librte_gso/rte_gso.h                           | 148 +++++++
 lib/librte_gso/rte_gso_version.map                 |   7 +
 mk/rte.app.mk                                      |   1 +
 28 files changed, 2411 insertions(+), 4 deletions(-)
 create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
 create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
 create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

-- 
2.7.4

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [PATCH v10 1/6] gso: add Generic Segmentation Offload API framework
  2017-10-07 14:56                 ` [PATCH v10 " Jiayu Hu
@ 2017-10-07 14:56                   ` Jiayu Hu
  2017-10-07 14:56                   ` [PATCH v10 2/6] gso: add TCP/IPv4 GSO support Jiayu Hu
                                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-10-07 14:56 UTC (permalink / raw)
  To: dev; +Cc: mark.b.kavanagh, ferruh.yigit, konstantin.ananyev, Jiayu Hu

Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. To segment a packet requires two steps. The first
is to set proper flags to mbuf->ol_flags, where the flags are the same
as that of TSO. The second is to call the segmentation API,
rte_gso_segment(). This patch introduces the GSO API framework to DPDK.

rte_gso_segment() splits an input packet into small ones in each
invocation. The GSO library refers to these small packets generated
by rte_gso_segment() as GSO segments. Each of the newly-created GSO
segments is organized as a two-segment MBUF, where the first segment is a
standard MBUF, which stores a copy of packet header, and the second is an
indirect MBUF which points to a section of data in the input packet.
rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
when all GSO segments are freed, the input packet is freed automatically.
Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
the driver of the interface which the GSO segments are sent to should
support to transmit multi-segment packets.

The GSO framework clears the PKT_TX_TCP_SEG flag for both the input
packet, and all produced GSO segments in the event of success, since
segmentation in hardware is no longer required at that point.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 config/common_base                     |   5 ++
 doc/api/doxy-api-index.md              |   1 +
 doc/api/doxy-api.conf                  |   1 +
 doc/guides/rel_notes/release_17_11.rst |   1 +
 lib/Makefile                           |   2 +
 lib/librte_gso/Makefile                |  49 +++++++++++
 lib/librte_gso/rte_gso.c               |  52 ++++++++++++
 lib/librte_gso/rte_gso.h               | 143 +++++++++++++++++++++++++++++++++
 lib/librte_gso/rte_gso_version.map     |   7 ++
 mk/rte.app.mk                          |   1 +
 10 files changed, 262 insertions(+)
 create mode 100644 lib/librte_gso/Makefile
 create mode 100644 lib/librte_gso/rte_gso.c
 create mode 100644 lib/librte_gso/rte_gso.h
 create mode 100644 lib/librte_gso/rte_gso_version.map

diff --git a/config/common_base b/config/common_base
index ca47615..65c5e75 100644
--- a/config/common_base
+++ b/config/common_base
@@ -655,6 +655,11 @@ CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
 CONFIG_RTE_LIBRTE_GRO=y
 
 #
+# Compile GSO library
+#
+CONFIG_RTE_LIBRTE_GSO=y
+
+#
 # Compile librte_meter
 #
 CONFIG_RTE_LIBRTE_METER=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 19e0d4f..6512918 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -101,6 +101,7 @@ The public API headers are grouped by topics:
   [TCP]                (@ref rte_tcp.h),
   [UDP]                (@ref rte_udp.h),
   [GRO]                (@ref rte_gro.h),
+  [GSO]                (@ref rte_gso.h),
   [frag/reass]         (@ref rte_ip_frag.h),
   [LPM IPv4 route]     (@ref rte_lpm.h),
   [LPM IPv6 route]     (@ref rte_lpm6.h),
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 823554f..408f2e6 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -47,6 +47,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_ether \
                           lib/librte_eventdev \
                           lib/librte_gro \
+                          lib/librte_gso \
                           lib/librte_hash \
                           lib/librte_ip_frag \
                           lib/librte_jobstats \
diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 4f92912..d75fec2 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -208,6 +208,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_ethdev.so.8
      librte_eventdev.so.2
      librte_gro.so.1
+     librte_gso.so.1
      librte_hash.so.2
      librte_ip_frag.so.1
      librte_jobstats.so.1
diff --git a/lib/Makefile b/lib/Makefile
index 86caba1..3d123f4 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -108,6 +108,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
 DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
+DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
+DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
new file mode 100644
index 0000000..aeaacbc
--- /dev/null
+++ b/lib/librte_gso/Makefile
@@ -0,0 +1,49 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_gso.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+
+EXPORT_MAP := rte_gso_version.map
+
+LIBABIVER := 1
+
+#source files
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
new file mode 100644
index 0000000..b773636
--- /dev/null
+++ b/lib/librte_gso/rte_gso.c
@@ -0,0 +1,52 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <errno.h>
+
+#include "rte_gso.h"
+
+int
+rte_gso_segment(struct rte_mbuf *pkt,
+		const struct rte_gso_ctx *gso_ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
+			nb_pkts_out < 1)
+		return -EINVAL;
+
+	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+	pkts_out[0] = pkt;
+
+	return 1;
+}
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
new file mode 100644
index 0000000..9d3b4fc
--- /dev/null
+++ b/lib/librte_gso/rte_gso.h
@@ -0,0 +1,143 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_GSO_H_
+#define _RTE_GSO_H_
+
+/**
+ * @file
+ * Interface to GSO library
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/* GSO flags for rte_gso_ctx. */
+#define RTE_GSO_FLAG_IPID_FIXED (1ULL << 0)
+/**< Use fixed IP ids for output GSO segments. Setting
+ * 0 indicates using incremental IP ids.
+ */
+
+/**
+ * GSO context structure.
+ */
+struct rte_gso_ctx {
+	struct rte_mempool *direct_pool;
+	/**< MBUF pool for allocating direct buffers, which are used
+	 * to store packet headers for GSO segments.
+	 */
+	struct rte_mempool *indirect_pool;
+	/**< MBUF pool for allocating indirect buffers, which are used
+	 * to locate packet payloads for GSO segments. The indirect
+	 * buffer doesn't contain any data, but simply points to an
+	 * offset within the packet to segment.
+	 */
+	uint64_t flag;
+	/**< flag that controls specific attributes of output segments,
+	 * such as the type of IP ID generated (i.e. fixed or incremental).
+	 */
+	uint32_t gso_types;
+	/**< the bit mask of required GSO types. The GSO library
+	 * uses the same macros as that of describing device TX
+	 * offloading capabilities (i.e. DEV_TX_OFFLOAD_*_TSO) for
+	 * gso_types.
+	 *
+	 * For example, if applications want to segment TCP/IPv4
+	 * packets, set DEV_TX_OFFLOAD_TCP_TSO in gso_types.
+	 */
+	uint16_t gso_size;
+	/**< maximum size of an output GSO segment, including packet
+	 * header and payload, measured in bytes.
+	 */
+};
+
+/**
+ * Segmentation function, which supports processing of both single- and
+ * multi- MBUF packets.
+ *
+ * Note that we refer to the packets that are segmented from the input
+ * packet as 'GSO segments'. rte_gso_segment() doesn't check if the
+ * input packet has correct checksums, and doesn't update checksums for
+ * output GSO segments. Additionally, it doesn't process IP fragment
+ * packets.
+ *
+ * Before calling rte_gso_segment(), applications must set proper ol_flags
+ * for the packet. The GSO library uses the same macros as that of TSO.
+ * For example, set PKT_TX_TCP_SEG and PKT_TX_IPV4 in ol_flags to segment
+ * a TCP/IPv4 packet. If rte_gso_segment() succceds, the PKT_TX_TCP_SEG
+ * flag is removed for all GSO segments and the input packet.
+ *
+ * Each of the newly-created GSO segments is organized as a two-segment
+ * MBUF, where the first segment is a standard MBUF, which stores a copy
+ * of packet header, and the second is an indirect MBUF which points to
+ * a section of data in the input packet. Since each GSO segment has
+ * multiple MBUFs (i.e. typically 2 MBUFs), the driver of the interface which
+ * the GSO segments are sent to should support transmission of multi-segment
+ * packets.
+ *
+ * If the input packet is GSO'd, its mbuf refcnt reduces by 1. Therefore,
+ * when all GSO segments are freed, the input packet is freed automatically.
+ *
+ * If the memory space in pkts_out or MBUF pools is insufficient, this
+ * function fails, and it returns (-1) * errno. Otherwise, GSO succeeds,
+ * and this function returns the number of output GSO segments filled in
+ * pkts_out.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param ctx
+ *  GSO context object pointer.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when rte_gso_segment() succeeds.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of GSO segments filled in pkts_out on success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int rte_gso_segment(struct rte_mbuf *pkt,
+		const struct rte_gso_ctx *ctx,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_GSO_H_ */
diff --git a/lib/librte_gso/rte_gso_version.map b/lib/librte_gso/rte_gso_version.map
new file mode 100644
index 0000000..e1fd453
--- /dev/null
+++ b/lib/librte_gso/rte_gso_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_gso_segment;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 29507dc..6df402c 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -66,6 +66,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
+_LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v10 2/6] gso: add TCP/IPv4 GSO support
  2017-10-07 14:56                 ` [PATCH v10 " Jiayu Hu
  2017-10-07 14:56                   ` [PATCH v10 1/6] gso: add Generic Segmentation Offload API framework Jiayu Hu
@ 2017-10-07 14:56                   ` Jiayu Hu
  2017-10-07 14:56                   ` [PATCH v10 3/6] gso: add VxLAN " Jiayu Hu
                                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-10-07 14:56 UTC (permalink / raw)
  To: dev; +Cc: mark.b.kavanagh, ferruh.yigit, konstantin.ananyev, Jiayu Hu

This patch adds GSO support for TCP/IPv4 packets. Supported packets
may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
packets have correct checksums, and doesn't update checksums for
output packets (the responsibility for this lies with the application).
Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.

TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
MBUF, to organize an output packet. Note that we refer to these two
chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
header, while the indirect mbuf simply points to a location within the
original packet's payload. Consequently, use of the GSO library requires
multi-segment MBUF support in the TX functions of the NIC driver.

If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
result, when all of its GSOed segments are freed, the packet is freed
automatically.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst  |  12 +++
 lib/Makefile                            |   2 +-
 lib/librte_eal/common/include/rte_log.h |   1 +
 lib/librte_gso/Makefile                 |   2 +
 lib/librte_gso/gso_common.c             | 153 ++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_common.h             | 141 +++++++++++++++++++++++++++++
 lib/librte_gso/gso_tcp4.c               | 102 +++++++++++++++++++++
 lib/librte_gso/gso_tcp4.h               |  74 +++++++++++++++
 lib/librte_gso/rte_gso.c                |  53 ++++++++++-
 lib/librte_gso/rte_gso.h                |   7 +-
 10 files changed, 541 insertions(+), 6 deletions(-)
 create mode 100644 lib/librte_gso/gso_common.c
 create mode 100644 lib/librte_gso/gso_common.h
 create mode 100644 lib/librte_gso/gso_tcp4.c
 create mode 100644 lib/librte_gso/gso_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index d75fec2..8f4a1e0 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -64,6 +64,18 @@ New Features
    * Support for Flow API
    * Support for Tx and Rx descriptor status functions
 
+* **Added the Generic Segmentation Offload Library.**
+
+  Added the Generic Segmentation Offload (GSO) library to enable
+  applications to split large packets (e.g. MTU is 64KB) into small
+  ones (e.g. MTU is 1500B). Supported packet types are:
+
+  * TCP/IPv4 packets.
+
+  The GSO library doesn't check if the input packets have correct
+  checksums, and doesn't update checksums for output packets.
+  Additionally, the GSO library doesn't process IP fragmented packets.
+
 
 Resolved Issues
 ---------------
diff --git a/lib/Makefile b/lib/Makefile
index 3d123f4..5ecd1b3 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -109,7 +109,7 @@ DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
 DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
-DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
+DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net librte_mempool
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index ec8dba7..2fa1199 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -87,6 +87,7 @@ extern struct rte_logs rte_logs;
 #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */
 #define RTE_LOGTYPE_EFD       18 /**< Log related to EFD. */
 #define RTE_LOGTYPE_EVENTDEV  19 /**< Log related to eventdev. */
+#define RTE_LOGTYPE_GSO       20 /**< Log related to GSO. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1     24 /**< User-defined log type 1. */
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index aeaacbc..2be64d1 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -42,6 +42,8 @@ LIBABIVER := 1
 
 #source files
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.c b/lib/librte_gso/gso_common.c
new file mode 100644
index 0000000..ee75d4c
--- /dev/null
+++ b/lib/librte_gso/gso_common.c
@@ -0,0 +1,153 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdbool.h>
+#include <errno.h>
+
+#include <rte_memcpy.h>
+#include <rte_mempool.h>
+
+#include "gso_common.h"
+
+static inline void
+hdr_segment_init(struct rte_mbuf *hdr_segment, struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset)
+{
+	/* Copy MBUF metadata */
+	hdr_segment->nb_segs = 1;
+	hdr_segment->port = pkt->port;
+	hdr_segment->ol_flags = pkt->ol_flags;
+	hdr_segment->packet_type = pkt->packet_type;
+	hdr_segment->pkt_len = pkt_hdr_offset;
+	hdr_segment->data_len = pkt_hdr_offset;
+	hdr_segment->tx_offload = pkt->tx_offload;
+
+	/* Copy the packet header */
+	rte_memcpy(rte_pktmbuf_mtod(hdr_segment, char *),
+			rte_pktmbuf_mtod(pkt, char *),
+			pkt_hdr_offset);
+}
+
+static inline void
+free_gso_segment(struct rte_mbuf **pkts, uint16_t nb_pkts)
+{
+	uint16_t i;
+
+	for (i = 0; i < nb_pkts; i++)
+		rte_pktmbuf_free(pkts[i]);
+}
+
+int
+gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct rte_mbuf *pkt_in;
+	struct rte_mbuf *hdr_segment, *pyld_segment, *prev_segment;
+	uint16_t pkt_in_data_pos, segment_bytes_remaining;
+	uint16_t pyld_len, nb_segs;
+	bool more_in_pkt, more_out_segs;
+
+	pkt_in = pkt;
+	nb_segs = 0;
+	more_in_pkt = 1;
+	pkt_in_data_pos = pkt_hdr_offset;
+
+	while (more_in_pkt) {
+		if (unlikely(nb_segs >= nb_pkts_out)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -EINVAL;
+		}
+
+		/* Allocate a direct MBUF */
+		hdr_segment = rte_pktmbuf_alloc(direct_pool);
+		if (unlikely(hdr_segment == NULL)) {
+			free_gso_segment(pkts_out, nb_segs);
+			return -ENOMEM;
+		}
+		/* Fill the packet header */
+		hdr_segment_init(hdr_segment, pkt, pkt_hdr_offset);
+
+		prev_segment = hdr_segment;
+		segment_bytes_remaining = pyld_unit_size;
+		more_out_segs = 1;
+
+		while (more_out_segs && more_in_pkt) {
+			/* Allocate an indirect MBUF */
+			pyld_segment = rte_pktmbuf_alloc(indirect_pool);
+			if (unlikely(pyld_segment == NULL)) {
+				rte_pktmbuf_free(hdr_segment);
+				free_gso_segment(pkts_out, nb_segs);
+				return -ENOMEM;
+			}
+			/* Attach to current MBUF segment of pkt */
+			rte_pktmbuf_attach(pyld_segment, pkt_in);
+
+			prev_segment->next = pyld_segment;
+			prev_segment = pyld_segment;
+
+			pyld_len = segment_bytes_remaining;
+			if (pyld_len + pkt_in_data_pos > pkt_in->data_len)
+				pyld_len = pkt_in->data_len - pkt_in_data_pos;
+
+			pyld_segment->data_off = pkt_in_data_pos +
+				pkt_in->data_off;
+			pyld_segment->data_len = pyld_len;
+
+			/* Update header segment */
+			hdr_segment->pkt_len += pyld_len;
+			hdr_segment->nb_segs++;
+
+			pkt_in_data_pos += pyld_len;
+			segment_bytes_remaining -= pyld_len;
+
+			/* Finish processing a MBUF segment of pkt */
+			if (pkt_in_data_pos == pkt_in->data_len) {
+				pkt_in = pkt_in->next;
+				pkt_in_data_pos = 0;
+				if (pkt_in == NULL)
+					more_in_pkt = 0;
+			}
+
+			/* Finish generating a GSO segment */
+			if (segment_bytes_remaining == 0)
+				more_out_segs = 0;
+		}
+		pkts_out[nb_segs++] = hdr_segment;
+	}
+	return nb_segs;
+}
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
new file mode 100644
index 0000000..a8ad638
--- /dev/null
+++ b/lib/librte_gso/gso_common.h
@@ -0,0 +1,141 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_COMMON_H_
+#define _GSO_COMMON_H_
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
+#define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
+		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
+
+#define TCP_HDR_PSH_MASK ((uint8_t)0x08)
+#define TCP_HDR_FIN_MASK ((uint8_t)0x01)
+
+#define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
+
+/**
+ * Internal function which updates the TCP header of a packet, following
+ * segmentation. This is required to update the header's 'sent' sequence
+ * number, and also to clear 'PSH' and 'FIN' flags for non-tail segments.
+ *
+ * @param pkt
+ *  The packet containing the TCP header.
+ * @param l4_offset
+ *  The offset of the TCP header from the start of the packet.
+ * @param sent_seq
+ *  The sent sequence number.
+ * @param non-tail
+ *  Indicates whether or not this is a tail segment.
+ */
+static inline void
+update_tcp_header(struct rte_mbuf *pkt, uint16_t l4_offset, uint32_t sent_seq,
+		uint8_t non_tail)
+{
+	struct tcp_hdr *tcp_hdr;
+
+	tcp_hdr = (struct tcp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l4_offset);
+	tcp_hdr->sent_seq = rte_cpu_to_be_32(sent_seq);
+	if (likely(non_tail))
+		tcp_hdr->tcp_flags &= (~(TCP_HDR_PSH_MASK |
+					TCP_HDR_FIN_MASK));
+}
+
+/**
+ * Internal function which updates the IPv4 header of a packet, following
+ * segmentation. This is required to update the header's 'total_length' field,
+ * to reflect the reduced length of the now-segmented packet. Furthermore, the
+ * header's 'packet_id' field must be updated to reflect the new ID of the
+ * now-segmented packet.
+ *
+ * @param pkt
+ *  The packet containing the IPv4 header.
+ * @param l3_offset
+ *  The offset of the IPv4 header from the start of the packet.
+ * @param id
+ *  The new ID of the packet.
+ */
+static inline void
+update_ipv4_header(struct rte_mbuf *pkt, uint16_t l3_offset, uint16_t id)
+{
+	struct ipv4_hdr *ipv4_hdr;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			l3_offset);
+	ipv4_hdr->total_length = rte_cpu_to_be_16(pkt->pkt_len - l3_offset);
+	ipv4_hdr->packet_id = rte_cpu_to_be_16(id);
+}
+
+/**
+ * Internal function which divides the input packet into small segments.
+ * Each of the newly-created segments is organized as a two-segment MBUF,
+ * where the first segment is a standard mbuf, which stores a copy of
+ * packet header, and the second is an indirect mbuf which points to a
+ * section of data in the input packet.
+ *
+ * @param pkt
+ *  Packet to segment.
+ * @param pkt_hdr_offset
+ *  Packet header offset, measured in bytes.
+ * @param pyld_unit_size
+ *  The max payload length of a GSO segment.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to keep the mbuf addresses of output segments. If
+ *  the memory space in pkts_out is insufficient, gso_do_segment() fails
+ *  and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that pkts_out can keep.
+ *
+ * @return
+ *  - The number of segments created in the event of success.
+ *  - Return -ENOMEM if run out of memory in MBUF pools.
+ *  - Return -EINVAL for invalid parameters.
+ */
+int gso_do_segment(struct rte_mbuf *pkt,
+		uint16_t pkt_hdr_offset,
+		uint16_t pyld_unit_size,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
new file mode 100644
index 0000000..0c628cb
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.c
@@ -0,0 +1,102 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "gso_common.h"
+#include "gso_tcp4.h"
+
+static void
+update_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t id, tail_idx, i;
+	uint16_t l3_offset = pkt->l2_len;
+	uint16_t l4_offset = l3_offset + pkt->l3_len;
+
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char*) +
+			l3_offset);
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+
+	for (i = 0; i < nb_segs; i++) {
+		update_ipv4_header(segs[i], l3_offset, id);
+		update_tcp_header(segs[i], l4_offset, sent_seq, i < tail_idx);
+		id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
+
+int
+gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	uint16_t pyld_unit_size, hdr_offset;
+	uint16_t frag_off;
+	int ret;
+
+	/* Don't process the fragmented packet */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			pkt->l2_len);
+	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	/* Don't process the packet without data */
+	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
+	if (unlikely(hdr_offset >= pkt->pkt_len)) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret > 1)
+		update_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tcp4.h b/lib/librte_gso/gso_tcp4.h
new file mode 100644
index 0000000..1c57441
--- /dev/null
+++ b/lib/librte_gso/gso_tcp4.h
@@ -0,0 +1,74 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TCP4_H_
+#define _GSO_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment an IPv4/TCP packet. This function doesn't check if the input
+ * packet has correct checksums, and doesn't update checksums for output
+ * GSO segments. Furthermore, it doesn't process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing unit of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when the function succeeds. If the memory space in
+ *  pkts_out is insufficient, it fails and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ip_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index b773636..822693f 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -33,7 +33,12 @@
 
 #include <errno.h>
 
+#include <rte_log.h>
+#include <rte_ethdev.h>
+
 #include "rte_gso.h"
+#include "gso_common.h"
+#include "gso_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -41,12 +46,52 @@ rte_gso_segment(struct rte_mbuf *pkt,
 		struct rte_mbuf **pkts_out,
 		uint16_t nb_pkts_out)
 {
+	struct rte_mempool *direct_pool, *indirect_pool;
+	struct rte_mbuf *pkt_seg;
+	uint64_t ol_flags;
+	uint16_t gso_size;
+	uint8_t ipid_delta;
+	int ret = 1;
+
 	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
-			nb_pkts_out < 1)
+			nb_pkts_out < 1 ||
+			gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN ||
+			gso_ctx->gso_types != DEV_TX_OFFLOAD_TCP_TSO)
 		return -EINVAL;
 
-	pkt->ol_flags &= (~PKT_TX_TCP_SEG);
-	pkts_out[0] = pkt;
+	if (gso_ctx->gso_size >= pkt->pkt_len) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	direct_pool = gso_ctx->direct_pool;
+	indirect_pool = gso_ctx->indirect_pool;
+	gso_size = gso_ctx->gso_size;
+	ipid_delta = (gso_ctx->flag != RTE_GSO_FLAG_IPID_FIXED);
+	ol_flags = pkt->ol_flags;
+
+	if (IS_IPV4_TCP(pkt->ol_flags)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else {
+		pkts_out[0] = pkt;
+		RTE_LOG(DEBUG, GSO, "Unsupported packet type\n");
+		return 1;
+	}
+
+	if (ret > 1) {
+		pkt_seg = pkt;
+		while (pkt_seg) {
+			rte_mbuf_refcnt_update(pkt_seg, -1);
+			pkt_seg = pkt_seg->next;
+		}
+	} else if (ret < 0) {
+		/* Revert the ol_flags in the event of failure. */
+		pkt->ol_flags = ol_flags;
+	}
 
-	return 1;
+	return ret;
 }
diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
index 9d3b4fc..4b77176 100644
--- a/lib/librte_gso/rte_gso.h
+++ b/lib/librte_gso/rte_gso.h
@@ -46,6 +46,10 @@ extern "C" {
 #include <stdint.h>
 #include <rte_mbuf.h>
 
+/* Minimum GSO segment size. */
+#define RTE_GSO_SEG_SIZE_MIN (sizeof(struct ether_hdr) + \
+		sizeof(struct ipv4_hdr) + sizeof(struct tcp_hdr) + 1)
+
 /* GSO flags for rte_gso_ctx. */
 #define RTE_GSO_FLAG_IPID_FIXED (1ULL << 0)
 /**< Use fixed IP ids for output GSO segments. Setting
@@ -81,7 +85,8 @@ struct rte_gso_ctx {
 	 */
 	uint16_t gso_size;
 	/**< maximum size of an output GSO segment, including packet
-	 * header and payload, measured in bytes.
+	 * header and payload, measured in bytes. Must exceed
+	 * RTE_GSO_SEG_SIZE_MIN.
 	 */
 };
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v10 3/6] gso: add VxLAN GSO support
  2017-10-07 14:56                 ` [PATCH v10 " Jiayu Hu
  2017-10-07 14:56                   ` [PATCH v10 1/6] gso: add Generic Segmentation Offload API framework Jiayu Hu
  2017-10-07 14:56                   ` [PATCH v10 2/6] gso: add TCP/IPv4 GSO support Jiayu Hu
@ 2017-10-07 14:56                   ` Jiayu Hu
  2017-10-07 14:56                   ` [PATCH v10 4/6] gso: add GRE " Jiayu Hu
                                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-10-07 14:56 UTC (permalink / raw)
  To: dev; +Cc: mark.b.kavanagh, ferruh.yigit, konstantin.ananyev, Jiayu Hu

From: Mark Kavanagh <mark.b.kavanagh@intel.com>

This patch adds a framework that allows GSO on tunneled packets.
Furthermore, it leverages that framework to provide GSO support for
VxLAN-encapsulated packets.

Supported VxLAN packets must have an outer IPv4 header (prepended by an
optional VLAN tag), and contain an inner TCP/IPv4 packet (with an optional
inner VLAN tag).

VxLAN GSO doesn't check if input packets have correct checksums and
doesn't update checksums for output packets. Additionally, it doesn't
process IP fragmented packets.

As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
output packet, which mandates support for multi-segment mbufs in the TX
functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
reduces its MBUF refcnt by 1. As a result, when all of its GSO'd segments
are freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst |   2 +
 lib/librte_gso/Makefile                |   1 +
 lib/librte_gso/gso_common.h            |  25 +++++++
 lib/librte_gso/gso_tunnel_tcp4.c       | 120 +++++++++++++++++++++++++++++++++
 lib/librte_gso/gso_tunnel_tcp4.h       |  75 +++++++++++++++++++++
 lib/librte_gso/rte_gso.c               |  14 +++-
 6 files changed, 235 insertions(+), 2 deletions(-)
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.c
 create mode 100644 lib/librte_gso/gso_tunnel_tcp4.h

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 8f4a1e0..4c17207 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -71,6 +71,8 @@ New Features
   ones (e.g. MTU is 1500B). Supported packet types are:
 
   * TCP/IPv4 packets.
+  * VxLAN packets, which must have an outer IPv4 header, and contain
+    an inner TCP/IPv4 packet.
 
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
index 2be64d1..e6d41df 100644
--- a/lib/librte_gso/Makefile
+++ b/lib/librte_gso/Makefile
@@ -44,6 +44,7 @@ LIBABIVER := 1
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
+SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index a8ad638..95d54e7 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -39,6 +39,7 @@
 #include <rte_mbuf.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
+#include <rte_udp.h>
 
 #define IS_FRAGMENTED(frag_off) (((frag_off) & IPV4_HDR_OFFSET_MASK) != 0 \
 		|| ((frag_off) & IPV4_HDR_MF_FLAG) == IPV4_HDR_MF_FLAG)
@@ -49,6 +50,30 @@
 #define IS_IPV4_TCP(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4)) == \
 		(PKT_TX_TCP_SEG | PKT_TX_IPV4))
 
+#define IS_IPV4_VXLAN_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
+				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_VXLAN)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
+		 PKT_TX_TUNNEL_VXLAN))
+
+/**
+ * Internal function which updates the UDP header of a packet, following
+ * segmentation. This is required to update the header's datagram length field.
+ *
+ * @param pkt
+ *  The packet containing the UDP header.
+ * @param udp_offset
+ *  The offset of the UDP header from the start of the packet.
+ */
+static inline void
+update_udp_header(struct rte_mbuf *pkt, uint16_t udp_offset)
+{
+	struct udp_hdr *udp_hdr;
+
+	udp_hdr = (struct udp_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			udp_offset);
+	udp_hdr->dgram_len = rte_cpu_to_be_16(pkt->pkt_len - udp_offset);
+}
+
 /**
  * Internal function which updates the TCP header of a packet, following
  * segmentation. This is required to update the header's 'sent' sequence
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
new file mode 100644
index 0000000..5e8c8e5
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -0,0 +1,120 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "gso_common.h"
+#include "gso_tunnel_tcp4.h"
+
+static void
+update_tunnel_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
+		struct rte_mbuf **segs, uint16_t nb_segs)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct tcp_hdr *tcp_hdr;
+	uint32_t sent_seq;
+	uint16_t outer_id, inner_id, tail_idx, i;
+	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
+
+	outer_ipv4_offset = pkt->outer_l2_len;
+	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
+	inner_ipv4_offset = udp_offset + pkt->l2_len;
+	tcp_offset = inner_ipv4_offset + pkt->l3_len;
+
+	/* Outer IPv4 header. */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			outer_ipv4_offset);
+	outer_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+
+	/* Inner IPv4 header. */
+	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			inner_ipv4_offset);
+	inner_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+
+	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
+	tail_idx = nb_segs - 1;
+
+	for (i = 0; i < nb_segs; i++) {
+		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
+		update_udp_header(segs[i], udp_offset);
+		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
+		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
+		outer_id++;
+		inner_id += ipid_delta;
+		sent_seq += (segs[i]->pkt_len - segs[i]->data_len);
+	}
+}
+
+int
+gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out)
+{
+	struct ipv4_hdr *inner_ipv4_hdr;
+	uint16_t pyld_unit_size, hdr_offset, frag_off;
+	int ret = 1;
+
+	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
+	inner_ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
+			hdr_offset);
+	/*
+	 * Don't process the packet whose MF bit or offset in the inner
+	 * IPv4 header are non-zero.
+	 */
+	frag_off = rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset);
+	if (unlikely(IS_FRAGMENTED(frag_off))) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+
+	hdr_offset += pkt->l3_len + pkt->l4_len;
+	/* Don't process the packet without data */
+	if (hdr_offset >= pkt->pkt_len) {
+		pkts_out[0] = pkt;
+		return 1;
+	}
+	pyld_unit_size = gso_size - hdr_offset;
+
+	/* Segment the payload */
+	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
+			indirect_pool, pkts_out, nb_pkts_out);
+	if (ret <= 1)
+		return ret;
+
+	update_tunnel_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
+
+	return ret;
+}
diff --git a/lib/librte_gso/gso_tunnel_tcp4.h b/lib/librte_gso/gso_tunnel_tcp4.h
new file mode 100644
index 0000000..3c67f0c
--- /dev/null
+++ b/lib/librte_gso/gso_tunnel_tcp4.h
@@ -0,0 +1,75 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _GSO_TUNNEL_TCP4_H_
+#define _GSO_TUNNEL_TCP4_H_
+
+#include <stdint.h>
+#include <rte_mbuf.h>
+
+/**
+ * Segment a tunneling packet with inner TCP/IPv4 headers. This function
+ * doesn't check if the input packet has correct checksums, and doesn't
+ * update checksums for output GSO segments. Furthermore, it doesn't
+ * process IP fragment packets.
+ *
+ * @param pkt
+ *  The packet mbuf to segment.
+ * @param gso_size
+ *  The max length of a GSO segment, measured in bytes.
+ * @param ipid_delta
+ *  The increasing unit of IP ids.
+ * @param direct_pool
+ *  MBUF pool used for allocating direct buffers for output segments.
+ * @param indirect_pool
+ *  MBUF pool used for allocating indirect buffers for output segments.
+ * @param pkts_out
+ *  Pointer array used to store the MBUF addresses of output GSO
+ *  segments, when it succeeds. If the memory space in pkts_out is
+ *  insufficient, it fails and returns -EINVAL.
+ * @param nb_pkts_out
+ *  The max number of items that 'pkts_out' can keep.
+ *
+ * @return
+ *   - The number of GSO segments filled in pkts_out on success.
+ *   - Return -ENOMEM if run out of memory in MBUF pools.
+ *   - Return -EINVAL for invalid parameters.
+ */
+int gso_tunnel_tcp4_segment(struct rte_mbuf *pkt,
+		uint16_t gso_size,
+		uint8_t ipid_delta,
+		struct rte_mempool *direct_pool,
+		struct rte_mempool *indirect_pool,
+		struct rte_mbuf **pkts_out,
+		uint16_t nb_pkts_out);
+#endif
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index 822693f..0a3ef11 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -39,6 +39,7 @@
 #include "rte_gso.h"
 #include "gso_common.h"
 #include "gso_tcp4.h"
+#include "gso_tunnel_tcp4.h"
 
 int
 rte_gso_segment(struct rte_mbuf *pkt,
@@ -56,7 +57,8 @@ rte_gso_segment(struct rte_mbuf *pkt,
 	if (pkt == NULL || pkts_out == NULL || gso_ctx == NULL ||
 			nb_pkts_out < 1 ||
 			gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN ||
-			gso_ctx->gso_types != DEV_TX_OFFLOAD_TCP_TSO)
+			((gso_ctx->gso_types & (DEV_TX_OFFLOAD_TCP_TSO |
+			DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) == 0))
 		return -EINVAL;
 
 	if (gso_ctx->gso_size >= pkt->pkt_len) {
@@ -71,12 +73,20 @@ rte_gso_segment(struct rte_mbuf *pkt,
 	ipid_delta = (gso_ctx->flag != RTE_GSO_FLAG_IPID_FIXED);
 	ol_flags = pkt->ol_flags;
 
-	if (IS_IPV4_TCP(pkt->ol_flags)) {
+	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)
+		&& (gso_ctx->gso_types & DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) {
+		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
+				direct_pool, indirect_pool,
+				pkts_out, nb_pkts_out);
+	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
+			(gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO)) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
 				pkts_out, nb_pkts_out);
 	} else {
+		/* unsupported packet, skip */
 		pkts_out[0] = pkt;
 		RTE_LOG(DEBUG, GSO, "Unsupported packet type\n");
 		return 1;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v10 4/6] gso: add GRE GSO support
  2017-10-07 14:56                 ` [PATCH v10 " Jiayu Hu
                                     ` (2 preceding siblings ...)
  2017-10-07 14:56                   ` [PATCH v10 3/6] gso: add VxLAN " Jiayu Hu
@ 2017-10-07 14:56                   ` Jiayu Hu
  2017-10-07 14:56                   ` [PATCH v10 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
                                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-10-07 14:56 UTC (permalink / raw)
  To: dev; +Cc: mark.b.kavanagh, ferruh.yigit, konstantin.ananyev, Jiayu Hu

From: Mark Kavanagh <mark.b.kavanagh@intel.com>

This patch adds GSO support for GRE-tunneled packets. Supported GRE
packets must contain an outer IPv4 header, and inner TCP/IPv4 headers.
They may also contain a single VLAN tag. GRE GSO doesn't check if all
input packets have correct checksums and doesn't update checksums for
output packets. Additionally, it doesn't process IP fragmented packets.

As with VxLAN GSO, GRE GSO uses a two-segment MBUF to organize each
output packet, which requires multi-segment mbuf support in the TX
functions of the NIC driver. Also, if a packet is GSOed, GRE GSO reduces
its MBUF refcnt by 1. As a result, when all of its GSOed segments are
freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 doc/guides/rel_notes/release_17_11.rst |  2 ++
 lib/librte_gso/gso_common.h            |  5 +++++
 lib/librte_gso/gso_tunnel_tcp4.c       | 14 ++++++++++----
 lib/librte_gso/rte_gso.c               |  9 ++++++---
 4 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 4c17207..6ab725f 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -73,6 +73,8 @@ New Features
   * TCP/IPv4 packets.
   * VxLAN packets, which must have an outer IPv4 header, and contain
     an inner TCP/IPv4 packet.
+  * GRE packets, which must contain an outer IPv4 header, and inner
+    TCP/IPv4 headers.
 
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
index 95d54e7..145ea49 100644
--- a/lib/librte_gso/gso_common.h
+++ b/lib/librte_gso/gso_common.h
@@ -55,6 +55,11 @@
 		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
 		 PKT_TX_TUNNEL_VXLAN))
 
+#define IS_IPV4_GRE_TCP4(flag) (((flag) & (PKT_TX_TCP_SEG | PKT_TX_IPV4 | \
+				PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_GRE)) == \
+		(PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
+		 PKT_TX_TUNNEL_GRE))
+
 /**
  * Internal function which updates the UDP header of a packet, following
  * segmentation. This is required to update the header's datagram length field.
diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
index 5e8c8e5..8d0cfd7 100644
--- a/lib/librte_gso/gso_tunnel_tcp4.c
+++ b/lib/librte_gso/gso_tunnel_tcp4.c
@@ -42,11 +42,13 @@ update_tunnel_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
 	struct tcp_hdr *tcp_hdr;
 	uint32_t sent_seq;
 	uint16_t outer_id, inner_id, tail_idx, i;
-	uint16_t outer_ipv4_offset, inner_ipv4_offset, udp_offset, tcp_offset;
+	uint16_t outer_ipv4_offset, inner_ipv4_offset;
+	uint16_t udp_gre_offset, tcp_offset;
+	uint8_t update_udp_hdr;
 
 	outer_ipv4_offset = pkt->outer_l2_len;
-	udp_offset = outer_ipv4_offset + pkt->outer_l3_len;
-	inner_ipv4_offset = udp_offset + pkt->l2_len;
+	udp_gre_offset = outer_ipv4_offset + pkt->outer_l3_len;
+	inner_ipv4_offset = udp_gre_offset + pkt->l2_len;
 	tcp_offset = inner_ipv4_offset + pkt->l3_len;
 
 	/* Outer IPv4 header. */
@@ -63,9 +65,13 @@ update_tunnel_ipv4_tcp_headers(struct rte_mbuf *pkt, uint8_t ipid_delta,
 	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
 	tail_idx = nb_segs - 1;
 
+	/* Only update UDP header for VxLAN packets. */
+	update_udp_hdr = (pkt->ol_flags & PKT_TX_TUNNEL_VXLAN) ? 1 : 0;
+
 	for (i = 0; i < nb_segs; i++) {
 		update_ipv4_header(segs[i], outer_ipv4_offset, outer_id);
-		update_udp_header(segs[i], udp_offset);
+		if (update_udp_hdr)
+			update_udp_header(segs[i], udp_gre_offset);
 		update_ipv4_header(segs[i], inner_ipv4_offset, inner_id);
 		update_tcp_header(segs[i], tcp_offset, sent_seq, i < tail_idx);
 		outer_id++;
diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index 0a3ef11..f86e654 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -58,7 +58,8 @@ rte_gso_segment(struct rte_mbuf *pkt,
 			nb_pkts_out < 1 ||
 			gso_ctx->gso_size < RTE_GSO_SEG_SIZE_MIN ||
 			((gso_ctx->gso_types & (DEV_TX_OFFLOAD_TCP_TSO |
-			DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) == 0))
+			DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+			DEV_TX_OFFLOAD_GRE_TNL_TSO)) == 0))
 		return -EINVAL;
 
 	if (gso_ctx->gso_size >= pkt->pkt_len) {
@@ -73,8 +74,10 @@ rte_gso_segment(struct rte_mbuf *pkt,
 	ipid_delta = (gso_ctx->flag != RTE_GSO_FLAG_IPID_FIXED);
 	ol_flags = pkt->ol_flags;
 
-	if (IS_IPV4_VXLAN_TCP4(pkt->ol_flags)
-		&& (gso_ctx->gso_types & DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) {
+	if ((IS_IPV4_VXLAN_TCP4(pkt->ol_flags) &&
+			(gso_ctx->gso_types & DEV_TX_OFFLOAD_VXLAN_TNL_TSO)) ||
+			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
+			 (gso_ctx->gso_types & DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
 		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v10 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
  2017-10-07 14:56                 ` [PATCH v10 " Jiayu Hu
                                     ` (3 preceding siblings ...)
  2017-10-07 14:56                   ` [PATCH v10 4/6] gso: add GRE " Jiayu Hu
@ 2017-10-07 14:56                   ` Jiayu Hu
  2017-10-07 14:56                   ` [PATCH v10 6/6] doc: add GSO programmer's guide Jiayu Hu
  2017-10-08  3:40                   ` [PATCH v10 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Ferruh Yigit
  6 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-10-07 14:56 UTC (permalink / raw)
  To: dev; +Cc: mark.b.kavanagh, ferruh.yigit, konstantin.ananyev, Jiayu Hu

This patch adds GSO support to the csum forwarding engine. Oversized
packets transmitted over a GSO-enabled port will undergo segmentation
(with the exception of packet-types unsupported by the GSO library).
GSO support is disabled by default.

GSO support may be toggled on a per-port basis, using the command:

        "set port <port_id> gso on|off"

The maximum packet length (including the packet header and payload) for
GSO segments may be set with the command:

        "set gso segsz <length>"

Show GSO configuration for a given port with the command:

	"show port <port_id> gso"

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 app/test-pmd/cmdline.c                      | 180 ++++++++++++++++++++++++++++
 app/test-pmd/config.c                       |  24 ++++
 app/test-pmd/csumonly.c                     |  43 ++++++-
 app/test-pmd/testpmd.c                      |  13 ++
 app/test-pmd/testpmd.h                      |  10 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  46 +++++++
 6 files changed, 312 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 516fc89..b2d5284 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -438,6 +438,17 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"    Set the cycle to flush GROed packets from"
 			" reassembly tables.\n\n"
 
+			"set port (port_id) gso (on|off)"
+			"    Enable or disable Generic Segmentation Offload in"
+			" csum forwarding engine.\n\n"
+
+			"set gso segsz (length)\n"
+			"    Set max packet length for output GSO segments,"
+			" including packet header and payload.\n\n"
+
+			"show port (port_id) gso\n"
+			"    Show GSO configuration.\n\n"
+
 			"set fwd (%s)\n"
 			"    Set packet forwarding mode.\n\n"
 
@@ -4014,6 +4025,172 @@ cmdline_parse_inst_t cmd_gro_flush = {
 	},
 };
 
+/* *** ENABLE/DISABLE GSO *** */
+struct cmd_gso_enable_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_mode;
+	portid_t cmd_pid;
+};
+
+static void
+cmd_gso_enable_parsed(void *parsed_result,
+		__attribute__((unused)) struct cmdline *cl,
+		__attribute__((unused)) void *data)
+{
+	struct cmd_gso_enable_result *res;
+
+	res = parsed_result;
+	if (!strcmp(res->cmd_keyword, "gso"))
+		setup_gso(res->cmd_mode, res->cmd_pid);
+}
+
+cmdline_parse_token_string_t cmd_gso_enable_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_enable_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_enable_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_enable_mode =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_mode, "on#off");
+cmdline_parse_token_num_t cmd_gso_enable_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_enable_result,
+			cmd_pid, UINT16);
+
+cmdline_parse_inst_t cmd_gso_enable = {
+	.f = cmd_gso_enable_parsed,
+	.data = NULL,
+	.help_str = "set port <port_id> gso on|off",
+	.tokens = {
+		(void *)&cmd_gso_enable_set,
+		(void *)&cmd_gso_enable_port,
+		(void *)&cmd_gso_enable_pid,
+		(void *)&cmd_gso_enable_keyword,
+		(void *)&cmd_gso_enable_mode,
+		NULL,
+	},
+};
+
+/* *** SET MAX PACKET LENGTH FOR GSO SEGMENTS *** */
+struct cmd_gso_size_result {
+	cmdline_fixed_string_t cmd_set;
+	cmdline_fixed_string_t cmd_keyword;
+	cmdline_fixed_string_t cmd_segsz;
+	uint16_t cmd_size;
+};
+
+static void
+cmd_gso_size_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_size_result *res = parsed_result;
+
+	if (test_done == 0) {
+		printf("Before setting GSO segsz, please first"
+				" stop fowarding\n");
+		return;
+	}
+
+	if (!strcmp(res->cmd_keyword, "gso") &&
+			!strcmp(res->cmd_segsz, "segsz")) {
+		if (res->cmd_size < RTE_GSO_SEG_SIZE_MIN)
+			printf("gso_size should be larger than %zu."
+					" Please input a legal value\n",
+					RTE_GSO_SEG_SIZE_MIN);
+		else
+			gso_max_segment_size = res->cmd_size;
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_size_set =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_set, "set");
+cmdline_parse_token_string_t cmd_gso_size_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_string_t cmd_gso_size_segsz =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_size_result,
+				cmd_segsz, "segsz");
+cmdline_parse_token_num_t cmd_gso_size_size =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_size_result,
+				cmd_size, UINT16);
+
+cmdline_parse_inst_t cmd_gso_size = {
+	.f = cmd_gso_size_parsed,
+	.data = NULL,
+	.help_str = "set gso segsz <length>",
+	.tokens = {
+		(void *)&cmd_gso_size_set,
+		(void *)&cmd_gso_size_keyword,
+		(void *)&cmd_gso_size_segsz,
+		(void *)&cmd_gso_size_size,
+		NULL,
+	},
+};
+
+/* *** SHOW GSO CONFIGURATION *** */
+struct cmd_gso_show_result {
+	cmdline_fixed_string_t cmd_show;
+	cmdline_fixed_string_t cmd_port;
+	cmdline_fixed_string_t cmd_keyword;
+	portid_t cmd_pid;
+};
+
+static void
+cmd_gso_show_parsed(void *parsed_result,
+		       __attribute__((unused)) struct cmdline *cl,
+		       __attribute__((unused)) void *data)
+{
+	struct cmd_gso_show_result *res = parsed_result;
+
+	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
+		printf("invalid port id %u\n", res->cmd_pid);
+		return;
+	}
+	if (!strcmp(res->cmd_keyword, "gso")) {
+		if (gso_ports[res->cmd_pid].enable) {
+			printf("Max GSO'd packet size: %uB\n"
+					"Supported GSO types: TCP/IPv4, "
+					"VxLAN with inner TCP/IPv4 packet, "
+					"GRE with inner TCP/IPv4  packet\n",
+					gso_max_segment_size);
+		} else
+			printf("GSO is not enabled on Port %u\n", res->cmd_pid);
+	}
+}
+
+cmdline_parse_token_string_t cmd_gso_show_show =
+TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+		cmd_show, "show");
+cmdline_parse_token_string_t cmd_gso_show_port =
+TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+		cmd_port, "port");
+cmdline_parse_token_string_t cmd_gso_show_keyword =
+	TOKEN_STRING_INITIALIZER(struct cmd_gso_show_result,
+				cmd_keyword, "gso");
+cmdline_parse_token_num_t cmd_gso_show_pid =
+	TOKEN_NUM_INITIALIZER(struct cmd_gso_show_result,
+				cmd_pid, UINT16);
+
+cmdline_parse_inst_t cmd_gso_show = {
+	.f = cmd_gso_show_parsed,
+	.data = NULL,
+	.help_str = "show port <port_id> gso",
+	.tokens = {
+		(void *)&cmd_gso_show_show,
+		(void *)&cmd_gso_show_port,
+		(void *)&cmd_gso_show_pid,
+		(void *)&cmd_gso_show_keyword,
+		NULL,
+	},
+};
+
 /* *** ENABLE/DISABLE FLUSH ON RX STREAMS *** */
 struct cmd_set_flush_rx {
 	cmdline_fixed_string_t set;
@@ -14723,6 +14900,9 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_gro_enable,
 	(cmdline_parse_inst_t *)&cmd_gro_flush,
 	(cmdline_parse_inst_t *)&cmd_gro_show,
+	(cmdline_parse_inst_t *)&cmd_gso_enable,
+	(cmdline_parse_inst_t *)&cmd_gso_size,
+	(cmdline_parse_inst_t *)&cmd_gso_show,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_rx,
 	(cmdline_parse_inst_t *)&cmd_link_flow_control_set_tx,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 90e4f19..d04940c 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2510,6 +2510,30 @@ show_gro(portid_t port_id)
 		printf("Port %u doesn't enable GRO.\n", port_id);
 }
 
+void
+setup_gso(const char *mode, portid_t port_id)
+{
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		printf("invalid port id %u\n", port_id);
+		return;
+	}
+	if (strcmp(mode, "on") == 0) {
+		if (test_done == 0) {
+			printf("before enabling GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 1;
+	} else if (strcmp(mode, "off") == 0) {
+		if (test_done == 0) {
+			printf("before disabling GSO,"
+					" please stop forwarding first\n");
+			return;
+		}
+		gso_ports[port_id].enable = 0;
+	}
+}
+
 char*
 list_pkt_forwarding_modes(void)
 {
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index ca50ab7..34fe8cc 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -70,6 +70,8 @@
 #include <rte_string_fns.h>
 #include <rte_flow.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
+
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
@@ -91,6 +93,7 @@
 /* structure that caches offload info for the current packet */
 struct testpmd_offload_info {
 	uint16_t ethertype;
+	uint8_t gso_enable;
 	uint16_t l2_len;
 	uint16_t l3_len;
 	uint16_t l4_len;
@@ -381,6 +384,8 @@ process_inner_cksums(void *l3_hdr, const struct testpmd_offload_info *info,
 				get_udptcp_checksum(l3_hdr, tcp_hdr,
 					info->ethertype);
 		}
+		if (info->gso_enable)
+			ol_flags |= PKT_TX_TCP_SEG;
 	} else if (info->l4_proto == IPPROTO_SCTP) {
 		sctp_hdr = (struct sctp_hdr *)((char *)l3_hdr + info->l3_len);
 		sctp_hdr->cksum = 0;
@@ -627,6 +632,9 @@ static void
 pkt_burst_checksum_forward(struct fwd_stream *fs)
 {
 	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_mbuf *gso_segments[GSO_MAX_PKT_BURST];
+	struct rte_gso_ctx *gso_ctx;
+	struct rte_mbuf **tx_pkts_burst;
 	struct rte_port *txp;
 	struct rte_mbuf *m, *p;
 	struct ether_hdr *eth_hdr;
@@ -644,6 +652,8 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 	uint32_t rx_bad_ip_csum;
 	uint32_t rx_bad_l4_csum;
 	struct testpmd_offload_info info;
+	uint16_t nb_segments = 0;
+	int ret;
 
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	uint64_t start_tsc;
@@ -673,6 +683,8 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 	memset(&info, 0, sizeof(info));
 	info.tso_segsz = txp->tso_segsz;
 	info.tunnel_tso_segsz = txp->tunnel_tso_segsz;
+	if (gso_ports[fs->tx_port].enable)
+		info.gso_enable = 1;
 
 	for (i = 0; i < nb_rx; i++) {
 		if (likely(i < nb_rx - 1))
@@ -872,13 +884,35 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 		}
 	}
 
+	if (gso_ports[fs->tx_port].enable == 0)
+		tx_pkts_burst = pkts_burst;
+	else {
+		gso_ctx = &(current_fwd_lcore()->gso_ctx);
+		gso_ctx->gso_size = gso_max_segment_size;
+		for (i = 0; i < nb_rx; i++) {
+			ret = rte_gso_segment(pkts_burst[i], gso_ctx,
+					&gso_segments[nb_segments],
+					GSO_MAX_PKT_BURST - nb_segments);
+			if (ret >= 0)
+				nb_segments += ret;
+			else {
+				RTE_LOG(DEBUG, USER1,
+						"Unable to segment packet");
+				rte_pktmbuf_free(pkts_burst[i]);
+			}
+		}
+
+		tx_pkts_burst = gso_segments;
+		nb_rx = nb_segments;
+	}
+
 	nb_prep = rte_eth_tx_prepare(fs->tx_port, fs->tx_queue,
-			pkts_burst, nb_rx);
+			tx_pkts_burst, nb_rx);
 	if (nb_prep != nb_rx)
 		printf("Preparing packet burst to transmit failed: %s\n",
 				rte_strerror(rte_errno));
 
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst,
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, tx_pkts_burst,
 			nb_prep);
 
 	/*
@@ -889,7 +923,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
 			rte_delay_us(burst_tx_delay_time);
 			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_rx - nb_tx);
+					&tx_pkts_burst[nb_tx], nb_rx - nb_tx);
 		}
 	}
 	fs->tx_packets += nb_tx;
@@ -902,9 +936,10 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 	if (unlikely(nb_tx < nb_rx)) {
 		fs->fwd_dropped += (nb_rx - nb_tx);
 		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
+			rte_pktmbuf_free(tx_pkts_burst[nb_tx]);
 		} while (++nb_tx < nb_rx);
 	}
+
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	end_tsc = rte_rdtsc();
 	core_cycles = (end_tsc - start_tsc);
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 408db9f..037eb8e 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -400,6 +400,9 @@ static int eth_event_callback(portid_t port_id,
  */
 static int all_ports_started(void);
 
+struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+uint16_t gso_max_segment_size = ETHER_MAX_LEN - ETHER_CRC_LEN;
+
 /*
  * Helper function to check if socket is already discovered.
  * If yes, return positive value. If not, return zero.
@@ -571,6 +574,7 @@ init_config(void)
 	lcoreid_t  lc_id;
 	uint8_t port_per_socket[RTE_MAX_NUMA_NODES];
 	struct rte_gro_param gro_param;
+	uint32_t gso_types;
 
 	memset(port_per_socket,0,RTE_MAX_NUMA_NODES);
 
@@ -655,6 +659,8 @@ init_config(void)
 
 	init_port_config();
 
+	gso_types = DEV_TX_OFFLOAD_TCP_TSO | DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+		DEV_TX_OFFLOAD_GRE_TNL_TSO;
 	/*
 	 * Records which Mbuf pool to use by each logical core, if needed.
 	 */
@@ -665,6 +671,13 @@ init_config(void)
 		if (mbp == NULL)
 			mbp = mbuf_pool_find(0);
 		fwd_lcores[lc_id]->mbp = mbp;
+		/* initialize GSO context */
+		fwd_lcores[lc_id]->gso_ctx.direct_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.indirect_pool = mbp;
+		fwd_lcores[lc_id]->gso_ctx.gso_types = gso_types;
+		fwd_lcores[lc_id]->gso_ctx.gso_size = ETHER_MAX_LEN -
+			ETHER_CRC_LEN;
+		fwd_lcores[lc_id]->gso_ctx.flag = 0;
 	}
 
 	/* Configuration of packet forwarding streams. */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 2dc3b74..e2d9e34 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -36,6 +36,7 @@
 
 #include <rte_pci.h>
 #include <rte_gro.h>
+#include <rte_gso.h>
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
@@ -206,6 +207,7 @@ struct rte_port {
  * CPU id. configuration table.
  */
 struct fwd_lcore {
+	struct rte_gso_ctx gso_ctx;     /**< GSO context */
 	struct rte_mempool *mbp; /**< The mbuf pool to use by this core */
 	void *gro_ctx;		/**< GRO context */
 	streamid_t stream_idx;   /**< index of 1st stream in "fwd_streams" */
@@ -450,6 +452,13 @@ struct gro_status {
 extern struct gro_status gro_ports[RTE_MAX_ETHPORTS];
 extern uint8_t gro_flush_cycles;
 
+#define GSO_MAX_PKT_BURST 2048
+struct gso_status {
+	uint8_t enable;
+};
+extern struct gso_status gso_ports[RTE_MAX_ETHPORTS];
+extern uint16_t gso_max_segment_size;
+
 static inline unsigned int
 lcore_num(void)
 {
@@ -652,6 +661,7 @@ int tx_queue_id_is_invalid(queueid_t txq_id);
 void setup_gro(const char *onoff, portid_t port_id);
 void setup_gro_flush_cycles(uint8_t cycles);
 void show_gro(portid_t port_id);
+void setup_gso(const char *mode, portid_t port_id);
 
 /* Functions to manage the set of filtered Multicast MAC addresses */
 void mcast_addr_add(uint8_t port_id, struct ether_addr *mc_addr);
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 0f45344..eb3cc66 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -960,6 +960,52 @@ Please note that the large value of ``cycles`` may cause the poor TCP/IP
 stack performance. Because the GROed packets are delayed to arrive the
 stack, thus causing more duplicated ACKs and TCP retransmissions.
 
+set port - gso
+~~~~~~~~~~~~~~
+
+Toggle per-port GSO support in ``csum`` forwarding engine::
+
+   testpmd> set port <port_id> gso on|off
+
+If enabled, the csum forwarding engine will perform GSO on supported IPv4
+packets, transmitted on the given port.
+
+If disabled, packets transmitted on the given port will not undergo GSO.
+By default, GSO is disabled for all ports.
+
+.. note::
+
+   When GSO is enabled on a port, supported IPv4 packets transmitted on that
+   port undergo GSO. Afterwards, the segmented packets are represented by
+   multi-segment mbufs; however, the csum forwarding engine doesn't calculation
+   of checksums for GSO'd segments in SW. As a result, if users want correct
+   checksums in GSO segments, they should enable HW checksum calculation for
+   GSO-enabled ports.
+
+   For example, HW checksum calculation for VxLAN GSO'd packets may be enabled
+   by setting the following options in the csum forwarding engine:
+
+   testpmd> csum set outer_ip hw <port_id>
+
+   testpmd> csum set ip hw <port_id>
+
+   testpmd> csum set tcp hw <port_id>
+
+set gso segsz
+~~~~~~~~~~~~~
+
+Set the maximum GSO segment size (measured in bytes), which includes the
+packet header and the packet payload for GSO-enabled ports (global)::
+
+   testpmd> set gso segsz <length>
+
+show port - gso
+~~~~~~~~~~~~~~~
+
+Display the status of Generic Segmentation Offload for a given port::
+
+   testpmd> show port <port_id> gso
+
 mac_addr add
 ~~~~~~~~~~~~
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* [PATCH v10 6/6] doc: add GSO programmer's guide
  2017-10-07 14:56                 ` [PATCH v10 " Jiayu Hu
                                     ` (4 preceding siblings ...)
  2017-10-07 14:56                   ` [PATCH v10 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
@ 2017-10-07 14:56                   ` Jiayu Hu
  2017-10-08  3:40                   ` [PATCH v10 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Ferruh Yigit
  6 siblings, 0 replies; 157+ messages in thread
From: Jiayu Hu @ 2017-10-07 14:56 UTC (permalink / raw)
  To: dev; +Cc: mark.b.kavanagh, ferruh.yigit, konstantin.ananyev, Jiayu Hu

From: Mark Kavanagh <mark.b.kavanagh@intel.com>

Add programmer's guide doc to explain the design and use of the
GSO library.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 MAINTAINERS                                        |   6 +
 .../generic_segmentation_offload_lib.rst           | 256 +++++++++++
 .../prog_guide/img/gso-output-segment-format.svg   | 313 ++++++++++++++
 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg   | 477 +++++++++++++++++++++
 doc/guides/prog_guide/index.rst                    |   1 +
 5 files changed, 1053 insertions(+)
 create mode 100644 doc/guides/prog_guide/generic_segmentation_offload_lib.rst
 create mode 100644 doc/guides/prog_guide/img/gso-output-segment-format.svg
 create mode 100644 doc/guides/prog_guide/img/gso-three-seg-mbuf.svg

diff --git a/MAINTAINERS b/MAINTAINERS
index cd0d6bc..950ef5c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -654,6 +654,12 @@ M: Jiayu Hu <jiayu.hu@intel.com>
 F: lib/librte_gro/
 F: doc/guides/prog_guide/generic_receive_offload_lib.rst
 
+Generic Segmentation Offload
+M: Jiayu Hu <jiayu.hu@intel.com>
+M: Mark Kavanagh <mark.b.kavanagh@intel.com>
+F: lib/librte_gso/
+F: doc/guides/prog_guide/generic_segmentation_offload_lib.rst
+
 Distributor
 M: Bruce Richardson <bruce.richardson@intel.com>
 M: David Hunt <david.hunt@intel.com>
diff --git a/doc/guides/prog_guide/generic_segmentation_offload_lib.rst b/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
new file mode 100644
index 0000000..5e78f16
--- /dev/null
+++ b/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
@@ -0,0 +1,256 @@
+..  BSD LICENSE
+    Copyright(c) 2017 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Generic Segmentation Offload Library
+====================================
+
+Overview
+--------
+Generic Segmentation Offload (GSO) is a widely used software implementation of
+TCP Segmentation Offload (TSO), which reduces per-packet processing overhead.
+Much like TSO, GSO gains performance by enabling upper layer applications to
+process a smaller number of large packets (e.g. MTU size of 64KB), instead of
+processing higher numbers of small packets (e.g. MTU size of 1500B), thus
+reducing per-packet overhead.
+
+For example, GSO allows guest kernel stacks to transmit over-sized TCP segments
+that far exceed the kernel interface's MTU; this eliminates the need to segment
+packets within the guest, and improves the data-to-overhead ratio of both the
+guest-host link, and PCI bus. The expectation of the guest network stack in this
+scenario is that segmentation of egress frames will take place either in the NIC
+HW, or where that hardware capability is unavailable, either in the host
+application, or network stack.
+
+Bearing that in mind, the GSO library enables DPDK applications to segment
+packets in software. Note however, that GSO is implemented as a standalone
+library, and not via a 'fallback' mechanism (i.e. for when TSO is unsupported
+in the underlying hardware); that is, applications must explicitly invoke the
+GSO library to segment packets. The size of GSO segments ``(segsz)`` is
+configurable by the application.
+
+Limitations
+-----------
+
+#. The GSO library doesn't check if input packets have correct checksums.
+
+#. In addition, the GSO library doesn't re-calculate checksums for segmented
+   packets (that task is left to the application).
+
+#. IP fragments are unsupported by the GSO library.
+
+#. The egress interface's driver must support multi-segment packets.
+
+#. Currently, the GSO library supports the following IPv4 packet types:
+
+ - TCP
+ - VxLAN
+ - GRE
+
+  See `Supported GSO Packet Types`_ for further details.
+
+Packet Segmentation
+-------------------
+
+The ``rte_gso_segment()`` function is the GSO library's primary
+segmentation API.
+
+Before performing segmentation, an application must create a GSO context object
+``(struct rte_gso_ctx)``, which provides the library with some of the
+information required to understand how the packet should be segmented. Refer to
+`How to Segment a Packet`_ for additional details on same. Once the GSO context
+has been created, and populated, the application can then use the
+``rte_gso_segment()`` function to segment packets.
+
+The GSO library typically stores each segment that it creates in two parts: the
+first part contains a copy of the original packet's headers, while the second
+part contains a pointer to an offset within the original packet. This mechanism
+is explained in more detail in `GSO Output Segment Format`_.
+
+The GSO library supports both single- and multi-segment input mbufs.
+
+GSO Output Segment Format
+~~~~~~~~~~~~~~~~~~~~~~~~~
+To reduce the number of expensive memcpy operations required when segmenting a
+packet, the GSO library typically stores each segment that it creates as a
+two-part mbuf (technically, this is termed a 'two-segment' mbuf; however, since
+the elements produced by the API are also called 'segments', for clarity the
+term 'part' is used here instead).
+
+The first part of each output segment is a direct mbuf and contains a copy of
+the original packet's headers, which must be prepended to each output segment.
+These headers are copied from the original packet into each output segment.
+
+The second part of each output segment, represents a section of data from the
+original packet, i.e. a data segment. Rather than copy the data directly from
+the original packet into the output segment (which would impact performance
+considerably), the second part of each output segment is an indirect mbuf,
+which contains no actual data, but simply points to an offset within the
+original packet.
+
+The combination of the 'header' segment and the 'data' segment constitutes a
+single logical output GSO segment of the original packet. This is illustrated
+in :numref:`figure_gso-output-segment-format`.
+
+.. _figure_gso-output-segment-format:
+
+.. figure:: img/gso-output-segment-format.svg
+   :align: center
+
+   Two-part GSO output segment
+
+In one situation, the output segment may contain additional 'data' segments.
+This only occurs when:
+
+- the input packet on which GSO is to be performed is represented by a
+  multi-segment mbuf.
+
+- the output segment is required to contain data that spans the boundaries
+  between segments of the input multi-segment mbuf.
+
+The GSO library traverses each segment of the input packet, and produces
+numerous output segments; for optimal performance, the number of output
+segments is kept to a minimum. Consequently, the GSO library maximizes the
+amount of data contained within each output segment; i.e. each output segment
+``segsz`` bytes of data. The only exception to this is in the case of the very
+final output segment; if ``pkt_len`` % ``segsz``, then the final segment is
+smaller than the rest.
+
+In order for an output segment to meet its MSS, it may need to include data from
+multiple input segments. Due to the nature of indirect mbufs (each indirect mbuf
+can point to only one direct mbuf), the solution here is to add another indirect
+mbuf to the output segment; this additional segment then points to the next
+input segment. If necessary, this chaining process is repeated, until the sum of
+all of the data 'contained' in the output segment reaches ``segsz``. This
+ensures that the amount of data contained within each output segment is uniform,
+with the possible exception of the last segment, as previously described.
+
+:numref:`figure_gso-three-seg-mbuf` illustrates an example of a three-part
+output segment. In this example, the output segment needs to include data from
+the end of one input segment, and the beginning of another. To achieve this,
+an additional indirect mbuf is chained to the second part of the output segment,
+and is attached to the next input segment (i.e. it points to the data in the
+next input segment).
+
+.. _figure_gso-three-seg-mbuf:
+
+.. figure:: img/gso-three-seg-mbuf.svg
+   :align: center
+
+   Three-part GSO output segment
+
+Supported GSO Packet Types
+--------------------------
+
+TCP/IPv4 GSO
+~~~~~~~~~~~~
+TCP/IPv4 GSO supports segmentation of suitably large TCP/IPv4 packets, which
+may also contain an optional VLAN tag.
+
+VxLAN GSO
+~~~~~~~~~
+VxLAN packets GSO supports segmentation of suitably large VxLAN packets,
+which contain an outer IPv4 header, inner TCP/IPv4 headers, and optional
+inner and/or outer VLAN tag(s).
+
+GRE GSO
+~~~~~~~
+GRE GSO supports segmentation of suitably large GRE packets, which contain
+an outer IPv4 header, inner TCP/IPv4 headers, and an optional VLAN tag.
+
+How to Segment a Packet
+-----------------------
+
+To segment an outgoing packet, an application must:
+
+#. First create a GSO context ``(struct rte_gso_ctx)``; this contains:
+
+   - a pointer to the mbuf pool for allocating the direct buffers, which are
+     used to store the GSO segments' packet headers.
+
+   - a pointer to the mbuf pool for allocating indirect buffers, which are
+     used to locate GSO segments' packet payloads.
+
+.. note::
+
+     An application may use the same pool for both direct and indirect
+     buffers. However, since each indirect mbuf simply stores a pointer, the
+     application may reduce its memory consumption by creating a separate memory
+     pool, containing smaller elements, for the indirect pool.
+
+   - the size of each output segment, including packet headers and payload,
+     measured in bytes.
+
+   - the bit mask of required GSO types. The GSO library uses the same macros as
+     those that describe a physical device's TX offloading capabilities (i.e.
+     ``DEV_TX_OFFLOAD_*_TSO``) for gso_types. For example, if an application
+     wants to segment TCP/IPv4 packets, it should set gso_types to
+     ``DEV_TX_OFFLOAD_TCP_TSO``. The only other supported values currently
+     supported for gso_types are ``DEV_TX_OFFLOAD_VXLAN_TNL_TSO``, and
+     ``DEV_TX_OFFLOAD_GRE_TNL_TSO``; a combination of these macros is also
+     allowed.
+
+   - a flag, that indicates whether the IPv4 headers of output segments should
+     contain fixed or incremental ID values.
+
+2. Set the appropriate ol_flags in the mbuf.
+
+   - The GSO library use the value of an mbuf's ``ol_flags`` attribute to
+     to determine how a packet should be segmented. It is the application's
+     responsibility to ensure that these flags are set.
+
+   - For example, in order to segment TCP/IPv4 packets, the application should
+     add the ``PKT_TX_IPV4`` and ``PKT_TX_TCP_SEG`` flags to the mbuf's
+     ol_flags.
+
+   - If checksum calculation in hardware is required, the application should
+     also add the ``PKT_TX_TCP_CKSUM`` and ``PKT_TX_IP_CKSUM`` flags.
+
+#. Check if the packet should be processed. Packets with one of the
+   following properties are not processed and are returned immediately:
+
+   - Packet length is less than ``segsz`` (i.e. GSO is not required).
+
+   - Packet type is not supported by GSO library (see
+     `Supported GSO Packet Types`_).
+
+   - Application has not enabled GSO support for the packet type.
+
+   - Packet's ol_flags have been incorrectly set.
+
+#. Allocate space in which to store the output GSO segments. If the amount of
+   space allocated by the application is insufficient, segmentation will fail.
+
+#. Invoke the GSO segmentation API, ``rte_gso_segment()``.
+
+#. If required, update the L3 and L4 checksums of the newly-created segments.
+   For tunneled packets, the outer IPv4 headers' checksums should also be
+   updated. Alternatively, the application may offload checksum calculation
+   to HW.
+
diff --git a/doc/guides/prog_guide/img/gso-output-segment-format.svg b/doc/guides/prog_guide/img/gso-output-segment-format.svg
new file mode 100644
index 0000000..bdb5ec3
--- /dev/null
+++ b/doc/guides/prog_guide/img/gso-output-segment-format.svg
@@ -0,0 +1,313 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by Microsoft Visio, SVG Export gso-output-segment-format.svg Page-1 -->
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ev="http://www.w3.org/2001/xml-events"
+		xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/" width="19.3975in" height="8.21796in"
+		viewBox="0 0 1396.62 591.693" xml:space="preserve" color-interpolation-filters="sRGB" class="st21">
+	<v:documentProperties v:langID="1033" v:metric="true" v:viewMarkup="false"/>
+
+	<style type="text/css">
+	<![CDATA[
+		.st1 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0552552}
+		.st2 {fill:#ffffff;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st3 {stroke:#c3d600;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.68828}
+		.st4 {fill:#c3d600;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st5 {stroke:#8f9d00;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.75735}
+		.st6 {fill:#00aeef;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st7 {stroke:#007fb0;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.75735}
+		.st8 {stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st9 {fill:#ffffff;font-family:Intel Clear;font-size:1.99999em;font-weight:bold}
+		.st10 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0552552}
+		.st11 {fill:#ffffff;font-family:Intel Clear;font-size:2.44732em;font-weight:bold}
+		.st12 {fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round;stroke-width:5.52552}
+		.st13 {fill:#000000;font-family:Intel Clear;font-size:2.15291em}
+		.st14 {fill:#000000;font-family:Intel Clear;font-size:1.8401em}
+		.st15 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0276276}
+		.st16 {fill:#c3d600;font-family:Intel Clear;font-size:2.44732em}
+		.st17 {fill:#ffc000;font-family:Intel Clear;font-size:2.44732em}
+		.st18 {fill:#ffc000;stroke:#ffc000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0276276}
+		.st19 {fill:#0070c0;font-family:Intel Clear;font-size:1.8401em}
+		.st20 {fill:#006fc5;font-family:Intel Clear;font-size:1.61927em}
+		.st21 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}
+	]]>
+	</style>
+
+	<g v:mID="0" v:index="1" v:groupContext="foregroundPage">
+		<title>Page-1</title>
+		<v:pageProperties v:drawingScale="0.0393701" v:pageScale="0.0393701" v:drawingUnits="24" v:shadowOffsetX="8.50394"
+				v:shadowOffsetY="-8.50394"/>
+		<g id="shape3-1" v:mID="3" v:groupContext="shape" transform="translate(577.244,-560.42)">
+			<title>Sheet.3</title>
+			<path d="M9.24 585.29 L16.32 585.29 L16.32 587.06 L9.24 587.06 L9.24 585.29 L9.24 585.29 ZM21.63 585.29 L23.4 585.29
+						 L23.4 587.06 L21.63 587.06 L21.63 585.29 L21.63 585.29 ZM28.7 585.29 L35.78 585.29 L35.78 587.06 L28.7 587.06
+						 L28.7 585.29 L28.7 585.29 ZM41.09 585.29 L42.86 585.29 L42.86 587.06 L41.09 587.06 L41.09 585.29 L41.09
+						 585.29 ZM48.17 585.29 L55.25 585.29 L55.25 587.06 L48.17 587.06 L48.17 585.29 L48.17 585.29 ZM60.56 585.29
+						 L62.33 585.29 L62.33 587.06 L60.56 587.06 L60.56 585.29 L60.56 585.29 ZM67.64 585.29 L74.72 585.29 L74.72
+						 587.06 L67.64 587.06 L67.64 585.29 L67.64 585.29 ZM80.03 585.29 L81.8 585.29 L81.8 587.06 L80.03 587.06
+						 L80.03 585.29 L80.03 585.29 ZM87.11 585.29 L94.19 585.29 L94.19 587.06 L87.11 587.06 L87.11 585.29 L87.11
+						 585.29 ZM99.5 585.29 L101.27 585.29 L101.27 587.06 L99.5 587.06 L99.5 585.29 L99.5 585.29 ZM106.58 585.29
+						 L113.66 585.29 L113.66 587.06 L106.58 587.06 L106.58 585.29 L106.58 585.29 ZM118.97 585.29 L120.74 585.29
+						 L120.74 587.06 L118.97 587.06 L118.97 585.29 L118.97 585.29 ZM126.05 585.29 L133.13 585.29 L133.13 587.06
+						 L126.05 587.06 L126.05 585.29 L126.05 585.29 ZM138.43 585.29 L140.2 585.29 L140.2 587.06 L138.43 587.06
+						 L138.43 585.29 L138.43 585.29 ZM145.51 585.29 L152.59 585.29 L152.59 587.06 L145.51 587.06 L145.51 585.29
+						 L145.51 585.29 ZM157.9 585.29 L159.67 585.29 L159.67 587.06 L157.9 587.06 L157.9 585.29 L157.9 585.29 ZM164.98
+						 585.29 L172.06 585.29 L172.06 587.06 L164.98 587.06 L164.98 585.29 L164.98 585.29 ZM177.37 585.29 L179.14
+						 585.29 L179.14 587.06 L177.37 587.06 L177.37 585.29 L177.37 585.29 ZM184.45 585.29 L191.53 585.29 L191.53
+						 587.06 L184.45 587.06 L184.45 585.29 L184.45 585.29 ZM196.84 585.29 L198.61 585.29 L198.61 587.06 L196.84
+						 587.06 L196.84 585.29 L196.84 585.29 ZM203.92 585.29 L211 585.29 L211 587.06 L203.92 587.06 L203.92 585.29
+						 L203.92 585.29 ZM216.31 585.29 L218.08 585.29 L218.08 587.06 L216.31 587.06 L216.31 585.29 L216.31 585.29
+						 ZM223.39 585.29 L230.47 585.29 L230.47 587.06 L223.39 587.06 L223.39 585.29 L223.39 585.29 ZM235.78 585.29
+						 L237.55 585.29 L237.55 587.06 L235.78 587.06 L235.78 585.29 L235.78 585.29 ZM242.86 585.29 L249.93 585.29
+						 L249.93 587.06 L242.86 587.06 L242.86 585.29 L242.86 585.29 ZM255.24 585.29 L257.01 585.29 L257.01 587.06
+						 L255.24 587.06 L255.24 585.29 L255.24 585.29 ZM262.32 585.29 L269.4 585.29 L269.4 587.06 L262.32 587.06
+						 L262.32 585.29 L262.32 585.29 ZM274.71 585.29 L276.48 585.29 L276.48 587.06 L274.71 587.06 L274.71 585.29
+						 L274.71 585.29 ZM281.79 585.29 L288.87 585.29 L288.87 587.06 L281.79 587.06 L281.79 585.29 L281.79 585.29
+						 ZM294.18 585.29 L295.95 585.29 L295.95 587.06 L294.18 587.06 L294.18 585.29 L294.18 585.29 ZM301.26 585.29
+						 L308.34 585.29 L308.34 587.06 L301.26 587.06 L301.26 585.29 L301.26 585.29 ZM313.65 585.29 L315.42 585.29
+						 L315.42 587.06 L313.65 587.06 L313.65 585.29 L313.65 585.29 ZM320.73 585.29 L324.99 585.29 L324.99 587.06
+						 L320.73 587.06 L320.73 585.29 L320.73 585.29 ZM11.06 591.69 L0 586.17 L11.06 580.65 L11.06 591.69 L11.06
+						 591.69 ZM323.16 580.65 L334.22 586.17 L323.16 591.69 L323.16 580.65 L323.16 580.65 Z" class="st1"/>
+		</g>
+		<g id="shape4-3" v:mID="4" v:groupContext="shape" transform="translate(184.298,-201.906)">
+			<title>Sheet.4</title>
+			<path d="M94.04 570.43 L117.87 557.26 L0 344.58 L47.68 318.26 L165.55 530.94 L189.39 517.79 L168.08 591.69 L94.04 570.43
+						 Z" class="st2"/>
+		</g>
+		<g id="shape5-5" v:mID="5" v:groupContext="shape" transform="translate(184.298,-201.906)">
+			<title>Sheet.5</title>
+			<path d="M94.04 570.43 L117.87 557.26 L0 344.58 L47.68 318.26 L165.55 530.94 L189.39 517.79 L168.08 591.69 L94.04 570.43"
+					class="st3"/>
+		</g>
+		<g id="shape6-8" v:mID="6" v:groupContext="shape" transform="translate(119.408,-447.917)">
+			<title>Sheet.6</title>
+			<path d="M0 510.21 L0 591.69 L129.86 591.69 L129.86 510.21 L0 510.21 L0 510.21 Z" class="st4"/>
+		</g>
+		<g id="shape7-10" v:mID="7" v:groupContext="shape" transform="translate(119.408,-447.917)">
+			<title>Sheet.7</title>
+			<path d="M0 510.21 L129.86 510.21 L129.86 591.69 L0 591.69 L0 510.21" class="st5"/>
+		</g>
+		<g id="shape10-13" v:mID="10" v:groupContext="shape" transform="translate(250.819,-447.917)">
+			<title>Sheet.10</title>
+			<path d="M0 510.21 L0 591.69 L822.53 591.69 L822.53 510.21 L0 510.21 L0 510.21 Z" class="st6"/>
+		</g>
+		<g id="shape11-15" v:mID="11" v:groupContext="shape" transform="translate(250.819,-447.917)">
+			<title>Sheet.11</title>
+			<path d="M0 510.21 L822.53 510.21 L822.53 591.69 L0 591.69 L0 510.21" class="st7"/>
+		</g>
+		<g id="shape12-18" v:mID="12" v:groupContext="shape" transform="translate(255.478,-470.123)">
+			<title>Sheet.12</title>
+			<desc>Payload 0</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="157.315" cy="574.07" width="314.63" height="35.245"/>
+			<path d="M314.63 556.45 L0 556.45 L0 591.69 L314.63 591.69 L314.63 556.45" class="st8"/>
+			<text x="102.08" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 0</text>		</g>
+		<g id="shape13-22" v:mID="13" v:groupContext="shape" transform="translate(577.354,-470.123)">
+			<title>Sheet.13</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="167.112" cy="574.07" width="334.23" height="35.245"/>
+			<path d="M334.22 556.45 L0 556.45 L0 591.69 L334.22 591.69 L334.22 556.45" class="st8"/>
+			<text x="111.88" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape14-26" v:mID="14" v:groupContext="shape" transform="translate(910.635,-470.956)">
+			<title>Sheet.14</title>
+			<desc>Payload 2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="81.8509" cy="574.07" width="163.71" height="35.245"/>
+			<path d="M163.7 556.45 L0 556.45 L0 591.69 L163.7 591.69 L163.7 556.45" class="st8"/>
+			<text x="26.61" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 2</text>		</g>
+		<g id="shape15-30" v:mID="15" v:groupContext="shape" transform="translate(909.144,-453.824)">
+			<title>Sheet.15</title>
+			<path d="M1.16 453.85 L1.05 465.33 L3.93 465.39 L4.04 453.91 L1.16 453.85 L1.16 453.85 ZM1 473.95 L0.94 476.82 L3.82
+						 476.87 L3.87 474 L1 473.95 L1 473.95 ZM0.88 485.43 L0.77 496.91 L3.65 496.96 L3.76 485.48 L0.88 485.43 L0.88
+						 485.43 ZM0.72 505.52 L0.72 508.39 L3.59 508.45 L3.59 505.58 L0.72 505.52 L0.72 505.52 ZM0.61 517 L0.55 528.49
+						 L3.43 528.54 L3.48 517.06 L0.61 517 L0.61 517 ZM0.44 537.1 L0.44 539.97 L3.32 540.02 L3.32 537.15 L0.44
+						 537.1 L0.44 537.1 ZM0.39 548.58 L0.28 560.06 L3.15 560.12 L3.26 548.63 L0.39 548.58 L0.39 548.58 ZM0.22
+						 568.67 L0.17 571.54 L3.04 571.6 L3.1 568.73 L0.22 568.67 L0.22 568.67 ZM0.11 580.16 L0 591.64 L2.88 591.69
+						 L2.99 580.21 L0.11 580.16 L0.11 580.16 Z" class="st10"/>
+		</g>
+		<g id="shape16-32" v:mID="16" v:groupContext="shape" transform="translate(119.187,-447.917)">
+			<title>Sheet.16</title>
+			<path d="M0 510.21 L0 591.69 L129.86 591.69 L129.86 510.21 L0 510.21 L0 510.21 Z" class="st4"/>
+		</g>
+		<g id="shape17-34" v:mID="17" v:groupContext="shape" transform="translate(119.187,-447.917)">
+			<title>Sheet.17</title>
+			<path d="M0 510.21 L129.86 510.21 L129.86 591.69 L0 591.69 L0 510.21" class="st5"/>
+		</g>
+		<g id="shape18-37" v:mID="18" v:groupContext="shape" transform="translate(121.944,-471.034)">
+			<title>Sheet.18</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="61.0973" cy="574.07" width="122.2" height="35.245"/>
+			<path d="M122.19 556.45 L0 556.45 L0 591.69 L122.19 591.69 L122.19 556.45" class="st8"/>
+			<text x="20.61" y="581.27" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape19-41" v:mID="19" v:groupContext="shape" transform="translate(329.798,-1.87868)">
+			<title>Sheet.19</title>
+			<path d="M0 510.43 L0 591.69 L289.81 591.69 L289.81 510.43 L0 510.43 L0 510.43 Z" class="st4"/>
+		</g>
+		<g id="shape20-43" v:mID="20" v:groupContext="shape" transform="translate(329.798,-1.87868)">
+			<title>Sheet.20</title>
+			<path d="M0 510.43 L289.81 510.43 L289.81 591.69 L0 591.69 L0 510.43" class="st5"/>
+		</g>
+		<g id="shape21-46" v:mID="21" v:groupContext="shape" transform="translate(424.908,-21.567)">
+			<title>Sheet.21</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="61.0973" cy="574.07" width="122.2" height="35.245"/>
+			<path d="M122.19 556.45 L0 556.45 L0 591.69 L122.19 591.69 L122.19 556.45" class="st8"/>
+			<text x="11.55" y="582.88" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape22-50" v:mID="22" v:groupContext="shape" transform="translate(619.609,-1.87868)">
+			<title>Sheet.22</title>
+			<path d="M0 510.43 L0 591.69 L453.74 591.69 L453.74 510.43 L0 510.43 L0 510.43 Z" class="st6"/>
+		</g>
+		<g id="shape23-52" v:mID="23" v:groupContext="shape" transform="translate(619.609,-1.87868)">
+			<title>Sheet.23</title>
+			<path d="M0 510.43 L453.74 510.43 L453.74 591.69 L0 591.69 L0 510.43" class="st7"/>
+		</g>
+		<g id="shape24-55" v:mID="24" v:groupContext="shape" transform="translate(778.624,-21.5672)">
+			<title>Sheet.24</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="81.8509" cy="574.07" width="163.71" height="35.245"/>
+			<path d="M163.7 556.45 L0 556.45 L0 591.69 L163.7 591.69 L163.7 556.45" class="st8"/>
+			<text x="14.26" y="582.88" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape25-59" v:mID="25" v:groupContext="shape" transform="translate(710.092,-113.83)">
+			<title>Sheet.25</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L349.43 508.89 C357.12 508.89 363.26 515.07 363.26 522.69 L363.26
+						 577.89 C363.26 585.57 357.12 591.69 349.43 591.69 L13.83 591.69 C6.19 591.69 0 585.57 0 577.89 L0 522.69
+						 Z" class="st6"/>
+		</g>
+		<g id="shape26-61" v:mID="26" v:groupContext="shape" transform="translate(710.092,-113.83)">
+			<title>Sheet.26</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L349.43 508.89 C357.12 508.89 363.26 515.07 363.26 522.69 L363.26
+						 577.89 C363.26 585.57 357.12 591.69 349.43 591.69 L13.83 591.69 C6.19 591.69 0 585.57 0 577.89 L0 522.69
+						 Z" class="st12"/>
+		</g>
+		<g id="shape27-63" v:mID="27" v:groupContext="shape" transform="translate(813.057,-150.108)">
+			<title>Sheet.27</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="94.1386" cy="576.19" width="188.28" height="31.0055"/>
+			<path d="M188.28 560.69 L0 560.69 L0 591.69 L188.28 591.69 L188.28 560.69" class="st8"/>
+			<text x="15.43" y="583.94" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape28-67" v:mID="28" v:groupContext="shape" transform="translate(810.845,-123.854)">
+			<title>Sheet.28</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="95.5065" cy="578.442" width="191.02" height="26.501"/>
+			<path d="M191.01 565.19 L0 565.19 L0 591.69 L191.01 591.69 L191.01 565.19" class="st8"/>
+			<text x="15.15" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape29-71" v:mID="29" v:groupContext="shape" transform="translate(573.151,-149.601)">
+			<title>Sheet.29</title>
+			<path d="M0 584.74 L127.76 584.74 L127.76 587.61 L0 587.61 L0 584.74 L0 584.74 ZM125.91 580.65 L136.97 586.17 L125.91
+						 591.69 L125.91 580.65 L125.91 580.65 Z" class="st15"/>
+		</g>
+		<g id="shape30-73" v:mID="30" v:groupContext="shape" transform="translate(0,-309.671)">
+			<title>Sheet.30</title>
+			<desc>Memory copy</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="108.076" cy="574.07" width="216.16" height="35.245"/>
+			<path d="M216.15 556.45 L0 556.45 L0 591.69 L216.15 591.69 L216.15 556.45" class="st8"/>
+			<text x="17.68" y="582.88" class="st16" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Memory copy</text>		</g>
+		<g id="shape31-77" v:mID="31" v:groupContext="shape" transform="translate(680.77,-305.707)">
+			<title>Sheet.31</title>
+			<desc>No Memory Copy</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="136.547" cy="574.07" width="273.1" height="35.245"/>
+			<path d="M273.09 556.45 L0 556.45 L0 591.69 L273.09 591.69 L273.09 556.45" class="st8"/>
+			<text x="21.4" y="582.88" class="st17" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>No Memory Copy</text>		</g>
+		<g id="shape32-81" v:mID="32" v:groupContext="shape" transform="translate(1102.72,-26.7532)">
+			<title>Sheet.32</title>
+			<desc>Logical output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="138.243" cy="578.442" width="276.49" height="26.501"/>
+			<path d="M276.49 565.19 L0 565.19 L0 591.69 L276.49 591.69 L276.49 565.19" class="st8"/>
+			<text x="20.73" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Logical output segment</text>		</g>
+		<g id="shape36-85" v:mID="36" v:groupContext="shape" transform="translate(1106.81,-138.647)">
+			<title>Sheet.36</title>
+			<desc>Two-part output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="144.906" cy="578.442" width="289.82" height="26.501"/>
+			<path d="M289.81 565.19 L0 565.19 L0 591.69 L289.81 591.69 L289.81 565.19" class="st8"/>
+			<text x="16.56" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Two-part output segment</text>		</g>
+		<g id="shape37-89" v:mID="37" v:groupContext="shape" transform="translate(575.916,-453.879)">
+			<title>Sheet.37</title>
+			<path d="M2.88 453.91 L2.9 465.39 L0.03 465.39 L0 453.91 L2.88 453.91 L2.88 453.91 ZM2.9 474 L2.9 476.87 L0.03 476.87
+						 L0.03 474 L2.9 474 L2.9 474 ZM2.9 485.48 L2.9 496.96 L0.03 496.96 L0.03 485.48 L2.9 485.48 L2.9 485.48 ZM2.9
+						 505.58 L2.9 508.45 L0.03 508.45 L0.03 505.58 L2.9 505.58 L2.9 505.58 ZM2.9 517.06 L2.9 528.54 L0.03 528.54
+						 L0.03 517.06 L2.9 517.06 L2.9 517.06 ZM2.9 537.15 L2.9 540.02 L0.03 540.02 L0.03 537.15 L2.9 537.15 L2.9
+						 537.15 ZM2.9 548.63 L2.9 560.12 L0.03 560.12 L0.03 548.63 L2.9 548.63 L2.9 548.63 ZM2.9 568.73 L2.9 571.6
+						 L0.03 571.6 L0.03 568.73 L2.9 568.73 L2.9 568.73 ZM2.9 580.21 L2.9 591.69 L0.03 591.69 L0.03 580.21 L2.9
+						 580.21 L2.9 580.21 Z" class="st18"/>
+		</g>
+		<g id="shape38-91" v:mID="38" v:groupContext="shape" transform="translate(577.354,-193.764)">
+			<title>Sheet.38</title>
+			<path d="M5.59 347.01 L10.92 357.16 L8.38 358.52 L3.04 348.36 L5.59 347.01 L5.59 347.01 ZM14.96 364.78 L16.29 367.32
+						 L13.74 368.67 L12.42 366.13 L14.96 364.78 L14.96 364.78 ZM20.33 374.97 L25.66 385.12 L23.12 386.45 L17.78
+						 376.29 L20.33 374.97 L20.33 374.97 ZM29.7 392.74 L31.03 395.28 L28.48 396.61 L27.16 394.07 L29.7 392.74
+						 L29.7 392.74 ZM35.04 402.9 L40.4 413.06 L37.86 414.38 L32.49 404.22 L35.04 402.9 L35.04 402.9 ZM44.41 420.67
+						 L45.77 423.21 L43.22 424.57 L41.87 422.03 L44.41 420.67 L44.41 420.67 ZM49.78 430.83 L55.14 440.99 L52.6
+						 442.34 L47.23 432.18 L49.78 430.83 L49.78 430.83 ZM59.15 448.61 L60.51 451.15 L57.96 452.5 L56.61 449.96
+						 L59.15 448.61 L59.15 448.61 ZM64.52 458.79 L69.88 468.95 L67.34 470.27 L61.97 460.12 L64.52 458.79 L64.52
+						 458.79 ZM73.89 476.57 L75.25 479.11 L72.7 480.43 L71.35 477.89 L73.89 476.57 L73.89 476.57 ZM79.26 486.72
+						 L84.62 496.88 L82.08 498.21 L76.71 488.05 L79.26 486.72 L79.26 486.72 ZM88.63 504.5 L89.96 507.04 L87.41
+						 508.39 L86.09 505.85 L88.63 504.5 L88.63 504.5 ZM94 514.66 L99.33 524.81 L96.79 526.17 L91.45 516.01 L94
+						 514.66 L94 514.66 ZM103.37 532.43 L104.7 534.97 L102.15 536.32 L100.83 533.79 L103.37 532.43 L103.37 532.43
+						 ZM108.73 542.62 L114.07 552.77 L111.53 554.1 L106.19 543.94 L108.73 542.62 L108.73 542.62 ZM118.11 560.39
+						 L119.44 562.93 L116.89 564.26 L115.57 561.72 L118.11 560.39 L118.11 560.39 ZM123.45 570.55 L128.81 580.71
+						 L126.27 582.03 L120.9 571.87 L123.45 570.55 L123.45 570.55 ZM132.82 588.33 L133.9 590.37 L131.36 591.69
+						 L130.28 589.68 L132.82 588.33 L132.82 588.33 ZM0.28 351.89 L0 339.53 L10.07 346.73 L0.28 351.89 L0.28 351.89
+						 Z" class="st18"/>
+		</g>
+		<g id="shape39-93" v:mID="39" v:groupContext="shape" transform="translate(329.798,-113.83)">
+			<title>Sheet.39</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L229.53 508.89 C237.19 508.89 243.35 515.07 243.35 522.69 L243.35
+						 577.89 C243.35 585.54 237.19 591.69 229.53 591.69 L13.83 591.69 C6.19 591.69 0 585.54 0 577.89 L0 522.69
+						 Z" class="st4"/>
+		</g>
+		<g id="shape40-95" v:mID="40" v:groupContext="shape" transform="translate(329.798,-113.83)">
+			<title>Sheet.40</title>
+			<path d="M0 522.69 C0 515.07 6.19 508.89 13.83 508.89 L229.53 508.89 C237.19 508.89 243.35 515.07 243.35 522.69 L243.35
+						 577.89 C243.35 585.54 237.19 591.69 229.53 591.69 L13.83 591.69 C6.19 591.69 0 585.54 0 577.89 L0 522.69
+						 Z" class="st12"/>
+		</g>
+		<g id="shape41-97" v:mID="41" v:groupContext="shape" transform="translate(368.774,-150.453)">
+			<title>Sheet.41</title>
+			<desc>Direct mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="82.7002" cy="576.19" width="165.41" height="31.0055"/>
+			<path d="M165.4 560.69 L0 560.69 L0 591.69 L165.4 591.69 L165.4 560.69" class="st8"/>
+			<text x="13.94" y="583.94" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Direct mbuf</text>		</g>
+		<g id="shape42-101" v:mID="42" v:groupContext="shape" transform="translate(351.856,-123.854)">
+			<title>Sheet.42</title>
+			<desc>(copy of headers)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="102.121" cy="578.442" width="204.25" height="26.501"/>
+			<path d="M204.24 565.19 L0 565.19 L0 591.69 L204.24 591.69 L204.24 565.19" class="st8"/>
+			<text x="16.02" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(copy of headers)</text>		</g>
+		<g id="shape43-105" v:mID="43" v:groupContext="shape" transform="translate(619.797,-155.563)">
+			<title>Sheet.43</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="28.011" cy="578.442" width="56.03" height="26.501"/>
+			<path d="M56.02 565.19 L0 565.19 L0 591.69 L56.02 591.69 L56.02 565.19" class="st8"/>
+			<text x="6.35" y="585.07" class="st19" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape44-109" v:mID="44" v:groupContext="shape" transform="translate(700.911,-551.367)">
+			<title>Sheet.44</title>
+			<path d="M0 559.23 L0 591.69 L84.29 591.69 L84.29 559.23 L0 559.23 L0 559.23 Z" class="st2"/>
+		</g>
+		<g id="shape45-111" v:mID="45" v:groupContext="shape" transform="translate(709.883,-555.163)">
+			<title>Sheet.45</title>
+			<desc>segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="30.7501" cy="580.032" width="61.51" height="23.3211"/>
+			<path d="M61.5 568.37 L0 568.37 L0 591.69 L61.5 591.69 L61.5 568.37" class="st8"/>
+			<text x="6.38" y="585.86" class="st20" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>segsz</text>		</g>
+		<g id="shape46-115" v:mID="46" v:groupContext="shape" transform="translate(1111.54,-477.36)">
+			<title>Sheet.46</title>
+			<desc>Input packet</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="74.9" cy="578.442" width="149.8" height="26.501"/>
+			<path d="M149.8 565.19 L0 565.19 L0 591.69 L149.8 591.69 L149.8 565.19" class="st8"/>
+			<text x="12.47" y="585.07" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Input packet</text>		</g>
+	</g>
+</svg>
diff --git a/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg b/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
new file mode 100644
index 0000000..f18a327
--- /dev/null
+++ b/doc/guides/prog_guide/img/gso-three-seg-mbuf.svg
@@ -0,0 +1,477 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by Microsoft Visio, SVG Export gso-three-seg-mbuf.svg Page-1 -->
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ev="http://www.w3.org/2001/xml-events"
+		xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/" width="21.8589in" height="9.63966in"
+		viewBox="0 0 1573.84 694.055" xml:space="preserve" color-interpolation-filters="sRGB" class="st23">
+	<title>GSO three-part output segment</title>
+	<v:documentProperties v:langID="1033" v:metric="true" v:viewMarkup="false"/>
+
+	<style type="text/css">
+	<![CDATA[
+		.st1 {fill:#ffc000;stroke:#ffc000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st2 {fill:#006fc5;stroke:#006fc5;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st3 {fill:none;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:3.42236}
+		.st4 {fill:#c3d600;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st5 {stroke:#8f9d00;stroke-linecap:round;stroke-linejoin:round;stroke-width:4.47539}
+		.st6 {fill:#00aeef;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st7 {stroke:#007fb0;stroke-linecap:round;stroke-linejoin:round;stroke-width:4.47539}
+		.st8 {stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st9 {fill:#ffffff;font-family:Calibri;font-size:2.08333em;font-weight:bold}
+		.st10 {fill:#ffffff;font-family:Intel Clear;font-size:2.91502em;font-weight:bold}
+		.st11 {fill:#000000;font-family:Intel Clear;font-size:2.19175em}
+		.st12 {fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round;stroke-width:6.58146}
+		.st13 {fill:#000000;font-family:Intel Clear;font-size:2.50001em}
+		.st14 {fill:#000000;font-family:Intel Clear;font-size:1.99999em}
+		.st15 {fill:#0070c0;font-family:Intel Clear;font-size:2.19175em}
+		.st16 {fill:#ffffff;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.75}
+		.st17 {fill:#006fc5;font-family:Intel Clear;font-size:1.92874em}
+		.st18 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0329073}
+		.st19 {fill:#0070c0;font-family:Intel Clear;font-size:1.5em}
+		.st20 {fill:#000000;stroke:#000000;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.0658146}
+		.st21 {fill:#000000;font-family:Intel Clear;font-size:1.81915em}
+		.st22 {fill:#000000;font-family:Intel Clear;font-size:1.49785em}
+		.st23 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}
+	]]>
+	</style>
+
+	<g v:mID="0" v:index="1" v:groupContext="foregroundPage">
+		<title>Page-1</title>
+		<v:pageProperties v:drawingScale="0.0393701" v:pageScale="0.0393701" v:drawingUnits="24" v:shadowOffsetX="8.50394"
+				v:shadowOffsetY="-8.50394"/>
+		<v:layer v:name="top" v:index="0"/>
+		<v:layer v:name="middle" v:index="1"/>
+		<g id="shape111-1" v:mID="111" v:groupContext="shape" v:layerMember="0" transform="translate(787.208,-220.973)">
+			<title>Sheet.111</title>
+			<path d="M6.65 402.61 L13.01 414.71 L9.98 416.32 L3.62 404.22 L6.65 402.61 L6.65 402.61 ZM17.82 423.78 L19.4 426.81 L16.37
+						 428.42 L14.79 425.39 L17.82 423.78 L17.82 423.78 ZM24.21 435.91 L30.57 448.01 L27.54 449.59 L21.18 437.49
+						 L24.21 435.91 L24.21 435.91 ZM35.38 457.08 L36.96 460.11 L33.93 461.69 L32.35 458.66 L35.38 457.08 L35.38
+						 457.08 ZM41.73 469.18 L48.12 481.28 L45.09 482.86 L38.7 470.76 L41.73 469.18 L41.73 469.18 ZM52.9 490.36
+						 L54.51 493.38 L51.48 494.99 L49.87 491.97 L52.9 490.36 L52.9 490.36 ZM59.29 502.45 L65.68 514.55 L62.65
+						 516.16 L56.26 504.06 L59.29 502.45 L59.29 502.45 ZM70.46 523.63 L72.07 526.65 L69.04 528.26 L67.43 525.24
+						 L70.46 523.63 L70.46 523.63 ZM76.85 535.76 L83.24 547.86 L80.21 549.43 L73.82 537.34 L76.85 535.76 L76.85
+						 535.76 ZM88.01 556.93 L89.63 559.95 L86.6 561.53 L84.98 558.51 L88.01 556.93 L88.01 556.93 ZM94.4 569.03
+						 L100.79 581.13 L97.76 582.7 L91.37 570.61 L94.4 569.03 L94.4 569.03 ZM105.57 590.2 L107.15 593.22 L104.12
+						 594.84 L102.54 591.81 L105.57 590.2 L105.57 590.2 ZM111.96 602.3 L118.32 614.4 L115.28 616.01 L108.93 603.91
+						 L111.96 602.3 L111.96 602.3 ZM123.12 623.47 L124.71 626.5 L121.67 628.11 L120.09 625.08 L123.12 623.47 L123.12
+						 623.47 ZM129.51 635.6 L135.87 647.7 L132.84 649.28 L126.48 637.18 L129.51 635.6 L129.51 635.6 ZM140.68 656.77
+						 L142.26 659.8 L139.23 661.38 L137.65 658.35 L140.68 656.77 L140.68 656.77 ZM147.04 668.87 L153.43 680.97
+						 L150.4 682.55 L144.01 670.45 L147.04 668.87 L147.04 668.87 ZM158.2 690.04 L159.49 692.48 L156.46 694.06
+						 L155.17 691.66 L158.2 690.04 L158.2 690.04 ZM0.33 408.43 L0 393.7 L11.99 402.28 L0.33 408.43 L0.33 408.43
+						 Z" class="st1"/>
+		</g>
+		<g id="shape110-3" v:mID="110" v:groupContext="shape" v:layerMember="0" transform="translate(685.078,-560.166)">
+			<title>Sheet.110</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape4-5" v:mID="4" v:groupContext="shape" transform="translate(718.715,-469.955)">
+			<title>Sheet.4</title>
+			<path d="M0 655.13 L0 678.22 C0 686.97 11.69 694.06 26.05 694.06 C40.45 694.06 52.11 686.97 52.11 678.22 L52.11 673.91
+						 L59.55 673.91 L44.66 664.86 L29.78 673.91 L37.22 673.91 L37.22 678.22 C37.22 681.98 32.25 685 26.05 685
+						 C19.89 685 14.89 681.98 14.89 678.22 L14.89 655.13 L0 655.13 Z" class="st3"/>
+		</g>
+		<g id="shape5-7" v:mID="5" v:groupContext="shape" transform="translate(547.831,-656.823)">
+			<title>Sheet.5</title>
+			<path d="M11 686.43 L19.43 686.43 L19.43 688.53 L11 688.53 L11 686.43 L11 686.43 ZM25.76 686.43 L27.87 686.43 L27.87
+						 688.53 L25.76 688.53 L25.76 686.43 L25.76 686.43 ZM34.19 686.43 L42.62 686.43 L42.62 688.53 L34.19 688.53
+						 L34.19 686.43 L34.19 686.43 ZM48.95 686.43 L51.05 686.43 L51.05 688.53 L48.95 688.53 L48.95 686.43 L48.95
+						 686.43 ZM57.38 686.43 L65.81 686.43 L65.81 688.53 L57.38 688.53 L57.38 686.43 L57.38 686.43 ZM72.14 686.43
+						 L74.24 686.43 L74.24 688.53 L72.14 688.53 L72.14 686.43 L72.14 686.43 ZM80.57 686.43 L89 686.43 L89 688.53
+						 L80.57 688.53 L80.57 686.43 L80.57 686.43 ZM95.32 686.43 L97.43 686.43 L97.43 688.53 L95.32 688.53 L95.32
+						 686.43 L95.32 686.43 ZM103.76 686.43 L112.19 686.43 L112.19 688.53 L103.76 688.53 L103.76 686.43 L103.76
+						 686.43 ZM118.51 686.43 L120.62 686.43 L120.62 688.53 L118.51 688.53 L118.51 686.43 L118.51 686.43 ZM126.94
+						 686.43 L135.38 686.43 L135.38 688.53 L126.94 688.53 L126.94 686.43 L126.94 686.43 ZM141.7 686.43 L143.81
+						 686.43 L143.81 688.53 L141.7 688.53 L141.7 686.43 L141.7 686.43 ZM150.13 686.43 L158.57 686.43 L158.57 688.53
+						 L150.13 688.53 L150.13 686.43 L150.13 686.43 ZM164.89 686.43 L167 686.43 L167 688.53 L164.89 688.53 L164.89
+						 686.43 L164.89 686.43 ZM173.32 686.43 L181.75 686.43 L181.75 688.53 L173.32 688.53 L173.32 686.43 L173.32
+						 686.43 ZM188.08 686.43 L190.19 686.43 L190.19 688.53 L188.08 688.53 L188.08 686.43 L188.08 686.43 ZM196.51
+						 686.43 L204.94 686.43 L204.94 688.53 L196.51 688.53 L196.51 686.43 L196.51 686.43 ZM211.27 686.43 L213.38
+						 686.43 L213.38 688.53 L211.27 688.53 L211.27 686.43 L211.27 686.43 ZM219.7 686.43 L228.13 686.43 L228.13
+						 688.53 L219.7 688.53 L219.7 686.43 L219.7 686.43 ZM234.46 686.43 L236.56 686.43 L236.56 688.53 L234.46 688.53
+						 L234.46 686.43 L234.46 686.43 ZM242.89 686.43 L251.32 686.43 L251.32 688.53 L242.89 688.53 L242.89 686.43
+						 L242.89 686.43 ZM257.64 686.43 L259.75 686.43 L259.75 688.53 L257.64 688.53 L257.64 686.43 L257.64 686.43
+						 ZM266.08 686.43 L274.51 686.43 L274.51 688.53 L266.08 688.53 L266.08 686.43 L266.08 686.43 ZM280.83 686.43
+						 L282.94 686.43 L282.94 688.53 L280.83 688.53 L280.83 686.43 L280.83 686.43 ZM289.27 686.43 L297.7 686.43
+						 L297.7 688.53 L289.27 688.53 L289.27 686.43 L289.27 686.43 ZM304.02 686.43 L306.13 686.43 L306.13 688.53
+						 L304.02 688.53 L304.02 686.43 L304.02 686.43 ZM312.45 686.43 L320.89 686.43 L320.89 688.53 L312.45 688.53
+						 L312.45 686.43 L312.45 686.43 ZM327.21 686.43 L329.32 686.43 L329.32 688.53 L327.21 688.53 L327.21 686.43
+						 L327.21 686.43 ZM335.64 686.43 L344.08 686.43 L344.08 688.53 L335.64 688.53 L335.64 686.43 L335.64 686.43
+						 ZM350.4 686.43 L352.51 686.43 L352.51 688.53 L350.4 688.53 L350.4 686.43 L350.4 686.43 ZM358.83 686.43 L367.26
+						 686.43 L367.26 688.53 L358.83 688.53 L358.83 686.43 L358.83 686.43 ZM373.59 686.43 L375.7 686.43 L375.7
+						 688.53 L373.59 688.53 L373.59 686.43 L373.59 686.43 ZM382.02 686.43 L387.06 686.43 L387.06 688.53 L382.02
+						 688.53 L382.02 686.43 L382.02 686.43 ZM13.18 694.06 L0 687.48 L13.18 680.9 L13.18 694.06 L13.18 694.06 ZM384.89
+						 680.9 L398.06 687.48 L384.89 694.06 L384.89 680.9 L384.89 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape6-9" v:mID="6" v:groupContext="shape" transform="translate(2.5012,-522.82)">
+			<title>Sheet.6</title>
+			<path d="M0 597.01 L0 694.06 L154.68 694.06 L154.68 597.01 L0 597.01 L0 597.01 Z" class="st4"/>
+		</g>
+		<g id="shape7-11" v:mID="7" v:groupContext="shape" transform="translate(2.5012,-522.82)">
+			<title>Sheet.7</title>
+			<path d="M0 597.01 L154.68 597.01 L154.68 694.06 L0 694.06 L0 597.01" class="st5"/>
+		</g>
+		<g id="shape10-14" v:mID="10" v:groupContext="shape" transform="translate(159.025,-522.82)">
+			<title>Sheet.10</title>
+			<path d="M0 597.01 L0 694.06 L563.73 694.06 L563.73 597.01 L0 597.01 L0 597.01 Z" class="st6"/>
+		</g>
+		<g id="shape11-16" v:mID="11" v:groupContext="shape" transform="translate(159.025,-522.82)">
+			<title>Sheet.11</title>
+			<path d="M0 597.01 L563.73 597.01 L563.73 694.06 L0 694.06 L0 597.01" class="st7"/>
+		</g>
+		<g id="shape12-19" v:mID="12" v:groupContext="shape" transform="translate(262.039,-549.269)">
+			<title>Sheet.12</title>
+			<desc>Payload 0</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="46.92" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 0</text>		</g>
+		<g id="shape13-23" v:mID="13" v:groupContext="shape" transform="translate(547.615,-549.269)">
+			<title>Sheet.13</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="87.5716" cy="673.065" width="175.15" height="41.9798"/>
+			<path d="M175.14 652.08 L0 652.08 L0 694.06 L175.14 694.06 L175.14 652.08" class="st8"/>
+			<text x="37" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape15-27" v:mID="15" v:groupContext="shape" transform="translate(2.2377,-522.82)">
+			<title>Sheet.15</title>
+			<path d="M0 597.01 L0 694.06 L154.68 694.06 L154.68 597.01 L0 597.01 L0 597.01 Z" class="st4"/>
+		</g>
+		<g id="shape16-29" v:mID="16" v:groupContext="shape" transform="translate(2.2377,-522.82)">
+			<title>Sheet.16</title>
+			<path d="M0 597.01 L154.68 597.01 L154.68 694.06 L0 694.06 L0 597.01" class="st5"/>
+		</g>
+		<g id="shape17-32" v:mID="17" v:groupContext="shape" transform="translate(6.52106,-546.331)">
+			<title>Sheet.17</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="72.773" cy="673.065" width="145.55" height="41.9798"/>
+			<path d="M145.55 652.08 L0 652.08 L0 694.06 L145.55 694.06 L145.55 652.08" class="st8"/>
+			<text x="34.98" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape23-36" v:mID="23" v:groupContext="shape" transform="translate(286.548,-2.2377)">
+			<title>Sheet.23</title>
+			<path d="M0 597.27 L0 694.06 L345.2 694.06 L345.2 597.27 L0 597.27 L0 597.27 Z" class="st4"/>
+		</g>
+		<g id="shape24-38" v:mID="24" v:groupContext="shape" transform="translate(286.548,-2.2377)">
+			<title>Sheet.24</title>
+			<path d="M0 597.27 L345.2 597.27 L345.2 694.06 L0 694.06 L0 597.27" class="st5"/>
+		</g>
+		<g id="shape25-41" v:mID="25" v:groupContext="shape" transform="translate(399.834,-25.6887)">
+			<title>Sheet.25</title>
+			<desc>Header</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="72.773" cy="673.065" width="145.55" height="41.9798"/>
+			<path d="M145.55 652.08 L0 652.08 L0 694.06 L145.55 694.06 L145.55 652.08" class="st8"/>
+			<text x="13.76" y="683.56" class="st10" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Header</text>		</g>
+		<g id="shape31-45" v:mID="31" v:groupContext="shape" transform="translate(631.744,-2.2377)">
+			<title>Sheet.31</title>
+			<path d="M0 597.27 L0 694.06 L516.21 694.06 L516.21 597.27 L0 597.27 L0 597.27 Z" class="st6"/>
+		</g>
+		<g id="shape32-47" v:mID="32" v:groupContext="shape" transform="translate(631.744,-2.2377)">
+			<title>Sheet.32</title>
+			<path d="M0 597.27 L516.21 597.27 L516.21 694.06 L0 694.06 L0 597.27" class="st7"/>
+		</g>
+		<g id="shape33-50" v:mID="33" v:groupContext="shape" transform="translate(809.035,-25.6889)">
+			<title>Sheet.33</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="16.99" y="683.56" class="st10" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape35-54" v:mID="35" v:groupContext="shape" transform="translate(1199.29,-21.1708)">
+			<title>Sheet.35</title>
+			<desc>Logical output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="164.662" cy="678.273" width="329.33" height="31.5648"/>
+			<path d="M329.32 662.49 L0 662.49 L0 694.06 L329.32 694.06 L329.32 662.49" class="st8"/>
+			<text x="24.69" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Logical output segment</text>		</g>
+		<g id="shape38-58" v:mID="38" v:groupContext="shape" transform="translate(1204.65,-254.446)">
+			<title>Sheet.38</title>
+			<desc>Three-part output segment</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="181.707" cy="678.273" width="363.42" height="31.5648"/>
+			<path d="M363.41 662.49 L0 662.49 L0 694.06 L363.41 694.06 L363.41 662.49" class="st8"/>
+			<text x="19.51" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Three-part output segment</text>		</g>
+		<g id="shape39-62" v:mID="39" v:groupContext="shape" transform="translate(546.25,-529.921)">
+			<title>Sheet.39</title>
+			<path d="M3.43 529.94 L3.46 543.61 L0.03 543.61 L0 529.94 L3.43 529.94 L3.43 529.94 ZM3.46 553.87 L3.46 557.29 L0.03
+						 557.29 L0.03 553.87 L3.46 553.87 L3.46 553.87 ZM3.46 567.55 L3.46 581.22 L0.03 581.22 L0.03 567.55 L3.46
+						 567.55 L3.46 567.55 ZM3.46 591.48 L3.46 594.9 L0.03 594.9 L0.03 591.48 L3.46 591.48 L3.46 591.48 ZM3.46
+						 605.16 L3.46 618.83 L0.03 618.83 L0.03 605.16 L3.46 605.16 L3.46 605.16 ZM3.46 629.09 L3.46 632.51 L0.03
+						 632.51 L0.03 629.09 L3.46 629.09 L3.46 629.09 ZM3.46 642.77 L3.46 656.45 L0.03 656.45 L0.03 642.77 L3.46
+						 642.77 L3.46 642.77 ZM3.46 666.7 L3.46 670.12 L0.03 670.12 L0.03 666.7 L3.46 666.7 L3.46 666.7 ZM3.46 680.38
+						 L3.46 694.06 L0.03 694.06 L0.03 680.38 L3.46 680.38 L3.46 680.38 Z" class="st1"/>
+		</g>
+		<g id="shape40-64" v:mID="40" v:groupContext="shape" transform="translate(549.097,-223.749)">
+			<title>Sheet.40</title>
+			<path d="M6.65 402.61 L13.01 414.71 L9.98 416.32 L3.62 404.22 L6.65 402.61 L6.65 402.61 ZM17.82 423.78 L19.4 426.81 L16.37
+						 428.42 L14.79 425.39 L17.82 423.78 L17.82 423.78 ZM24.21 435.91 L30.57 448.01 L27.54 449.59 L21.18 437.49
+						 L24.21 435.91 L24.21 435.91 ZM35.38 457.08 L36.96 460.11 L33.93 461.69 L32.35 458.66 L35.38 457.08 L35.38
+						 457.08 ZM41.73 469.18 L48.12 481.28 L45.09 482.86 L38.7 470.76 L41.73 469.18 L41.73 469.18 ZM52.9 490.36
+						 L54.51 493.38 L51.48 494.99 L49.87 491.97 L52.9 490.36 L52.9 490.36 ZM59.29 502.45 L65.68 514.55 L62.65
+						 516.16 L56.26 504.06 L59.29 502.45 L59.29 502.45 ZM70.46 523.63 L72.07 526.65 L69.04 528.26 L67.43 525.24
+						 L70.46 523.63 L70.46 523.63 ZM76.85 535.76 L83.24 547.86 L80.21 549.43 L73.82 537.34 L76.85 535.76 L76.85
+						 535.76 ZM88.01 556.93 L89.63 559.95 L86.6 561.53 L84.98 558.51 L88.01 556.93 L88.01 556.93 ZM94.4 569.03
+						 L100.79 581.13 L97.76 582.7 L91.37 570.61 L94.4 569.03 L94.4 569.03 ZM105.57 590.2 L107.15 593.22 L104.12
+						 594.84 L102.54 591.81 L105.57 590.2 L105.57 590.2 ZM111.96 602.3 L118.32 614.4 L115.28 616.01 L108.93 603.91
+						 L111.96 602.3 L111.96 602.3 ZM123.12 623.47 L124.71 626.5 L121.67 628.11 L120.09 625.08 L123.12 623.47 L123.12
+						 623.47 ZM129.51 635.6 L135.87 647.7 L132.84 649.28 L126.48 637.18 L129.51 635.6 L129.51 635.6 ZM140.68 656.77
+						 L142.26 659.8 L139.23 661.38 L137.65 658.35 L140.68 656.77 L140.68 656.77 ZM147.04 668.87 L153.43 680.97
+						 L150.4 682.55 L144.01 670.45 L147.04 668.87 L147.04 668.87 ZM158.2 690.04 L159.49 692.48 L156.46 694.06
+						 L155.17 691.66 L158.2 690.04 L158.2 690.04 ZM0.33 408.43 L0 393.7 L11.99 402.28 L0.33 408.43 L0.33 408.43
+						 Z" class="st1"/>
+		</g>
+		<g id="shape46-66" v:mID="46" v:groupContext="shape" transform="translate(66.8445,-221.499)">
+			<title>Sheet.46</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L273.39 595.43 C282.51 595.43 289.86 602.79 289.86 611.87 L289.86
+						 677.62 C289.86 686.72 282.51 694.06 273.39 694.06 L16.47 694.06 C7.38 694.06 -0 686.72 0 677.62 L0 611.87
+						 Z" class="st4"/>
+		</g>
+		<g id="shape47-68" v:mID="47" v:groupContext="shape" transform="translate(66.8445,-221.499)">
+			<title>Sheet.47</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L273.39 595.43 C282.51 595.43 289.86 602.79 289.86 611.87 L289.86
+						 677.62 C289.86 686.72 282.51 694.06 273.39 694.06 L16.47 694.06 C7.38 694.06 -0 686.72 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape48-70" v:mID="48" v:groupContext="shape" transform="translate(113.27,-263.667)">
+			<title>Sheet.48</title>
+			<desc>Direct mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="98.5041" cy="675.59" width="197.01" height="36.9302"/>
+			<path d="M197.01 657.13 L0 657.13 L0 694.06 L197.01 694.06 L197.01 657.13" class="st8"/>
+			<text x="18.66" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Direct mbuf</text>		</g>
+		<g id="shape51-74" v:mID="51" v:groupContext="shape" transform="translate(85.817,-233.439)">
+			<title>Sheet.51</title>
+			<desc>(copy of headers)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="127.916" cy="678.273" width="255.84" height="31.5648"/>
+			<path d="M255.83 662.49 L0 662.49 L0 694.06 L255.83 694.06 L255.83 662.49" class="st8"/>
+			<text x="34.33" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(copy of headers)</text>		</g>
+		<g id="shape53-78" v:mID="53" v:groupContext="shape" transform="translate(371.944,-275.998)">
+			<title>Sheet.53</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="7.56" y="686.16" class="st15" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape54-82" v:mID="54" v:groupContext="shape" transform="translate(695.132,-646.04)">
+			<title>Sheet.54</title>
+			<path d="M0 655.39 L0 694.06 L100.4 694.06 L100.4 655.39 L0 655.39 L0 655.39 Z" class="st16"/>
+		</g>
+		<g id="shape55-84" v:mID="55" v:groupContext="shape" transform="translate(709.033,-648.946)">
+			<title>Sheet.55</title>
+			<desc>segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="36.6265" cy="680.167" width="73.26" height="27.7775"/>
+			<path d="M73.25 666.28 L0 666.28 L0 694.06 L73.25 694.06 L73.25 666.28" class="st8"/>
+			<text x="7.6" y="687.11" class="st17" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>segsz</text>		</g>
+		<g id="shape56-88" v:mID="56" v:groupContext="shape" transform="translate(785.874,-521.182)">
+			<title>Sheet.56</title>
+			<path d="M0 597.27 L0 694.06 L363.41 694.06 L363.41 597.27 L0 597.27 L0 597.27 Z" class="st6"/>
+		</g>
+		<g id="shape57-90" v:mID="57" v:groupContext="shape" transform="translate(785.874,-521.182)">
+			<title>Sheet.57</title>
+			<path d="M0 597.27 L363.41 597.27 L363.41 694.06 L0 694.06 L0 597.27" class="st7"/>
+		</g>
+		<g id="shape58-93" v:mID="58" v:groupContext="shape" v:layerMember="0" transform="translate(943.158,-529.889)">
+			<title>Sheet.58</title>
+			<path d="M1.35 529.91 L1.25 543.58 L4.68 543.61 L4.78 529.94 L1.35 529.91 L1.35 529.91 ZM1.15 553.84 L1.12 557.26 L4.55
+						 557.29 L4.58 553.87 L1.15 553.84 L1.15 553.84 ZM1.05 567.52 L0.92 581.19 L4.35 581.22 L4.48 567.55 L1.05
+						 567.52 L1.05 567.52 ZM0.86 591.45 L0.82 594.87 L4.25 594.9 L4.28 591.48 L0.86 591.45 L0.86 591.45 ZM0.72
+						 605.13 L0.63 618.8 L4.05 618.83 L4.15 605.16 L0.72 605.13 L0.72 605.13 ZM0.53 629.06 L0.53 632.48 L3.95
+						 632.51 L3.95 629.09 L0.53 629.06 L0.53 629.06 ZM0.43 642.74 L0.33 656.41 L3.75 656.45 L3.85 642.77 L0.43
+						 642.74 L0.43 642.74 ZM0.23 666.67 L0.2 670.09 L3.62 670.12 L3.66 666.7 L0.23 666.67 L0.23 666.67 ZM0.13
+						 680.35 L0 694.02 L3.43 694.06 L3.56 680.38 L0.13 680.35 L0.13 680.35 Z" class="st18"/>
+		</g>
+		<g id="shape59-95" v:mID="59" v:groupContext="shape" transform="translate(785.874,-549.473)">
+			<title>Sheet.59</title>
+			<desc>Payload 1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="77.3395" cy="673.065" width="154.68" height="41.9798"/>
+			<path d="M154.68 652.08 L0 652.08 L0 694.06 L154.68 694.06 L154.68 652.08" class="st8"/>
+			<text x="26.77" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 1</text>		</g>
+		<g id="shape60-99" v:mID="60" v:groupContext="shape" transform="translate(952.97,-548.822)">
+			<title>Sheet.60</title>
+			<desc>Payload 2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="97.4929" cy="673.065" width="194.99" height="41.9798"/>
+			<path d="M194.99 652.08 L0 652.08 L0 694.06 L194.99 694.06 L194.99 652.08" class="st8"/>
+			<text x="46.92" y="680.57" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Payload 2</text>		</g>
+		<g id="shape63-103" v:mID="63" v:groupContext="shape" transform="translate(1210.43,-551.684)">
+			<title>Sheet.63</title>
+			<desc>Multi-segment input packet</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="181.707" cy="678.273" width="363.42" height="31.5648"/>
+			<path d="M363.41 662.49 L0 662.49 L0 694.06 L363.41 694.06 L363.41 662.49" class="st8"/>
+			<text x="17.75" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Multi-segment input packet</text>		</g>
+		<g id="shape70-107" v:mID="70" v:groupContext="shape" v:layerMember="1" transform="translate(455.049,-221.499)">
+			<title>Sheet.70</title>
+			<path d="M0 611.87 C0 602.79 5.33 595.43 11.89 595.43 L282.92 595.43 C289.53 595.43 294.8 602.79 294.8 611.87 L294.8
+						 677.62 C294.8 686.76 289.53 694.06 282.92 694.06 L11.89 694.06 C5.33 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st6"/>
+		</g>
+		<g id="shape71-109" v:mID="71" v:groupContext="shape" transform="translate(455.049,-221.499)">
+			<title>Sheet.71</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L391.97 595.43 C401.12 595.43 408.44 602.79 408.44 611.87 L408.44
+						 677.62 C408.44 686.76 401.12 694.06 391.97 694.06 L16.47 694.06 C7.38 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape72-111" v:mID="72" v:groupContext="shape" transform="translate(489.065,-263.434)">
+			<title>Sheet.72</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="112.128" cy="675.59" width="224.26" height="36.9302"/>
+			<path d="M224.26 657.13 L0 657.13 L0 694.06 L224.26 694.06 L224.26 657.13" class="st8"/>
+			<text x="20.73" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape75-115" v:mID="75" v:groupContext="shape" transform="translate(849.065,-281.435)">
+			<title>Sheet.75</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="4.49" y="686.16" class="st11" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape77-119" v:mID="77" v:groupContext="shape" transform="translate(717.742,-563.523)">
+			<title>Sheet.77</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="15.71" y="683.67" class="st19" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape78-123" v:mID="78" v:groupContext="shape" transform="translate(1148.17,-529.067)">
+			<title>Sheet.78</title>
+			<path d="M1.38 529.87 L1.25 543.55 L4.68 543.61 L4.81 529.94 L1.38 529.87 L1.38 529.87 ZM1.19 553.81 L1.12 557.23 L4.55
+						 557.29 L4.61 553.87 L1.19 553.81 L1.19 553.81 ZM1.05 567.48 L0.92 581.16 L4.35 581.22 L4.48 567.55 L1.05
+						 567.48 L1.05 567.48 ZM0.86 591.42 L0.86 594.84 L4.28 594.9 L4.28 591.48 L0.86 591.42 L0.86 591.42 ZM0.72
+						 605.09 L0.66 618.77 L4.08 618.83 L4.15 605.16 L0.72 605.09 L0.72 605.09 ZM0.53 629.03 L0.53 632.45 L3.95
+						 632.51 L3.95 629.09 L0.53 629.03 L0.53 629.03 ZM0.46 642.7 L0.33 656.38 L3.75 656.45 L3.89 642.77 L0.46
+						 642.7 L0.46 642.7 ZM0.26 666.64 L0.2 670.06 L3.62 670.12 L3.69 666.7 L0.26 666.64 L0.26 666.64 ZM0.13 680.31
+						 L0 693.99 L3.43 694.06 L3.56 680.38 L0.13 680.31 L0.13 680.31 Z" class="st20"/>
+		</g>
+		<g id="shape79-125" v:mID="79" v:groupContext="shape" transform="translate(946.254,-657.81)">
+			<title>Sheet.79</title>
+			<path d="M11 686.69 L17.33 686.69 L17.33 688.27 L11 688.27 L11 686.69 L11 686.69 ZM22.07 686.69 L23.65 686.69 L23.65
+						 688.27 L22.07 688.27 L22.07 686.69 L22.07 686.69 ZM28.39 686.69 L34.72 686.69 L34.72 688.27 L28.39 688.27
+						 L28.39 686.69 L28.39 686.69 ZM39.46 686.69 L41.04 686.69 L41.04 688.27 L39.46 688.27 L39.46 686.69 L39.46
+						 686.69 ZM45.78 686.69 L52.11 686.69 L52.11 688.27 L45.78 688.27 L45.78 686.69 L45.78 686.69 ZM56.85 686.69
+						 L58.43 686.69 L58.43 688.27 L56.85 688.27 L56.85 686.69 L56.85 686.69 ZM63.18 686.69 L69.5 686.69 L69.5
+						 688.27 L63.18 688.27 L63.18 686.69 L63.18 686.69 ZM74.24 686.69 L75.82 686.69 L75.82 688.27 L74.24 688.27
+						 L74.24 686.69 L74.24 686.69 ZM80.57 686.69 L86.89 686.69 L86.89 688.27 L80.57 688.27 L80.57 686.69 L80.57
+						 686.69 ZM91.63 686.69 L93.22 686.69 L93.22 688.27 L91.63 688.27 L91.63 686.69 L91.63 686.69 ZM97.96 686.69
+						 L104.28 686.69 L104.28 688.27 L97.96 688.27 L97.96 686.69 L97.96 686.69 ZM109.03 686.69 L110.61 686.69 L110.61
+						 688.27 L109.03 688.27 L109.03 686.69 L109.03 686.69 ZM115.35 686.69 L121.67 686.69 L121.67 688.27 L115.35
+						 688.27 L115.35 686.69 L115.35 686.69 ZM126.42 686.69 L128 686.69 L128 688.27 L126.42 688.27 L126.42 686.69
+						 L126.42 686.69 ZM132.74 686.69 L139.07 686.69 L139.07 688.27 L132.74 688.27 L132.74 686.69 L132.74 686.69
+						 ZM143.81 686.69 L145.39 686.69 L145.39 688.27 L143.81 688.27 L143.81 686.69 L143.81 686.69 ZM150.13 686.69
+						 L156.46 686.69 L156.46 688.27 L150.13 688.27 L150.13 686.69 L150.13 686.69 ZM161.2 686.69 L162.78 686.69
+						 L162.78 688.27 L161.2 688.27 L161.2 686.69 L161.2 686.69 ZM167.53 686.69 L173.85 686.69 L173.85 688.27 L167.53
+						 688.27 L167.53 686.69 L167.53 686.69 ZM178.59 686.69 L180.17 686.69 L180.17 688.27 L178.59 688.27 L178.59
+						 686.69 L178.59 686.69 ZM184.92 686.69 L189.4 686.69 L189.4 688.27 L184.92 688.27 L184.92 686.69 L184.92
+						 686.69 ZM13.18 694.06 L0 687.41 L13.18 680.9 L13.18 694.06 L13.18 694.06 ZM187.22 680.9 L200.4 687.48 L187.22
+						 694.06 L187.22 680.9 L187.22 680.9 Z" class="st20"/>
+		</g>
+		<g id="shape80-127" v:mID="80" v:groupContext="shape" transform="translate(982.882,-643.673)">
+			<title>Sheet.80</title>
+			<path d="M0 655.13 L0 694.06 L127.01 694.06 L127.01 655.13 L0 655.13 L0 655.13 Z" class="st16"/>
+		</g>
+		<g id="shape81-129" v:mID="81" v:groupContext="shape" transform="translate(1003.39,-660.621)">
+			<title>Sheet.81</title>
+			<desc>pkt_len</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="48.6041" cy="680.956" width="97.21" height="26.1994"/>
+			<path d="M97.21 667.86 L0 667.86 L0 694.06 L97.21 694.06 L97.21 667.86" class="st8"/>
+			<text x="11.67" y="687.5" class="st21" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>pkt_len  </text>		</g>
+		<g id="shape82-133" v:mID="82" v:groupContext="shape" transform="translate(1001.18,-634.321)">
+			<title>Sheet.82</title>
+			<desc>% segsz</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="49.2945" cy="680.956" width="98.59" height="26.1994"/>
+			<path d="M98.59 667.86 L0 667.86 L0 694.06 L98.59 694.06 L98.59 667.86" class="st8"/>
+			<text x="9.09" y="687.5" class="st21" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>% segsz</text>		</g>
+		<g id="shape34-137" v:mID="34" v:groupContext="shape" v:layerMember="0" transform="translate(356.703,-264.106)">
+			<title>Sheet.34</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape85-139" v:mID="85" v:groupContext="shape" v:layerMember="0" transform="translate(78.5359,-282.66)">
+			<title>Sheet.85</title>
+			<path d="M0 680.87 C-0 673.59 6.88 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.88 694.06 0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape87-141" v:mID="87" v:groupContext="shape" v:layerMember="0" transform="translate(85.4791,-284.062)">
+			<title>Sheet.87</title>
+			<desc>1</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>1</text>		</g>
+		<g id="shape88-145" v:mID="88" v:groupContext="shape" v:layerMember="0" transform="translate(468.906,-282.66)">
+			<title>Sheet.88</title>
+			<path d="M0 680.87 C-0 673.59 6.89 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.89 694.06 -0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape90-147" v:mID="90" v:groupContext="shape" v:layerMember="0" transform="translate(474.575,-284.062)">
+			<title>Sheet.90</title>
+			<desc>2</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>2</text>		</g>
+		<g id="shape95-151" v:mID="95" v:groupContext="shape" v:layerMember="0" transform="translate(764.026,-275.998)">
+			<title>Sheet.95</title>
+			<desc>next</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="33.3635" cy="678.273" width="66.73" height="31.5648"/>
+			<path d="M66.73 662.49 L0 662.49 L0 694.06 L66.73 694.06 L66.73 662.49" class="st8"/>
+			<text x="7.56" y="686.16" class="st15" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>next</text>		</g>
+		<g id="shape97-155" v:mID="97" v:groupContext="shape" v:layerMember="0" transform="translate(889.755,-220.915)">
+			<title>Sheet.97</title>
+			<path d="M0 611.87 C0 602.79 7.38 595.43 16.47 595.43 L391.97 595.43 C401.12 595.43 408.44 602.79 408.44 611.87 L408.44
+						 677.62 C408.44 686.76 401.12 694.06 391.97 694.06 L16.47 694.06 C7.38 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st12"/>
+		</g>
+		<g id="shape100-157" v:mID="100" v:groupContext="shape" v:layerMember="0" transform="translate(751.857,-262.528)">
+			<title>Sheet.100</title>
+			<path d="M0 685.77 L90.67 685.77 L90.67 689.19 L0 689.19 L0 685.77 L0 685.77 ZM89.36 680.9 L97.21 687.48 L89.36 694.06
+						 L89.36 680.9 L89.36 680.9 Z" class="st2"/>
+		</g>
+		<g id="shape104-159" v:mID="104" v:groupContext="shape" v:layerMember="1" transform="translate(851.429,-218.08)">
+			<title>Sheet.104</title>
+			<path d="M0 611.87 C0 602.79 5.33 595.43 11.89 595.43 L282.92 595.43 C289.53 595.43 294.8 602.79 294.8 611.87 L294.8
+						 677.62 C294.8 686.76 289.53 694.06 282.92 694.06 L11.89 694.06 C5.33 694.06 0 686.76 0 677.62 L0 611.87
+						 Z" class="st6"/>
+		</g>
+		<g id="shape105-161" v:mID="105" v:groupContext="shape" v:layerMember="0" transform="translate(885.444,-260.015)">
+			<title>Sheet.105</title>
+			<desc>Indirect mbuf</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="112.128" cy="675.59" width="224.26" height="36.9302"/>
+			<path d="M224.26 657.13 L0 657.13 L0 694.06 L224.26 694.06 L224.26 657.13" class="st8"/>
+			<text x="20.73" y="684.59" class="st13" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Indirect mbuf</text>		</g>
+		<g id="shape106-165" v:mID="106" v:groupContext="shape" v:layerMember="0" transform="translate(895.672,-229.419)">
+			<title>Sheet.106</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="12.86" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+		<g id="shape107-169" v:mID="107" v:groupContext="shape" v:layerMember="0" transform="translate(863.297,-280.442)">
+			<title>Sheet.107</title>
+			<path d="M0 680.87 C-0 673.59 6.89 667.69 15.37 667.69 C23.86 667.69 30.73 673.59 30.73 680.87 C30.73 688.15 23.86 694.06
+						 15.37 694.06 C6.89 694.06 -0 688.15 0 680.87 Z" class="st16"/>
+		</g>
+		<g id="shape108-171" v:mID="108" v:groupContext="shape" v:layerMember="0" transform="translate(870.001,-281.547)">
+			<title>Sheet.108</title>
+			<desc>3</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="8.66303" cy="683.269" width="17.33" height="21.5726"/>
+			<path d="M17.33 672.48 L0 672.48 L0 694.06 L17.33 694.06 L17.33 672.48" class="st8"/>
+			<text x="3.32" y="688.66" class="st22" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>3</text>		</g>
+		<g id="shape109-175" v:mID="109" v:groupContext="shape" v:layerMember="0" transform="translate(500.959,-231.87)">
+			<title>Sheet.109</title>
+			<desc>(pointer to data)</desc>
+			<v:textBlock v:margins="rect(0,0,0,0)" v:tabSpace="42.5197"/>
+			<v:textRect cx="100.199" cy="678.273" width="200.4" height="31.5648"/>
+			<path d="M200.4 662.49 L0 662.49 L0 694.06 L200.4 694.06 L200.4 662.49" class="st8"/>
+			<text x="12.86" y="685.47" class="st14" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>(pointer to data)</text>		</g>
+	</g>
+</svg>
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 40f04a1..c7c8b17 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -56,6 +56,7 @@ Programmer's Guide
     reorder_lib
     ip_fragment_reassembly_lib
     generic_receive_offload_lib
+    generic_segmentation_offload_lib
     pdump_lib
     multi_proc_support
     kernel_nic_interface
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 157+ messages in thread

* Re: [PATCH v10 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK
  2017-10-07 14:56                 ` [PATCH v10 " Jiayu Hu
                                     ` (5 preceding siblings ...)
  2017-10-07 14:56                   ` [PATCH v10 6/6] doc: add GSO programmer's guide Jiayu Hu
@ 2017-10-08  3:40                   ` Ferruh Yigit
  6 siblings, 0 replies; 157+ messages in thread
From: Ferruh Yigit @ 2017-10-08  3:40 UTC (permalink / raw)
  To: Jiayu Hu, dev; +Cc: mark.b.kavanagh, konstantin.ananyev

On 10/7/2017 3:56 PM, Jiayu Hu wrote:
> Generic Segmentation Offload (GSO) is a SW technique to split large
> packets into small ones. Akin to TSO, GSO enables applications to
> operate on large packets, thus reducing per-packet processing overhead.
> 
> To enable more flexibility to applications, DPDK GSO is implemented
> as a standalone library. Applications explicitly use the GSO library
> to segment packets. This patch adds GSO support to DPDK for specific
> packet types: specifically, TCP/IPv4, VxLAN, and GRE.
> 
> The first patch introduces the GSO API framework. The second patch
> adds GSO support for TCP/IPv4 packets (containing an optional VLAN
> tag). The third patch adds GSO support for VxLAN packets that contain
> outer IPv4, and inner TCP/IPv4 headers (plus optional inner and/or 
> outer VLAN tags). The fourth patch adds GSO support for GRE packets
> that contain outer IPv4, and inner TCP/IPv4 headers (with optional 
> outer VLAN tag). The fifth patch in the series enables TCP/IPv4, VxLAN,
> and GRE GSO in testpmd's checksum forwarding engine. The final patch
> in the series adds GSO documentation to the programmer's guide.
> 
> Note that this patch set has dependency on the patch "app/testpmd: enable
> the heavyweight mode TCP/IPv4 GRO".
> http://dpdk.org/dev/patchwork/patch/29867/
> 
> Performance Testing
> ===================
> The performance of TCP/IPv4 GSO on a 10Gbps link is demonstrated using
> iperf. Setup for the test is described as follows:
> 
> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>    machine, together physically.
> b. Launch testpmd with P0 and a vhost-user port, and use csum
>    forwarding engine with "retry".
> c. Select IP and TCP HW checksum calculation for P0; select TCP HW
>    checksum calculation for vhost-user port.
> d. Launch a VM with csum and tso offloading enabled.
> e. Run iperf-client on virtio-net port in the VM to send TCP packets.
>    With enabling csum and tso, the VM can send large TCP/IPv4 packets
>    (mss is up to 64KB).
> f. P1 is assigned to linux kernel and enabled kernel GRO. Run
>    iperf-server on P1.
> 
> We conduct three iperf tests:
> 
> test-1: enable GSO for P0 in testpmd, and set max GSO segment length
>     to 1514B. Run four iperf-client in the VM.
> test-2: enable TSO for P0 in testpmd, and set TSO segsz to 1514B. Run
>     four iperf-client in the VM.
> test-3: disable GSO and TSO in testpmd. Run two iperf-client in the VM.
> 
> Throughput of the above three tests:
> 
> test-1: 9Gbps
> test-2: 9.5Gbps
> test-3: 3Mbps
> 
> Functional Testing
> ==================
> Unlike TCP packets, VMs can't send large VxLAN or GRE packets. The max
> length of tunneled packets from VMs is 1514B. So current experiment
> method can't be used to measure VxLAN and GRE GSO performance, but simply
> test the functionality via setting small GSO segment length (e.g. 500B).
> 
> VxLAN
> -----
> To test VxLAN GSO functionality, we use the following setup:
> 
> a. Connect 2 x 10Gbps physical ports (P0, P1), which are in the same
>    machine, together physically.
> b. Launch testpmd with P0 and a vhost-user port, and use csum forwarding
>    engine with "retry".
> c. Testpmd commands:
>     - csum parse_tunnel on "P0"
>     - csum parse_tunnel on "vhost-user port"
>     - csum set outer-ip hw "P0"
>     - csum set ip hw "P0"
>     - csum set tcp hw "P0"
>     - csum set tcp hw "vhost-user port"
>     - set port "P0" gso on
>     - set gso segsz 500
> d. Launch a VM with csum and tso offloading enabled.
> e. Create a vxlan port for the virtio-net port in the VM. Run iperf-client
>    on the VxLAN port, so TCP packets are VxLAN encapsulated. However, the
>    max packet length is 1514B.
> f. P1 is assigned to linux kernel and kernel GRO is disabled. Similarly,
>    create a VxLAN port for P1, and run iperf-server on the VxLAN port.
> 
> In testpmd, we can see the length of all packets sent from P0 is smaller
> than or equal to 500B. Additionally, the packets arriving in P1 is
> encapsulated and is smaller than or equal to 500B.
> 
> GRE
> ---
> The same process may be used to test GRE functionality, with the exception that
> the tunnel type created for both the guest's virtio-net, and the host's kernel
> interfaces is GRE:
>    `ip tunnel add <gre tunnel> mode gre remote <remote IP> local <local_ip>`
> 
> As in the VxLAN testcase, the length of packets sent from P0, and received on
> P1, is less than 500B.
> 

<...>

> Jiayu Hu (3):
>   gso: add Generic Segmentation Offload API framework
>   gso: add TCP/IPv4 GSO support
>   app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO
> 
> Mark Kavanagh (3):
>   gso: add VxLAN GSO support
>   gso: add GRE GSO support
>   doc: add GSO programmer's guide

Series applied to dpdk-next-net/master, thanks.

^ permalink raw reply	[flat|nested] 157+ messages in thread

end of thread, other threads:[~2017-10-08  3:40 UTC | newest]

Thread overview: 157+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-24 14:15 [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
2017-08-24 14:15 ` [PATCH 1/5] lib: add Generic Segmentation Offload API framework Jiayu Hu
2017-08-30  1:38   ` Ananyev, Konstantin
2017-08-30  7:57     ` Jiayu Hu
2017-08-24 14:15 ` [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support Jiayu Hu
2017-08-30  1:38   ` Ananyev, Konstantin
2017-08-30  2:55     ` Jiayu Hu
2017-08-30  9:25       ` Kavanagh, Mark B
2017-08-30  9:39         ` Ananyev, Konstantin
2017-08-30  9:59           ` Ananyev, Konstantin
2017-08-30 13:27             ` Kavanagh, Mark B
2017-08-30  9:03     ` Jiayu Hu
2017-09-04  3:31     ` Jiayu Hu
2017-09-04  9:54       ` Ananyev, Konstantin
2017-09-05  1:09         ` Hu, Jiayu
2017-09-11 13:04           ` Ananyev, Konstantin
2017-08-24 14:15 ` [PATCH 3/5] lib/gso: add VxLAN " Jiayu Hu
2017-08-24 14:15 ` [PATCH 4/5] lib/gso: add GRE " Jiayu Hu
2017-08-24 14:15 ` [PATCH 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
2017-08-30  1:37 ` [PATCH 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Ananyev, Konstantin
2017-08-30  7:36   ` Jiayu Hu
2017-08-30 10:49     ` Ananyev, Konstantin
2017-08-30 13:32       ` Kavanagh, Mark B
2017-09-05  7:57 ` [PATCH v2 " Jiayu Hu
2017-09-05  7:57   ` [PATCH v2 1/5] gso: add Generic Segmentation Offload API framework Jiayu Hu
2017-09-05  7:57   ` [PATCH v2 2/5] gso: add TCP/IPv4 GSO support Jiayu Hu
2017-09-05  7:57   ` [PATCH v2 3/5] gso: add VxLAN " Jiayu Hu
2017-09-05  7:57   ` [PATCH v2 4/5] gso: add GRE " Jiayu Hu
2017-09-05  7:57   ` [PATCH v2 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
2017-09-12  2:43   ` [PATCH v3 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
2017-09-12  2:43     ` [PATCH v3 1/5] gso: add Generic Segmentation Offload API framework Jiayu Hu
2017-09-12 10:36       ` Ananyev, Konstantin
2017-09-13  2:11         ` Jiayu Hu
2017-09-14 18:33       ` Ferruh Yigit
2017-09-15  1:12         ` Hu, Jiayu
2017-09-12  2:43     ` [PATCH v3 2/5] gso: add TCP/IPv4 GSO support Jiayu Hu
2017-09-12 11:17       ` Ananyev, Konstantin
2017-09-13  2:48         ` Jiayu Hu
2017-09-13  9:38           ` Ananyev, Konstantin
2017-09-13 10:23             ` Hu, Jiayu
2017-09-13 14:52             ` Kavanagh, Mark B
2017-09-13 15:13               ` Ananyev, Konstantin
2017-09-14  0:59                 ` Hu, Jiayu
2017-09-14  8:35                   ` Kavanagh, Mark B
2017-09-14  8:39                     ` Ananyev, Konstantin
2017-09-14  9:00                       ` Kavanagh, Mark B
2017-09-14  9:10                         ` Ananyev, Konstantin
2017-09-14  9:35                           ` Kavanagh, Mark B
2017-09-12 14:17       ` Ananyev, Konstantin
2017-09-13 10:44         ` Jiayu Hu
2017-09-13 22:10           ` Ananyev, Konstantin
2017-09-14  6:07             ` Jiayu Hu
2017-09-14  8:47               ` Ananyev, Konstantin
2017-09-14  9:29                 ` Hu, Jiayu
2017-09-14  9:35                   ` Ananyev, Konstantin
2017-09-14 10:01                     ` Hu, Jiayu
2017-09-14 15:42                       ` Kavanagh, Mark B
2017-09-14 18:38                         ` Ananyev, Konstantin
2017-09-15  7:54                           ` Hu, Jiayu
2017-09-15  8:15                             ` Ananyev, Konstantin
2017-09-15  8:17                             ` Ananyev, Konstantin
2017-09-15  8:38                               ` Hu, Jiayu
2017-09-14  8:51               ` Kavanagh, Mark B
2017-09-14  9:45                 ` Hu, Jiayu
2017-09-12  2:43     ` [PATCH v3 3/5] gso: add VxLAN " Jiayu Hu
2017-09-12  2:43     ` [PATCH v3 4/5] gso: add GRE " Jiayu Hu
2017-09-12  2:43     ` [PATCH v3 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
2017-09-14 18:33       ` Ferruh Yigit
2017-09-15  1:13         ` Hu, Jiayu
2017-09-19  7:32     ` [PATCH v4 0/5] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Jiayu Hu
2017-09-19  7:32       ` [PATCH v4 1/5] gso: add Generic Segmentation Offload API framework Jiayu Hu
2017-09-19  7:32       ` [PATCH v4 2/5] gso: add TCP/IPv4 GSO support Jiayu Hu
2017-09-20  7:03         ` Yao, Lei A
2017-09-19  7:32       ` [PATCH v4 3/5] gso: add VxLAN " Jiayu Hu
2017-09-20  3:11         ` Tan, Jianfeng
2017-09-20  3:17           ` Hu, Jiayu
2017-09-19  7:32       ` [PATCH v4 4/5] gso: add GRE " Jiayu Hu
2017-09-20  2:53         ` Tan, Jianfeng
2017-09-20  6:01           ` Hu, Jiayu
2017-09-19  7:32       ` [PATCH v4 5/5] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
2017-09-28 22:13       ` [PATCH v5 0/6] Support TCP/IPv4, VxLAN and GRE GSO in DPDK Mark Kavanagh
2017-10-02 16:45         ` [PATCH v6 0/6] Support TCP/IPv4, VxLAN, " Mark Kavanagh
2017-10-02 16:45           ` [PATCH v6 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
2017-10-04 13:11             ` Ananyev, Konstantin
2017-10-04 13:21               ` Kavanagh, Mark B
2017-10-02 16:45           ` [PATCH v6 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
2017-10-04 13:32             ` Ananyev, Konstantin
2017-10-04 14:30               ` Kavanagh, Mark B
2017-10-04 14:49                 ` Ananyev, Konstantin
2017-10-04 14:59                   ` Kavanagh, Mark B
2017-10-04 13:35             ` Ananyev, Konstantin
2017-10-04 14:22               ` Kavanagh, Mark B
2017-10-02 16:45           ` [PATCH v6 3/6] gso: add VxLAN " Mark Kavanagh
2017-10-04 14:12             ` Ananyev, Konstantin
2017-10-04 14:35               ` Kavanagh, Mark B
2017-10-04 16:13               ` Kavanagh, Mark B
2017-10-04 16:17                 ` Ananyev, Konstantin
2017-10-02 16:45           ` [PATCH v6 4/6] gso: add GRE " Mark Kavanagh
2017-10-04 14:15             ` Ananyev, Konstantin
2017-10-04 14:36               ` Kavanagh, Mark B
2017-10-02 16:45           ` [PATCH v6 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
2017-10-04 15:08             ` Ananyev, Konstantin
2017-10-04 16:23               ` Kavanagh, Mark B
2017-10-04 16:26                 ` Ananyev, Konstantin
2017-10-04 16:51                   ` Kavanagh, Mark B
2017-10-02 16:45           ` [PATCH v6 6/6] doc: add GSO programmer's guide Mark Kavanagh
2017-10-04 13:51             ` Mcnamara, John
2017-10-05 11:02           ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Mark Kavanagh
2017-10-05 11:02             ` [PATCH v7 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
2017-10-05 11:02             ` [PATCH v7 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
2017-10-05 11:02             ` [PATCH v7 3/6] gso: add VxLAN " Mark Kavanagh
2017-10-05 11:02             ` [PATCH v7 4/6] gso: add GRE " Mark Kavanagh
2017-10-05 11:02             ` [PATCH v7 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
2017-10-05 11:02             ` [PATCH v7 6/6] doc: add GSO programmer's guide Mark Kavanagh
2017-10-05 13:22             ` [PATCH v7 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Ananyev, Konstantin
2017-10-05 14:39               ` Kavanagh, Mark B
2017-10-05 15:43             ` [PATCH v8 " Mark Kavanagh
2017-10-05 17:12               ` Ananyev, Konstantin
2017-10-05 20:16                 ` Kavanagh, Mark B
2017-10-05 20:36               ` [PATCH v9 " Mark Kavanagh
2017-10-05 22:24                 ` Ananyev, Konstantin
2017-10-06  8:24                   ` FW: " Kavanagh, Mark B
2017-10-06 10:35                   ` Kavanagh, Mark B
2017-10-06 23:32                 ` Ferruh Yigit
2017-10-06 23:34                   ` Ferruh Yigit
2017-10-07 14:56                 ` [PATCH v10 " Jiayu Hu
2017-10-07 14:56                   ` [PATCH v10 1/6] gso: add Generic Segmentation Offload API framework Jiayu Hu
2017-10-07 14:56                   ` [PATCH v10 2/6] gso: add TCP/IPv4 GSO support Jiayu Hu
2017-10-07 14:56                   ` [PATCH v10 3/6] gso: add VxLAN " Jiayu Hu
2017-10-07 14:56                   ` [PATCH v10 4/6] gso: add GRE " Jiayu Hu
2017-10-07 14:56                   ` [PATCH v10 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Jiayu Hu
2017-10-07 14:56                   ` [PATCH v10 6/6] doc: add GSO programmer's guide Jiayu Hu
2017-10-08  3:40                   ` [PATCH v10 0/6] Support TCP/IPv4, VxLAN, and GRE GSO in DPDK Ferruh Yigit
2017-10-05 20:36               ` [PATCH v9 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
2017-10-05 20:36               ` [PATCH v9 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
2017-10-05 20:36               ` [PATCH v9 3/6] gso: add VxLAN " Mark Kavanagh
2017-10-05 20:36               ` [PATCH v9 4/6] gso: add GRE " Mark Kavanagh
2017-10-05 20:36               ` [PATCH v9 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
2017-10-05 20:36               ` [PATCH v9 6/6] doc: add GSO programmer's guide Mark Kavanagh
2017-10-06 13:34                 ` Mcnamara, John
2017-10-06 13:41                   ` Kavanagh, Mark B
2017-10-05 15:43             ` [PATCH v8 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
2017-10-05 15:44             ` [PATCH v8 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
2017-10-05 15:44             ` [PATCH v8 3/6] gso: add VxLAN " Mark Kavanagh
2017-10-05 15:44             ` [PATCH v8 4/6] gso: add GRE " Mark Kavanagh
2017-10-05 15:44             ` [PATCH v8 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
2017-10-05 15:44             ` [PATCH v8 6/6] doc: add GSO programmer's guide Mark Kavanagh
2017-10-05 17:57               ` Mcnamara, John
2017-09-28 22:13       ` [PATCH v5 1/6] gso: add Generic Segmentation Offload API framework Mark Kavanagh
2017-09-28 22:13       ` [PATCH v5 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh
2017-09-29  3:12         ` Jiayu Hu
2017-09-29  9:05           ` Kavanagh, Mark B
2017-09-28 22:13       ` [PATCH v5 3/6] gso: add VxLAN " Mark Kavanagh
2017-09-28 22:13       ` [PATCH v5 4/6] gso: add GRE " Mark Kavanagh
2017-09-28 22:13       ` [PATCH v5 5/6] app/testpmd: enable TCP/IPv4, VxLAN and GRE GSO Mark Kavanagh
2017-09-28 22:13       ` [PATCH v5 6/6] doc: add GSO programmer's guide Mark Kavanagh
2017-09-28 22:18       ` [PATCH v5 2/6] gso: add TCP/IPv4 GSO support Mark Kavanagh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.