All of lore.kernel.org
 help / color / mirror / Atom feed
From: Huichao Cai <chcchc88@163.com>
To: dev@dpdk.org
Cc: konstantin.v.ananyev@yandex.ru
Subject: [PATCH v6] ip_frag: add IPv4 fragment copy packet API
Date: Sun, 24 Jul 2022 16:10:03 +0800	[thread overview]
Message-ID: <1658650203-7831-1-git-send-email-chcchc88@163.com> (raw)
In-Reply-To: <1658638211-6661-1-git-send-email-chcchc88@163.com>

Some NIC drivers support MBUF_FAST_FREE(Device supports optimization
for fast release of mbufs. When set application must guarantee that
per-queue all mbufs comes from the same mempool,has refcnt = 1,direct
and non-segmented.)offload. In order to adapt to this offload function,
add this API. Add some test data for this API.

Signed-off-by: Huichao Cai <chcchc88@163.com>
---
 app/test/test_ipfrag.c               |   9 +-
 lib/ip_frag/rte_ip_frag.h            |  34 +++++++
 lib/ip_frag/rte_ipv4_fragmentation.c | 175 +++++++++++++++++++++++++++++++++++
 lib/ip_frag/version.map              |   1 +
 4 files changed, 218 insertions(+), 1 deletion(-)

diff --git a/app/test/test_ipfrag.c b/app/test/test_ipfrag.c
index ba0ffd0..88cc4cd 100644
--- a/app/test/test_ipfrag.c
+++ b/app/test/test_ipfrag.c
@@ -418,10 +418,17 @@ static void ut_teardown(void)
 		}
 
 		if (tests[i].ipv == 4)
-			len = rte_ipv4_fragment_packet(b, pkts_out, BURST,
+			if (i % 2)
+				len = rte_ipv4_fragment_packet(b, pkts_out, BURST,
 						       tests[i].mtu_size,
 						       direct_pool,
 						       indirect_pool);
+			else
+				len = rte_ipv4_fragment_copy_nonseg_packet(b,
+						       pkts_out,
+						       BURST,
+						       tests[i].mtu_size,
+						       direct_pool);
 		else if (tests[i].ipv == 6)
 			len = rte_ipv6_fragment_packet(b, pkts_out, BURST,
 						       tests[i].mtu_size,
diff --git a/lib/ip_frag/rte_ip_frag.h b/lib/ip_frag/rte_ip_frag.h
index 7d2abe1..4a2b150 100644
--- a/lib/ip_frag/rte_ip_frag.h
+++ b/lib/ip_frag/rte_ip_frag.h
@@ -179,6 +179,40 @@ int32_t rte_ipv4_fragment_packet(struct rte_mbuf *pkt_in,
 			struct rte_mempool *pool_indirect);
 
 /**
+ * IPv4 fragmentation by copy.
+ *
+ * This function implements the fragmentation of IPv4 packets by copy
+ * non-segmented mbuf.
+ * This function is mainly used to adapt TX MBUF_FAST_FREE offload.
+ * MBUF_FAST_FREE: Device supports optimization for fast release of mbufs.
+ * When set application must guarantee that per-queue all mbufs comes from
+ * the same mempool,has refcnt = 1,direct and non-segmented.
+ *
+ * @param pkt_in
+ *   The input packet.
+ * @param pkts_out
+ *   Array storing the output fragments.
+ * @param nb_pkts_out
+ *   Number of fragments.
+ * @param mtu_size
+ *   Size in bytes of the Maximum Transfer Unit (MTU) for the outgoing IPv4
+ *   datagrams. This value includes the size of the IPv4 header.
+ * @param pool_direct
+ *   MBUF pool used for allocating direct buffers for the output fragments.
+ * @return
+ *   Upon successful completion - number of output fragments placed
+ *   in the pkts_out array.
+ *   Otherwise - (-1) * errno.
+ */
+__rte_experimental
+int32_t
+rte_ipv4_fragment_copy_nonseg_packet(struct rte_mbuf *pkt_in,
+	struct rte_mbuf **pkts_out,
+	uint16_t nb_pkts_out,
+	uint16_t mtu_size,
+	struct rte_mempool *pool_direct);
+
+/**
  * This function implements reassembly of fragmented IPv4 packets.
  * Incoming mbufs should have its l2_len/l3_len fields setup correctly.
  *
diff --git a/lib/ip_frag/rte_ipv4_fragmentation.c b/lib/ip_frag/rte_ipv4_fragmentation.c
index 27a8ad2..e6ec408 100644
--- a/lib/ip_frag/rte_ipv4_fragmentation.c
+++ b/lib/ip_frag/rte_ipv4_fragmentation.c
@@ -259,3 +259,178 @@ static inline uint16_t __create_ipopt_frag_hdr(uint8_t *iph,
 
 	return out_pkt_pos;
 }
+
+/**
+ * IPv4 fragmentation by copy.
+ *
+ * This function implements the fragmentation of IPv4 packets by copy
+ * non-segmented mbuf.
+ * This function is mainly used to adapt TX MBUF_FAST_FREE offload.
+ * MBUF_FAST_FREE: Device supports optimization for fast release of mbufs.
+ * When set application must guarantee that per-queue all mbufs comes from
+ * the same mempool,has refcnt = 1,direct and non-segmented.
+ *
+ * @param pkt_in
+ *   The input packet.
+ * @param pkts_out
+ *   Array storing the output fragments.
+ * @param nb_pkts_out
+ *   Number of fragments.
+ * @param mtu_size
+ *   Size in bytes of the Maximum Transfer Unit (MTU) for the outgoing IPv4
+ *   datagrams. This value includes the size of the IPv4 header.
+ * @param pool_direct
+ *   MBUF pool used for allocating direct buffers for the output fragments.
+ * @return
+ *   Upon successful completion - number of output fragments placed
+ *   in the pkts_out array.
+ *   Otherwise - (-1) * errno.
+ */
+int32_t
+rte_ipv4_fragment_copy_nonseg_packet(struct rte_mbuf *pkt_in,
+	struct rte_mbuf **pkts_out,
+	uint16_t nb_pkts_out,
+	uint16_t mtu_size,
+	struct rte_mempool *pool_direct)
+{
+	struct rte_mbuf *in_seg = NULL;
+	struct rte_ipv4_hdr *in_hdr;
+	uint32_t out_pkt_pos, in_seg_data_pos;
+	uint32_t more_in_segs;
+	uint16_t fragment_offset, flag_offset, frag_size, header_len;
+	uint16_t frag_bytes_remaining;
+	uint8_t ipopt_frag_hdr[IPV4_HDR_MAX_LEN];
+	uint16_t ipopt_len;
+
+	/*
+	 * Formal parameter checking.
+	 */
+	if (unlikely(pkt_in == NULL) || unlikely(pkts_out == NULL) ||
+	    unlikely(nb_pkts_out == 0) || unlikely(pool_direct == NULL) ||
+	    unlikely(mtu_size < RTE_ETHER_MIN_MTU))
+		return -EINVAL;
+
+	in_hdr = rte_pktmbuf_mtod(pkt_in, struct rte_ipv4_hdr *);
+	header_len = (in_hdr->version_ihl & RTE_IPV4_HDR_IHL_MASK) *
+	    RTE_IPV4_IHL_MULTIPLIER;
+
+	/* Check IP header length */
+	if (unlikely(pkt_in->data_len < header_len) ||
+	    unlikely(mtu_size < header_len))
+		return -EINVAL;
+
+	/*
+	 * Ensure the IP payload length of all fragments is aligned to a
+	 * multiple of 8 bytes as per RFC791 section 2.3.
+	 */
+	frag_size = RTE_ALIGN_FLOOR((mtu_size - header_len),
+				    IPV4_HDR_FO_ALIGN);
+
+	flag_offset = rte_cpu_to_be_16(in_hdr->fragment_offset);
+
+	/* If Don't Fragment flag is set */
+	if (unlikely((flag_offset & IPV4_HDR_DF_MASK) != 0))
+		return -ENOTSUP;
+
+	/* Check that pkts_out is big enough to hold all fragments */
+	if (unlikely(frag_size * nb_pkts_out <
+	    (uint16_t)(pkt_in->pkt_len - header_len)))
+		return -EINVAL;
+
+	in_seg = pkt_in;
+	in_seg_data_pos = header_len;
+	out_pkt_pos = 0;
+	fragment_offset = 0;
+
+	ipopt_len = header_len - sizeof(struct rte_ipv4_hdr);
+	if (unlikely(ipopt_len > RTE_IPV4_HDR_OPT_MAX_LEN))
+		return -EINVAL;
+
+	more_in_segs = 1;
+	while (likely(more_in_segs)) {
+		struct rte_mbuf *out_pkt = NULL;
+		uint32_t more_out_segs;
+		struct rte_ipv4_hdr *out_hdr;
+
+		/* Allocate direct buffer */
+		out_pkt = rte_pktmbuf_alloc(pool_direct);
+		if (unlikely(out_pkt == NULL)) {
+			__free_fragments(pkts_out, out_pkt_pos);
+			return -ENOMEM;
+		}
+		if (unlikely(out_pkt->buf_len - rte_pktmbuf_headroom(out_pkt) <
+				frag_size)) {
+			rte_pktmbuf_free(out_pkt);
+			__free_fragments(pkts_out, out_pkt_pos);
+			return -EINVAL;
+		}
+
+		/* Reserve space for the IP header that will be built later */
+		out_pkt->data_len = header_len;
+		out_pkt->pkt_len = header_len;
+		frag_bytes_remaining = frag_size;
+
+		more_out_segs = 1;
+		while (likely(more_out_segs && more_in_segs)) {
+			uint32_t len;
+
+			len = frag_bytes_remaining;
+			if (len > (in_seg->data_len - in_seg_data_pos))
+				len = in_seg->data_len - in_seg_data_pos;
+
+			memcpy(rte_pktmbuf_mtod_offset(out_pkt, char *,
+					out_pkt->data_len),
+				rte_pktmbuf_mtod_offset(in_seg, char *,
+					in_seg_data_pos),
+				len);
+
+			in_seg_data_pos += len;
+			frag_bytes_remaining -= len;
+			out_pkt->data_len += len;
+
+			/* Current output packet (i.e. fragment) done ? */
+			if (unlikely(frag_bytes_remaining == 0))
+				more_out_segs = 0;
+
+			/* Current input segment done ? */
+			if (unlikely(in_seg_data_pos == in_seg->data_len)) {
+				in_seg = in_seg->next;
+				in_seg_data_pos = 0;
+
+				if (unlikely(in_seg == NULL))
+					more_in_segs = 0;
+			}
+		}
+
+		/* Build the IP header */
+
+		out_pkt->pkt_len = out_pkt->data_len;
+		out_hdr = rte_pktmbuf_mtod(out_pkt, struct rte_ipv4_hdr *);
+
+		__fill_ipv4hdr_frag(out_hdr, in_hdr, header_len,
+		    (uint16_t)out_pkt->pkt_len,
+		    flag_offset, fragment_offset, more_in_segs);
+
+		if (unlikely((fragment_offset == 0) && (ipopt_len) &&
+			    ((flag_offset & RTE_IPV4_HDR_OFFSET_MASK) == 0))) {
+			ipopt_len = __create_ipopt_frag_hdr((uint8_t *)in_hdr,
+				ipopt_len, ipopt_frag_hdr);
+			fragment_offset = (uint16_t)(fragment_offset +
+				out_pkt->pkt_len - header_len);
+			out_pkt->l3_len = header_len;
+
+			header_len = sizeof(struct rte_ipv4_hdr) + ipopt_len;
+			in_hdr = (struct rte_ipv4_hdr *)ipopt_frag_hdr;
+		} else {
+			fragment_offset = (uint16_t)(fragment_offset +
+				out_pkt->pkt_len - header_len);
+			out_pkt->l3_len = header_len;
+		}
+
+		/* Write the fragment to the output list */
+		pkts_out[out_pkt_pos] = out_pkt;
+		out_pkt_pos++;
+	}
+
+	return out_pkt_pos;
+}
diff --git a/lib/ip_frag/version.map b/lib/ip_frag/version.map
index b9c1cca..8aad839 100644
--- a/lib/ip_frag/version.map
+++ b/lib/ip_frag/version.map
@@ -17,4 +17,5 @@ EXPERIMENTAL {
 	global:
 
 	rte_ip_frag_table_del_expired_entries;
+	rte_ipv4_fragment_copy_nonseg_packet;
 };
-- 
1.8.3.1


  reply	other threads:[~2022-07-24  8:10 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-09  2:39 [PATCH v1] ip_frag: add IPv4 fragment copy packet API Huichao Cai
2022-06-09 14:19 ` [PATCH v2] " Huichao Cai
2022-07-10 23:35   ` Konstantin Ananyev
2022-07-11  9:14     ` Konstantin Ananyev
2022-07-15  8:05       ` Huichao Cai
2022-07-19  8:19         ` Konstantin Ananyev
2022-07-22 13:01   ` [PATCH v3] " Huichao Cai
2022-07-22 14:42     ` Morten Brørup
2022-07-22 14:49     ` Stephen Hemminger
2022-07-22 15:52       ` Morten Brørup
2022-07-22 15:58         ` Huichao Cai
2022-07-22 16:14           ` Morten Brørup
2022-07-22 22:35             ` Konstantin Ananyev
2022-07-23  8:24               ` Morten Brørup
2022-07-23 18:25                 ` Konstantin Ananyev
2022-07-23 22:27                   ` Morten Brørup
2022-07-22 14:49     ` [PATCH v4] " Huichao Cai
2022-07-24  4:50       ` [PATCH v5] " Huichao Cai
2022-07-24  8:10         ` Huichao Cai [this message]
2022-07-25 15:42           ` [PATCH v6] " Stephen Hemminger
2022-07-26  1:22             ` Huichao Cai
2022-08-07 11:49               ` Konstantin Ananyev
2022-08-07 11:45           ` Konstantin Ananyev
2022-08-08  1:48           ` [PATCH v7] " Huichao Cai
2022-08-08 22:29             ` Konstantin Ananyev
2022-08-29 14:22               ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1658650203-7831-1-git-send-email-chcchc88@163.com \
    --to=chcchc88@163.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.v.ananyev@yandex.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.