intel-wired-lan.lists.osuosl.org archive mirror
 help / color / mirror / Atom feed
* [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer
@ 2023-02-15 12:42 Tirthendu Sarkar
  2023-02-15 12:42 ` [Intel-wired-lan] [PATCH intel-next v4 1/8] i40e: consolidate maximum frame size calculation for vsi Tirthendu Sarkar
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Tirthendu Sarkar @ 2023-02-15 12:42 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: tirthendu.sarkar, netdev, jesse.brandeburg, anthony.l.nguyen,
	bpf, magnus.karlsson

This patchset adds multi-buffer support for XDP. Tx side already has
support for multi-buffer. This patchset focuses on Rx side. The last
patch contains actual multi-buffer changes while the previous ones are
preparatory patches.

On receiving the first buffer of a packet, xdp_buff is built and its
subsequent buffers are added to it as frags. While 'next_to_clean' keeps
pointing to the first descriptor, the newly introduced 'next_to_process'
keeps track of every descriptor for the packet. 

On receiving EOP buffer the XDP program is called and appropriate action
is taken (building skb for XDP_PASS, reusing page for XDP_DROP, adjusting
page offsets for XDP_{REDIRECT,TX}).

The patchset also streamlines page offset adjustments for buffer reuse
to make it easier to post process the rx_buffers after running XDP prog.

With this patchset there does not seem to be any performance degradation
for XDP_PASS and some improvement (~1% for XDP_TX, ~5% for XDP_DROP) when
measured using xdp_rxq_info program from samples/bpf/ for 64B packets.

Changelog:
    v3 -> v4:
    - Added non-linear XDP buffer support to xdp_features. [Maciej]
    - Removed double space. [Maciej]

    v2 -> v3:
    - Fixed buffer cleanup for single buffer packets on skb alloc
      failure.
    - Better naming of cleanup function.
    - Stop incrementing nr_frags for overflowing packets.
 
    v1 -> v2:
    - Instead of building xdp_buff on eop now it is built incrementally.
    - xdp_buff is now added to i40e_ring struct for preserving across
      napi calls. [Alexander Duyck]
    - Post XDP program rx_buffer processing has been simplified.
    - Rx buffer allocation pull out is reverted to avoid performance 
      issues for smaller ring sizes and now done when at least half of
      the ring has been cleaned. With v1 there was ~75% drop for
      XDP_PASS with the smallest ring size of 64 which is mitigated by
      v2 [Alexander Duyck]
    - Instead of retrying skb allocation on previous failure now the
      packet is dropped. [Maciej]
    - Simplified page offset adjustments by using xdp->frame_sz instead
      of recalculating truesize. [Maciej]
    - Change i40e_trace() to use xdp instead of skb [Maciej]
    - Reserve tailroom for legacy-rx [Maciej]
    - Centralize max frame size calculation

Tirthendu Sarkar (8):
  i40e: consolidate maximum frame size calculation for vsi
  i40e: change Rx buffer size for legacy-rx to support XDP multi-buffer
  i40e: add pre-xdp page_count in rx_buffer
  i40e: Change size to truesize when using i40e_rx_buffer_flip()
  i40e: use frame_sz instead of recalculating truesize for building skb
  i40e: introduce next_to_process to i40e_ring
  i40e: add xdp_buff to i40e_ring struct
  i40e: add support for XDP multi-buffer Rx

 drivers/net/ethernet/intel/i40e/i40e_main.c  |  78 ++--
 drivers/net/ethernet/intel/i40e/i40e_trace.h |  20 +-
 drivers/net/ethernet/intel/i40e/i40e_txrx.c  | 420 +++++++++++--------
 drivers/net/ethernet/intel/i40e/i40e_txrx.h  |  21 +-
 4 files changed, 307 insertions(+), 232 deletions(-)

-- 
2.34.1

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Intel-wired-lan] [PATCH intel-next v4 1/8] i40e: consolidate maximum frame size calculation for vsi
  2023-02-15 12:42 [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer Tirthendu Sarkar
@ 2023-02-15 12:42 ` Tirthendu Sarkar
  2023-02-15 12:42 ` [Intel-wired-lan] [PATCH intel-next v4 2/8] i40e: change Rx buffer size for legacy-rx to support XDP multi-buffer Tirthendu Sarkar
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Tirthendu Sarkar @ 2023-02-15 12:42 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: tirthendu.sarkar, netdev, jesse.brandeburg, anthony.l.nguyen,
	bpf, magnus.karlsson

Introduce new helper function to calculate max frame size for validating
and setting of vsi frame size. This is used while configuring vsi,
changing the MTU and attaching an XDP program to the vsi.

This is in preparation of the legacy rx and multi-buffer changes to be
introduced in later patches.

Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 71 +++++++++++----------
 1 file changed, 38 insertions(+), 33 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 3ee00c3bc319..672038801d1d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -2896,15 +2896,35 @@ static void i40e_sync_filters_subtask(struct i40e_pf *pf)
 }
 
 /**
- * i40e_max_xdp_frame_size - returns the maximum allowed frame size for XDP
+ * i40e_calculate_vsi_rx_buf_len - Calculates buffer length
+ *
+ * @vsi: VSI to calculate rx_buf_len from
+ */
+static u16 i40e_calculate_vsi_rx_buf_len(struct i40e_vsi *vsi)
+{
+	if (!vsi->netdev || (vsi->back->flags & I40E_FLAG_LEGACY_RX))
+		return I40E_RXBUFFER_2048;
+
+	return PAGE_SIZE < 8192 ? I40E_RXBUFFER_3072 : I40E_RXBUFFER_2048;
+}
+
+/**
+ * i40e_max_vsi_frame_size - returns the maximum allowed frame size for VSI
  * @vsi: the vsi
+ * @xdp_prog: XDP program
  **/
-static int i40e_max_xdp_frame_size(struct i40e_vsi *vsi)
+static int i40e_max_vsi_frame_size(struct i40e_vsi *vsi,
+				   struct bpf_prog *xdp_prog)
 {
-	if (PAGE_SIZE >= 8192 || (vsi->back->flags & I40E_FLAG_LEGACY_RX))
-		return I40E_RXBUFFER_2048;
+	u16 rx_buf_len = i40e_calculate_vsi_rx_buf_len(vsi);
+	u16 chain_len;
+
+	if (xdp_prog)
+		chain_len = 1;
 	else
-		return I40E_RXBUFFER_3072;
+		chain_len = I40E_MAX_CHAINED_RX_BUFFERS;
+
+	return min_t(u16, rx_buf_len * chain_len, I40E_MAX_RXBUFFER);
 }
 
 /**
@@ -2919,12 +2939,13 @@ static int i40e_change_mtu(struct net_device *netdev, int new_mtu)
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
 	struct i40e_vsi *vsi = np->vsi;
 	struct i40e_pf *pf = vsi->back;
+	int frame_size;
 
-	if (i40e_enabled_xdp_vsi(vsi)) {
-		int frame_size = new_mtu + ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN;
-
-		if (frame_size > i40e_max_xdp_frame_size(vsi))
-			return -EINVAL;
+	frame_size = i40e_max_vsi_frame_size(vsi, vsi->xdp_prog);
+	if (new_mtu > frame_size - I40E_PACKET_HDR_PAD) {
+		netdev_err(netdev, "Error changing mtu to %d, Max is %d\n",
+			   new_mtu, frame_size - I40E_PACKET_HDR_PAD);
+		return -EINVAL;
 	}
 
 	netdev_dbg(netdev, "changing MTU from %d to %d\n",
@@ -3693,24 +3714,6 @@ static int i40e_vsi_configure_tx(struct i40e_vsi *vsi)
 	return err;
 }
 
-/**
- * i40e_calculate_vsi_rx_buf_len - Calculates buffer length
- *
- * @vsi: VSI to calculate rx_buf_len from
- */
-static u16 i40e_calculate_vsi_rx_buf_len(struct i40e_vsi *vsi)
-{
-	if (!vsi->netdev || (vsi->back->flags & I40E_FLAG_LEGACY_RX))
-		return I40E_RXBUFFER_2048;
-
-#if (PAGE_SIZE < 8192)
-	if (!I40E_2K_TOO_SMALL_WITH_PADDING && vsi->netdev->mtu <= ETH_DATA_LEN)
-		return I40E_RXBUFFER_1536 - NET_IP_ALIGN;
-#endif
-
-	return PAGE_SIZE < 8192 ? I40E_RXBUFFER_3072 : I40E_RXBUFFER_2048;
-}
-
 /**
  * i40e_vsi_configure_rx - Configure the VSI for Rx
  * @vsi: the VSI being configured
@@ -3722,13 +3725,15 @@ static int i40e_vsi_configure_rx(struct i40e_vsi *vsi)
 	int err = 0;
 	u16 i;
 
-	vsi->max_frame = I40E_MAX_RXBUFFER;
+	vsi->max_frame = i40e_max_vsi_frame_size(vsi, vsi->xdp_prog);
 	vsi->rx_buf_len = i40e_calculate_vsi_rx_buf_len(vsi);
 
 #if (PAGE_SIZE < 8192)
 	if (vsi->netdev && !I40E_2K_TOO_SMALL_WITH_PADDING &&
-	    vsi->netdev->mtu <= ETH_DATA_LEN)
-		vsi->max_frame = I40E_RXBUFFER_1536 - NET_IP_ALIGN;
+	    vsi->netdev->mtu <= ETH_DATA_LEN) {
+		vsi->rx_buf_len = I40E_RXBUFFER_1536 - NET_IP_ALIGN;
+		vsi->max_frame = vsi->rx_buf_len;
+	}
 #endif
 
 	/* set up individual rings */
@@ -13314,14 +13319,14 @@ static netdev_features_t i40e_features_check(struct sk_buff *skb,
 static int i40e_xdp_setup(struct i40e_vsi *vsi, struct bpf_prog *prog,
 			  struct netlink_ext_ack *extack)
 {
-	int frame_size = vsi->netdev->mtu + ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN;
+	int frame_size = i40e_max_vsi_frame_size(vsi, prog);
 	struct i40e_pf *pf = vsi->back;
 	struct bpf_prog *old_prog;
 	bool need_reset;
 	int i;
 
 	/* Don't allow frames that span over multiple buffers */
-	if (frame_size > i40e_calculate_vsi_rx_buf_len(vsi)) {
+	if (vsi->netdev->mtu > frame_size - I40E_PACKET_HDR_PAD) {
 		NL_SET_ERR_MSG_MOD(extack, "MTU too large to enable XDP");
 		return -EINVAL;
 	}
-- 
2.34.1

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-wired-lan] [PATCH intel-next v4 2/8] i40e: change Rx buffer size for legacy-rx to support XDP multi-buffer
  2023-02-15 12:42 [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer Tirthendu Sarkar
  2023-02-15 12:42 ` [Intel-wired-lan] [PATCH intel-next v4 1/8] i40e: consolidate maximum frame size calculation for vsi Tirthendu Sarkar
@ 2023-02-15 12:42 ` Tirthendu Sarkar
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 3/8] i40e: add pre-xdp page_count in rx_buffer Tirthendu Sarkar
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Tirthendu Sarkar @ 2023-02-15 12:42 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: tirthendu.sarkar, netdev, jesse.brandeburg, anthony.l.nguyen,
	bpf, magnus.karlsson

Adding support for XDP multi-buffer entails adding information of all
the fragments of the packet in the xdp_buff. This approach implies that
underlying buffer has to provide tailroom for skb_shared_info.

In the legacy-rx mode, driver can only configure upto 2k sized Rx buffers
and with the current configuration of 2k sized Rx buffers there is no way
to do tailroom reservation for skb_shared_info. Hence size of Rx buffers
is now lowered to 1664 (2k - sizeof(skb_shared_info)). Also, driver can
only chain upto 5 Rx buffers and this means max MTU supported for
legacy-rx is now 8320.

Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +-
 drivers/net/ethernet/intel/i40e/i40e_txrx.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 672038801d1d..d7c08f1d486a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -2903,7 +2903,7 @@ static void i40e_sync_filters_subtask(struct i40e_pf *pf)
 static u16 i40e_calculate_vsi_rx_buf_len(struct i40e_vsi *vsi)
 {
 	if (!vsi->netdev || (vsi->back->flags & I40E_FLAG_LEGACY_RX))
-		return I40E_RXBUFFER_2048;
+		return I40E_RXBUFFER_1664;
 
 	return PAGE_SIZE < 8192 ? I40E_RXBUFFER_3072 : I40E_RXBUFFER_2048;
 }
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index 768290dc6f48..1382efb43ffd 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -97,6 +97,7 @@ enum i40e_dyn_idx_t {
 /* Supported Rx Buffer Sizes (a multiple of 128) */
 #define I40E_RXBUFFER_256   256
 #define I40E_RXBUFFER_1536  1536  /* 128B aligned standard Ethernet frame */
+#define I40E_RXBUFFER_1664  1664  /* For legacy Rx with tailroom for frags */
 #define I40E_RXBUFFER_2048  2048
 #define I40E_RXBUFFER_3072  3072  /* Used for large frames w/ padding */
 #define I40E_MAX_RXBUFFER   9728  /* largest size for single descriptor */
-- 
2.34.1

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-wired-lan] [PATCH intel-next v4 3/8] i40e: add pre-xdp page_count in rx_buffer
  2023-02-15 12:42 [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer Tirthendu Sarkar
  2023-02-15 12:42 ` [Intel-wired-lan] [PATCH intel-next v4 1/8] i40e: consolidate maximum frame size calculation for vsi Tirthendu Sarkar
  2023-02-15 12:42 ` [Intel-wired-lan] [PATCH intel-next v4 2/8] i40e: change Rx buffer size for legacy-rx to support XDP multi-buffer Tirthendu Sarkar
@ 2023-02-15 12:43 ` Tirthendu Sarkar
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 4/8] i40e: Change size to truesize when using i40e_rx_buffer_flip() Tirthendu Sarkar
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Tirthendu Sarkar @ 2023-02-15 12:43 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: tirthendu.sarkar, netdev, jesse.brandeburg, anthony.l.nguyen,
	bpf, magnus.karlsson

Page count of rx_buffer needs to be stored prior to XDP call to prevent
page recycling in case that buffer would be freed within xdp redirect
path. Instead of storing it on the stack, now it is stored in the
rx_buffer struct. This will help in processing multi-buffers as the page
counts of all rx_buffers (of the same packet) don't need to be stored on
stack.

Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 23 +++++++--------------
 drivers/net/ethernet/intel/i40e/i40e_txrx.h |  1 +
 2 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 924f972b91fa..a7fba294a8f4 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1970,7 +1970,6 @@ static bool i40e_cleanup_headers(struct i40e_ring *rx_ring, struct sk_buff *skb,
  * i40e_can_reuse_rx_page - Determine if page can be reused for another Rx
  * @rx_buffer: buffer containing the page
  * @rx_stats: rx stats structure for the rx ring
- * @rx_buffer_pgcnt: buffer page refcount pre xdp_do_redirect() call
  *
  * If page is reusable, we have a green light for calling i40e_reuse_rx_page,
  * which will assign the current buffer to the buffer that next_to_alloc is
@@ -1981,8 +1980,7 @@ static bool i40e_cleanup_headers(struct i40e_ring *rx_ring, struct sk_buff *skb,
  * or busy if it could not be reused.
  */
 static bool i40e_can_reuse_rx_page(struct i40e_rx_buffer *rx_buffer,
-				   struct i40e_rx_queue_stats *rx_stats,
-				   int rx_buffer_pgcnt)
+				   struct i40e_rx_queue_stats *rx_stats)
 {
 	unsigned int pagecnt_bias = rx_buffer->pagecnt_bias;
 	struct page *page = rx_buffer->page;
@@ -1995,7 +1993,7 @@ static bool i40e_can_reuse_rx_page(struct i40e_rx_buffer *rx_buffer,
 
 #if (PAGE_SIZE < 8192)
 	/* if we are only owner of page we can reuse it */
-	if (unlikely((rx_buffer_pgcnt - pagecnt_bias) > 1)) {
+	if (unlikely((rx_buffer->page_count - pagecnt_bias) > 1)) {
 		rx_stats->page_busy_count++;
 		return false;
 	}
@@ -2058,19 +2056,17 @@ static void i40e_add_rx_frag(struct i40e_ring *rx_ring,
  * i40e_get_rx_buffer - Fetch Rx buffer and synchronize data for use
  * @rx_ring: rx descriptor ring to transact packets on
  * @size: size of buffer to add to skb
- * @rx_buffer_pgcnt: buffer page refcount
  *
  * This function will pull an Rx buffer from the ring and synchronize it
  * for use by the CPU.
  */
 static struct i40e_rx_buffer *i40e_get_rx_buffer(struct i40e_ring *rx_ring,
-						 const unsigned int size,
-						 int *rx_buffer_pgcnt)
+						 const unsigned int size)
 {
 	struct i40e_rx_buffer *rx_buffer;
 
 	rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean);
-	*rx_buffer_pgcnt =
+	rx_buffer->page_count =
 #if (PAGE_SIZE < 8192)
 		page_count(rx_buffer->page);
 #else
@@ -2226,16 +2222,14 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
  * i40e_put_rx_buffer - Clean up used buffer and either recycle or free
  * @rx_ring: rx descriptor ring to transact packets on
  * @rx_buffer: rx buffer to pull data from
- * @rx_buffer_pgcnt: rx buffer page refcount pre xdp_do_redirect() call
  *
  * This function will clean up the contents of the rx_buffer.  It will
  * either recycle the buffer or unmap it and free the associated resources.
  */
 static void i40e_put_rx_buffer(struct i40e_ring *rx_ring,
-			       struct i40e_rx_buffer *rx_buffer,
-			       int rx_buffer_pgcnt)
+			       struct i40e_rx_buffer *rx_buffer)
 {
-	if (i40e_can_reuse_rx_page(rx_buffer, &rx_ring->rx_stats, rx_buffer_pgcnt)) {
+	if (i40e_can_reuse_rx_page(rx_buffer, &rx_ring->rx_stats)) {
 		/* hand second half of page back to the ring */
 		i40e_reuse_rx_page(rx_ring, rx_buffer);
 	} else {
@@ -2457,7 +2451,6 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 	while (likely(total_rx_packets < (unsigned int)budget)) {
 		struct i40e_rx_buffer *rx_buffer;
 		union i40e_rx_desc *rx_desc;
-		int rx_buffer_pgcnt;
 		unsigned int size;
 		u64 qword;
 
@@ -2500,7 +2493,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 			break;
 
 		i40e_trace(clean_rx_irq, rx_ring, rx_desc, skb);
-		rx_buffer = i40e_get_rx_buffer(rx_ring, size, &rx_buffer_pgcnt);
+		rx_buffer = i40e_get_rx_buffer(rx_ring, size);
 
 		/* retrieve a buffer from the ring */
 		if (!skb) {
@@ -2541,7 +2534,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 			break;
 		}
 
-		i40e_put_rx_buffer(rx_ring, rx_buffer, rx_buffer_pgcnt);
+		i40e_put_rx_buffer(rx_ring, rx_buffer);
 		cleaned_count++;
 
 		i40e_inc_ntc(rx_ring);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index 1382efb43ffd..3e2935365104 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -278,6 +278,7 @@ struct i40e_rx_buffer {
 	struct page *page;
 	__u32 page_offset;
 	__u16 pagecnt_bias;
+	__u32 page_count;
 };
 
 struct i40e_queue_stats {
-- 
2.34.1

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-wired-lan] [PATCH intel-next v4 4/8] i40e: Change size to truesize when using i40e_rx_buffer_flip()
  2023-02-15 12:42 [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer Tirthendu Sarkar
                   ` (2 preceding siblings ...)
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 3/8] i40e: add pre-xdp page_count in rx_buffer Tirthendu Sarkar
@ 2023-02-15 12:43 ` Tirthendu Sarkar
  2023-02-15 23:11   ` Tony Nguyen
  2023-02-16  7:49   ` Paul Menzel
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 5/8] i40e: use frame_sz instead of recalculating truesize for building skb Tirthendu Sarkar
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 17+ messages in thread
From: Tirthendu Sarkar @ 2023-02-15 12:43 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: tirthendu.sarkar, netdev, jesse.brandeburg, anthony.l.nguyen,
	bpf, magnus.karlsson

Truesize is now passed directly to i40e_rx_buffer_flip() instead of size
so that it does not need to recalculate truesize from size using
i40e_rx_frame_truesize() before adjusting page offset.

With these change the function can now be used during skb building and
adding frags. In later patches it will also be easier for adjusting
page offsets for multi-buffers.

Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 54 ++++++++-------------
 1 file changed, 19 insertions(+), 35 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index a7fba294a8f4..019abd7273a2 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2018,6 +2018,21 @@ static bool i40e_can_reuse_rx_page(struct i40e_rx_buffer *rx_buffer,
 	return true;
 }
 
+/**
+ * i40e_rx_buffer_flip - adjusted rx_buffer to point to an unused region
+ * @rx_buffer: Rx buffer to adjust
+ * @size: Size of adjustment
+ **/
+static void i40e_rx_buffer_flip(struct i40e_rx_buffer *rx_buffer,
+				unsigned int truesize)
+{
+#if (PAGE_SIZE < 8192)
+	rx_buffer->page_offset ^= truesize;
+#else
+	rx_buffer->page_offset += truesize;
+#endif
+}
+
 /**
  * i40e_add_rx_frag - Add contents of Rx buffer to sk_buff
  * @rx_ring: rx descriptor ring to transact packets on
@@ -2045,11 +2060,7 @@ static void i40e_add_rx_frag(struct i40e_ring *rx_ring,
 			rx_buffer->page_offset, size, truesize);
 
 	/* page is being used so we must update the page offset */
-#if (PAGE_SIZE < 8192)
-	rx_buffer->page_offset ^= truesize;
-#else
-	rx_buffer->page_offset += truesize;
-#endif
+	i40e_rx_buffer_flip(rx_buffer, truesize);
 }
 
 /**
@@ -2154,11 +2165,7 @@ static struct sk_buff *i40e_construct_skb(struct i40e_ring *rx_ring,
 				size, truesize);
 
 		/* buffer is used by skb, update page_offset */
-#if (PAGE_SIZE < 8192)
-		rx_buffer->page_offset ^= truesize;
-#else
-		rx_buffer->page_offset += truesize;
-#endif
+		i40e_rx_buffer_flip(rx_buffer, truesize);
 	} else {
 		/* buffer is unused, reset bias back to rx_buffer */
 		rx_buffer->pagecnt_bias++;
@@ -2209,11 +2216,7 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
 		skb_metadata_set(skb, metasize);
 
 	/* buffer is used by skb, update page_offset */
-#if (PAGE_SIZE < 8192)
-	rx_buffer->page_offset ^= truesize;
-#else
-	rx_buffer->page_offset += truesize;
-#endif
+	i40e_rx_buffer_flip(rx_buffer, truesize);
 
 	return skb;
 }
@@ -2326,25 +2329,6 @@ static int i40e_run_xdp(struct i40e_ring *rx_ring, struct xdp_buff *xdp, struct
 	return result;
 }
 
-/**
- * i40e_rx_buffer_flip - adjusted rx_buffer to point to an unused region
- * @rx_ring: Rx ring
- * @rx_buffer: Rx buffer to adjust
- * @size: Size of adjustment
- **/
-static void i40e_rx_buffer_flip(struct i40e_ring *rx_ring,
-				struct i40e_rx_buffer *rx_buffer,
-				unsigned int size)
-{
-	unsigned int truesize = i40e_rx_frame_truesize(rx_ring, size);
-
-#if (PAGE_SIZE < 8192)
-	rx_buffer->page_offset ^= truesize;
-#else
-	rx_buffer->page_offset += truesize;
-#endif
-}
-
 /**
  * i40e_xdp_ring_update_tail - Updates the XDP Tx ring tail register
  * @xdp_ring: XDP Tx ring
@@ -2513,7 +2497,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 		if (xdp_res) {
 			if (xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR)) {
 				xdp_xmit |= xdp_res;
-				i40e_rx_buffer_flip(rx_ring, rx_buffer, size);
+				i40e_rx_buffer_flip(rx_buffer, xdp.frame_sz);
 			} else {
 				rx_buffer->pagecnt_bias++;
 			}
-- 
2.34.1

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-wired-lan] [PATCH intel-next v4 5/8] i40e: use frame_sz instead of recalculating truesize for building skb
  2023-02-15 12:42 [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer Tirthendu Sarkar
                   ` (3 preceding siblings ...)
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 4/8] i40e: Change size to truesize when using i40e_rx_buffer_flip() Tirthendu Sarkar
@ 2023-02-15 12:43 ` Tirthendu Sarkar
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 6/8] i40e: introduce next_to_process to i40e_ring Tirthendu Sarkar
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Tirthendu Sarkar @ 2023-02-15 12:43 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: tirthendu.sarkar, netdev, jesse.brandeburg, anthony.l.nguyen,
	bpf, magnus.karlsson

In skb path truesize is calculated while building skb. This is now
avoided and xdp->frame_is used instead for both i40e_build_skb() and
i40e_construct_skb().

Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 20 ++++----------------
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 019abd7273a2..01340f620d96 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2113,11 +2113,6 @@ static struct sk_buff *i40e_construct_skb(struct i40e_ring *rx_ring,
 					  struct xdp_buff *xdp)
 {
 	unsigned int size = xdp->data_end - xdp->data;
-#if (PAGE_SIZE < 8192)
-	unsigned int truesize = i40e_rx_pg_size(rx_ring) / 2;
-#else
-	unsigned int truesize = SKB_DATA_ALIGN(size);
-#endif
 	unsigned int headlen;
 	struct sk_buff *skb;
 
@@ -2162,10 +2157,10 @@ static struct sk_buff *i40e_construct_skb(struct i40e_ring *rx_ring,
 	if (size) {
 		skb_add_rx_frag(skb, 0, rx_buffer->page,
 				rx_buffer->page_offset + headlen,
-				size, truesize);
+				size, xdp->frame_sz);
 
 		/* buffer is used by skb, update page_offset */
-		i40e_rx_buffer_flip(rx_buffer, truesize);
+		i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz);
 	} else {
 		/* buffer is unused, reset bias back to rx_buffer */
 		rx_buffer->pagecnt_bias++;
@@ -2188,13 +2183,6 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
 				      struct xdp_buff *xdp)
 {
 	unsigned int metasize = xdp->data - xdp->data_meta;
-#if (PAGE_SIZE < 8192)
-	unsigned int truesize = i40e_rx_pg_size(rx_ring) / 2;
-#else
-	unsigned int truesize = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) +
-				SKB_DATA_ALIGN(xdp->data_end -
-					       xdp->data_hard_start);
-#endif
 	struct sk_buff *skb;
 
 	/* Prefetch first cache line of first page. If xdp->data_meta
@@ -2205,7 +2193,7 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
 	net_prefetch(xdp->data_meta);
 
 	/* build an skb around the page buffer */
-	skb = napi_build_skb(xdp->data_hard_start, truesize);
+	skb = napi_build_skb(xdp->data_hard_start, xdp->frame_sz);
 	if (unlikely(!skb))
 		return NULL;
 
@@ -2216,7 +2204,7 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
 		skb_metadata_set(skb, metasize);
 
 	/* buffer is used by skb, update page_offset */
-	i40e_rx_buffer_flip(rx_buffer, truesize);
+	i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz);
 
 	return skb;
 }
-- 
2.34.1

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-wired-lan] [PATCH intel-next v4 6/8] i40e: introduce next_to_process to i40e_ring
  2023-02-15 12:42 [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer Tirthendu Sarkar
                   ` (4 preceding siblings ...)
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 5/8] i40e: use frame_sz instead of recalculating truesize for building skb Tirthendu Sarkar
@ 2023-02-15 12:43 ` Tirthendu Sarkar
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 7/8] i40e: add xdp_buff to i40e_ring struct Tirthendu Sarkar
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Tirthendu Sarkar @ 2023-02-15 12:43 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: tirthendu.sarkar, netdev, jesse.brandeburg, anthony.l.nguyen,
	bpf, magnus.karlsson

Add a new field called next_to_process in the i40e_ring that is
advanced for every buffer and change the semantics of next_to_clean to
point to the first buffer of a packet. Driver will use next_to_process
in the same way next_to_clean was used previously.

For the non multi-buffer case, next_to_process and next_to_clean will
always be the same since each packet consists of a single buffer.

Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 26 ++++++++++++---------
 drivers/net/ethernet/intel/i40e/i40e_txrx.h |  4 ++++
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 01340f620d96..94c50fa223bd 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1524,6 +1524,7 @@ void i40e_clean_rx_ring(struct i40e_ring *rx_ring)
 
 	rx_ring->next_to_alloc = 0;
 	rx_ring->next_to_clean = 0;
+	rx_ring->next_to_process = 0;
 	rx_ring->next_to_use = 0;
 }
 
@@ -1576,6 +1577,7 @@ int i40e_setup_rx_descriptors(struct i40e_ring *rx_ring)
 
 	rx_ring->next_to_alloc = 0;
 	rx_ring->next_to_clean = 0;
+	rx_ring->next_to_process = 0;
 	rx_ring->next_to_use = 0;
 
 	/* XDP RX-queue info only needed for RX rings exposed to XDP */
@@ -2076,7 +2078,7 @@ static struct i40e_rx_buffer *i40e_get_rx_buffer(struct i40e_ring *rx_ring,
 {
 	struct i40e_rx_buffer *rx_buffer;
 
-	rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean);
+	rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_process);
 	rx_buffer->page_count =
 #if (PAGE_SIZE < 8192)
 		page_count(rx_buffer->page);
@@ -2375,16 +2377,16 @@ void i40e_finalize_xdp_rx(struct i40e_ring *rx_ring, unsigned int xdp_res)
 }
 
 /**
- * i40e_inc_ntc: Advance the next_to_clean index
+ * i40e_inc_ntp: Advance the next_to_process index
  * @rx_ring: Rx ring
  **/
-static void i40e_inc_ntc(struct i40e_ring *rx_ring)
+static void i40e_inc_ntp(struct i40e_ring *rx_ring)
 {
-	u32 ntc = rx_ring->next_to_clean + 1;
+	u32 ntp = rx_ring->next_to_process + 1;
 
-	ntc = (ntc < rx_ring->count) ? ntc : 0;
-	rx_ring->next_to_clean = ntc;
-	prefetch(I40E_RX_DESC(rx_ring, ntc));
+	ntp = (ntp < rx_ring->count) ? ntp : 0;
+	rx_ring->next_to_process = ntp;
+	prefetch(I40E_RX_DESC(rx_ring, ntp));
 }
 
 /**
@@ -2421,6 +2423,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 	xdp_prog = READ_ONCE(rx_ring->xdp_prog);
 
 	while (likely(total_rx_packets < (unsigned int)budget)) {
+		u16 ntp = rx_ring->next_to_process;
 		struct i40e_rx_buffer *rx_buffer;
 		union i40e_rx_desc *rx_desc;
 		unsigned int size;
@@ -2433,7 +2436,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 			cleaned_count = 0;
 		}
 
-		rx_desc = I40E_RX_DESC(rx_ring, rx_ring->next_to_clean);
+		rx_desc = I40E_RX_DESC(rx_ring, ntp);
 
 		/* status_error_len will always be zero for unused descriptors
 		 * because it's cleared in cleanup, and overlaps with hdr_addr
@@ -2452,8 +2455,8 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 			i40e_clean_programming_status(rx_ring,
 						      rx_desc->raw.qword[0],
 						      qword);
-			rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean);
-			i40e_inc_ntc(rx_ring);
+			rx_buffer = i40e_rx_bi(rx_ring, ntp);
+			i40e_inc_ntp(rx_ring);
 			i40e_reuse_rx_page(rx_ring, rx_buffer);
 			cleaned_count++;
 			continue;
@@ -2509,7 +2512,8 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 		i40e_put_rx_buffer(rx_ring, rx_buffer);
 		cleaned_count++;
 
-		i40e_inc_ntc(rx_ring);
+		i40e_inc_ntp(rx_ring);
+		rx_ring->next_to_clean = rx_ring->next_to_process;
 		if (i40e_is_non_eop(rx_ring, rx_desc))
 			continue;
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index 3e2935365104..6e0fd73367df 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -338,6 +338,10 @@ struct i40e_ring {
 	u8 dcb_tc;			/* Traffic class of ring */
 	u8 __iomem *tail;
 
+	/* Next descriptor to be processed; next_to_clean is updated only on
+	 * processing EOP descriptor
+	 */
+	u16 next_to_process;
 	/* high bit set means dynamic, use accessor routines to read/write.
 	 * hardware only supports 2us resolution for the ITR registers.
 	 * these values always store the USER setting, and must be converted
-- 
2.34.1

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-wired-lan] [PATCH intel-next v4 7/8] i40e: add xdp_buff to i40e_ring struct
  2023-02-15 12:42 [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer Tirthendu Sarkar
                   ` (5 preceding siblings ...)
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 6/8] i40e: introduce next_to_process to i40e_ring Tirthendu Sarkar
@ 2023-02-15 12:43 ` Tirthendu Sarkar
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 8/8] i40e: add support for XDP multi-buffer Rx Tirthendu Sarkar
  2023-02-15 23:13 ` [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer Tony Nguyen
  8 siblings, 0 replies; 17+ messages in thread
From: Tirthendu Sarkar @ 2023-02-15 12:43 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: tirthendu.sarkar, netdev, jesse.brandeburg, anthony.l.nguyen,
	bpf, magnus.karlsson

Store xdp_buff on Rx ring struct in preparation for XDP multi-buffer
support. This will allow us to combine fragmented frames across
separate NAPI cycles in the same way as currently skb fragments are
handled. This means that skb pointer on Rx ring will become redundant
and will be removed in a later patch. As a consequence i40e_trace() now
uses xdp instead of skb pointer.

Truesize only needs to be calculated for page sizes bigger than 4k as it
is always half-page for 4k pages. With xdp_buff on ring, frame size can
now be set during xdp_init_buff() and need not be repopulated in each
NAPI call for 4k pages. As a consequence i40e_rx_frame_truesize() is now
used only for bigger pages.

Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c  |  2 ++
 drivers/net/ethernet/intel/i40e/i40e_trace.h | 20 ++++++------
 drivers/net/ethernet/intel/i40e/i40e_txrx.c  | 33 ++++++++------------
 drivers/net/ethernet/intel/i40e/i40e_txrx.h  |  7 +++++
 4 files changed, 32 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index d7c08f1d486a..a6b0516a81c0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -3616,6 +3616,8 @@ static int i40e_configure_rx_ring(struct i40e_ring *ring)
 		}
 	}
 
+	xdp_init_buff(&ring->xdp, i40e_rx_pg_size(ring) / 2, &ring->xdp_rxq);
+
 	rx_ctx.dbuff = DIV_ROUND_UP(ring->rx_buf_len,
 				    BIT_ULL(I40E_RXQ_CTX_DBUFF_SHIFT));
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_trace.h b/drivers/net/ethernet/intel/i40e/i40e_trace.h
index 79d587ad5409..33b4e30f5e00 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_trace.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_trace.h
@@ -162,45 +162,45 @@ DECLARE_EVENT_CLASS(
 
 	TP_PROTO(struct i40e_ring *ring,
 		 union i40e_16byte_rx_desc *desc,
-		 struct sk_buff *skb),
+		 struct xdp_buff *xdp),
 
-	TP_ARGS(ring, desc, skb),
+	TP_ARGS(ring, desc, xdp),
 
 	TP_STRUCT__entry(
 		__field(void*, ring)
 		__field(void*, desc)
-		__field(void*, skb)
+		__field(void*, xdp)
 		__string(devname, ring->netdev->name)
 	),
 
 	TP_fast_assign(
 		__entry->ring = ring;
 		__entry->desc = desc;
-		__entry->skb = skb;
+		__entry->xdp = xdp;
 		__assign_str(devname, ring->netdev->name);
 	),
 
 	TP_printk(
-		"netdev: %s ring: %p desc: %p skb %p",
+		"netdev: %s ring: %p desc: %p xdp %p",
 		__get_str(devname), __entry->ring,
-		__entry->desc, __entry->skb)
+		__entry->desc, __entry->xdp)
 );
 
 DEFINE_EVENT(
 	i40e_rx_template, i40e_clean_rx_irq,
 	TP_PROTO(struct i40e_ring *ring,
 		 union i40e_16byte_rx_desc *desc,
-		 struct sk_buff *skb),
+		 struct xdp_buff *xdp),
 
-	TP_ARGS(ring, desc, skb));
+	TP_ARGS(ring, desc, xdp));
 
 DEFINE_EVENT(
 	i40e_rx_template, i40e_clean_rx_irq_rx,
 	TP_PROTO(struct i40e_ring *ring,
 		 union i40e_16byte_rx_desc *desc,
-		 struct sk_buff *skb),
+		 struct xdp_buff *xdp),
 
-	TP_ARGS(ring, desc, skb));
+	TP_ARGS(ring, desc, xdp));
 
 DECLARE_EVENT_CLASS(
 	i40e_xmit_template,
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 94c50fa223bd..dc2c9aae0ffe 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1619,21 +1619,19 @@ void i40e_release_rx_desc(struct i40e_ring *rx_ring, u32 val)
 	writel(val, rx_ring->tail);
 }
 
+#if (PAGE_SIZE >= 8192)
 static unsigned int i40e_rx_frame_truesize(struct i40e_ring *rx_ring,
 					   unsigned int size)
 {
 	unsigned int truesize;
 
-#if (PAGE_SIZE < 8192)
-	truesize = i40e_rx_pg_size(rx_ring) / 2; /* Must be power-of-2 */
-#else
 	truesize = rx_ring->rx_offset ?
 		SKB_DATA_ALIGN(size + rx_ring->rx_offset) +
 		SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) :
 		SKB_DATA_ALIGN(size);
-#endif
 	return truesize;
 }
+#endif
 
 /**
  * i40e_alloc_mapped_page - recycle or make a new page
@@ -2405,21 +2403,16 @@ static void i40e_inc_ntp(struct i40e_ring *rx_ring)
 static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 			     unsigned int *rx_cleaned)
 {
-	unsigned int total_rx_bytes = 0, total_rx_packets = 0, frame_sz = 0;
+	unsigned int total_rx_bytes = 0, total_rx_packets = 0;
 	u16 cleaned_count = I40E_DESC_UNUSED(rx_ring);
 	unsigned int offset = rx_ring->rx_offset;
+	struct xdp_buff *xdp = &rx_ring->xdp;
 	struct sk_buff *skb = rx_ring->skb;
 	unsigned int xdp_xmit = 0;
 	struct bpf_prog *xdp_prog;
 	bool failure = false;
-	struct xdp_buff xdp;
 	int xdp_res = 0;
 
-#if (PAGE_SIZE < 8192)
-	frame_sz = i40e_rx_frame_truesize(rx_ring, 0);
-#endif
-	xdp_init_buff(&xdp, frame_sz, &rx_ring->xdp_rxq);
-
 	xdp_prog = READ_ONCE(rx_ring->xdp_prog);
 
 	while (likely(total_rx_packets < (unsigned int)budget)) {
@@ -2467,7 +2460,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 		if (!size)
 			break;
 
-		i40e_trace(clean_rx_irq, rx_ring, rx_desc, skb);
+		i40e_trace(clean_rx_irq, rx_ring, rx_desc, xdp);
 		rx_buffer = i40e_get_rx_buffer(rx_ring, size);
 
 		/* retrieve a buffer from the ring */
@@ -2476,19 +2469,19 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 
 			hard_start = page_address(rx_buffer->page) +
 				     rx_buffer->page_offset - offset;
-			xdp_prepare_buff(&xdp, hard_start, offset, size, true);
-			xdp_buff_clear_frags_flag(&xdp);
+			xdp_prepare_buff(xdp, hard_start, offset, size, true);
+			xdp_buff_clear_frags_flag(xdp);
 #if (PAGE_SIZE > 4096)
 			/* At larger PAGE_SIZE, frame_sz depend on len size */
-			xdp.frame_sz = i40e_rx_frame_truesize(rx_ring, size);
+			xdp->frame_sz = i40e_rx_frame_truesize(rx_ring, size);
 #endif
-			xdp_res = i40e_run_xdp(rx_ring, &xdp, xdp_prog);
+			xdp_res = i40e_run_xdp(rx_ring, xdp, xdp_prog);
 		}
 
 		if (xdp_res) {
 			if (xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR)) {
 				xdp_xmit |= xdp_res;
-				i40e_rx_buffer_flip(rx_buffer, xdp.frame_sz);
+				i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz);
 			} else {
 				rx_buffer->pagecnt_bias++;
 			}
@@ -2497,9 +2490,9 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 		} else if (skb) {
 			i40e_add_rx_frag(rx_ring, rx_buffer, skb, size);
 		} else if (ring_uses_build_skb(rx_ring)) {
-			skb = i40e_build_skb(rx_ring, rx_buffer, &xdp);
+			skb = i40e_build_skb(rx_ring, rx_buffer, xdp);
 		} else {
-			skb = i40e_construct_skb(rx_ring, rx_buffer, &xdp);
+			skb = i40e_construct_skb(rx_ring, rx_buffer, xdp);
 		}
 
 		/* exit if we failed to retrieve a buffer */
@@ -2528,7 +2521,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 		/* populate checksum, VLAN, and protocol */
 		i40e_process_skb_fields(rx_ring, rx_desc, skb);
 
-		i40e_trace(clean_rx_irq_rx, rx_ring, rx_desc, skb);
+		i40e_trace(clean_rx_irq_rx, rx_ring, rx_desc, xdp);
 		napi_gro_receive(&rx_ring->q_vector->napi, skb);
 		skb = NULL;
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index 6e0fd73367df..e86abc25bb5e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -338,6 +338,13 @@ struct i40e_ring {
 	u8 dcb_tc;			/* Traffic class of ring */
 	u8 __iomem *tail;
 
+	/* Storing xdp_buff on ring helps in saving the state of partially built
+	 * packet when i40e_clean_rx_ring_irq() must return before it sees EOP
+	 * and to resume packet building for this ring in the next call to
+	 * i40e_clean_rx_ring_irq().
+	 */
+	struct xdp_buff xdp;
+
 	/* Next descriptor to be processed; next_to_clean is updated only on
 	 * processing EOP descriptor
 	 */
-- 
2.34.1

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-wired-lan] [PATCH intel-next v4 8/8] i40e: add support for XDP multi-buffer Rx
  2023-02-15 12:42 [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer Tirthendu Sarkar
                   ` (6 preceding siblings ...)
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 7/8] i40e: add xdp_buff to i40e_ring struct Tirthendu Sarkar
@ 2023-02-15 12:43 ` Tirthendu Sarkar
  2023-02-15 14:51   ` Lorenzo Bianconi
  2023-02-15 23:13 ` [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer Tony Nguyen
  8 siblings, 1 reply; 17+ messages in thread
From: Tirthendu Sarkar @ 2023-02-15 12:43 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: tirthendu.sarkar, netdev, jesse.brandeburg, anthony.l.nguyen,
	bpf, magnus.karlsson

This patch adds multi-buffer support for the i40e_driver.

i40e_clean_rx_irq() is modified to collate all the buffers of a packet
before calling the XDP program. xdp_buff is built for the first frag of
the packet and subsequent frags are added to it. 'next_to_process' is
incremented for all non-EOP frags while 'next_to_clean' stays at the
first descriptor of the packet. XDP program is called only on receiving
EOP frag.

New functions are added for adding frags to xdp_buff and for post
processing of the buffers once the xdp prog has run. For XDP_PASS this
results in a skb with multiple fragments.

i40e_build_skb() builds the skb around xdp buffer that already contains
frags data. So i40e_add_rx_frag() helper function is now removed. Since
fields before 'dataref' in skb_shared_info are cleared during
napi_skb_build(), xdp_update_skb_shared_info() is called to set those.

For i40e_construct_skb(), all the frags data needs to be copied from
xdp_buffer's shared_skb_info to newly constructed skb's shared_skb_info.

This also means 'skb' does not need to be preserved across i40e_napi_poll()
calls and hence is removed from i40e_ring structure.

Previously i40e_alloc_rx_buffers() was called for every 32 cleaned
buffers. For multi-buffers this may not be optimal as there may be more
cleaned buffers in each i40e_clean_rx_irq() call. So this is now called
when at least half of the ring size has been cleaned.

Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c |   7 +-
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 314 +++++++++++++-------
 drivers/net/ethernet/intel/i40e/i40e_txrx.h |   8 -
 3 files changed, 211 insertions(+), 118 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index a6b0516a81c0..978ba7b61166 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -2919,7 +2919,7 @@ static int i40e_max_vsi_frame_size(struct i40e_vsi *vsi,
 	u16 rx_buf_len = i40e_calculate_vsi_rx_buf_len(vsi);
 	u16 chain_len;
 
-	if (xdp_prog)
+	if (xdp_prog && !xdp_prog->aux->xdp_has_frags)
 		chain_len = 1;
 	else
 		chain_len = I40E_MAX_CHAINED_RX_BUFFERS;
@@ -13329,7 +13329,7 @@ static int i40e_xdp_setup(struct i40e_vsi *vsi, struct bpf_prog *prog,
 
 	/* Don't allow frames that span over multiple buffers */
 	if (vsi->netdev->mtu > frame_size - I40E_PACKET_HDR_PAD) {
-		NL_SET_ERR_MSG_MOD(extack, "MTU too large to enable XDP");
+		NL_SET_ERR_MSG_MOD(extack, "MTU too large for linear frames and XDP prog does not support frags");
 		return -EINVAL;
 	}
 
@@ -13795,7 +13795,8 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
 
 	netdev->features &= ~NETIF_F_HW_TC;
 	netdev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT |
-			       NETDEV_XDP_ACT_XSK_ZEROCOPY;
+			       NETDEV_XDP_ACT_XSK_ZEROCOPY |
+			       NETDEV_XDP_ACT_RX_SG;
 
 	if (vsi->type == I40E_VSI_MAIN) {
 		SET_NETDEV_DEV(netdev, &pf->pdev->dev);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index dc2c9aae0ffe..7ace7b7ec256 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1477,9 +1477,6 @@ void i40e_clean_rx_ring(struct i40e_ring *rx_ring)
 	if (!rx_ring->rx_bi)
 		return;
 
-	dev_kfree_skb(rx_ring->skb);
-	rx_ring->skb = NULL;
-
 	if (rx_ring->xsk_pool) {
 		i40e_xsk_clean_rx_ring(rx_ring);
 		goto skip_free;
@@ -2033,36 +2030,6 @@ static void i40e_rx_buffer_flip(struct i40e_rx_buffer *rx_buffer,
 #endif
 }
 
-/**
- * i40e_add_rx_frag - Add contents of Rx buffer to sk_buff
- * @rx_ring: rx descriptor ring to transact packets on
- * @rx_buffer: buffer containing page to add
- * @skb: sk_buff to place the data into
- * @size: packet length from rx_desc
- *
- * This function will add the data contained in rx_buffer->page to the skb.
- * It will just attach the page as a frag to the skb.
- *
- * The function will then update the page offset.
- **/
-static void i40e_add_rx_frag(struct i40e_ring *rx_ring,
-			     struct i40e_rx_buffer *rx_buffer,
-			     struct sk_buff *skb,
-			     unsigned int size)
-{
-#if (PAGE_SIZE < 8192)
-	unsigned int truesize = i40e_rx_pg_size(rx_ring) / 2;
-#else
-	unsigned int truesize = SKB_DATA_ALIGN(size + rx_ring->rx_offset);
-#endif
-
-	skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, rx_buffer->page,
-			rx_buffer->page_offset, size, truesize);
-
-	/* page is being used so we must update the page offset */
-	i40e_rx_buffer_flip(rx_buffer, truesize);
-}
-
 /**
  * i40e_get_rx_buffer - Fetch Rx buffer and synchronize data for use
  * @rx_ring: rx descriptor ring to transact packets on
@@ -2099,20 +2066,82 @@ static struct i40e_rx_buffer *i40e_get_rx_buffer(struct i40e_ring *rx_ring,
 }
 
 /**
- * i40e_construct_skb - Allocate skb and populate it
+ * i40e_put_rx_buffer - Clean up used buffer and either recycle or free
  * @rx_ring: rx descriptor ring to transact packets on
  * @rx_buffer: rx buffer to pull data from
+ *
+ * This function will clean up the contents of the rx_buffer.  It will
+ * either recycle the buffer or unmap it and free the associated resources.
+ */
+static void i40e_put_rx_buffer(struct i40e_ring *rx_ring,
+			       struct i40e_rx_buffer *rx_buffer)
+{
+	if (i40e_can_reuse_rx_page(rx_buffer, &rx_ring->rx_stats)) {
+		/* hand second half of page back to the ring */
+		i40e_reuse_rx_page(rx_ring, rx_buffer);
+	} else {
+		/* we are not reusing the buffer so unmap it */
+		dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma,
+				     i40e_rx_pg_size(rx_ring),
+				     DMA_FROM_DEVICE, I40E_RX_DMA_ATTR);
+		__page_frag_cache_drain(rx_buffer->page,
+					rx_buffer->pagecnt_bias);
+		/* clear contents of buffer_info */
+		rx_buffer->page = NULL;
+	}
+}
+
+/**
+ * i40e_process_rx_buffs- Processing of buffers post XDP prog or on error
+ * @rx_ring: Rx descriptor ring to transact packets on
+ * @xdp_res: Result of the XDP program
+ * @xdp: xdp_buff pointing to the data
+ **/
+static void i40e_process_rx_buffs(struct i40e_ring *rx_ring, int xdp_res,
+				  struct xdp_buff *xdp)
+{
+	u16 next = rx_ring->next_to_clean;
+	struct i40e_rx_buffer *rx_buffer;
+
+	xdp->flags = 0;
+
+	while (1) {
+		rx_buffer = i40e_rx_bi(rx_ring, next);
+		if (++next == rx_ring->count)
+			next = 0;
+
+		if (!rx_buffer->page)
+			continue;
+
+		if (xdp_res == I40E_XDP_CONSUMED)
+			rx_buffer->pagecnt_bias++;
+		else
+			i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz);
+
+		/* EOP buffer will be put in i40e_clean_rx_irq() */
+		if (next == rx_ring->next_to_process)
+			return;
+
+		i40e_put_rx_buffer(rx_ring, rx_buffer);
+	}
+}
+
+/**
+ * i40e_construct_skb - Allocate skb and populate it
+ * @rx_ring: rx descriptor ring to transact packets on
  * @xdp: xdp_buff pointing to the data
+ * @nr_frags: number of buffers for the packet
  *
  * This function allocates an skb.  It then populates it with the page
  * data from the current receive descriptor, taking care to set up the
  * skb correctly.
  */
 static struct sk_buff *i40e_construct_skb(struct i40e_ring *rx_ring,
-					  struct i40e_rx_buffer *rx_buffer,
-					  struct xdp_buff *xdp)
+					  struct xdp_buff *xdp,
+					  u32 nr_frags)
 {
 	unsigned int size = xdp->data_end - xdp->data;
+	struct i40e_rx_buffer *rx_buffer;
 	unsigned int headlen;
 	struct sk_buff *skb;
 
@@ -2152,13 +2181,17 @@ static struct sk_buff *i40e_construct_skb(struct i40e_ring *rx_ring,
 	memcpy(__skb_put(skb, headlen), xdp->data,
 	       ALIGN(headlen, sizeof(long)));
 
+	rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean);
 	/* update all of the pointers */
 	size -= headlen;
 	if (size) {
+		if (unlikely(nr_frags >= MAX_SKB_FRAGS)) {
+			dev_kfree_skb(skb);
+			return NULL;
+		}
 		skb_add_rx_frag(skb, 0, rx_buffer->page,
 				rx_buffer->page_offset + headlen,
 				size, xdp->frame_sz);
-
 		/* buffer is used by skb, update page_offset */
 		i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz);
 	} else {
@@ -2166,21 +2199,40 @@ static struct sk_buff *i40e_construct_skb(struct i40e_ring *rx_ring,
 		rx_buffer->pagecnt_bias++;
 	}
 
+	if (unlikely(xdp_buff_has_frags(xdp))) {
+		struct skb_shared_info *sinfo, *skinfo = skb_shinfo(skb);
+
+		sinfo = xdp_get_shared_info_from_buff(xdp);
+		memcpy(&skinfo->frags[skinfo->nr_frags], &sinfo->frags[0],
+		       sizeof(skb_frag_t) * nr_frags);
+
+		xdp_update_skb_shared_info(skb, skinfo->nr_frags + nr_frags,
+					   sinfo->xdp_frags_size,
+					   nr_frags * xdp->frame_sz,
+					   xdp_buff_is_frag_pfmemalloc(xdp));
+
+		/* First buffer has already been processed, so bump ntc */
+		if (++rx_ring->next_to_clean == rx_ring->count)
+			rx_ring->next_to_clean = 0;
+
+		i40e_process_rx_buffs(rx_ring, I40E_XDP_PASS, xdp);
+	}
+
 	return skb;
 }
 
 /**
  * i40e_build_skb - Build skb around an existing buffer
  * @rx_ring: Rx descriptor ring to transact packets on
- * @rx_buffer: Rx buffer to pull data from
  * @xdp: xdp_buff pointing to the data
+ * @nr_frags: number of buffers for the packet
  *
  * This function builds an skb around an existing Rx buffer, taking care
  * to set up the skb correctly and avoid any memcpy overhead.
  */
 static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
-				      struct i40e_rx_buffer *rx_buffer,
-				      struct xdp_buff *xdp)
+				      struct xdp_buff *xdp,
+				      u32 nr_frags)
 {
 	unsigned int metasize = xdp->data - xdp->data_meta;
 	struct sk_buff *skb;
@@ -2203,36 +2255,25 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
 	if (metasize)
 		skb_metadata_set(skb, metasize);
 
-	/* buffer is used by skb, update page_offset */
-	i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz);
+	if (unlikely(xdp_buff_has_frags(xdp))) {
+		struct skb_shared_info *sinfo;
 
-	return skb;
-}
+		sinfo = xdp_get_shared_info_from_buff(xdp);
+		xdp_update_skb_shared_info(skb, nr_frags,
+					   sinfo->xdp_frags_size,
+					   nr_frags * xdp->frame_sz,
+					   xdp_buff_is_frag_pfmemalloc(xdp));
 
-/**
- * i40e_put_rx_buffer - Clean up used buffer and either recycle or free
- * @rx_ring: rx descriptor ring to transact packets on
- * @rx_buffer: rx buffer to pull data from
- *
- * This function will clean up the contents of the rx_buffer.  It will
- * either recycle the buffer or unmap it and free the associated resources.
- */
-static void i40e_put_rx_buffer(struct i40e_ring *rx_ring,
-			       struct i40e_rx_buffer *rx_buffer)
-{
-	if (i40e_can_reuse_rx_page(rx_buffer, &rx_ring->rx_stats)) {
-		/* hand second half of page back to the ring */
-		i40e_reuse_rx_page(rx_ring, rx_buffer);
+		i40e_process_rx_buffs(rx_ring, I40E_XDP_PASS, xdp);
 	} else {
-		/* we are not reusing the buffer so unmap it */
-		dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma,
-				     i40e_rx_pg_size(rx_ring),
-				     DMA_FROM_DEVICE, I40E_RX_DMA_ATTR);
-		__page_frag_cache_drain(rx_buffer->page,
-					rx_buffer->pagecnt_bias);
-		/* clear contents of buffer_info */
-		rx_buffer->page = NULL;
+		struct i40e_rx_buffer *rx_buffer;
+
+		rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean);
+		/* buffer is used by skb, update page_offset */
+		i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz);
 	}
+
+	return skb;
 }
 
 /**
@@ -2387,6 +2428,55 @@ static void i40e_inc_ntp(struct i40e_ring *rx_ring)
 	prefetch(I40E_RX_DESC(rx_ring, ntp));
 }
 
+/**
+ * i40e_add_xdp_frag: Add a frag to xdp_buff
+ * @xdp: xdp_buff pointing to the data
+ * @nr_frags: return number of buffers for the packet
+ * @rx_buffer: rx_buffer holding data of the current frag
+ * @size: size of data of current frag
+ */
+static int i40e_add_xdp_frag(struct xdp_buff *xdp, u32 *nr_frags,
+			     struct i40e_rx_buffer *rx_buffer, u32 size)
+{
+	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
+
+	if (!xdp_buff_has_frags(xdp)) {
+		sinfo->nr_frags = 0;
+		sinfo->xdp_frags_size = 0;
+		xdp_buff_set_frags_flag(xdp);
+	} else if (unlikely(sinfo->nr_frags >= MAX_SKB_FRAGS)) {
+		/* Overflowing packet: All frags need to be dropped */
+		return -ENOMEM;
+	}
+
+	__skb_fill_page_desc_noacc(sinfo, sinfo->nr_frags++, rx_buffer->page,
+				   rx_buffer->page_offset, size);
+
+	sinfo->xdp_frags_size += size;
+
+	if (page_is_pfmemalloc(rx_buffer->page))
+		xdp_buff_set_frag_pfmemalloc(xdp);
+	*nr_frags = sinfo->nr_frags;
+
+	return 0;
+}
+
+/**
+ * i40e_consume_xdp_buff - Consume all the buffers of the packet and update ntc
+ * @rx_ring: rx descriptor ring to transact packets on
+ * @xdp: xdp_buff pointing to the data
+ * @rx_buffer: rx_buffer of eop desc
+ */
+static void i40e_consume_xdp_buff(struct i40e_ring *rx_ring,
+				  struct xdp_buff *xdp,
+				  struct i40e_rx_buffer *rx_buffer)
+{
+	i40e_process_rx_buffs(rx_ring, I40E_XDP_CONSUMED, xdp);
+	i40e_put_rx_buffer(rx_ring, rx_buffer);
+	rx_ring->next_to_clean = rx_ring->next_to_process;
+	xdp->data = NULL;
+}
+
 /**
  * i40e_clean_rx_irq - Clean completed descriptors from Rx ring - bounce buf
  * @rx_ring: rx descriptor ring to transact packets on
@@ -2405,9 +2495,9 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 {
 	unsigned int total_rx_bytes = 0, total_rx_packets = 0;
 	u16 cleaned_count = I40E_DESC_UNUSED(rx_ring);
+	u16 clean_threshold = rx_ring->count / 2;
 	unsigned int offset = rx_ring->rx_offset;
 	struct xdp_buff *xdp = &rx_ring->xdp;
-	struct sk_buff *skb = rx_ring->skb;
 	unsigned int xdp_xmit = 0;
 	struct bpf_prog *xdp_prog;
 	bool failure = false;
@@ -2419,11 +2509,14 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 		u16 ntp = rx_ring->next_to_process;
 		struct i40e_rx_buffer *rx_buffer;
 		union i40e_rx_desc *rx_desc;
+		struct sk_buff *skb;
 		unsigned int size;
+		u32 nfrags = 0;
+		bool neop;
 		u64 qword;
 
 		/* return some buffers to hardware, one at a time is too slow */
-		if (cleaned_count >= I40E_RX_BUFFER_WRITE) {
+		if (cleaned_count >= clean_threshold) {
 			failure = failure ||
 				  i40e_alloc_rx_buffers(rx_ring, cleaned_count);
 			cleaned_count = 0;
@@ -2461,76 +2554,83 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
 			break;
 
 		i40e_trace(clean_rx_irq, rx_ring, rx_desc, xdp);
+		/* retrieve a buffer from the ring */
 		rx_buffer = i40e_get_rx_buffer(rx_ring, size);
 
-		/* retrieve a buffer from the ring */
-		if (!skb) {
+		neop = i40e_is_non_eop(rx_ring, rx_desc);
+		i40e_inc_ntp(rx_ring);
+
+		if (!xdp->data) {
 			unsigned char *hard_start;
 
 			hard_start = page_address(rx_buffer->page) +
 				     rx_buffer->page_offset - offset;
 			xdp_prepare_buff(xdp, hard_start, offset, size, true);
-			xdp_buff_clear_frags_flag(xdp);
 #if (PAGE_SIZE > 4096)
 			/* At larger PAGE_SIZE, frame_sz depend on len size */
 			xdp->frame_sz = i40e_rx_frame_truesize(rx_ring, size);
 #endif
-			xdp_res = i40e_run_xdp(rx_ring, xdp, xdp_prog);
+		} else if (i40e_add_xdp_frag(xdp, &nfrags, rx_buffer, size) &&
+			   !neop) {
+			/* Overflowing packet: Drop all frags on EOP */
+			i40e_consume_xdp_buff(rx_ring, xdp, rx_buffer);
+			break;
 		}
 
+		if (neop)
+			continue;
+
+		xdp_res = i40e_run_xdp(rx_ring, xdp, xdp_prog);
+
 		if (xdp_res) {
-			if (xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR)) {
-				xdp_xmit |= xdp_res;
+			xdp_xmit |= xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR);
+
+			if (unlikely(xdp_buff_has_frags(xdp))) {
+				i40e_process_rx_buffs(rx_ring, xdp_res, xdp);
+				size = xdp_get_buff_len(xdp);
+			} else if (xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR)) {
 				i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz);
 			} else {
 				rx_buffer->pagecnt_bias++;
 			}
 			total_rx_bytes += size;
-			total_rx_packets++;
-		} else if (skb) {
-			i40e_add_rx_frag(rx_ring, rx_buffer, skb, size);
-		} else if (ring_uses_build_skb(rx_ring)) {
-			skb = i40e_build_skb(rx_ring, rx_buffer, xdp);
 		} else {
-			skb = i40e_construct_skb(rx_ring, rx_buffer, xdp);
-		}
+			if (ring_uses_build_skb(rx_ring))
+				skb = i40e_build_skb(rx_ring, xdp, nfrags);
+			else
+				skb = i40e_construct_skb(rx_ring, xdp, nfrags);
+
+			/* drop if we failed to retrieve a buffer */
+			if (!skb) {
+				rx_ring->rx_stats.alloc_buff_failed++;
+				i40e_consume_xdp_buff(rx_ring, xdp, rx_buffer);
+				break;
+			}
 
-		/* exit if we failed to retrieve a buffer */
-		if (!xdp_res && !skb) {
-			rx_ring->rx_stats.alloc_buff_failed++;
-			rx_buffer->pagecnt_bias++;
-			break;
-		}
+			if (i40e_cleanup_headers(rx_ring, skb, rx_desc))
+				goto process_next;
 
-		i40e_put_rx_buffer(rx_ring, rx_buffer);
-		cleaned_count++;
+			/* probably a little skewed due to removing CRC */
+			total_rx_bytes += skb->len;
 
-		i40e_inc_ntp(rx_ring);
-		rx_ring->next_to_clean = rx_ring->next_to_process;
-		if (i40e_is_non_eop(rx_ring, rx_desc))
-			continue;
+			/* populate checksum, VLAN, and protocol */
+			i40e_process_skb_fields(rx_ring, rx_desc, skb);
 
-		if (xdp_res || i40e_cleanup_headers(rx_ring, skb, rx_desc)) {
-			skb = NULL;
-			continue;
+			i40e_trace(clean_rx_irq_rx, rx_ring, rx_desc, xdp);
+			napi_gro_receive(&rx_ring->q_vector->napi, skb);
 		}
 
-		/* probably a little skewed due to removing CRC */
-		total_rx_bytes += skb->len;
-
-		/* populate checksum, VLAN, and protocol */
-		i40e_process_skb_fields(rx_ring, rx_desc, skb);
-
-		i40e_trace(clean_rx_irq_rx, rx_ring, rx_desc, xdp);
-		napi_gro_receive(&rx_ring->q_vector->napi, skb);
-		skb = NULL;
-
 		/* update budget accounting */
 		total_rx_packets++;
+process_next:
+		cleaned_count += nfrags + 1;
+		i40e_put_rx_buffer(rx_ring, rx_buffer);
+		rx_ring->next_to_clean = rx_ring->next_to_process;
+
+		xdp->data = NULL;
 	}
 
 	i40e_finalize_xdp_rx(rx_ring, xdp_xmit);
-	rx_ring->skb = skb;
 
 	i40e_update_rx_stats(rx_ring, total_rx_bytes, total_rx_packets);
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index e86abc25bb5e..14ad074639ab 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -393,14 +393,6 @@ struct i40e_ring {
 
 	struct rcu_head rcu;		/* to avoid race on free */
 	u16 next_to_alloc;
-	struct sk_buff *skb;		/* When i40e_clean_rx_ring_irq() must
-					 * return before it sees the EOP for
-					 * the current packet, we save that skb
-					 * here and resume receiving this
-					 * packet the next time
-					 * i40e_clean_rx_ring_irq() is called
-					 * for this ring.
-					 */
 
 	struct i40e_channel *ch;
 	u16 rx_offset;
-- 
2.34.1

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Intel-wired-lan] [PATCH intel-next v4 8/8] i40e: add support for XDP multi-buffer Rx
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 8/8] i40e: add support for XDP multi-buffer Rx Tirthendu Sarkar
@ 2023-02-15 14:51   ` Lorenzo Bianconi
  2023-02-16  0:02     ` Tony Nguyen
  0 siblings, 1 reply; 17+ messages in thread
From: Lorenzo Bianconi @ 2023-02-15 14:51 UTC (permalink / raw)
  To: Tirthendu Sarkar
  Cc: netdev, jesse.brandeburg, anthony.l.nguyen, intel-wired-lan, bpf,
	magnus.karlsson


[-- Attachment #1.1: Type: text/plain, Size: 17614 bytes --]

> This patch adds multi-buffer support for the i40e_driver.
> 

[...]

>  
>  	netdev->features &= ~NETIF_F_HW_TC;
>  	netdev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT |
> -			       NETDEV_XDP_ACT_XSK_ZEROCOPY;
> +			       NETDEV_XDP_ACT_XSK_ZEROCOPY |
> +			       NETDEV_XDP_ACT_RX_SG;

Hi Tirthendu,

I guess we should set it just for I40E_VSI_MAIN, I posted a patch yesterday
to fix it:
https://patchwork.kernel.org/project/netdevbpf/patch/f2b537f86b34fc176fbc6b3d249b46a20a87a2f3.1676405131.git.lorenzo@kernel.org/

can you please rebase on top of it?

Regards,
Lorenzo

>  
>  	if (vsi->type == I40E_VSI_MAIN) {
>  		SET_NETDEV_DEV(netdev, &pf->pdev->dev);
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> index dc2c9aae0ffe..7ace7b7ec256 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> @@ -1477,9 +1477,6 @@ void i40e_clean_rx_ring(struct i40e_ring *rx_ring)
>  	if (!rx_ring->rx_bi)
>  		return;
>  
> -	dev_kfree_skb(rx_ring->skb);
> -	rx_ring->skb = NULL;
> -
>  	if (rx_ring->xsk_pool) {
>  		i40e_xsk_clean_rx_ring(rx_ring);
>  		goto skip_free;
> @@ -2033,36 +2030,6 @@ static void i40e_rx_buffer_flip(struct i40e_rx_buffer *rx_buffer,
>  #endif
>  }
>  
> -/**
> - * i40e_add_rx_frag - Add contents of Rx buffer to sk_buff
> - * @rx_ring: rx descriptor ring to transact packets on
> - * @rx_buffer: buffer containing page to add
> - * @skb: sk_buff to place the data into
> - * @size: packet length from rx_desc
> - *
> - * This function will add the data contained in rx_buffer->page to the skb.
> - * It will just attach the page as a frag to the skb.
> - *
> - * The function will then update the page offset.
> - **/
> -static void i40e_add_rx_frag(struct i40e_ring *rx_ring,
> -			     struct i40e_rx_buffer *rx_buffer,
> -			     struct sk_buff *skb,
> -			     unsigned int size)
> -{
> -#if (PAGE_SIZE < 8192)
> -	unsigned int truesize = i40e_rx_pg_size(rx_ring) / 2;
> -#else
> -	unsigned int truesize = SKB_DATA_ALIGN(size + rx_ring->rx_offset);
> -#endif
> -
> -	skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, rx_buffer->page,
> -			rx_buffer->page_offset, size, truesize);
> -
> -	/* page is being used so we must update the page offset */
> -	i40e_rx_buffer_flip(rx_buffer, truesize);
> -}
> -
>  /**
>   * i40e_get_rx_buffer - Fetch Rx buffer and synchronize data for use
>   * @rx_ring: rx descriptor ring to transact packets on
> @@ -2099,20 +2066,82 @@ static struct i40e_rx_buffer *i40e_get_rx_buffer(struct i40e_ring *rx_ring,
>  }
>  
>  /**
> - * i40e_construct_skb - Allocate skb and populate it
> + * i40e_put_rx_buffer - Clean up used buffer and either recycle or free
>   * @rx_ring: rx descriptor ring to transact packets on
>   * @rx_buffer: rx buffer to pull data from
> + *
> + * This function will clean up the contents of the rx_buffer.  It will
> + * either recycle the buffer or unmap it and free the associated resources.
> + */
> +static void i40e_put_rx_buffer(struct i40e_ring *rx_ring,
> +			       struct i40e_rx_buffer *rx_buffer)
> +{
> +	if (i40e_can_reuse_rx_page(rx_buffer, &rx_ring->rx_stats)) {
> +		/* hand second half of page back to the ring */
> +		i40e_reuse_rx_page(rx_ring, rx_buffer);
> +	} else {
> +		/* we are not reusing the buffer so unmap it */
> +		dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma,
> +				     i40e_rx_pg_size(rx_ring),
> +				     DMA_FROM_DEVICE, I40E_RX_DMA_ATTR);
> +		__page_frag_cache_drain(rx_buffer->page,
> +					rx_buffer->pagecnt_bias);
> +		/* clear contents of buffer_info */
> +		rx_buffer->page = NULL;
> +	}
> +}
> +
> +/**
> + * i40e_process_rx_buffs- Processing of buffers post XDP prog or on error
> + * @rx_ring: Rx descriptor ring to transact packets on
> + * @xdp_res: Result of the XDP program
> + * @xdp: xdp_buff pointing to the data
> + **/
> +static void i40e_process_rx_buffs(struct i40e_ring *rx_ring, int xdp_res,
> +				  struct xdp_buff *xdp)
> +{
> +	u16 next = rx_ring->next_to_clean;
> +	struct i40e_rx_buffer *rx_buffer;
> +
> +	xdp->flags = 0;
> +
> +	while (1) {
> +		rx_buffer = i40e_rx_bi(rx_ring, next);
> +		if (++next == rx_ring->count)
> +			next = 0;
> +
> +		if (!rx_buffer->page)
> +			continue;
> +
> +		if (xdp_res == I40E_XDP_CONSUMED)
> +			rx_buffer->pagecnt_bias++;
> +		else
> +			i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz);
> +
> +		/* EOP buffer will be put in i40e_clean_rx_irq() */
> +		if (next == rx_ring->next_to_process)
> +			return;
> +
> +		i40e_put_rx_buffer(rx_ring, rx_buffer);
> +	}
> +}
> +
> +/**
> + * i40e_construct_skb - Allocate skb and populate it
> + * @rx_ring: rx descriptor ring to transact packets on
>   * @xdp: xdp_buff pointing to the data
> + * @nr_frags: number of buffers for the packet
>   *
>   * This function allocates an skb.  It then populates it with the page
>   * data from the current receive descriptor, taking care to set up the
>   * skb correctly.
>   */
>  static struct sk_buff *i40e_construct_skb(struct i40e_ring *rx_ring,
> -					  struct i40e_rx_buffer *rx_buffer,
> -					  struct xdp_buff *xdp)
> +					  struct xdp_buff *xdp,
> +					  u32 nr_frags)
>  {
>  	unsigned int size = xdp->data_end - xdp->data;
> +	struct i40e_rx_buffer *rx_buffer;
>  	unsigned int headlen;
>  	struct sk_buff *skb;
>  
> @@ -2152,13 +2181,17 @@ static struct sk_buff *i40e_construct_skb(struct i40e_ring *rx_ring,
>  	memcpy(__skb_put(skb, headlen), xdp->data,
>  	       ALIGN(headlen, sizeof(long)));
>  
> +	rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean);
>  	/* update all of the pointers */
>  	size -= headlen;
>  	if (size) {
> +		if (unlikely(nr_frags >= MAX_SKB_FRAGS)) {
> +			dev_kfree_skb(skb);
> +			return NULL;
> +		}
>  		skb_add_rx_frag(skb, 0, rx_buffer->page,
>  				rx_buffer->page_offset + headlen,
>  				size, xdp->frame_sz);
> -
>  		/* buffer is used by skb, update page_offset */
>  		i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz);
>  	} else {
> @@ -2166,21 +2199,40 @@ static struct sk_buff *i40e_construct_skb(struct i40e_ring *rx_ring,
>  		rx_buffer->pagecnt_bias++;
>  	}
>  
> +	if (unlikely(xdp_buff_has_frags(xdp))) {
> +		struct skb_shared_info *sinfo, *skinfo = skb_shinfo(skb);
> +
> +		sinfo = xdp_get_shared_info_from_buff(xdp);
> +		memcpy(&skinfo->frags[skinfo->nr_frags], &sinfo->frags[0],
> +		       sizeof(skb_frag_t) * nr_frags);
> +
> +		xdp_update_skb_shared_info(skb, skinfo->nr_frags + nr_frags,
> +					   sinfo->xdp_frags_size,
> +					   nr_frags * xdp->frame_sz,
> +					   xdp_buff_is_frag_pfmemalloc(xdp));
> +
> +		/* First buffer has already been processed, so bump ntc */
> +		if (++rx_ring->next_to_clean == rx_ring->count)
> +			rx_ring->next_to_clean = 0;
> +
> +		i40e_process_rx_buffs(rx_ring, I40E_XDP_PASS, xdp);
> +	}
> +
>  	return skb;
>  }
>  
>  /**
>   * i40e_build_skb - Build skb around an existing buffer
>   * @rx_ring: Rx descriptor ring to transact packets on
> - * @rx_buffer: Rx buffer to pull data from
>   * @xdp: xdp_buff pointing to the data
> + * @nr_frags: number of buffers for the packet
>   *
>   * This function builds an skb around an existing Rx buffer, taking care
>   * to set up the skb correctly and avoid any memcpy overhead.
>   */
>  static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
> -				      struct i40e_rx_buffer *rx_buffer,
> -				      struct xdp_buff *xdp)
> +				      struct xdp_buff *xdp,
> +				      u32 nr_frags)
>  {
>  	unsigned int metasize = xdp->data - xdp->data_meta;
>  	struct sk_buff *skb;
> @@ -2203,36 +2255,25 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
>  	if (metasize)
>  		skb_metadata_set(skb, metasize);
>  
> -	/* buffer is used by skb, update page_offset */
> -	i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz);
> +	if (unlikely(xdp_buff_has_frags(xdp))) {
> +		struct skb_shared_info *sinfo;
>  
> -	return skb;
> -}
> +		sinfo = xdp_get_shared_info_from_buff(xdp);
> +		xdp_update_skb_shared_info(skb, nr_frags,
> +					   sinfo->xdp_frags_size,
> +					   nr_frags * xdp->frame_sz,
> +					   xdp_buff_is_frag_pfmemalloc(xdp));
>  
> -/**
> - * i40e_put_rx_buffer - Clean up used buffer and either recycle or free
> - * @rx_ring: rx descriptor ring to transact packets on
> - * @rx_buffer: rx buffer to pull data from
> - *
> - * This function will clean up the contents of the rx_buffer.  It will
> - * either recycle the buffer or unmap it and free the associated resources.
> - */
> -static void i40e_put_rx_buffer(struct i40e_ring *rx_ring,
> -			       struct i40e_rx_buffer *rx_buffer)
> -{
> -	if (i40e_can_reuse_rx_page(rx_buffer, &rx_ring->rx_stats)) {
> -		/* hand second half of page back to the ring */
> -		i40e_reuse_rx_page(rx_ring, rx_buffer);
> +		i40e_process_rx_buffs(rx_ring, I40E_XDP_PASS, xdp);
>  	} else {
> -		/* we are not reusing the buffer so unmap it */
> -		dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma,
> -				     i40e_rx_pg_size(rx_ring),
> -				     DMA_FROM_DEVICE, I40E_RX_DMA_ATTR);
> -		__page_frag_cache_drain(rx_buffer->page,
> -					rx_buffer->pagecnt_bias);
> -		/* clear contents of buffer_info */
> -		rx_buffer->page = NULL;
> +		struct i40e_rx_buffer *rx_buffer;
> +
> +		rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean);
> +		/* buffer is used by skb, update page_offset */
> +		i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz);
>  	}
> +
> +	return skb;
>  }
>  
>  /**
> @@ -2387,6 +2428,55 @@ static void i40e_inc_ntp(struct i40e_ring *rx_ring)
>  	prefetch(I40E_RX_DESC(rx_ring, ntp));
>  }
>  
> +/**
> + * i40e_add_xdp_frag: Add a frag to xdp_buff
> + * @xdp: xdp_buff pointing to the data
> + * @nr_frags: return number of buffers for the packet
> + * @rx_buffer: rx_buffer holding data of the current frag
> + * @size: size of data of current frag
> + */
> +static int i40e_add_xdp_frag(struct xdp_buff *xdp, u32 *nr_frags,
> +			     struct i40e_rx_buffer *rx_buffer, u32 size)
> +{
> +	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
> +
> +	if (!xdp_buff_has_frags(xdp)) {
> +		sinfo->nr_frags = 0;
> +		sinfo->xdp_frags_size = 0;
> +		xdp_buff_set_frags_flag(xdp);
> +	} else if (unlikely(sinfo->nr_frags >= MAX_SKB_FRAGS)) {
> +		/* Overflowing packet: All frags need to be dropped */
> +		return -ENOMEM;
> +	}
> +
> +	__skb_fill_page_desc_noacc(sinfo, sinfo->nr_frags++, rx_buffer->page,
> +				   rx_buffer->page_offset, size);
> +
> +	sinfo->xdp_frags_size += size;
> +
> +	if (page_is_pfmemalloc(rx_buffer->page))
> +		xdp_buff_set_frag_pfmemalloc(xdp);
> +	*nr_frags = sinfo->nr_frags;
> +
> +	return 0;
> +}
> +
> +/**
> + * i40e_consume_xdp_buff - Consume all the buffers of the packet and update ntc
> + * @rx_ring: rx descriptor ring to transact packets on
> + * @xdp: xdp_buff pointing to the data
> + * @rx_buffer: rx_buffer of eop desc
> + */
> +static void i40e_consume_xdp_buff(struct i40e_ring *rx_ring,
> +				  struct xdp_buff *xdp,
> +				  struct i40e_rx_buffer *rx_buffer)
> +{
> +	i40e_process_rx_buffs(rx_ring, I40E_XDP_CONSUMED, xdp);
> +	i40e_put_rx_buffer(rx_ring, rx_buffer);
> +	rx_ring->next_to_clean = rx_ring->next_to_process;
> +	xdp->data = NULL;
> +}
> +
>  /**
>   * i40e_clean_rx_irq - Clean completed descriptors from Rx ring - bounce buf
>   * @rx_ring: rx descriptor ring to transact packets on
> @@ -2405,9 +2495,9 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
>  {
>  	unsigned int total_rx_bytes = 0, total_rx_packets = 0;
>  	u16 cleaned_count = I40E_DESC_UNUSED(rx_ring);
> +	u16 clean_threshold = rx_ring->count / 2;
>  	unsigned int offset = rx_ring->rx_offset;
>  	struct xdp_buff *xdp = &rx_ring->xdp;
> -	struct sk_buff *skb = rx_ring->skb;
>  	unsigned int xdp_xmit = 0;
>  	struct bpf_prog *xdp_prog;
>  	bool failure = false;
> @@ -2419,11 +2509,14 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
>  		u16 ntp = rx_ring->next_to_process;
>  		struct i40e_rx_buffer *rx_buffer;
>  		union i40e_rx_desc *rx_desc;
> +		struct sk_buff *skb;
>  		unsigned int size;
> +		u32 nfrags = 0;
> +		bool neop;
>  		u64 qword;
>  
>  		/* return some buffers to hardware, one at a time is too slow */
> -		if (cleaned_count >= I40E_RX_BUFFER_WRITE) {
> +		if (cleaned_count >= clean_threshold) {
>  			failure = failure ||
>  				  i40e_alloc_rx_buffers(rx_ring, cleaned_count);
>  			cleaned_count = 0;
> @@ -2461,76 +2554,83 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
>  			break;
>  
>  		i40e_trace(clean_rx_irq, rx_ring, rx_desc, xdp);
> +		/* retrieve a buffer from the ring */
>  		rx_buffer = i40e_get_rx_buffer(rx_ring, size);
>  
> -		/* retrieve a buffer from the ring */
> -		if (!skb) {
> +		neop = i40e_is_non_eop(rx_ring, rx_desc);
> +		i40e_inc_ntp(rx_ring);
> +
> +		if (!xdp->data) {
>  			unsigned char *hard_start;
>  
>  			hard_start = page_address(rx_buffer->page) +
>  				     rx_buffer->page_offset - offset;
>  			xdp_prepare_buff(xdp, hard_start, offset, size, true);
> -			xdp_buff_clear_frags_flag(xdp);
>  #if (PAGE_SIZE > 4096)
>  			/* At larger PAGE_SIZE, frame_sz depend on len size */
>  			xdp->frame_sz = i40e_rx_frame_truesize(rx_ring, size);
>  #endif
> -			xdp_res = i40e_run_xdp(rx_ring, xdp, xdp_prog);
> +		} else if (i40e_add_xdp_frag(xdp, &nfrags, rx_buffer, size) &&
> +			   !neop) {
> +			/* Overflowing packet: Drop all frags on EOP */
> +			i40e_consume_xdp_buff(rx_ring, xdp, rx_buffer);
> +			break;
>  		}
>  
> +		if (neop)
> +			continue;
> +
> +		xdp_res = i40e_run_xdp(rx_ring, xdp, xdp_prog);
> +
>  		if (xdp_res) {
> -			if (xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR)) {
> -				xdp_xmit |= xdp_res;
> +			xdp_xmit |= xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR);
> +
> +			if (unlikely(xdp_buff_has_frags(xdp))) {
> +				i40e_process_rx_buffs(rx_ring, xdp_res, xdp);
> +				size = xdp_get_buff_len(xdp);
> +			} else if (xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR)) {
>  				i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz);
>  			} else {
>  				rx_buffer->pagecnt_bias++;
>  			}
>  			total_rx_bytes += size;
> -			total_rx_packets++;
> -		} else if (skb) {
> -			i40e_add_rx_frag(rx_ring, rx_buffer, skb, size);
> -		} else if (ring_uses_build_skb(rx_ring)) {
> -			skb = i40e_build_skb(rx_ring, rx_buffer, xdp);
>  		} else {
> -			skb = i40e_construct_skb(rx_ring, rx_buffer, xdp);
> -		}
> +			if (ring_uses_build_skb(rx_ring))
> +				skb = i40e_build_skb(rx_ring, xdp, nfrags);
> +			else
> +				skb = i40e_construct_skb(rx_ring, xdp, nfrags);
> +
> +			/* drop if we failed to retrieve a buffer */
> +			if (!skb) {
> +				rx_ring->rx_stats.alloc_buff_failed++;
> +				i40e_consume_xdp_buff(rx_ring, xdp, rx_buffer);
> +				break;
> +			}
>  
> -		/* exit if we failed to retrieve a buffer */
> -		if (!xdp_res && !skb) {
> -			rx_ring->rx_stats.alloc_buff_failed++;
> -			rx_buffer->pagecnt_bias++;
> -			break;
> -		}
> +			if (i40e_cleanup_headers(rx_ring, skb, rx_desc))
> +				goto process_next;
>  
> -		i40e_put_rx_buffer(rx_ring, rx_buffer);
> -		cleaned_count++;
> +			/* probably a little skewed due to removing CRC */
> +			total_rx_bytes += skb->len;
>  
> -		i40e_inc_ntp(rx_ring);
> -		rx_ring->next_to_clean = rx_ring->next_to_process;
> -		if (i40e_is_non_eop(rx_ring, rx_desc))
> -			continue;
> +			/* populate checksum, VLAN, and protocol */
> +			i40e_process_skb_fields(rx_ring, rx_desc, skb);
>  
> -		if (xdp_res || i40e_cleanup_headers(rx_ring, skb, rx_desc)) {
> -			skb = NULL;
> -			continue;
> +			i40e_trace(clean_rx_irq_rx, rx_ring, rx_desc, xdp);
> +			napi_gro_receive(&rx_ring->q_vector->napi, skb);
>  		}
>  
> -		/* probably a little skewed due to removing CRC */
> -		total_rx_bytes += skb->len;
> -
> -		/* populate checksum, VLAN, and protocol */
> -		i40e_process_skb_fields(rx_ring, rx_desc, skb);
> -
> -		i40e_trace(clean_rx_irq_rx, rx_ring, rx_desc, xdp);
> -		napi_gro_receive(&rx_ring->q_vector->napi, skb);
> -		skb = NULL;
> -
>  		/* update budget accounting */
>  		total_rx_packets++;
> +process_next:
> +		cleaned_count += nfrags + 1;
> +		i40e_put_rx_buffer(rx_ring, rx_buffer);
> +		rx_ring->next_to_clean = rx_ring->next_to_process;
> +
> +		xdp->data = NULL;
>  	}
>  
>  	i40e_finalize_xdp_rx(rx_ring, xdp_xmit);
> -	rx_ring->skb = skb;
>  
>  	i40e_update_rx_stats(rx_ring, total_rx_bytes, total_rx_packets);
>  
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
> index e86abc25bb5e..14ad074639ab 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
> @@ -393,14 +393,6 @@ struct i40e_ring {
>  
>  	struct rcu_head rcu;		/* to avoid race on free */
>  	u16 next_to_alloc;
> -	struct sk_buff *skb;		/* When i40e_clean_rx_ring_irq() must
> -					 * return before it sees the EOP for
> -					 * the current packet, we save that skb
> -					 * here and resume receiving this
> -					 * packet the next time
> -					 * i40e_clean_rx_ring_irq() is called
> -					 * for this ring.
> -					 */
>  
>  	struct i40e_channel *ch;
>  	u16 rx_offset;
> -- 
> 2.34.1
> 

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 162 bytes --]

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-wired-lan] [PATCH intel-next v4 4/8] i40e: Change size to truesize when using i40e_rx_buffer_flip()
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 4/8] i40e: Change size to truesize when using i40e_rx_buffer_flip() Tirthendu Sarkar
@ 2023-02-15 23:11   ` Tony Nguyen
  2023-02-16  7:49   ` Paul Menzel
  1 sibling, 0 replies; 17+ messages in thread
From: Tony Nguyen @ 2023-02-15 23:11 UTC (permalink / raw)
  To: Tirthendu Sarkar, intel-wired-lan
  Cc: netdev, bpf, jesse.brandeburg, magnus.karlsson

On 2/15/2023 4:43 AM, Tirthendu Sarkar wrote:
> Truesize is now passed directly to i40e_rx_buffer_flip() instead of size
> so that it does not need to recalculate truesize from size using
> i40e_rx_frame_truesize() before adjusting page offset.
> 
> With these change the function can now be used during skb building and
> adding frags. In later patches it will also be easier for adjusting
> page offsets for multi-buffers.
> 
> Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
> ---
>   drivers/net/ethernet/intel/i40e/i40e_txrx.c | 54 ++++++++-------------
>   1 file changed, 19 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> index a7fba294a8f4..019abd7273a2 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> @@ -2018,6 +2018,21 @@ static bool i40e_can_reuse_rx_page(struct i40e_rx_buffer *rx_buffer,
>   	return true;
>   }
>   
> +/**
> + * i40e_rx_buffer_flip - adjusted rx_buffer to point to an unused region
> + * @rx_buffer: Rx buffer to adjust
> + * @size: Size of adjustment

s/size/truesize

> + **/
> +static void i40e_rx_buffer_flip(struct i40e_rx_buffer *rx_buffer,
> +				unsigned int truesize)
> +{
> +#if (PAGE_SIZE < 8192)
> +	rx_buffer->page_offset ^= truesize;
> +#else
> +	rx_buffer->page_offset += truesize;
> +#endif
> +}
> +
>   /**
>    * i40e_add_rx_frag - Add contents of Rx buffer to sk_buff
>    * @rx_ring: rx descriptor ring to transact packets on
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer
  2023-02-15 12:42 [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer Tirthendu Sarkar
                   ` (7 preceding siblings ...)
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 8/8] i40e: add support for XDP multi-buffer Rx Tirthendu Sarkar
@ 2023-02-15 23:13 ` Tony Nguyen
  8 siblings, 0 replies; 17+ messages in thread
From: Tony Nguyen @ 2023-02-15 23:13 UTC (permalink / raw)
  To: Tirthendu Sarkar, intel-wired-lan
  Cc: netdev, bpf, jesse.brandeburg, magnus.karlsson

On 2/15/2023 4:42 AM, Tirthendu Sarkar wrote:
> This patchset adds multi-buffer support for XDP. Tx side already has
> support for multi-buffer. This patchset focuses on Rx side. The last
> patch contains actual multi-buffer changes while the previous ones are
> preparatory patches.
> 
> On receiving the first buffer of a packet, xdp_buff is built and its
> subsequent buffers are added to it as frags. While 'next_to_clean' keeps
> pointing to the first descriptor, the newly introduced 'next_to_process'
> keeps track of every descriptor for the packet.
> 
> On receiving EOP buffer the XDP program is called and appropriate action
> is taken (building skb for XDP_PASS, reusing page for XDP_DROP, adjusting
> page offsets for XDP_{REDIRECT,TX}).
> 
> The patchset also streamlines page offset adjustments for buffer reuse
> to make it easier to post process the rx_buffers after running XDP prog.
> 
> With this patchset there does not seem to be any performance degradation
> for XDP_PASS and some improvement (~1% for XDP_TX, ~5% for XDP_DROP) when
> measured using xdp_rxq_info program from samples/bpf/ for 64B packets.

If you want this to go through Intel Wired LAN, can you base it off that 
tree [1]. This doesn't apply to next-queue/dev-queue.

Thanks,
Tony

[1] https://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue.git/
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-wired-lan] [PATCH intel-next v4 8/8] i40e: add support for XDP multi-buffer Rx
  2023-02-15 14:51   ` Lorenzo Bianconi
@ 2023-02-16  0:02     ` Tony Nguyen
  2023-02-16  3:15       ` Jakub Kicinski
  2023-02-16  4:56       ` Sarkar, Tirthendu
  0 siblings, 2 replies; 17+ messages in thread
From: Tony Nguyen @ 2023-02-16  0:02 UTC (permalink / raw)
  To: Lorenzo Bianconi, Tirthendu Sarkar
  Cc: netdev, jesse.brandeburg, intel-wired-lan, bpf, magnus.karlsson

On 2/15/2023 6:51 AM, Lorenzo Bianconi wrote:
>> This patch adds multi-buffer support for the i40e_driver.
>>
> 
> [...]
> 
>>   
>>   	netdev->features &= ~NETIF_F_HW_TC;
>>   	netdev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT |
>> -			       NETDEV_XDP_ACT_XSK_ZEROCOPY;
>> +			       NETDEV_XDP_ACT_XSK_ZEROCOPY |
>> +			       NETDEV_XDP_ACT_RX_SG;
> 
> Hi Tirthendu,
> 
> I guess we should set it just for I40E_VSI_MAIN, I posted a patch yesterday
> to fix it:
> https://patchwork.kernel.org/project/netdevbpf/patch/f2b537f86b34fc176fbc6b3d249b46a20a87a2f3.1676405131.git.lorenzo@kernel.org/
> 
> can you please rebase on top of it?

Jakub,

I believe you are planning on taking Lorenzo's ice [1] and i40e [2] 
patch based on the comment of taking follow-ups directly [3]?

If so, Tirthendu, I'll rebase after this is pulled by netdev, then if 
you can base on next-queue so everything will apply nicely.

Thanks,
Tony

> Regards,
> Lorenzo

[1] 
https://lore.kernel.org/all/8a4781511ab6e3cd280e944eef69158954f1a15f.1676385351.git.lorenzo@kernel.org/
[2] 
https://lore.kernel.org/all/f2b537f86b34fc176fbc6b3d249b46a20a87a2f3.1676405131.git.lorenzo@kernel.org/
[3] https://lore.kernel.org/all/20230213172358.7df0f07c@kernel.org/

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-wired-lan] [PATCH intel-next v4 8/8] i40e: add support for XDP multi-buffer Rx
  2023-02-16  0:02     ` Tony Nguyen
@ 2023-02-16  3:15       ` Jakub Kicinski
  2023-02-16  4:56       ` Sarkar, Tirthendu
  1 sibling, 0 replies; 17+ messages in thread
From: Jakub Kicinski @ 2023-02-16  3:15 UTC (permalink / raw)
  To: Tony Nguyen
  Cc: Tirthendu Sarkar, netdev, jesse.brandeburg, intel-wired-lan, bpf,
	magnus.karlsson

On Wed, 15 Feb 2023 16:02:45 -0800 Tony Nguyen wrote:
> I believe you are planning on taking Lorenzo's ice [1] and i40e [2] 
> patch based on the comment of taking follow-ups directly [3]?
> 
> If so, Tirthendu, I'll rebase after this is pulled by netdev, then if 
> you can base on next-queue so everything will apply nicely.

Yup, applied now!
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-wired-lan] [PATCH intel-next v4 8/8] i40e: add support for XDP multi-buffer Rx
  2023-02-16  0:02     ` Tony Nguyen
  2023-02-16  3:15       ` Jakub Kicinski
@ 2023-02-16  4:56       ` Sarkar, Tirthendu
  1 sibling, 0 replies; 17+ messages in thread
From: Sarkar, Tirthendu @ 2023-02-16  4:56 UTC (permalink / raw)
  To: Nguyen, Anthony L, Lorenzo Bianconi
  Cc: netdev, Brandeburg, Jesse, intel-wired-lan, bpf, Karlsson, Magnus

> From: Nguyen, Anthony L <anthony.l.nguyen@intel.com>
> Sent: Thursday, February 16, 2023 5:33 AM
> Subject: Re: [PATCH intel-next v4 8/8] i40e: add support for XDP multi-buffer
> Rx
> 
> On 2/15/2023 6:51 AM, Lorenzo Bianconi wrote:
> >> This patch adds multi-buffer support for the i40e_driver.
> >>
> >
> > [...]
> >
> >>
> >>   	netdev->features &= ~NETIF_F_HW_TC;
> >>   	netdev->xdp_features = NETDEV_XDP_ACT_BASIC |
> NETDEV_XDP_ACT_REDIRECT |
> >> -			       NETDEV_XDP_ACT_XSK_ZEROCOPY;
> >> +			       NETDEV_XDP_ACT_XSK_ZEROCOPY |
> >> +			       NETDEV_XDP_ACT_RX_SG;
> >
> > Hi Tirthendu,
> >
> > I guess we should set it just for I40E_VSI_MAIN, I posted a patch yesterday
> > to fix it:
> >
> https://patchwork.kernel.org/project/netdevbpf/patch/f2b537f86b34fc176f
> bc6b3d249b46a20a87a2f3.1676405131.git.lorenzo@kernel.org/
> >
> > can you please rebase on top of it?
> 
> Jakub,
> 
> I believe you are planning on taking Lorenzo's ice [1] and i40e [2]
> patch based on the comment of taking follow-ups directly [3]?
> 
> If so, Tirthendu, I'll rebase after this is pulled by netdev, then if
> you can base on next-queue so everything will apply nicely.
> 

I have rebased it and will send the v5 once the 24hr curfew on sending patches is over.

Regards
Tirthendu

> Thanks,
> Tony
> 
> > Regards,
> > Lorenzo
> 
> [1]
> https://lore.kernel.org/all/8a4781511ab6e3cd280e944eef69158954f1a15f.167
> 6385351.git.lorenzo@kernel.org/
> [2]
> https://lore.kernel.org/all/f2b537f86b34fc176fbc6b3d249b46a20a87a2f3.1676
> 405131.git.lorenzo@kernel.org/
> [3] https://lore.kernel.org/all/20230213172358.7df0f07c@kernel.org/

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-wired-lan] [PATCH intel-next v4 4/8] i40e: Change size to truesize when using i40e_rx_buffer_flip()
  2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 4/8] i40e: Change size to truesize when using i40e_rx_buffer_flip() Tirthendu Sarkar
  2023-02-15 23:11   ` Tony Nguyen
@ 2023-02-16  7:49   ` Paul Menzel
  2023-02-16 13:53     ` Sarkar, Tirthendu
  1 sibling, 1 reply; 17+ messages in thread
From: Paul Menzel @ 2023-02-16  7:49 UTC (permalink / raw)
  To: Tirthendu Sarkar
  Cc: netdev, jesse.brandeburg, anthony.l.nguyen, intel-wired-lan, bpf,
	magnus.karlsson

Dear Tirthendu,


Thank you for your patch.

Am 15.02.23 um 13:43 schrieb Tirthendu Sarkar:
> Truesize is now passed directly to i40e_rx_buffer_flip() instead of size
> so that it does not need to recalculate truesize from size using
> i40e_rx_frame_truesize() before adjusting page offset.

Did the compiler not optimize that well enough?

> With these change the function can now be used during skb building and
> adding frags. In later patches it will also be easier for adjusting
> page offsets for multi-buffers.

Why couldn’t the function be used before?

> Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
> ---
>   drivers/net/ethernet/intel/i40e/i40e_txrx.c | 54 ++++++++-------------
>   1 file changed, 19 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> index a7fba294a8f4..019abd7273a2 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> @@ -2018,6 +2018,21 @@ static bool i40e_can_reuse_rx_page(struct i40e_rx_buffer *rx_buffer,
>   	return true;
>   }
>   
> +/**
> + * i40e_rx_buffer_flip - adjusted rx_buffer to point to an unused region
> + * @rx_buffer: Rx buffer to adjust
> + * @size: Size of adjustment
> + **/
> +static void i40e_rx_buffer_flip(struct i40e_rx_buffer *rx_buffer,
> +				unsigned int truesize)
> +{
> +#if (PAGE_SIZE < 8192)
> +	rx_buffer->page_offset ^= truesize;
> +#else
> +	rx_buffer->page_offset += truesize;
> +#endif

It’d be great if you sent a patch on top, doing the check not in the 
preprocessor but in native C code.

> +}
> +
>   /**
>    * i40e_add_rx_frag - Add contents of Rx buffer to sk_buff
>    * @rx_ring: rx descriptor ring to transact packets on
> @@ -2045,11 +2060,7 @@ static void i40e_add_rx_frag(struct i40e_ring *rx_ring,
>   			rx_buffer->page_offset, size, truesize);
>   
>   	/* page is being used so we must update the page offset */
> -#if (PAGE_SIZE < 8192)
> -	rx_buffer->page_offset ^= truesize;
> -#else
> -	rx_buffer->page_offset += truesize;
> -#endif
> +	i40e_rx_buffer_flip(rx_buffer, truesize);
>   }
>   
>   /**
> @@ -2154,11 +2165,7 @@ static struct sk_buff *i40e_construct_skb(struct i40e_ring *rx_ring,
>   				size, truesize);
>   
>   		/* buffer is used by skb, update page_offset */
> -#if (PAGE_SIZE < 8192)
> -		rx_buffer->page_offset ^= truesize;
> -#else
> -		rx_buffer->page_offset += truesize;
> -#endif
> +		i40e_rx_buffer_flip(rx_buffer, truesize);
>   	} else {
>   		/* buffer is unused, reset bias back to rx_buffer */
>   		rx_buffer->pagecnt_bias++;
> @@ -2209,11 +2216,7 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
>   		skb_metadata_set(skb, metasize);
>   
>   	/* buffer is used by skb, update page_offset */
> -#if (PAGE_SIZE < 8192)
> -	rx_buffer->page_offset ^= truesize;
> -#else
> -	rx_buffer->page_offset += truesize;
> -#endif
> +	i40e_rx_buffer_flip(rx_buffer, truesize);
>   
>   	return skb;
>   }
> @@ -2326,25 +2329,6 @@ static int i40e_run_xdp(struct i40e_ring *rx_ring, struct xdp_buff *xdp, struct
>   	return result;
>   }
>   
> -/**
> - * i40e_rx_buffer_flip - adjusted rx_buffer to point to an unused region
> - * @rx_ring: Rx ring
> - * @rx_buffer: Rx buffer to adjust
> - * @size: Size of adjustment
> - **/
> -static void i40e_rx_buffer_flip(struct i40e_ring *rx_ring,
> -				struct i40e_rx_buffer *rx_buffer,
> -				unsigned int size)
> -{
> -	unsigned int truesize = i40e_rx_frame_truesize(rx_ring, size);
> -
> -#if (PAGE_SIZE < 8192)
> -	rx_buffer->page_offset ^= truesize;
> -#else
> -	rx_buffer->page_offset += truesize;
> -#endif
> -}
> -
>   /**
>    * i40e_xdp_ring_update_tail - Updates the XDP Tx ring tail register
>    * @xdp_ring: XDP Tx ring
> @@ -2513,7 +2497,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
>   		if (xdp_res) {
>   			if (xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR)) {
>   				xdp_xmit |= xdp_res;
> -				i40e_rx_buffer_flip(rx_ring, rx_buffer, size);
> +				i40e_rx_buffer_flip(rx_buffer, xdp.frame_sz);

Why is `xdp.frame_sz` the correct size now?

>   			} else {
>   				rx_buffer->pagecnt_bias++;
>   			}


Kind regards,

Paul
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-wired-lan] [PATCH intel-next v4 4/8] i40e: Change size to truesize when using i40e_rx_buffer_flip()
  2023-02-16  7:49   ` Paul Menzel
@ 2023-02-16 13:53     ` Sarkar, Tirthendu
  0 siblings, 0 replies; 17+ messages in thread
From: Sarkar, Tirthendu @ 2023-02-16 13:53 UTC (permalink / raw)
  To: Paul Menzel
  Cc: netdev, Brandeburg, Jesse, Nguyen, Anthony L, intel-wired-lan,
	bpf, Karlsson, Magnus

> -----Original Message-----
> From: Paul Menzel <pmenzel@molgen.mpg.de>
> Subject: Re: [Intel-wired-lan] [PATCH intel-next v4 4/8] i40e: Change size to
> truesize when using i40e_rx_buffer_flip()
> Importance: Low
> 
> Dear Tirthendu,
> 
> 
> Thank you for your patch.
> 
> Am 15.02.23 um 13:43 schrieb Tirthendu Sarkar:
> > Truesize is now passed directly to i40e_rx_buffer_flip() instead of size
> > so that it does not need to recalculate truesize from size using
> > i40e_rx_frame_truesize() before adjusting page offset.
> 
> Did the compiler not optimize that well enough?
> 

For flipping the buffer, data's 'truesize' is needed. So i40e_rx_buffer_flip() needed  to call 
i40e_rx_frame_truesize() to get the 'truesize' since only 'size' was passed to it. With this 
patch the caller provides 'truesize' itself to i40e_rx_buffer_flip().

> > With these change the function can now be used during skb building and
> > adding frags. In later patches it will also be easier for adjusting
> > page offsets for multi-buffers.
> 
> Why couldn’t the function be used before?
> 

i40e_rx_frame_trusize() does not cover all cases for truesize calculation , for e.g., for frags
in non-legacy-rx mode. Since i40e_rx_buffer_flip() was depending on i40e_rx_frame_trusize() 
for truesize, it could not be used for adding frags.

> > Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
> > ---
> >   drivers/net/ethernet/intel/i40e/i40e_txrx.c | 54 ++++++++-------------
> >   1 file changed, 19 insertions(+), 35 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > index a7fba294a8f4..019abd7273a2 100644
> > --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > @@ -2018,6 +2018,21 @@ static bool i40e_can_reuse_rx_page(struct
> i40e_rx_buffer *rx_buffer,
> >   	return true;
> >   }
> >
> > +/**
> > + * i40e_rx_buffer_flip - adjusted rx_buffer to point to an unused region
> > + * @rx_buffer: Rx buffer to adjust
> > + * @size: Size of adjustment
> > + **/
> > +static void i40e_rx_buffer_flip(struct i40e_rx_buffer *rx_buffer,
> > +				unsigned int truesize)
> > +{
> > +#if (PAGE_SIZE < 8192)
> > +	rx_buffer->page_offset ^= truesize;
> > +#else
> > +	rx_buffer->page_offset += truesize;
> > +#endif
> 
> It’d be great if you sent a patch on top, doing the check not in the
> preprocessor but in native C code.
> 

We don’t want to introduce branches. Also, note this part of the code is not
newly introduced in this patchset.

> > +}
> > +
> >   /**
> >    * i40e_add_rx_frag - Add contents of Rx buffer to sk_buff
> >    * @rx_ring: rx descriptor ring to transact packets on
> > @@ -2045,11 +2060,7 @@ static void i40e_add_rx_frag(struct i40e_ring
> *rx_ring,
> >   			rx_buffer->page_offset, size, truesize);
> >
> >   	/* page is being used so we must update the page offset */
> > -#if (PAGE_SIZE < 8192)
> > -	rx_buffer->page_offset ^= truesize;
> > -#else
> > -	rx_buffer->page_offset += truesize;
> > -#endif
> > +	i40e_rx_buffer_flip(rx_buffer, truesize);
> >   }
> >
> >   /**
> > @@ -2154,11 +2165,7 @@ static struct sk_buff *i40e_construct_skb(struct
> i40e_ring *rx_ring,
> >   				size, truesize);
> >
> >   		/* buffer is used by skb, update page_offset */
> > -#if (PAGE_SIZE < 8192)
> > -		rx_buffer->page_offset ^= truesize;
> > -#else
> > -		rx_buffer->page_offset += truesize;
> > -#endif
> > +		i40e_rx_buffer_flip(rx_buffer, truesize);
> >   	} else {
> >   		/* buffer is unused, reset bias back to rx_buffer */
> >   		rx_buffer->pagecnt_bias++;
> > @@ -2209,11 +2216,7 @@ static struct sk_buff *i40e_build_skb(struct
> i40e_ring *rx_ring,
> >   		skb_metadata_set(skb, metasize);
> >
> >   	/* buffer is used by skb, update page_offset */
> > -#if (PAGE_SIZE < 8192)
> > -	rx_buffer->page_offset ^= truesize;
> > -#else
> > -	rx_buffer->page_offset += truesize;
> > -#endif
> > +	i40e_rx_buffer_flip(rx_buffer, truesize);
> >
> >   	return skb;
> >   }
> > @@ -2326,25 +2329,6 @@ static int i40e_run_xdp(struct i40e_ring *rx_ring,
> struct xdp_buff *xdp, struct
> >   	return result;
> >   }
> >
> > -/**
> > - * i40e_rx_buffer_flip - adjusted rx_buffer to point to an unused region
> > - * @rx_ring: Rx ring
> > - * @rx_buffer: Rx buffer to adjust
> > - * @size: Size of adjustment
> > - **/
> > -static void i40e_rx_buffer_flip(struct i40e_ring *rx_ring,
> > -				struct i40e_rx_buffer *rx_buffer,
> > -				unsigned int size)
> > -{
> > -	unsigned int truesize = i40e_rx_frame_truesize(rx_ring, size);
> > -
> > -#if (PAGE_SIZE < 8192)
> > -	rx_buffer->page_offset ^= truesize;
> > -#else
> > -	rx_buffer->page_offset += truesize;
> > -#endif
> > -}
> > -
> >   /**
> >    * i40e_xdp_ring_update_tail - Updates the XDP Tx ring tail register
> >    * @xdp_ring: XDP Tx ring
> > @@ -2513,7 +2497,7 @@ static int i40e_clean_rx_irq(struct i40e_ring
> *rx_ring, int budget,
> >   		if (xdp_res) {
> >   			if (xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR)) {
> >   				xdp_xmit |= xdp_res;
> > -				i40e_rx_buffer_flip(rx_ring, rx_buffer, size);
> > +				i40e_rx_buffer_flip(rx_buffer,
> xdp.frame_sz);
> 
> Why is `xdp.frame_sz` the correct size now?
> 

As explained before, earlier i40e_rx_buffer_flip() was calculating 'truesize'
internally but now depends on the caller for it to be provided. 'xdp.frame_sz'
actually is the truesize of the buffer.

> >   			} else {
> >   				rx_buffer->pagecnt_bias++;
> >   			}
> 
> 
> Kind regards,
> 
> Paul

Thanks.
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-02-16 16:17 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-15 12:42 [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer Tirthendu Sarkar
2023-02-15 12:42 ` [Intel-wired-lan] [PATCH intel-next v4 1/8] i40e: consolidate maximum frame size calculation for vsi Tirthendu Sarkar
2023-02-15 12:42 ` [Intel-wired-lan] [PATCH intel-next v4 2/8] i40e: change Rx buffer size for legacy-rx to support XDP multi-buffer Tirthendu Sarkar
2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 3/8] i40e: add pre-xdp page_count in rx_buffer Tirthendu Sarkar
2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 4/8] i40e: Change size to truesize when using i40e_rx_buffer_flip() Tirthendu Sarkar
2023-02-15 23:11   ` Tony Nguyen
2023-02-16  7:49   ` Paul Menzel
2023-02-16 13:53     ` Sarkar, Tirthendu
2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 5/8] i40e: use frame_sz instead of recalculating truesize for building skb Tirthendu Sarkar
2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 6/8] i40e: introduce next_to_process to i40e_ring Tirthendu Sarkar
2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 7/8] i40e: add xdp_buff to i40e_ring struct Tirthendu Sarkar
2023-02-15 12:43 ` [Intel-wired-lan] [PATCH intel-next v4 8/8] i40e: add support for XDP multi-buffer Rx Tirthendu Sarkar
2023-02-15 14:51   ` Lorenzo Bianconi
2023-02-16  0:02     ` Tony Nguyen
2023-02-16  3:15       ` Jakub Kicinski
2023-02-16  4:56       ` Sarkar, Tirthendu
2023-02-15 23:13 ` [Intel-wired-lan] [PATCH intel-next v4 0/8] i40e: support XDP multi-buffer Tony Nguyen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).