linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 net-next 0/9] net: intel: napi_alloc_skb() vs metadata
@ 2021-12-07 20:55 Alexander Lobakin
  2021-12-07 20:55 ` [PATCH v3 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: Alexander Lobakin @ 2021-12-07 20:55 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jeff Kirsher, Krzysztof Kazimierczak, Jithu Joseph, Andre Guedes,
	Vedang Patel, netdev, linux-kernel, bpf

This is an interpolation of [0] to other Intel Ethernet drivers
(and is (re)based on its code).
The main aim is to keep XDP metadata not only in case with
build_skb(), but also when we do napi_alloc_skb() + memcpy().

All Intel drivers suffers from the same here:
 - metadata gets lost on XDP_PASS in legacy-rx;
 - excessive headroom allocation on XSK Rx to skbs;
 - metadata gets lost on XSK Rx to skbs.

Those get especially actual in XDP Hints upcoming.
I couldn't have addressed the first one for all Intel drivers due to
that they don't reserve any headroom for now in legacy-rx mode even
with XDP enabled. This is hugely wrong, but requires quite a bunch
of work and a separate series. Luckily, ice doesn't suffer from
that.
igc has 1 and 3 already fixed in [0].

From v2 (unreleased upstream):
 - tweaked 007 to pass bi->xdp directly and simplify code (Maciej);
 - picked Michal's Reviewed-by.

From v1 (unreleased upstream):
 - drop "fixes" of legacy-rx for i40e, igb and ixgbe since they have
   another flaw regarding headroom (see above);
 - drop igc cosmetic fixes since they landed upstream incorporated
   into Jesper's commits;
 - picked one Acked-by from Maciej.

[0] https://lore.kernel.org/netdev/163700856423.565980.10162564921347693758.stgit@firesoul

Alexander Lobakin (9):
  i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  i40e: respect metadata on XSK Rx to skb
  ice: respect metadata in legacy-rx/ice_construct_skb()
  ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  ice: respect metadata on XSK Rx to skb
  igc: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directly
  ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  i40e: respect metadata on XSK Rx to skb

 drivers/net/ethernet/intel/i40e/i40e_xsk.c   | 16 +++++++-----
 drivers/net/ethernet/intel/ice/ice_txrx.c    | 15 ++++++++---
 drivers/net/ethernet/intel/ice/ice_xsk.c     | 16 +++++++-----
 drivers/net/ethernet/intel/igc/igc_main.c    | 13 +++++-----
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 27 ++++++++++++--------
 5 files changed, 54 insertions(+), 33 deletions(-)

-- 
Testing hints:

Setup an XDP and AF_XDP program which will prepend metadata at the
front of the frames and return XDP_PASS, then check that metadata
is present after frames reach kernel network stack.
--
2.33.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-07 20:55 [PATCH v3 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
@ 2021-12-07 20:55 ` Alexander Lobakin
  2021-12-07 20:55 ` [PATCH v3 net-next 2/9] i40e: respect metadata " Alexander Lobakin
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Alexander Lobakin @ 2021-12-07 20:55 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jeff Kirsher, Krzysztof Kazimierczak, Jithu Joseph, Andre Guedes,
	Vedang Patel, netdev, linux-kernel, bpf

{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
+ NET_IP_ALIGN for any skb.
OTOH, i40e_construct_skb_zc() currently allocates and reserves
additional `xdp->data - xdp->data_hard_start`, which is
XDP_PACKET_HEADROOM for XSK frames.
There's no need for that at all as the frame is post-XDP and will
go only to the networking stack core.
Pass the size of the actual data only to __napi_alloc_skb() and
don't reserve anything. This will give enough headroom for stack
processing.

Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index f08d19b8c554..9564906b7da8 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -245,13 +245,11 @@ static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring,
 	struct sk_buff *skb;
 
 	/* allocate a skb to store the frags */
-	skb = __napi_alloc_skb(&rx_ring->q_vector->napi,
-			       xdp->data_end - xdp->data_hard_start,
+	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		goto out;
 
-	skb_reserve(skb, xdp->data - xdp->data_hard_start);
 	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
 	if (metasize)
 		skb_metadata_set(skb, metasize);
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 net-next 2/9] i40e: respect metadata on XSK Rx to skb
  2021-12-07 20:55 [PATCH v3 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
  2021-12-07 20:55 ` [PATCH v3 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
@ 2021-12-07 20:55 ` Alexander Lobakin
  2022-01-10 10:31   ` [Intel-wired-lan] " Bhandare, KiranX
  2021-12-07 20:55 ` [PATCH v3 net-next 3/9] ice: respect metadata in legacy-rx/ice_construct_skb() Alexander Lobakin
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 12+ messages in thread
From: Alexander Lobakin @ 2021-12-07 20:55 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jeff Kirsher, Krzysztof Kazimierczak, Jithu Joseph, Andre Guedes,
	Vedang Patel, netdev, linux-kernel, bpf, Jesper Dangaard Brouer

For now, if the XDP prog returns XDP_PASS on XSK, the metadata will
be lost as it doesn't get copied to the skb.
Copy it along with the frame headers. Account its size on skb
allocation, and when copying just treat it as a part of the frame
and do a pull after to "move" it to the "reserved" zone.
net_prefetch() xdp->data_meta and align the copy size to speed-up
memcpy() a little and better match i40e_costruct_skb().

Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support")
Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_xsk.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index 9564906b7da8..0e8cf275e084 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -240,19 +240,25 @@ bool i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 count)
 static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring,
 					     struct xdp_buff *xdp)
 {
+	unsigned int totalsize = xdp->data_end - xdp->data_meta;
 	unsigned int metasize = xdp->data - xdp->data_meta;
-	unsigned int datasize = xdp->data_end - xdp->data;
 	struct sk_buff *skb;
 
+	net_prefetch(xdp->data_meta);
+
 	/* allocate a skb to store the frags */
-	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize,
+	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, totalsize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		goto out;
 
-	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
-	if (metasize)
+	memcpy(__skb_put(skb, totalsize), xdp->data_meta,
+	       ALIGN(totalsize, sizeof(long)));
+
+	if (metasize) {
 		skb_metadata_set(skb, metasize);
+		__skb_pull(skb, metasize);
+	}
 
 out:
 	xsk_buff_free(xdp);
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 net-next 3/9] ice: respect metadata in legacy-rx/ice_construct_skb()
  2021-12-07 20:55 [PATCH v3 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
  2021-12-07 20:55 ` [PATCH v3 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
  2021-12-07 20:55 ` [PATCH v3 net-next 2/9] i40e: respect metadata " Alexander Lobakin
@ 2021-12-07 20:55 ` Alexander Lobakin
  2021-12-07 20:55 ` [PATCH v3 net-next 4/9] ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Alexander Lobakin @ 2021-12-07 20:55 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jeff Kirsher, Krzysztof Kazimierczak, Jithu Joseph, Andre Guedes,
	Vedang Patel, netdev, linux-kernel, bpf, Jesper Dangaard Brouer

In "legacy-rx" mode represented by ice_construct_skb(), we can
still use XDP (and XDP metadata), but after XDP_PASS the metadata
will be lost as it doesn't get copied to the skb.
Copy it along with the frame headers. Account its size on skb
allocation, and when copying just treat it as a part of the frame
and do a pull after to "move" it to the "reserved" zone.
Point net_prefetch() to xdp->data_meta instead of data. This won't
change anything when the meta is not here, but will save some cache
misses otherwise.

Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/ice/ice_txrx.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index bc3ba19dc88f..d724b6376c43 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -968,15 +968,17 @@ static struct sk_buff *
 ice_construct_skb(struct ice_rx_ring *rx_ring, struct ice_rx_buf *rx_buf,
 		  struct xdp_buff *xdp)
 {
+	unsigned int metasize = xdp->data - xdp->data_meta;
 	unsigned int size = xdp->data_end - xdp->data;
 	unsigned int headlen;
 	struct sk_buff *skb;
 
 	/* prefetch first cache line of first page */
-	net_prefetch(xdp->data);
+	net_prefetch(xdp->data_meta);
 
 	/* allocate a skb to store the frags */
-	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, ICE_RX_HDR_SIZE,
+	skb = __napi_alloc_skb(&rx_ring->q_vector->napi,
+			       ICE_RX_HDR_SIZE + metasize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		return NULL;
@@ -988,8 +990,13 @@ ice_construct_skb(struct ice_rx_ring *rx_ring, struct ice_rx_buf *rx_buf,
 		headlen = eth_get_headlen(skb->dev, xdp->data, ICE_RX_HDR_SIZE);
 
 	/* align pull length to size of long to optimize memcpy performance */
-	memcpy(__skb_put(skb, headlen), xdp->data, ALIGN(headlen,
-							 sizeof(long)));
+	memcpy(__skb_put(skb, headlen + metasize), xdp->data_meta,
+	       ALIGN(headlen + metasize, sizeof(long)));
+
+	if (metasize) {
+		skb_metadata_set(skb, metasize);
+		__skb_pull(skb, metasize);
+	}
 
 	/* if we exhaust the linear part then add what is left as a frag */
 	size -= headlen;
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 net-next 4/9] ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-07 20:55 [PATCH v3 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
                   ` (2 preceding siblings ...)
  2021-12-07 20:55 ` [PATCH v3 net-next 3/9] ice: respect metadata in legacy-rx/ice_construct_skb() Alexander Lobakin
@ 2021-12-07 20:55 ` Alexander Lobakin
  2021-12-07 20:55 ` [PATCH v3 net-next 5/9] ice: respect metadata " Alexander Lobakin
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Alexander Lobakin @ 2021-12-07 20:55 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jeff Kirsher, Krzysztof Kazimierczak, Jithu Joseph, Andre Guedes,
	Vedang Patel, netdev, linux-kernel, bpf

{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
+ NET_IP_ALIGN for any skb.
OTOH, ice_construct_skb_zc() currently allocates and reserves
additional `xdp->data - xdp->data_hard_start`, which is
XDP_PACKET_HEADROOM for XSK frames.
There's no need for that at all as the frame is post-XDP and will
go only to the networking stack core.
Pass the size of the actual data only to __napi_alloc_skb() and
don't reserve anything. This will give enough headroom for stack
processing.

Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/ice/ice_xsk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index f8ea6b0633eb..f0bd8e1953bf 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -430,15 +430,13 @@ ice_construct_skb_zc(struct ice_rx_ring *rx_ring, struct xdp_buff **xdp_arr)
 	struct xdp_buff *xdp = *xdp_arr;
 	unsigned int metasize = xdp->data - xdp->data_meta;
 	unsigned int datasize = xdp->data_end - xdp->data;
-	unsigned int datasize_hard = xdp->data_end - xdp->data_hard_start;
 	struct sk_buff *skb;
 
-	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize_hard,
+	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		return NULL;
 
-	skb_reserve(skb, xdp->data - xdp->data_hard_start);
 	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
 	if (metasize)
 		skb_metadata_set(skb, metasize);
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 net-next 5/9] ice: respect metadata on XSK Rx to skb
  2021-12-07 20:55 [PATCH v3 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
                   ` (3 preceding siblings ...)
  2021-12-07 20:55 ` [PATCH v3 net-next 4/9] ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
@ 2021-12-07 20:55 ` Alexander Lobakin
  2021-12-07 20:55 ` [PATCH v3 net-next 6/9] igc: don't reserve excessive XDP_PACKET_HEADROOM " Alexander Lobakin
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Alexander Lobakin @ 2021-12-07 20:55 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jeff Kirsher, Krzysztof Kazimierczak, Jithu Joseph, Andre Guedes,
	Vedang Patel, netdev, linux-kernel, bpf, Jesper Dangaard Brouer

For now, if the XDP prog returns XDP_PASS on XSK, the metadata will
be lost as it doesn't get copied to the skb.
Copy it along with the frame headers. Account its size on skb
allocation, and when copying just treat it as a part of the frame
and do a pull after to "move" it to the "reserved" zone.
net_prefetch() xdp->data_meta and align the copy size to speed-up
memcpy() a little and better match ice_costruct_skb().

Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/ice/ice_xsk.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index f0bd8e1953bf..57e49e652439 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -428,18 +428,24 @@ static struct sk_buff *
 ice_construct_skb_zc(struct ice_rx_ring *rx_ring, struct xdp_buff **xdp_arr)
 {
 	struct xdp_buff *xdp = *xdp_arr;
+	unsigned int totalsize = xdp->data_end - xdp->data_meta;
 	unsigned int metasize = xdp->data - xdp->data_meta;
-	unsigned int datasize = xdp->data_end - xdp->data;
 	struct sk_buff *skb;
 
-	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize,
+	net_prefetch(xdp->data_meta);
+
+	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, totalsize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		return NULL;
 
-	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
-	if (metasize)
+	memcpy(__skb_put(skb, totalsize), xdp->data_meta,
+	       ALIGN(totalsize, sizeof(long)));
+
+	if (metasize) {
 		skb_metadata_set(skb, metasize);
+		__skb_pull(skb, metasize);
+	}
 
 	xsk_buff_free(xdp);
 	*xdp_arr = NULL;
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 net-next 6/9] igc: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-07 20:55 [PATCH v3 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
                   ` (4 preceding siblings ...)
  2021-12-07 20:55 ` [PATCH v3 net-next 5/9] ice: respect metadata " Alexander Lobakin
@ 2021-12-07 20:55 ` Alexander Lobakin
  2021-12-07 20:55 ` [PATCH v3 net-next 7/9] ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directly Alexander Lobakin
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Alexander Lobakin @ 2021-12-07 20:55 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jeff Kirsher, Krzysztof Kazimierczak, Jithu Joseph, Andre Guedes,
	Vedang Patel, netdev, linux-kernel, bpf

{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
+ NET_IP_ALIGN for any skb.
OTOH, igc_construct_skb_zc() currently allocates and reserves
additional `xdp->data_meta - xdp->data_hard_start`, which is about
XDP_PACKET_HEADROOM for XSK frames.
There's no need for that at all as the frame is post-XDP and will
go only to the networking stack core.
Pass the size of the actual data only (+ meta) to
__napi_alloc_skb() and don't reserve anything. This will give
enough headroom for stack processing.
Also, net_prefetch() xdp->data_meta and align the copy size to
speed-up memcpy() a little and better match igc_costruct_skb().

Fixes: fc9df2a0b520 ("igc: Enable RX via AF_XDP zero-copy")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 142c57b7a451..a2e8d43be3a1 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -2446,19 +2446,20 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
 static struct sk_buff *igc_construct_skb_zc(struct igc_ring *ring,
 					    struct xdp_buff *xdp)
 {
+	unsigned int totalsize = xdp->data_end - xdp->data_meta;
 	unsigned int metasize = xdp->data - xdp->data_meta;
-	unsigned int datasize = xdp->data_end - xdp->data;
-	unsigned int totalsize = metasize + datasize;
 	struct sk_buff *skb;
 
-	skb = __napi_alloc_skb(&ring->q_vector->napi,
-			       xdp->data_end - xdp->data_hard_start,
+	net_prefetch(xdp->data_meta);
+
+	skb = __napi_alloc_skb(&ring->q_vector->napi, totalsize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		return NULL;
 
-	skb_reserve(skb, xdp->data_meta - xdp->data_hard_start);
-	memcpy(__skb_put(skb, totalsize), xdp->data_meta, totalsize);
+	memcpy(__skb_put(skb, totalsize), xdp->data_meta,
+	       ALIGN(totalsize, sizeof(long)));
+
 	if (metasize) {
 		skb_metadata_set(skb, metasize);
 		__skb_pull(skb, metasize);
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 net-next 7/9] ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directly
  2021-12-07 20:55 [PATCH v3 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
                   ` (5 preceding siblings ...)
  2021-12-07 20:55 ` [PATCH v3 net-next 6/9] igc: don't reserve excessive XDP_PACKET_HEADROOM " Alexander Lobakin
@ 2021-12-07 20:55 ` Alexander Lobakin
  2021-12-07 20:55 ` [PATCH v3 net-next 8/9] ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
  2021-12-07 20:55 ` [PATCH v3 net-next 9/9] i40e: respect metadata " Alexander Lobakin
  8 siblings, 0 replies; 12+ messages in thread
From: Alexander Lobakin @ 2021-12-07 20:55 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jeff Kirsher, Krzysztof Kazimierczak, Jithu Joseph, Andre Guedes,
	Vedang Patel, netdev, linux-kernel, bpf

To not dereference bi->xdp each time in ixgbe_construct_skb_zc(),
pass bi->xdp as an argument instead of bi. We can also call
xsk_buff_free() outside of the function as well as assign bi->xdp
to NULL, which seems to make it closer to its name.

Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index db2bc58dfcfd..1d74a7980d81 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -207,26 +207,24 @@ bool ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 count)
 }
 
 static struct sk_buff *ixgbe_construct_skb_zc(struct ixgbe_ring *rx_ring,
-					      struct ixgbe_rx_buffer *bi)
+					      const struct xdp_buff *xdp)
 {
-	unsigned int metasize = bi->xdp->data - bi->xdp->data_meta;
-	unsigned int datasize = bi->xdp->data_end - bi->xdp->data;
+	unsigned int metasize = xdp->data - xdp->data_meta;
+	unsigned int datasize = xdp->data_end - xdp->data;
 	struct sk_buff *skb;
 
 	/* allocate a skb to store the frags */
 	skb = __napi_alloc_skb(&rx_ring->q_vector->napi,
-			       bi->xdp->data_end - bi->xdp->data_hard_start,
+			       xdp->data_end - xdp->data_hard_start,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		return NULL;
 
-	skb_reserve(skb, bi->xdp->data - bi->xdp->data_hard_start);
-	memcpy(__skb_put(skb, datasize), bi->xdp->data, datasize);
+	skb_reserve(skb, xdp->data - xdp->data_hard_start);
+	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
 	if (metasize)
 		skb_metadata_set(skb, metasize);
 
-	xsk_buff_free(bi->xdp);
-	bi->xdp = NULL;
 	return skb;
 }
 
@@ -317,12 +315,15 @@ int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector,
 		}
 
 		/* XDP_PASS path */
-		skb = ixgbe_construct_skb_zc(rx_ring, bi);
+		skb = ixgbe_construct_skb_zc(rx_ring, bi->xdp);
 		if (!skb) {
 			rx_ring->rx_stats.alloc_rx_buff_failed++;
 			break;
 		}
 
+		xsk_buff_free(bi->xdp);
+		bi->xdp = NULL;
+
 		cleaned_count++;
 		ixgbe_inc_ntc(rx_ring);
 
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 net-next 8/9] ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-07 20:55 [PATCH v3 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
                   ` (6 preceding siblings ...)
  2021-12-07 20:55 ` [PATCH v3 net-next 7/9] ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directly Alexander Lobakin
@ 2021-12-07 20:55 ` Alexander Lobakin
  2021-12-07 20:55 ` [PATCH v3 net-next 9/9] i40e: respect metadata " Alexander Lobakin
  8 siblings, 0 replies; 12+ messages in thread
From: Alexander Lobakin @ 2021-12-07 20:55 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jeff Kirsher, Krzysztof Kazimierczak, Jithu Joseph, Andre Guedes,
	Vedang Patel, netdev, linux-kernel, bpf

{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
+ NET_IP_ALIGN for any skb.
OTOH, ixgbe_construct_skb_zc() currently allocates and reserves
additional `xdp->data - xdp->data_hard_start`, which is
XDP_PACKET_HEADROOM for XSK frames.
There's no need for that at all as the frame is post-XDP and will
go only to the networking stack core.
Pass the size of the actual data only to __napi_alloc_skb() and
don't reserve anything. This will give enough headroom for stack
processing.

Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index 1d74a7980d81..db20dc4c2488 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -214,13 +214,11 @@ static struct sk_buff *ixgbe_construct_skb_zc(struct ixgbe_ring *rx_ring,
 	struct sk_buff *skb;
 
 	/* allocate a skb to store the frags */
-	skb = __napi_alloc_skb(&rx_ring->q_vector->napi,
-			       xdp->data_end - xdp->data_hard_start,
+	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		return NULL;
 
-	skb_reserve(skb, xdp->data - xdp->data_hard_start);
 	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
 	if (metasize)
 		skb_metadata_set(skb, metasize);
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 net-next 9/9] i40e: respect metadata on XSK Rx to skb
  2021-12-07 20:55 [PATCH v3 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
                   ` (7 preceding siblings ...)
  2021-12-07 20:55 ` [PATCH v3 net-next 8/9] ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
@ 2021-12-07 20:55 ` Alexander Lobakin
  2021-12-08 13:47   ` Jesper Dangaard Brouer
  8 siblings, 1 reply; 12+ messages in thread
From: Alexander Lobakin @ 2021-12-07 20:55 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jeff Kirsher, Krzysztof Kazimierczak, Jithu Joseph, Andre Guedes,
	Vedang Patel, netdev, linux-kernel, bpf, Jesper Dangaard Brouer

For now, if the XDP prog returns XDP_PASS on XSK, the metadata will
be lost as it doesn't get copied to the skb.
Copy it along with the frame headers. Account its size on skb
allocation, and when copying just treat it as a part of the frame
and do a pull after to "move" it to the "reserved" zone.
net_prefetch() xdp->data_meta and align the copy size to speed-up
memcpy() a little and better match ixgbee_costruct_skb().

Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support")
Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index db20dc4c2488..ec1e2da72676 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -209,19 +209,25 @@ bool ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 count)
 static struct sk_buff *ixgbe_construct_skb_zc(struct ixgbe_ring *rx_ring,
 					      const struct xdp_buff *xdp)
 {
+	unsigned int totalsize = xdp->data_end - xdp->data_meta;
 	unsigned int metasize = xdp->data - xdp->data_meta;
-	unsigned int datasize = xdp->data_end - xdp->data;
 	struct sk_buff *skb;
 
+	net_prefetch(xdp->data_meta);
+
 	/* allocate a skb to store the frags */
-	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize,
+	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, totalsize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		return NULL;
 
-	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
-	if (metasize)
+	memcpy(__skb_put(skb, totalsize), xdp->data_meta,
+	       ALIGN(totalsize, sizeof(long)));
+
+	if (metasize) {
 		skb_metadata_set(skb, metasize);
+		__skb_pull(skb, metasize);
+	}
 
 	return skb;
 }
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 net-next 9/9] i40e: respect metadata on XSK Rx to skb
  2021-12-07 20:55 ` [PATCH v3 net-next 9/9] i40e: respect metadata " Alexander Lobakin
@ 2021-12-08 13:47   ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 12+ messages in thread
From: Jesper Dangaard Brouer @ 2021-12-08 13:47 UTC (permalink / raw)
  To: Alexander Lobakin, intel-wired-lan
  Cc: brouer, Jesse Brandeburg, Tony Nguyen, David S. Miller,
	Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Björn Töpel,
	Maciej Fijalkowski, Michal Swiatkowski, Jeff Kirsher,
	Krzysztof Kazimierczak, Jithu Joseph, Andre Guedes, Vedang Patel,
	netdev, linux-kernel, bpf


On 07/12/2021 21.55, Alexander Lobakin wrote:
> For now, if the XDP prog returns XDP_PASS on XSK, the metadata will
> be lost as it doesn't get copied to the skb.
> Copy it along with the frame headers. Account its size on skb
> allocation, and when copying just treat it as a part of the frame
> and do a pull after to "move" it to the "reserved" zone.
> net_prefetch() xdp->data_meta and align the copy size to speed-up
> memcpy() a little and better match ixgbee_costruct_skb().
                                      ^^^^^^^^^^^^^^^^^^^
Misspelling function name.

> 
> Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support")
> Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> ---
>   drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)

Wrong driver (i40e:) "tagged" in subject.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [Intel-wired-lan] [PATCH v3 net-next 2/9] i40e: respect metadata on XSK Rx to skb
  2021-12-07 20:55 ` [PATCH v3 net-next 2/9] i40e: respect metadata " Alexander Lobakin
@ 2022-01-10 10:31   ` Bhandare, KiranX
  0 siblings, 0 replies; 12+ messages in thread
From: Bhandare, KiranX @ 2022-01-10 10:31 UTC (permalink / raw)
  To: Lobakin, Alexandr, intel-wired-lan
  Cc: Andre Guedes, Jesper Dangaard Brouer, Daniel Borkmann, netdev,
	Krzysztof Kazimierczak, bpf, John Fastabend, Alexei Starovoitov,
	Brouer, Jesper, Björn Töpel, Jeff Kirsher,
	Jakub Kicinski, linux-kernel, David S. Miller, Vedang Patel


> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Lobakin, Alexandr
> Sent: Wednesday, December 8, 2021 2:25 AM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Andre Guedes <andre.guedes@intel.com>; Jesper Dangaard Brouer
> <hawk@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>;
> netdev@vger.kernel.org; Krzysztof Kazimierczak
> <krzysztof.kazimierczak@intel.com>; bpf@vger.kernel.org; John Fastabend
> <john.fastabend@gmail.com>; Alexei Starovoitov <ast@kernel.org>; Brouer,
> Jesper <brouer@redhat.com>; Björn Töpel <bjorn@kernel.org>; Jeff Kirsher
> <jeffrey.t.kirsher@intel.com>; Jakub Kicinski <kuba@kernel.org>; linux-
> kernel@vger.kernel.org; David S. Miller <davem@davemloft.net>; Vedang
> Patel <vedang.patel@intel.com>
> Subject: [Intel-wired-lan] [PATCH v3 net-next 2/9] i40e: respect metadata on
> XSK Rx to skb
> 
> For now, if the XDP prog returns XDP_PASS on XSK, the metadata will be lost
> as it doesn't get copied to the skb.
> Copy it along with the frame headers. Account its size on skb allocation, and
> when copying just treat it as a part of the frame and do a pull after to "move"
> it to the "reserved" zone.
> net_prefetch() xdp->data_meta and align the copy size to speed-up
> memcpy() a little and better match i40e_costruct_skb().
> 
> Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support")
> Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_xsk.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 

Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>  A Contingent Worker at Intel

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-01-10 10:32 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-07 20:55 [PATCH v3 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
2021-12-07 20:55 ` [PATCH v3 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
2021-12-07 20:55 ` [PATCH v3 net-next 2/9] i40e: respect metadata " Alexander Lobakin
2022-01-10 10:31   ` [Intel-wired-lan] " Bhandare, KiranX
2021-12-07 20:55 ` [PATCH v3 net-next 3/9] ice: respect metadata in legacy-rx/ice_construct_skb() Alexander Lobakin
2021-12-07 20:55 ` [PATCH v3 net-next 4/9] ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
2021-12-07 20:55 ` [PATCH v3 net-next 5/9] ice: respect metadata " Alexander Lobakin
2021-12-07 20:55 ` [PATCH v3 net-next 6/9] igc: don't reserve excessive XDP_PACKET_HEADROOM " Alexander Lobakin
2021-12-07 20:55 ` [PATCH v3 net-next 7/9] ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directly Alexander Lobakin
2021-12-07 20:55 ` [PATCH v3 net-next 8/9] ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
2021-12-07 20:55 ` [PATCH v3 net-next 9/9] i40e: respect metadata " Alexander Lobakin
2021-12-08 13:47   ` Jesper Dangaard Brouer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).