bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata
@ 2021-12-08 14:06 Alexander Lobakin
  2021-12-08 14:06 ` [PATCH v4 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
                   ` (9 more replies)
  0 siblings, 10 replies; 27+ messages in thread
From: Alexander Lobakin @ 2021-12-08 14:06 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jithu Joseph, Martin KaFai Lau, Song Liu, KP Singh,
	Yonghong Song, Andrii Nakryiko, netdev, linux-kernel, bpf

This is an interpolation of [0] to other Intel Ethernet drivers
(and is (re)based on its code).
The main aim is to keep XDP metadata not only in case with
build_skb(), but also when we do napi_alloc_skb() + memcpy().

All Intel drivers suffers from the same here:
 - metadata gets lost on XDP_PASS in legacy-rx;
 - excessive headroom allocation on XSK Rx to skbs;
 - metadata gets lost on XSK Rx to skbs.

Those get especially actual in XDP Hints upcoming.
I couldn't have addressed the first one for all Intel drivers due to
that they don't reserve any headroom for now in legacy-rx mode even
with XDP enabled. This is hugely wrong, but requires quite a bunch
of work and a separate series. Luckily, ice doesn't suffer from
that.
igc has 1 and 3 already fixed in [0].

From v3 ([1]):
 - fix driver name and ixgbe_construct_skb() function name in the
   commit message of #9 (Jesper);
 - no functional changes.

From v2 (unreleased upstream):
 - tweaked 007 to pass bi->xdp directly and simplify code (Maciej);
 - picked Michal's Reviewed-by.

From v1 (unreleased upstream):
 - drop "fixes" of legacy-rx for i40e, igb and ixgbe since they have
   another flaw regarding headroom (see above);
 - drop igc cosmetic fixes since they landed upstream incorporated
   into Jesper's commits;
 - picked one Acked-by from Maciej.

[0] https://lore.kernel.org/netdev/163700856423.565980.10162564921347693758.stgit@firesoul
[1] https://lore.kernel.org/netdev/20211207205536.563550-1-alexandr.lobakin@intel.com

Alexander Lobakin (9):
  i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  i40e: respect metadata on XSK Rx to skb
  ice: respect metadata in legacy-rx/ice_construct_skb()
  ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  ice: respect metadata on XSK Rx to skb
  igc: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directly
  ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  ixgbe: respect metadata on XSK Rx to skb

 drivers/net/ethernet/intel/i40e/i40e_xsk.c   | 16 +++++++-----
 drivers/net/ethernet/intel/ice/ice_txrx.c    | 15 ++++++++---
 drivers/net/ethernet/intel/ice/ice_xsk.c     | 16 +++++++-----
 drivers/net/ethernet/intel/igc/igc_main.c    | 13 +++++-----
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 27 ++++++++++++--------
 5 files changed, 54 insertions(+), 33 deletions(-)

-- 
Testing hints:

Setup an XDP and AF_XDP program which will prepend metadata at the
front of the frames and return XDP_PASS, then check that metadata
is present after frames reach kernel network stack.
--
2.33.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v4 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-08 14:06 [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
@ 2021-12-08 14:06 ` Alexander Lobakin
  2021-12-09  8:19   ` Jesper Dangaard Brouer
  2022-01-29  8:55   ` [Intel-wired-lan] " Bhandare, KiranX
  2021-12-08 14:06 ` [PATCH v4 net-next 2/9] i40e: respect metadata " Alexander Lobakin
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 27+ messages in thread
From: Alexander Lobakin @ 2021-12-08 14:06 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jithu Joseph, Martin KaFai Lau, Song Liu, KP Singh,
	Yonghong Song, Andrii Nakryiko, netdev, linux-kernel, bpf

{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
+ NET_IP_ALIGN for any skb.
OTOH, i40e_construct_skb_zc() currently allocates and reserves
additional `xdp->data - xdp->data_hard_start`, which is
XDP_PACKET_HEADROOM for XSK frames.
There's no need for that at all as the frame is post-XDP and will
go only to the networking stack core.
Pass the size of the actual data only to __napi_alloc_skb() and
don't reserve anything. This will give enough headroom for stack
processing.

Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index f08d19b8c554..9564906b7da8 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -245,13 +245,11 @@ static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring,
 	struct sk_buff *skb;
 
 	/* allocate a skb to store the frags */
-	skb = __napi_alloc_skb(&rx_ring->q_vector->napi,
-			       xdp->data_end - xdp->data_hard_start,
+	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		goto out;
 
-	skb_reserve(skb, xdp->data - xdp->data_hard_start);
 	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
 	if (metasize)
 		skb_metadata_set(skb, metasize);
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 net-next 2/9] i40e: respect metadata on XSK Rx to skb
  2021-12-08 14:06 [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
  2021-12-08 14:06 ` [PATCH v4 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
@ 2021-12-08 14:06 ` Alexander Lobakin
  2021-12-09  8:27   ` Jesper Dangaard Brouer
  2022-01-10 11:24   ` [Intel-wired-lan] " Bhandare, KiranX
  2021-12-08 14:06 ` [PATCH v4 net-next 3/9] ice: respect metadata in legacy-rx/ice_construct_skb() Alexander Lobakin
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 27+ messages in thread
From: Alexander Lobakin @ 2021-12-08 14:06 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jithu Joseph, Martin KaFai Lau, Song Liu, KP Singh,
	Yonghong Song, Andrii Nakryiko, netdev, linux-kernel, bpf,
	Jesper Dangaard Brouer

For now, if the XDP prog returns XDP_PASS on XSK, the metadata will
be lost as it doesn't get copied to the skb.
Copy it along with the frame headers. Account its size on skb
allocation, and when copying just treat it as a part of the frame
and do a pull after to "move" it to the "reserved" zone.
net_prefetch() xdp->data_meta and align the copy size to speed-up
memcpy() a little and better match i40e_costruct_skb().

Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support")
Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_xsk.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index 9564906b7da8..0e8cf275e084 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -240,19 +240,25 @@ bool i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 count)
 static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring,
 					     struct xdp_buff *xdp)
 {
+	unsigned int totalsize = xdp->data_end - xdp->data_meta;
 	unsigned int metasize = xdp->data - xdp->data_meta;
-	unsigned int datasize = xdp->data_end - xdp->data;
 	struct sk_buff *skb;
 
+	net_prefetch(xdp->data_meta);
+
 	/* allocate a skb to store the frags */
-	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize,
+	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, totalsize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		goto out;
 
-	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
-	if (metasize)
+	memcpy(__skb_put(skb, totalsize), xdp->data_meta,
+	       ALIGN(totalsize, sizeof(long)));
+
+	if (metasize) {
 		skb_metadata_set(skb, metasize);
+		__skb_pull(skb, metasize);
+	}
 
 out:
 	xsk_buff_free(xdp);
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 net-next 3/9] ice: respect metadata in legacy-rx/ice_construct_skb()
  2021-12-08 14:06 [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
  2021-12-08 14:06 ` [PATCH v4 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
  2021-12-08 14:06 ` [PATCH v4 net-next 2/9] i40e: respect metadata " Alexander Lobakin
@ 2021-12-08 14:06 ` Alexander Lobakin
  2022-01-10 10:16   ` [Intel-wired-lan] " Bhandare, KiranX
  2021-12-08 14:06 ` [PATCH v4 net-next 4/9] ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 27+ messages in thread
From: Alexander Lobakin @ 2021-12-08 14:06 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jithu Joseph, Martin KaFai Lau, Song Liu, KP Singh,
	Yonghong Song, Andrii Nakryiko, netdev, linux-kernel, bpf,
	Jesper Dangaard Brouer

In "legacy-rx" mode represented by ice_construct_skb(), we can
still use XDP (and XDP metadata), but after XDP_PASS the metadata
will be lost as it doesn't get copied to the skb.
Copy it along with the frame headers. Account its size on skb
allocation, and when copying just treat it as a part of the frame
and do a pull after to "move" it to the "reserved" zone.
Point net_prefetch() to xdp->data_meta instead of data. This won't
change anything when the meta is not here, but will save some cache
misses otherwise.

Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/ice/ice_txrx.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index bc3ba19dc88f..d724b6376c43 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -968,15 +968,17 @@ static struct sk_buff *
 ice_construct_skb(struct ice_rx_ring *rx_ring, struct ice_rx_buf *rx_buf,
 		  struct xdp_buff *xdp)
 {
+	unsigned int metasize = xdp->data - xdp->data_meta;
 	unsigned int size = xdp->data_end - xdp->data;
 	unsigned int headlen;
 	struct sk_buff *skb;
 
 	/* prefetch first cache line of first page */
-	net_prefetch(xdp->data);
+	net_prefetch(xdp->data_meta);
 
 	/* allocate a skb to store the frags */
-	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, ICE_RX_HDR_SIZE,
+	skb = __napi_alloc_skb(&rx_ring->q_vector->napi,
+			       ICE_RX_HDR_SIZE + metasize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		return NULL;
@@ -988,8 +990,13 @@ ice_construct_skb(struct ice_rx_ring *rx_ring, struct ice_rx_buf *rx_buf,
 		headlen = eth_get_headlen(skb->dev, xdp->data, ICE_RX_HDR_SIZE);
 
 	/* align pull length to size of long to optimize memcpy performance */
-	memcpy(__skb_put(skb, headlen), xdp->data, ALIGN(headlen,
-							 sizeof(long)));
+	memcpy(__skb_put(skb, headlen + metasize), xdp->data_meta,
+	       ALIGN(headlen + metasize, sizeof(long)));
+
+	if (metasize) {
+		skb_metadata_set(skb, metasize);
+		__skb_pull(skb, metasize);
+	}
 
 	/* if we exhaust the linear part then add what is left as a frag */
 	size -= headlen;
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 net-next 4/9] ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-08 14:06 [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
                   ` (2 preceding siblings ...)
  2021-12-08 14:06 ` [PATCH v4 net-next 3/9] ice: respect metadata in legacy-rx/ice_construct_skb() Alexander Lobakin
@ 2021-12-08 14:06 ` Alexander Lobakin
  2022-01-10 10:13   ` [Intel-wired-lan] " Bhandare, KiranX
  2021-12-08 14:06 ` [PATCH v4 net-next 5/9] ice: respect metadata " Alexander Lobakin
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 27+ messages in thread
From: Alexander Lobakin @ 2021-12-08 14:06 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jithu Joseph, Martin KaFai Lau, Song Liu, KP Singh,
	Yonghong Song, Andrii Nakryiko, netdev, linux-kernel, bpf

{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
+ NET_IP_ALIGN for any skb.
OTOH, ice_construct_skb_zc() currently allocates and reserves
additional `xdp->data - xdp->data_hard_start`, which is
XDP_PACKET_HEADROOM for XSK frames.
There's no need for that at all as the frame is post-XDP and will
go only to the networking stack core.
Pass the size of the actual data only to __napi_alloc_skb() and
don't reserve anything. This will give enough headroom for stack
processing.

Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/ice/ice_xsk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index f8ea6b0633eb..f0bd8e1953bf 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -430,15 +430,13 @@ ice_construct_skb_zc(struct ice_rx_ring *rx_ring, struct xdp_buff **xdp_arr)
 	struct xdp_buff *xdp = *xdp_arr;
 	unsigned int metasize = xdp->data - xdp->data_meta;
 	unsigned int datasize = xdp->data_end - xdp->data;
-	unsigned int datasize_hard = xdp->data_end - xdp->data_hard_start;
 	struct sk_buff *skb;
 
-	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize_hard,
+	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		return NULL;
 
-	skb_reserve(skb, xdp->data - xdp->data_hard_start);
 	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
 	if (metasize)
 		skb_metadata_set(skb, metasize);
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 net-next 5/9] ice: respect metadata on XSK Rx to skb
  2021-12-08 14:06 [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
                   ` (3 preceding siblings ...)
  2021-12-08 14:06 ` [PATCH v4 net-next 4/9] ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
@ 2021-12-08 14:06 ` Alexander Lobakin
  2022-01-10 11:37   ` [Intel-wired-lan] " Bhandare, KiranX
  2021-12-08 14:06 ` [PATCH v4 net-next 6/9] igc: don't reserve excessive XDP_PACKET_HEADROOM " Alexander Lobakin
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 27+ messages in thread
From: Alexander Lobakin @ 2021-12-08 14:06 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jithu Joseph, Martin KaFai Lau, Song Liu, KP Singh,
	Yonghong Song, Andrii Nakryiko, netdev, linux-kernel, bpf,
	Jesper Dangaard Brouer

For now, if the XDP prog returns XDP_PASS on XSK, the metadata will
be lost as it doesn't get copied to the skb.
Copy it along with the frame headers. Account its size on skb
allocation, and when copying just treat it as a part of the frame
and do a pull after to "move" it to the "reserved" zone.
net_prefetch() xdp->data_meta and align the copy size to speed-up
memcpy() a little and better match ice_costruct_skb().

Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/ice/ice_xsk.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index f0bd8e1953bf..57e49e652439 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -428,18 +428,24 @@ static struct sk_buff *
 ice_construct_skb_zc(struct ice_rx_ring *rx_ring, struct xdp_buff **xdp_arr)
 {
 	struct xdp_buff *xdp = *xdp_arr;
+	unsigned int totalsize = xdp->data_end - xdp->data_meta;
 	unsigned int metasize = xdp->data - xdp->data_meta;
-	unsigned int datasize = xdp->data_end - xdp->data;
 	struct sk_buff *skb;
 
-	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize,
+	net_prefetch(xdp->data_meta);
+
+	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, totalsize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		return NULL;
 
-	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
-	if (metasize)
+	memcpy(__skb_put(skb, totalsize), xdp->data_meta,
+	       ALIGN(totalsize, sizeof(long)));
+
+	if (metasize) {
 		skb_metadata_set(skb, metasize);
+		__skb_pull(skb, metasize);
+	}
 
 	xsk_buff_free(xdp);
 	*xdp_arr = NULL;
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 net-next 6/9] igc: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-08 14:06 [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
                   ` (4 preceding siblings ...)
  2021-12-08 14:06 ` [PATCH v4 net-next 5/9] ice: respect metadata " Alexander Lobakin
@ 2021-12-08 14:06 ` Alexander Lobakin
  2021-12-27 20:34   ` [Intel-wired-lan] " Kraus, NechamaX
  2021-12-08 14:07 ` [PATCH v4 net-next 7/9] ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directly Alexander Lobakin
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 27+ messages in thread
From: Alexander Lobakin @ 2021-12-08 14:06 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jithu Joseph, Martin KaFai Lau, Song Liu, KP Singh,
	Yonghong Song, Andrii Nakryiko, netdev, linux-kernel, bpf

{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
+ NET_IP_ALIGN for any skb.
OTOH, igc_construct_skb_zc() currently allocates and reserves
additional `xdp->data_meta - xdp->data_hard_start`, which is about
XDP_PACKET_HEADROOM for XSK frames.
There's no need for that at all as the frame is post-XDP and will
go only to the networking stack core.
Pass the size of the actual data only (+ meta) to
__napi_alloc_skb() and don't reserve anything. This will give
enough headroom for stack processing.
Also, net_prefetch() xdp->data_meta and align the copy size to
speed-up memcpy() a little and better match igc_costruct_skb().

Fixes: fc9df2a0b520 ("igc: Enable RX via AF_XDP zero-copy")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 142c57b7a451..a2e8d43be3a1 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -2446,19 +2446,20 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
 static struct sk_buff *igc_construct_skb_zc(struct igc_ring *ring,
 					    struct xdp_buff *xdp)
 {
+	unsigned int totalsize = xdp->data_end - xdp->data_meta;
 	unsigned int metasize = xdp->data - xdp->data_meta;
-	unsigned int datasize = xdp->data_end - xdp->data;
-	unsigned int totalsize = metasize + datasize;
 	struct sk_buff *skb;
 
-	skb = __napi_alloc_skb(&ring->q_vector->napi,
-			       xdp->data_end - xdp->data_hard_start,
+	net_prefetch(xdp->data_meta);
+
+	skb = __napi_alloc_skb(&ring->q_vector->napi, totalsize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		return NULL;
 
-	skb_reserve(skb, xdp->data_meta - xdp->data_hard_start);
-	memcpy(__skb_put(skb, totalsize), xdp->data_meta, totalsize);
+	memcpy(__skb_put(skb, totalsize), xdp->data_meta,
+	       ALIGN(totalsize, sizeof(long)));
+
 	if (metasize) {
 		skb_metadata_set(skb, metasize);
 		__skb_pull(skb, metasize);
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 net-next 7/9] ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directly
  2021-12-08 14:06 [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
                   ` (5 preceding siblings ...)
  2021-12-08 14:06 ` [PATCH v4 net-next 6/9] igc: don't reserve excessive XDP_PACKET_HEADROOM " Alexander Lobakin
@ 2021-12-08 14:07 ` Alexander Lobakin
  2022-01-11  7:30   ` [Intel-wired-lan] " Penigalapati, Sandeep
  2021-12-08 14:07 ` [PATCH v4 net-next 8/9] ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 27+ messages in thread
From: Alexander Lobakin @ 2021-12-08 14:07 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jithu Joseph, Martin KaFai Lau, Song Liu, KP Singh,
	Yonghong Song, Andrii Nakryiko, netdev, linux-kernel, bpf

To not dereference bi->xdp each time in ixgbe_construct_skb_zc(),
pass bi->xdp as an argument instead of bi. We can also call
xsk_buff_free() outside of the function as well as assign bi->xdp
to NULL, which seems to make it closer to its name.

Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index db2bc58dfcfd..1d74a7980d81 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -207,26 +207,24 @@ bool ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 count)
 }
 
 static struct sk_buff *ixgbe_construct_skb_zc(struct ixgbe_ring *rx_ring,
-					      struct ixgbe_rx_buffer *bi)
+					      const struct xdp_buff *xdp)
 {
-	unsigned int metasize = bi->xdp->data - bi->xdp->data_meta;
-	unsigned int datasize = bi->xdp->data_end - bi->xdp->data;
+	unsigned int metasize = xdp->data - xdp->data_meta;
+	unsigned int datasize = xdp->data_end - xdp->data;
 	struct sk_buff *skb;
 
 	/* allocate a skb to store the frags */
 	skb = __napi_alloc_skb(&rx_ring->q_vector->napi,
-			       bi->xdp->data_end - bi->xdp->data_hard_start,
+			       xdp->data_end - xdp->data_hard_start,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		return NULL;
 
-	skb_reserve(skb, bi->xdp->data - bi->xdp->data_hard_start);
-	memcpy(__skb_put(skb, datasize), bi->xdp->data, datasize);
+	skb_reserve(skb, xdp->data - xdp->data_hard_start);
+	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
 	if (metasize)
 		skb_metadata_set(skb, metasize);
 
-	xsk_buff_free(bi->xdp);
-	bi->xdp = NULL;
 	return skb;
 }
 
@@ -317,12 +315,15 @@ int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector,
 		}
 
 		/* XDP_PASS path */
-		skb = ixgbe_construct_skb_zc(rx_ring, bi);
+		skb = ixgbe_construct_skb_zc(rx_ring, bi->xdp);
 		if (!skb) {
 			rx_ring->rx_stats.alloc_rx_buff_failed++;
 			break;
 		}
 
+		xsk_buff_free(bi->xdp);
+		bi->xdp = NULL;
+
 		cleaned_count++;
 		ixgbe_inc_ntc(rx_ring);
 
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 net-next 8/9] ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-08 14:06 [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
                   ` (6 preceding siblings ...)
  2021-12-08 14:07 ` [PATCH v4 net-next 7/9] ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directly Alexander Lobakin
@ 2021-12-08 14:07 ` Alexander Lobakin
  2022-01-11 11:51   ` [Intel-wired-lan] " Penigalapati, Sandeep
  2021-12-08 14:07 ` [PATCH v4 net-next 9/9] ixgbe: respect metadata " Alexander Lobakin
  2022-01-10 10:11 ` [Intel-wired-lan] [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Bhandare, KiranX
  9 siblings, 1 reply; 27+ messages in thread
From: Alexander Lobakin @ 2021-12-08 14:07 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jithu Joseph, Martin KaFai Lau, Song Liu, KP Singh,
	Yonghong Song, Andrii Nakryiko, netdev, linux-kernel, bpf

{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
+ NET_IP_ALIGN for any skb.
OTOH, ixgbe_construct_skb_zc() currently allocates and reserves
additional `xdp->data - xdp->data_hard_start`, which is
XDP_PACKET_HEADROOM for XSK frames.
There's no need for that at all as the frame is post-XDP and will
go only to the networking stack core.
Pass the size of the actual data only to __napi_alloc_skb() and
don't reserve anything. This will give enough headroom for stack
processing.

Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index 1d74a7980d81..db20dc4c2488 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -214,13 +214,11 @@ static struct sk_buff *ixgbe_construct_skb_zc(struct ixgbe_ring *rx_ring,
 	struct sk_buff *skb;
 
 	/* allocate a skb to store the frags */
-	skb = __napi_alloc_skb(&rx_ring->q_vector->napi,
-			       xdp->data_end - xdp->data_hard_start,
+	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		return NULL;
 
-	skb_reserve(skb, xdp->data - xdp->data_hard_start);
 	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
 	if (metasize)
 		skb_metadata_set(skb, metasize);
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 net-next 9/9] ixgbe: respect metadata on XSK Rx to skb
  2021-12-08 14:06 [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
                   ` (7 preceding siblings ...)
  2021-12-08 14:07 ` [PATCH v4 net-next 8/9] ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
@ 2021-12-08 14:07 ` Alexander Lobakin
  2022-01-11 11:52   ` [Intel-wired-lan] " Penigalapati, Sandeep
  2022-01-10 10:11 ` [Intel-wired-lan] [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Bhandare, KiranX
  9 siblings, 1 reply; 27+ messages in thread
From: Alexander Lobakin @ 2021-12-08 14:07 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Alexander Lobakin, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jithu Joseph, Martin KaFai Lau, Song Liu, KP Singh,
	Yonghong Song, Andrii Nakryiko, netdev, linux-kernel, bpf,
	Jesper Dangaard Brouer

For now, if the XDP prog returns XDP_PASS on XSK, the metadata
will be lost as it doesn't get copied to the skb.
Copy it along with the frame headers. Account its size on skb
allocation, and when copying just treat it as a part of the frame
and do a pull after to "move" it to the "reserved" zone.
net_prefetch() xdp->data_meta and align the copy size to speed-up
memcpy() a little and better match ixgbe_costruct_skb().

Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support")
Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index db20dc4c2488..ec1e2da72676 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -209,19 +209,25 @@ bool ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 count)
 static struct sk_buff *ixgbe_construct_skb_zc(struct ixgbe_ring *rx_ring,
 					      const struct xdp_buff *xdp)
 {
+	unsigned int totalsize = xdp->data_end - xdp->data_meta;
 	unsigned int metasize = xdp->data - xdp->data_meta;
-	unsigned int datasize = xdp->data_end - xdp->data;
 	struct sk_buff *skb;
 
+	net_prefetch(xdp->data_meta);
+
 	/* allocate a skb to store the frags */
-	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize,
+	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, totalsize,
 			       GFP_ATOMIC | __GFP_NOWARN);
 	if (unlikely(!skb))
 		return NULL;
 
-	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
-	if (metasize)
+	memcpy(__skb_put(skb, totalsize), xdp->data_meta,
+	       ALIGN(totalsize, sizeof(long)));
+
+	if (metasize) {
 		skb_metadata_set(skb, metasize);
+		__skb_pull(skb, metasize);
+	}
 
 	return skb;
 }
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-08 14:06 ` [PATCH v4 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
@ 2021-12-09  8:19   ` Jesper Dangaard Brouer
  2021-12-09 17:33     ` Alexander Lobakin
  2022-01-29  8:55   ` [Intel-wired-lan] " Bhandare, KiranX
  1 sibling, 1 reply; 27+ messages in thread
From: Jesper Dangaard Brouer @ 2021-12-09  8:19 UTC (permalink / raw)
  To: Alexander Lobakin, intel-wired-lan
  Cc: brouer, Jesse Brandeburg, Tony Nguyen, David S. Miller,
	Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Björn Töpel,
	Maciej Fijalkowski, Michal Swiatkowski, Jithu Joseph,
	Martin KaFai Lau, Song Liu, KP Singh, Yonghong Song,
	Andrii Nakryiko, netdev, linux-kernel, bpf



On 08/12/2021 15.06, Alexander Lobakin wrote:
> {__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
> + NET_IP_ALIGN for any skb.
> OTOH, i40e_construct_skb_zc() currently allocates and reserves
> additional `xdp->data - xdp->data_hard_start`, which is
> XDP_PACKET_HEADROOM for XSK frames.
> There's no need for that at all as the frame is post-XDP and will
> go only to the networking stack core.

I disagree with this assumption, that headroom is not needed by netstack.
Why "no need for that at all" for netstack?

Having headroom is important for netstack in general.  When packet will 
grow we avoid realloc of SKB.  Use-case could also be cpumap or veth 
redirect, or XDP-generic, that expect this headroom.


> Pass the size of the actual data only to __napi_alloc_skb() and
> don't reserve anything. This will give enough headroom for stack
> processing.
> 
> Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support")
> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> ---
>   drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 +---
>   1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> index f08d19b8c554..9564906b7da8 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> @@ -245,13 +245,11 @@ static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring,
>   	struct sk_buff *skb;
>   
>   	/* allocate a skb to store the frags */
> -	skb = __napi_alloc_skb(&rx_ring->q_vector->napi,
> -			       xdp->data_end - xdp->data_hard_start,
> +	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize,
>   			       GFP_ATOMIC | __GFP_NOWARN);
>   	if (unlikely(!skb))
>   		goto out;
>   
> -	skb_reserve(skb, xdp->data - xdp->data_hard_start);
>   	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
>   	if (metasize)
>   		skb_metadata_set(skb, metasize);
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/9] i40e: respect metadata on XSK Rx to skb
  2021-12-08 14:06 ` [PATCH v4 net-next 2/9] i40e: respect metadata " Alexander Lobakin
@ 2021-12-09  8:27   ` Jesper Dangaard Brouer
  2021-12-09 17:38     ` Alexander Lobakin
  2022-01-10 11:24   ` [Intel-wired-lan] " Bhandare, KiranX
  1 sibling, 1 reply; 27+ messages in thread
From: Jesper Dangaard Brouer @ 2021-12-09  8:27 UTC (permalink / raw)
  To: Alexander Lobakin, intel-wired-lan
  Cc: brouer, Jesse Brandeburg, Tony Nguyen, David S. Miller,
	Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Björn Töpel,
	Maciej Fijalkowski, Michal Swiatkowski, Jithu Joseph,
	Martin KaFai Lau, Song Liu, KP Singh, Yonghong Song,
	Andrii Nakryiko, netdev, linux-kernel, bpf



On 08/12/2021 15.06, Alexander Lobakin wrote:
> For now, if the XDP prog returns XDP_PASS on XSK, the metadata will
> be lost as it doesn't get copied to the skb.

I have an urge to add a newline here, when reading this, as IMHO it is a 
paragraph with the problem statement.

> Copy it along with the frame headers. Account its size on skb
> allocation, and when copying just treat it as a part of the frame
> and do a pull after to "move" it to the "reserved" zone.

Also newline here, as next paragraph are some extra details, you felt a 
need to explain to the reader.

> net_prefetch() xdp->data_meta and align the copy size to speed-up
> memcpy() a little and better match i40e_costruct_skb().
                                      ^^^^^^xx^^^^^^^^^

You have a general misspelling of this function name in all of your 
commit messages.

--Jesper


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-09  8:19   ` Jesper Dangaard Brouer
@ 2021-12-09 17:33     ` Alexander Lobakin
  2021-12-10 13:31       ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 27+ messages in thread
From: Alexander Lobakin @ 2021-12-09 17:33 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Alexander Lobakin, intel-wired-lan, brouer, Jesse Brandeburg,
	Tony Nguyen, David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jithu Joseph, Martin KaFai Lau, Song Liu, KP Singh,
	Yonghong Song, Andrii Nakryiko, netdev, linux-kernel, bpf

From: Jesper Dangaard Brouer <jbrouer@redhat.com>
Date: Thu, 9 Dec 2021 09:19:46 +0100

> On 08/12/2021 15.06, Alexander Lobakin wrote:
> > {__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
> > + NET_IP_ALIGN for any skb.
> > OTOH, i40e_construct_skb_zc() currently allocates and reserves
> > additional `xdp->data - xdp->data_hard_start`, which is
> > XDP_PACKET_HEADROOM for XSK frames.
> > There's no need for that at all as the frame is post-XDP and will
> > go only to the networking stack core.
> 
> I disagree with this assumption, that headroom is not needed by netstack.
> Why "no need for that at all" for netstack?

napi_alloc_skb() in our particular case will reserve 64 bytes, it is
sufficient for {TCP,UDP,SCTP,...}/IPv{4,6} etc.

> 
> Having headroom is important for netstack in general.  When packet will 
> grow we avoid realloc of SKB.  Use-case could also be cpumap or veth 
> redirect, or XDP-generic, that expect this headroom.

Well, those are not common cases at all.
Allocating 256 bytes more for some hypothetical usecases (and having
320 in total) is more expensive than expanding headroom in-place.
I don't know any other drivers or ifaces which reserve
XDP_PACKET_HEADROOM just for the case of using both driver-side
and generic XDP at the same time. To be more precise, I can't
remember any driver which would check whether generic XDP is enabled
for its netdev(s).

As a second option, I was trying to get exactly XDP_PACKET_HEADROOM
of headroom, but this involves either __alloc_skb() which is slower
than napi_alloc_skb(), or

	skb = napi_alloc_skb(napi, xdp->data_end -
				   xdp->data_hard_start -
				   NET_SKB_PAD);
	skb_reserve(skb, xdp->data_meta - xdp->data_hard_start -
			 NET_SKB_PAD);

Doesn't look good for me.

We could probably introduce a version of napi_alloc_skb() which
wouldn't reserve any headroom for you to have more control over it,
but that's more global material than these local fixes I'd say.

> 
> 
> > Pass the size of the actual data only to __napi_alloc_skb() and
> > don't reserve anything. This will give enough headroom for stack
> > processing.
> > 
> > Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support")
> > Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> > Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> > ---
> >   drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 +---
> >   1 file changed, 1 insertion(+), 3 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > index f08d19b8c554..9564906b7da8 100644
> > --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > @@ -245,13 +245,11 @@ static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring,
> >   	struct sk_buff *skb;
> >   
> >   	/* allocate a skb to store the frags */
> > -	skb = __napi_alloc_skb(&rx_ring->q_vector->napi,
> > -			       xdp->data_end - xdp->data_hard_start,
> > +	skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize,
> >   			       GFP_ATOMIC | __GFP_NOWARN);
> >   	if (unlikely(!skb))
> >   		goto out;
> >   
> > -	skb_reserve(skb, xdp->data - xdp->data_hard_start);
> >   	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
> >   	if (metasize)
> >   		skb_metadata_set(skb, metasize);

Thanks,
Al

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/9] i40e: respect metadata on XSK Rx to skb
  2021-12-09  8:27   ` Jesper Dangaard Brouer
@ 2021-12-09 17:38     ` Alexander Lobakin
  2021-12-09 18:50       ` Nguyen, Anthony L
  0 siblings, 1 reply; 27+ messages in thread
From: Alexander Lobakin @ 2021-12-09 17:38 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Alexander Lobakin, intel-wired-lan, brouer, Jesse Brandeburg,
	Tony Nguyen, David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jithu Joseph, Martin KaFai Lau, Song Liu, KP Singh,
	Yonghong Song, Andrii Nakryiko, netdev, linux-kernel, bpf

From: Jesper Dangaard Brouer <jbrouer@redhat.com>
Date: Thu, 9 Dec 2021 09:27:37 +0100

> On 08/12/2021 15.06, Alexander Lobakin wrote:
> > For now, if the XDP prog returns XDP_PASS on XSK, the metadata will
> > be lost as it doesn't get copied to the skb.
> 
> I have an urge to add a newline here, when reading this, as IMHO it is a 
> paragraph with the problem statement.
> 
> > Copy it along with the frame headers. Account its size on skb
> > allocation, and when copying just treat it as a part of the frame
> > and do a pull after to "move" it to the "reserved" zone.
> 
> Also newline here, as next paragraph are some extra details, you felt a 
> need to explain to the reader.
> 
> > net_prefetch() xdp->data_meta and align the copy size to speed-up
> > memcpy() a little and better match i40e_costruct_skb().
>                                       ^^^^^^xx^^^^^^^^^
> 
> You have a general misspelling of this function name in all of your 
> commit messages.

Oh gosh, I thought I don't have attention deficit. Thanks, maybe
Tony will fix it for me or I could send a follow-up (or resend if
needed, I saw those were already applied to dev-queue).

> 
> --Jesper

Al

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/9] i40e: respect metadata on XSK Rx to skb
  2021-12-09 17:38     ` Alexander Lobakin
@ 2021-12-09 18:50       ` Nguyen, Anthony L
  2021-12-10 11:08         ` Alexander Lobakin
  0 siblings, 1 reply; 27+ messages in thread
From: Nguyen, Anthony L @ 2021-12-09 18:50 UTC (permalink / raw)
  To: Lobakin, Alexandr, jbrouer
  Cc: songliubraving, hawk, kafai, andrii, davem, Fijalkowski, Maciej,
	john.fastabend, Brandeburg, Jesse, kpsingh, ast, linux-kernel,
	brouer, yhs, Joseph, Jithu, intel-wired-lan, kuba, bjorn, daniel,
	netdev, michal.swiatkowski, bpf

On Thu, 2021-12-09 at 18:38 +0100, Alexander Lobakin wrote:
> From: Jesper Dangaard Brouer <jbrouer@redhat.com>
> Date: Thu, 9 Dec 2021 09:27:37 +0100
> 
> > On 08/12/2021 15.06, Alexander Lobakin wrote:
> > > For now, if the XDP prog returns XDP_PASS on XSK, the metadata
> > > will
> > > be lost as it doesn't get copied to the skb.
> > 
> > I have an urge to add a newline here, when reading this, as IMHO it
> > is a 
> > paragraph with the problem statement.
> > 
> > > Copy it along with the frame headers. Account its size on skb
> > > allocation, and when copying just treat it as a part of the frame
> > > and do a pull after to "move" it to the "reserved" zone.
> > 
> > Also newline here, as next paragraph are some extra details, you
> > felt a 
> > need to explain to the reader.
> > 
> > > net_prefetch() xdp->data_meta and align the copy size to speed-up
> > > memcpy() a little and better match i40e_costruct_skb().
> >                                       ^^^^^^xx^^^^^^^^^
> > 
> > commit messages.
> 
> Oh gosh, I thought I don't have attention deficit. Thanks, maybe
> Tony will fix it for me or I could send a follow-up (or resend if
> needed, I saw those were already applied to dev-queue).

If there's no need for follow-ups beyond this change, I'll fix it up.

Thanks,
Tony

> > --Jesper
> 
> Al


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/9] i40e: respect metadata on XSK Rx to skb
  2021-12-09 18:50       ` Nguyen, Anthony L
@ 2021-12-10 11:08         ` Alexander Lobakin
  0 siblings, 0 replies; 27+ messages in thread
From: Alexander Lobakin @ 2021-12-10 11:08 UTC (permalink / raw)
  To: Tony Nguyen
  Cc: Alexander Lobakin, jbrouer, songliubraving, hawk, kafai, andrii,
	davem, Maciej Fijalkowski, john.fastabend, Jesse Brandeburg,
	kpsingh, ast, linux-kernel, brouer, yhs, Jith Joseph,
	intel-wired-lan, kuba, bjorn, daniel, netdev, Michal Swiatkowski,
	bpf

From: Tony Nguyen <anthony.l.nguyen@intel.com>
Date: Thu, 9 Dec 2021 18:50:07 +0000

> On Thu, 2021-12-09 at 18:38 +0100, Alexander Lobakin wrote:
> > From: Jesper Dangaard Brouer <jbrouer@redhat.com>
> > Date: Thu, 9 Dec 2021 09:27:37 +0100
> > 
> > > On 08/12/2021 15.06, Alexander Lobakin wrote:
> > > > For now, if the XDP prog returns XDP_PASS on XSK, the metadata
> > > > will
> > > > be lost as it doesn't get copied to the skb.
> > > 
> > > I have an urge to add a newline here, when reading this, as IMHO it
> > > is a 
> > > paragraph with the problem statement.
> > > 
> > > > Copy it along with the frame headers. Account its size on skb
> > > > allocation, and when copying just treat it as a part of the frame
> > > > and do a pull after to "move" it to the "reserved" zone.
> > > 
> > > Also newline here, as next paragraph are some extra details, you
> > > felt a 
> > > need to explain to the reader.
> > > 
> > > > net_prefetch() xdp->data_meta and align the copy size to speed-up
> > > > memcpy() a little and better match i40e_costruct_skb().
> > >                                       ^^^^^^xx^^^^^^^^^
> > > 
> > > commit messages.
> > 
> > Oh gosh, I thought I don't have attention deficit. Thanks, maybe
> > Tony will fix it for me or I could send a follow-up (or resend if
> > needed, I saw those were already applied to dev-queue).
> 
> If there's no need for follow-ups beyond this change, I'll fix it up.

The rest is fine, thank you!

> Thanks,
> Tony
> 
> > > --Jesper
> > 
> > Al

Al

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-09 17:33     ` Alexander Lobakin
@ 2021-12-10 13:31       ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 27+ messages in thread
From: Jesper Dangaard Brouer @ 2021-12-10 13:31 UTC (permalink / raw)
  To: Alexander Lobakin, Jesper Dangaard Brouer
  Cc: brouer, intel-wired-lan, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Michal Swiatkowski,
	Jithu Joseph, Martin KaFai Lau, Song Liu, KP Singh,
	Yonghong Song, Andrii Nakryiko, netdev, linux-kernel, bpf



On 09/12/2021 18.33, Alexander Lobakin wrote:
> From: Jesper Dangaard Brouer <jbrouer@redhat.com>
> Date: Thu, 9 Dec 2021 09:19:46 +0100
> 
>> On 08/12/2021 15.06, Alexander Lobakin wrote:
>>> {__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
>>> + NET_IP_ALIGN for any skb.
>>> OTOH, i40e_construct_skb_zc() currently allocates and reserves
>>> additional `xdp->data - xdp->data_hard_start`, which is
>>> XDP_PACKET_HEADROOM for XSK frames.
>>> There's no need for that at all as the frame is post-XDP and will
>>> go only to the networking stack core.
>>
>> I disagree with this assumption, that headroom is not needed by netstack.
>> Why "no need for that at all" for netstack?
> 
> napi_alloc_skb() in our particular case will reserve 64 bytes, it is
> sufficient for {TCP,UDP,SCTP,...}/IPv{4,6} etc.

My bad, I misunderstood you. I now see (looking at code) that (as you 
say) 64 bytes of headroom *is* reserved (in bottom of __napi_alloc_skb).
Thus, the SKB *do* have headroom, so this patch should be fine.

Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>

Do watch out that 64 bytes is not always enough. Notice the define 
LL_MAX_HEADER and MAX_HEADER in include/linux/netdevice.h (that tries to 
determine worst-case header length) which is above 64 bytes. It is also 
affected by HyperV and WiFi configs.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Intel-wired-lan] [PATCH v4 net-next 6/9] igc: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-08 14:06 ` [PATCH v4 net-next 6/9] igc: don't reserve excessive XDP_PACKET_HEADROOM " Alexander Lobakin
@ 2021-12-27 20:34   ` Kraus, NechamaX
  0 siblings, 0 replies; 27+ messages in thread
From: Kraus, NechamaX @ 2021-12-27 20:34 UTC (permalink / raw)
  To: Alexander Lobakin, intel-wired-lan
  Cc: Song Liu, Jesper Dangaard Brouer, Daniel Borkmann, Yonghong Song,
	Martin KaFai Lau, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, Björn Töpel, netdev, Jakub Kicinski,
	KP Singh, bpf, David S. Miller, linux-kernel

On 12/8/2021 16:06, Alexander Lobakin wrote:
> {__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
> + NET_IP_ALIGN for any skb.
> OTOH, igc_construct_skb_zc() currently allocates and reserves
> additional `xdp->data_meta - xdp->data_hard_start`, which is about
> XDP_PACKET_HEADROOM for XSK frames.
> There's no need for that at all as the frame is post-XDP and will
> go only to the networking stack core.
> Pass the size of the actual data only (+ meta) to
> __napi_alloc_skb() and don't reserve anything. This will give
> enough headroom for stack processing.
> Also, net_prefetch() xdp->data_meta and align the copy size to
> speed-up memcpy() a little and better match igc_costruct_skb().
> 
> Fixes: fc9df2a0b520 ("igc: Enable RX via AF_XDP zero-copy")
> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> ---
>   drivers/net/ethernet/intel/igc/igc_main.c | 13 +++++++------
>   1 file changed, 7 insertions(+), 6 deletions(-)
>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [Intel-wired-lan] [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata
  2021-12-08 14:06 [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
                   ` (8 preceding siblings ...)
  2021-12-08 14:07 ` [PATCH v4 net-next 9/9] ixgbe: respect metadata " Alexander Lobakin
@ 2022-01-10 10:11 ` Bhandare, KiranX
  9 siblings, 0 replies; 27+ messages in thread
From: Bhandare, KiranX @ 2022-01-10 10:11 UTC (permalink / raw)
  To: Lobakin, Alexandr, intel-wired-lan
  Cc: Song Liu, Jesper Dangaard Brouer, Daniel Borkmann, Yonghong Song,
	Martin KaFai Lau, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, Björn Töpel, netdev, Jakub Kicinski,
	KP Singh, bpf, David S. Miller, linux-kernel


> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Alexander Lobakin
> Sent: Wednesday, December 8, 2021 7:37 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Song Liu <songliubraving@fb.com>; Jesper Dangaard Brouer
> <hawk@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>; Yonghong
> Song <yhs@fb.com>; Martin KaFai Lau <kafai@fb.com>; John Fastabend
> <john.fastabend@gmail.com>; Alexei Starovoitov <ast@kernel.org>; Andrii
> Nakryiko <andrii@kernel.org>; Björn Töpel <bjorn@kernel.org>;
> netdev@vger.kernel.org; Jakub Kicinski <kuba@kernel.org>; KP Singh
> <kpsingh@kernel.org>; bpf@vger.kernel.org; David S. Miller
> <davem@davemloft.net>; linux-kernel@vger.kernel.org
> Subject: [Intel-wired-lan] [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb()
> vs metadata
> 
> This is an interpolation of [0] to other Intel Ethernet drivers (and is (re)based
> on its code).
> The main aim is to keep XDP metadata not only in case with build_skb(), but
> also when we do napi_alloc_skb() + memcpy().
> 
> All Intel drivers suffers from the same here:
>  - metadata gets lost on XDP_PASS in legacy-rx;
>  - excessive headroom allocation on XSK Rx to skbs;
>  - metadata gets lost on XSK Rx to skbs.
> 
> Those get especially actual in XDP Hints upcoming.
> I couldn't have addressed the first one for all Intel drivers due to that they
> don't reserve any headroom for now in legacy-rx mode even with XDP
> enabled. This is hugely wrong, but requires quite a bunch of work and a
> separate series. Luckily, ice doesn't suffer from that.
> igc has 1 and 3 already fixed in [0].
> 
> From v3 ([1]):
>  - fix driver name and ixgbe_construct_skb() function name in the
>    commit message of #9 (Jesper);
>  - no functional changes.
> 
> From v2 (unreleased upstream):
>  - tweaked 007 to pass bi->xdp directly and simplify code (Maciej);
>  - picked Michal's Reviewed-by.
> 
> From v1 (unreleased upstream):
>  - drop "fixes" of legacy-rx for i40e, igb and ixgbe since they have
>    another flaw regarding headroom (see above);
>  - drop igc cosmetic fixes since they landed upstream incorporated
>    into Jesper's commits;
>  - picked one Acked-by from Maciej.
> 
> [0]
> https://lore.kernel.org/netdev/163700856423.565980.101625649213476937
> 58.stgit@firesoul
> [1] https://lore.kernel.org/netdev/20211207205536.563550-1-
> alexandr.lobakin@intel.com
> 
> Alexander Lobakin (9):
>   i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
>   i40e: respect metadata on XSK Rx to skb
>   ice: respect metadata in legacy-rx/ice_construct_skb()
>   ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
>   ice: respect metadata on XSK Rx to skb
>   igc: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
>   ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directly
>   ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
>   ixgbe: respect metadata on XSK Rx to skb
> 
>  drivers/net/ethernet/intel/i40e/i40e_xsk.c   | 16 +++++++-----
>  drivers/net/ethernet/intel/ice/ice_txrx.c    | 15 ++++++++---
>  drivers/net/ethernet/intel/ice/ice_xsk.c     | 16 +++++++-----
>  drivers/net/ethernet/intel/igc/igc_main.c    | 13 +++++-----
>  drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 27 ++++++++++++--------
>  5 files changed, 54 insertions(+), 33 deletions(-)
> 

Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>  A Contingent Worker at Intel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [Intel-wired-lan] [PATCH v4 net-next 4/9] ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-08 14:06 ` [PATCH v4 net-next 4/9] ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
@ 2022-01-10 10:13   ` Bhandare, KiranX
  0 siblings, 0 replies; 27+ messages in thread
From: Bhandare, KiranX @ 2022-01-10 10:13 UTC (permalink / raw)
  To: Lobakin, Alexandr, intel-wired-lan
  Cc: Song Liu, Jesper Dangaard Brouer, Daniel Borkmann, Yonghong Song,
	Martin KaFai Lau, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, Björn Töpel, netdev, Jakub Kicinski,
	KP Singh, bpf, David S. Miller, linux-kernel


> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Alexander Lobakin
> Sent: Wednesday, December 8, 2021 7:37 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Song Liu <songliubraving@fb.com>; Jesper Dangaard Brouer
> <hawk@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>; Yonghong
> Song <yhs@fb.com>; Martin KaFai Lau <kafai@fb.com>; John Fastabend
> <john.fastabend@gmail.com>; Alexei Starovoitov <ast@kernel.org>; Andrii
> Nakryiko <andrii@kernel.org>; Björn Töpel <bjorn@kernel.org>;
> netdev@vger.kernel.org; Jakub Kicinski <kuba@kernel.org>; KP Singh
> <kpsingh@kernel.org>; bpf@vger.kernel.org; David S. Miller
> <davem@davemloft.net>; linux-kernel@vger.kernel.org
> Subject: [Intel-wired-lan] [PATCH v4 net-next 4/9] ice: don't reserve excessive
> XDP_PACKET_HEADROOM on XSK Rx to skb
> 
> {__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
> + NET_IP_ALIGN for any skb.
> OTOH, ice_construct_skb_zc() currently allocates and reserves additional
> `xdp->data - xdp->data_hard_start`, which is XDP_PACKET_HEADROOM for
> XSK frames.
> There's no need for that at all as the frame is post-XDP and will go only to the
> networking stack core.
> Pass the size of the actual data only to __napi_alloc_skb() and don't reserve
> anything. This will give enough headroom for stack processing.
> 
> Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_xsk.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 

Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>  A Contingent Worker at Intel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [Intel-wired-lan] [PATCH v4 net-next 3/9] ice: respect metadata in legacy-rx/ice_construct_skb()
  2021-12-08 14:06 ` [PATCH v4 net-next 3/9] ice: respect metadata in legacy-rx/ice_construct_skb() Alexander Lobakin
@ 2022-01-10 10:16   ` Bhandare, KiranX
  0 siblings, 0 replies; 27+ messages in thread
From: Bhandare, KiranX @ 2022-01-10 10:16 UTC (permalink / raw)
  To: Lobakin, Alexandr, intel-wired-lan
  Cc: Song Liu, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	John Fastabend, Brouer, Jesper, Yonghong Song,
	Jesper Dangaard Brouer, KP Singh, Jakub Kicinski, netdev,
	linux-kernel, David S. Miller, Björn Töpel, bpf,
	Martin KaFai Lau


> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Alexander Lobakin
> Sent: Wednesday, December 8, 2021 7:37 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Song Liu <songliubraving@fb.com>; Alexei Starovoitov <ast@kernel.org>;
> Andrii Nakryiko <andrii@kernel.org>; Daniel Borkmann
> <daniel@iogearbox.net>; John Fastabend <john.fastabend@gmail.com>;
> Jesper Dangaard Brouer <brouer@redhat.com>; Yonghong Song
> <yhs@fb.com>; Jesper Dangaard Brouer <hawk@kernel.org>; KP Singh
> <kpsingh@kernel.org>; Jakub Kicinski <kuba@kernel.org>;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org; David S. Miller
> <davem@davemloft.net>; Björn Töpel <bjorn@kernel.org>;
> bpf@vger.kernel.org; Martin KaFai Lau <kafai@fb.com>
> Subject: [Intel-wired-lan] [PATCH v4 net-next 3/9] ice: respect metadata in
> legacy-rx/ice_construct_skb()
> 
> In "legacy-rx" mode represented by ice_construct_skb(), we can still use XDP
> (and XDP metadata), but after XDP_PASS the metadata will be lost as it
> doesn't get copied to the skb.
> Copy it along with the frame headers. Account its size on skb allocation, and
> when copying just treat it as a part of the frame and do a pull after to "move"
> it to the "reserved" zone.
> Point net_prefetch() to xdp->data_meta instead of data. This won't change
> anything when the meta is not here, but will save some cache misses
> otherwise.
> 
> Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_txrx.c | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 

Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>  A Contingent Worker at Intel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [Intel-wired-lan] [PATCH v4 net-next 2/9] i40e: respect metadata on XSK Rx to skb
  2021-12-08 14:06 ` [PATCH v4 net-next 2/9] i40e: respect metadata " Alexander Lobakin
  2021-12-09  8:27   ` Jesper Dangaard Brouer
@ 2022-01-10 11:24   ` Bhandare, KiranX
  1 sibling, 0 replies; 27+ messages in thread
From: Bhandare, KiranX @ 2022-01-10 11:24 UTC (permalink / raw)
  To: Lobakin, Alexandr, intel-wired-lan
  Cc: Song Liu, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	John Fastabend, Brouer, Jesper, Yonghong Song,
	Jesper Dangaard Brouer, KP Singh, Jakub Kicinski, netdev,
	linux-kernel, David S. Miller, Björn Töpel, bpf,
	Martin KaFai Lau


> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Lobakin, Alexandr
> Sent: Wednesday, December 8, 2021 7:37 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Song Liu <songliubraving@fb.com>; Alexei Starovoitov <ast@kernel.org>;
> Andrii Nakryiko <andrii@kernel.org>; Daniel Borkmann
> <daniel@iogearbox.net>; John Fastabend <john.fastabend@gmail.com>;
> Brouer, Jesper <brouer@redhat.com>; Yonghong Song <yhs@fb.com>;
> Jesper Dangaard Brouer <hawk@kernel.org>; KP Singh
> <kpsingh@kernel.org>; Jakub Kicinski <kuba@kernel.org>;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org; David S. Miller
> <davem@davemloft.net>; Björn Töpel <bjorn@kernel.org>;
> bpf@vger.kernel.org; Martin KaFai Lau <kafai@fb.com>
> Subject: [Intel-wired-lan] [PATCH v4 net-next 2/9] i40e: respect metadata on
> XSK Rx to skb
> 
> For now, if the XDP prog returns XDP_PASS on XSK, the metadata will be lost
> as it doesn't get copied to the skb.
> Copy it along with the frame headers. Account its size on skb allocation, and
> when copying just treat it as a part of the frame and do a pull after to "move"
> it to the "reserved" zone.
> net_prefetch() xdp->data_meta and align the copy size to speed-up
> memcpy() a little and better match i40e_costruct_skb().
> 
> Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support")
> Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_xsk.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 

Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>  A Contingent Worker at Intel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [Intel-wired-lan] [PATCH v4 net-next 5/9] ice: respect metadata on XSK Rx to skb
  2021-12-08 14:06 ` [PATCH v4 net-next 5/9] ice: respect metadata " Alexander Lobakin
@ 2022-01-10 11:37   ` Bhandare, KiranX
  0 siblings, 0 replies; 27+ messages in thread
From: Bhandare, KiranX @ 2022-01-10 11:37 UTC (permalink / raw)
  To: Lobakin, Alexandr, intel-wired-lan
  Cc: Song Liu, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	John Fastabend, Brouer, Jesper, Yonghong Song,
	Jesper Dangaard Brouer, KP Singh, Jakub Kicinski, netdev,
	linux-kernel, David S. Miller, Björn Töpel, bpf,
	Martin KaFai Lau


> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Alexander Lobakin
> Sent: Wednesday, December 8, 2021 7:37 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Song Liu <songliubraving@fb.com>; Alexei Starovoitov <ast@kernel.org>;
> Andrii Nakryiko <andrii@kernel.org>; Daniel Borkmann
> <daniel@iogearbox.net>; John Fastabend <john.fastabend@gmail.com>;
> Jesper Dangaard Brouer <brouer@redhat.com>; Yonghong Song
> <yhs@fb.com>; Jesper Dangaard Brouer <hawk@kernel.org>; KP Singh
> <kpsingh@kernel.org>; Jakub Kicinski <kuba@kernel.org>;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org; David S. Miller
> <davem@davemloft.net>; Björn Töpel <bjorn@kernel.org>;
> bpf@vger.kernel.org; Martin KaFai Lau <kafai@fb.com>
> Subject: [Intel-wired-lan] [PATCH v4 net-next 5/9] ice: respect metadata on
> XSK Rx to skb
> 
> For now, if the XDP prog returns XDP_PASS on XSK, the metadata will be lost
> as it doesn't get copied to the skb.
> Copy it along with the frame headers. Account its size on skb allocation, and
> when copying just treat it as a part of the frame and do a pull after to "move"
> it to the "reserved" zone.
> net_prefetch() xdp->data_meta and align the copy size to speed-up
> memcpy() a little and better match ice_costruct_skb().
> 
> Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_xsk.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 

Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>  A Contingent Worker at Intel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [Intel-wired-lan] [PATCH v4 net-next 7/9] ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directly
  2021-12-08 14:07 ` [PATCH v4 net-next 7/9] ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directly Alexander Lobakin
@ 2022-01-11  7:30   ` Penigalapati, Sandeep
  0 siblings, 0 replies; 27+ messages in thread
From: Penigalapati, Sandeep @ 2022-01-11  7:30 UTC (permalink / raw)
  To: Lobakin, Alexandr, intel-wired-lan
  Cc: Song Liu, Jesper Dangaard Brouer, Daniel Borkmann, Yonghong Song,
	Martin KaFai Lau, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, Björn Töpel, netdev, Jakub Kicinski,
	KP Singh, bpf, David S. Miller, linux-kernel

>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Alexander Lobakin
>Sent: Wednesday, December 8, 2021 7:37 PM
>To: intel-wired-lan@lists.osuosl.org
>Cc: Song Liu <songliubraving@fb.com>; Jesper Dangaard Brouer
><hawk@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>; Yonghong
>Song <yhs@fb.com>; Martin KaFai Lau <kafai@fb.com>; John Fastabend
><john.fastabend@gmail.com>; Alexei Starovoitov <ast@kernel.org>; Andrii
>Nakryiko <andrii@kernel.org>; Björn Töpel <bjorn@kernel.org>;
>netdev@vger.kernel.org; Jakub Kicinski <kuba@kernel.org>; KP Singh
><kpsingh@kernel.org>; bpf@vger.kernel.org; David S. Miller
><davem@davemloft.net>; linux-kernel@vger.kernel.org
>Subject: [Intel-wired-lan] [PATCH v4 net-next 7/9] ixgbe: pass bi->xdp to
>ixgbe_construct_skb_zc() directly
>
>To not dereference bi->xdp each time in ixgbe_construct_skb_zc(), pass bi-
>>xdp as an argument instead of bi. We can also call
>xsk_buff_free() outside of the function as well as assign bi->xdp to NULL,
>which seems to make it closer to its name.
>
>Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
>---
> drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 19 ++++++++++---------
> 1 file changed, 10 insertions(+), 9 deletions(-)
>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [Intel-wired-lan] [PATCH v4 net-next 8/9] ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-08 14:07 ` [PATCH v4 net-next 8/9] ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
@ 2022-01-11 11:51   ` Penigalapati, Sandeep
  0 siblings, 0 replies; 27+ messages in thread
From: Penigalapati, Sandeep @ 2022-01-11 11:51 UTC (permalink / raw)
  To: Lobakin, Alexandr, intel-wired-lan
  Cc: Song Liu, Jesper Dangaard Brouer, Daniel Borkmann, Yonghong Song,
	Martin KaFai Lau, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, Björn Töpel, netdev, Jakub Kicinski,
	KP Singh, bpf, David S. Miller, linux-kernel

>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Alexander Lobakin
>Sent: Wednesday, December 8, 2021 7:37 PM
>To: intel-wired-lan@lists.osuosl.org
>Cc: Song Liu <songliubraving@fb.com>; Jesper Dangaard Brouer
><hawk@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>; Yonghong
>Song <yhs@fb.com>; Martin KaFai Lau <kafai@fb.com>; John Fastabend
><john.fastabend@gmail.com>; Alexei Starovoitov <ast@kernel.org>; Andrii
>Nakryiko <andrii@kernel.org>; Björn Töpel <bjorn@kernel.org>;
>netdev@vger.kernel.org; Jakub Kicinski <kuba@kernel.org>; KP Singh
><kpsingh@kernel.org>; bpf@vger.kernel.org; David S. Miller
><davem@davemloft.net>; linux-kernel@vger.kernel.org
>Subject: [Intel-wired-lan] [PATCH v4 net-next 8/9] ixgbe: don't reserve
>excessive XDP_PACKET_HEADROOM on XSK Rx to skb
>
>{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
>+ NET_IP_ALIGN for any skb.
>OTOH, ixgbe_construct_skb_zc() currently allocates and reserves additional
>`xdp->data - xdp->data_hard_start`, which is XDP_PACKET_HEADROOM for
>XSK frames.
>There's no need for that at all as the frame is post-XDP and will go only to the
>networking stack core.
>Pass the size of the actual data only to __napi_alloc_skb() and don't reserve
>anything. This will give enough headroom for stack processing.
>
>Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support")
>Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
>Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
>---
> drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [Intel-wired-lan] [PATCH v4 net-next 9/9] ixgbe: respect metadata on XSK Rx to skb
  2021-12-08 14:07 ` [PATCH v4 net-next 9/9] ixgbe: respect metadata " Alexander Lobakin
@ 2022-01-11 11:52   ` Penigalapati, Sandeep
  0 siblings, 0 replies; 27+ messages in thread
From: Penigalapati, Sandeep @ 2022-01-11 11:52 UTC (permalink / raw)
  To: Lobakin, Alexandr, intel-wired-lan
  Cc: Song Liu, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	John Fastabend, Brouer, Jesper, Yonghong Song,
	Jesper Dangaard Brouer, KP Singh, Jakub Kicinski, netdev,
	linux-kernel, David S. Miller, Björn Töpel, bpf,
	Martin KaFai Lau

>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Alexander Lobakin
>Sent: Wednesday, December 8, 2021 7:37 PM
>To: intel-wired-lan@lists.osuosl.org
>Cc: Song Liu <songliubraving@fb.com>; Alexei Starovoitov <ast@kernel.org>;
>Andrii Nakryiko <andrii@kernel.org>; Daniel Borkmann
><daniel@iogearbox.net>; John Fastabend <john.fastabend@gmail.com>;
>Jesper Dangaard Brouer <brouer@redhat.com>; Yonghong Song
><yhs@fb.com>; Jesper Dangaard Brouer <hawk@kernel.org>; KP Singh
><kpsingh@kernel.org>; Jakub Kicinski <kuba@kernel.org>;
>netdev@vger.kernel.org; linux-kernel@vger.kernel.org; David S. Miller
><davem@davemloft.net>; Björn Töpel <bjorn@kernel.org>;
>bpf@vger.kernel.org; Martin KaFai Lau <kafai@fb.com>
>Subject: [Intel-wired-lan] [PATCH v4 net-next 9/9] ixgbe: respect metadata on
>XSK Rx to skb
>
>For now, if the XDP prog returns XDP_PASS on XSK, the metadata will be lost
>as it doesn't get copied to the skb.
>Copy it along with the frame headers. Account its size on skb allocation, and
>when copying just treat it as a part of the frame and do a pull after to "move"
>it to the "reserved" zone.
>net_prefetch() xdp->data_meta and align the copy size to speed-up
>memcpy() a little and better match ixgbe_costruct_skb().
>
>Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support")
>Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
>Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
>Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
>---
> drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)
>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [Intel-wired-lan] [PATCH v4 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
  2021-12-08 14:06 ` [PATCH v4 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
  2021-12-09  8:19   ` Jesper Dangaard Brouer
@ 2022-01-29  8:55   ` Bhandare, KiranX
  1 sibling, 0 replies; 27+ messages in thread
From: Bhandare, KiranX @ 2022-01-29  8:55 UTC (permalink / raw)
  To: Lobakin, Alexandr, intel-wired-lan
  Cc: Song Liu, Jesper Dangaard Brouer, Daniel Borkmann, Yonghong Song,
	Martin KaFai Lau, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, Björn Töpel, netdev, Jakub Kicinski,
	KP Singh, bpf, David S. Miller, linux-kernel


> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Alexander Lobakin
> Sent: Wednesday, December 8, 2021 7:37 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Song Liu <songliubraving@fb.com>; Jesper Dangaard Brouer
> <hawk@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>; Yonghong
> Song <yhs@fb.com>; Martin KaFai Lau <kafai@fb.com>; John Fastabend
> <john.fastabend@gmail.com>; Alexei Starovoitov <ast@kernel.org>; Andrii
> Nakryiko <andrii@kernel.org>; Björn Töpel <bjorn@kernel.org>;
> netdev@vger.kernel.org; Jakub Kicinski <kuba@kernel.org>; KP Singh
> <kpsingh@kernel.org>; bpf@vger.kernel.org; David S. Miller
> <davem@davemloft.net>; linux-kernel@vger.kernel.org
> Subject: [Intel-wired-lan] [PATCH v4 net-next 1/9] i40e: don't reserve
> excessive XDP_PACKET_HEADROOM on XSK Rx to skb
> 
> {__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD
> + NET_IP_ALIGN for any skb.
> OTOH, i40e_construct_skb_zc() currently allocates and reserves additional
> `xdp->data - xdp->data_hard_start`, which is XDP_PACKET_HEADROOM for
> XSK frames.
> There's no need for that at all as the frame is post-XDP and will go only to the
> networking stack core.
> Pass the size of the actual data only to __napi_alloc_skb() and don't reserve
> anything. This will give enough headroom for stack processing.
> 
> Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support")
> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 

Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>  A Contingent Worker at Intel

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-01-29  8:55 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-08 14:06 [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Alexander Lobakin
2021-12-08 14:06 ` [PATCH v4 net-next 1/9] i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
2021-12-09  8:19   ` Jesper Dangaard Brouer
2021-12-09 17:33     ` Alexander Lobakin
2021-12-10 13:31       ` Jesper Dangaard Brouer
2022-01-29  8:55   ` [Intel-wired-lan] " Bhandare, KiranX
2021-12-08 14:06 ` [PATCH v4 net-next 2/9] i40e: respect metadata " Alexander Lobakin
2021-12-09  8:27   ` Jesper Dangaard Brouer
2021-12-09 17:38     ` Alexander Lobakin
2021-12-09 18:50       ` Nguyen, Anthony L
2021-12-10 11:08         ` Alexander Lobakin
2022-01-10 11:24   ` [Intel-wired-lan] " Bhandare, KiranX
2021-12-08 14:06 ` [PATCH v4 net-next 3/9] ice: respect metadata in legacy-rx/ice_construct_skb() Alexander Lobakin
2022-01-10 10:16   ` [Intel-wired-lan] " Bhandare, KiranX
2021-12-08 14:06 ` [PATCH v4 net-next 4/9] ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
2022-01-10 10:13   ` [Intel-wired-lan] " Bhandare, KiranX
2021-12-08 14:06 ` [PATCH v4 net-next 5/9] ice: respect metadata " Alexander Lobakin
2022-01-10 11:37   ` [Intel-wired-lan] " Bhandare, KiranX
2021-12-08 14:06 ` [PATCH v4 net-next 6/9] igc: don't reserve excessive XDP_PACKET_HEADROOM " Alexander Lobakin
2021-12-27 20:34   ` [Intel-wired-lan] " Kraus, NechamaX
2021-12-08 14:07 ` [PATCH v4 net-next 7/9] ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directly Alexander Lobakin
2022-01-11  7:30   ` [Intel-wired-lan] " Penigalapati, Sandeep
2021-12-08 14:07 ` [PATCH v4 net-next 8/9] ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb Alexander Lobakin
2022-01-11 11:51   ` [Intel-wired-lan] " Penigalapati, Sandeep
2021-12-08 14:07 ` [PATCH v4 net-next 9/9] ixgbe: respect metadata " Alexander Lobakin
2022-01-11 11:52   ` [Intel-wired-lan] " Penigalapati, Sandeep
2022-01-10 10:11 ` [Intel-wired-lan] [PATCH v4 net-next 0/9] net: intel: napi_alloc_skb() vs metadata Bhandare, KiranX

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).