All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5
@ 2023-08-24 19:26 Larysa Zaremba
  2023-08-24 19:26 ` [RFC bpf-next 01/23] ice: make RX hash reading code more reusable Larysa Zaremba
                   ` (24 more replies)
  0 siblings, 25 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Alexei has requested an implementation of VLAN and checksum XDP hints
for one more driver [0].

This series is exactly the v5 of "XDP metadata via kfuncs for ice" [1]
with 2 additional patches for mlx5.

Firstly, there is a VLAN hint implementation. I am pretty sure this
one works and would not object adding it to the main series, if someone
from nvidia ACKs it.

The second patch is a checksum hint implementation and it is very rough.
There is logic duplication and some missing features, but I am sure it
captures the main points of the potential end implementation.

I think it is unrealistic for me to provide a fully working mlx5 checksum
hint implementation (complex logic, no HW), so would much rather prefer
not having it in my main series. My main intension with this RFC is
to prove proposed hints functions are suitable for non-intel HW.

[0] https://lore.kernel.org/bpf/CAADnVQLNeO81zc4f_z_UDCi+tJ2LS4dj2E1+au5TbXM+CPSyXQ@mail.gmail.com/
[1] https://lore.kernel.org/bpf/20230811161509.19722-1-larysa.zaremba@intel.com/

Aleksander Lobakin (1):
  net, xdp: allow metadata > 32

Larysa Zaremba (22):
  ice: make RX hash reading code more reusable
  ice: make RX HW timestamp reading code more reusable
  ice: make RX checksum checking code more reusable
  ice: Make ptype internal to descriptor info processing
  ice: Introduce ice_xdp_buff
  ice: Support HW timestamp hint
  ice: Support RX hash XDP hint
  ice: Support XDP hints in AF_XDP ZC mode
  xdp: Add VLAN tag hint
  ice: Implement VLAN tag hint
  ice: use VLAN proto from ring packet context in skb path
  xdp: Add checksum hint
  ice: Implement checksum hint
  selftests/bpf: Allow VLAN packets in xdp_hw_metadata
  selftests/bpf: Add flags and new hints to xdp_hw_metadata
  veth: Implement VLAN tag and checksum XDP hint
  net: make vlan_get_tag() return -ENODATA instead of -EINVAL
  selftests/bpf: Use AF_INET for TX in xdp_metadata
  selftests/bpf: Check VLAN tag and proto in xdp_metadata
  selftests/bpf: check checksum state in xdp_metadata
  mlx5: implement VLAN tag XDP hint
  mlx5: implement RX checksum XDP hint

 Documentation/networking/xdp-rx-metadata.rst  |  11 +-
 drivers/net/ethernet/intel/ice/ice.h          |   2 +
 drivers/net/ethernet/intel/ice/ice_ethtool.c  |   2 +-
 .../net/ethernet/intel/ice/ice_lan_tx_rx.h    | 412 +++++++++---------
 drivers/net/ethernet/intel/ice/ice_lib.c      |   2 +-
 drivers/net/ethernet/intel/ice/ice_main.c     |  23 +
 drivers/net/ethernet/intel/ice/ice_ptp.c      |  27 +-
 drivers/net/ethernet/intel/ice/ice_ptp.h      |  15 +-
 drivers/net/ethernet/intel/ice/ice_txrx.c     |  19 +-
 drivers/net/ethernet/intel/ice/ice_txrx.h     |  29 +-
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 343 ++++++++++++---
 drivers/net/ethernet/intel/ice/ice_txrx_lib.h |  18 +-
 drivers/net/ethernet/intel/ice/ice_xsk.c      |  26 +-
 .../net/ethernet/mellanox/mlx5/core/en/txrx.h |  10 +
 .../net/ethernet/mellanox/mlx5/core/en/xdp.c  | 116 +++++
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   |  12 +-
 drivers/net/veth.c                            |  42 ++
 include/linux/if_vlan.h                       |   4 +-
 include/linux/mlx5/device.h                   |   4 +-
 include/linux/skbuff.h                        |  13 +-
 include/net/xdp.h                             |  29 +-
 kernel/bpf/offload.c                          |   4 +
 net/core/xdp.c                                |  57 +++
 .../selftests/bpf/prog_tests/xdp_metadata.c   | 187 ++++----
 .../selftests/bpf/progs/xdp_hw_metadata.c     |  48 +-
 .../selftests/bpf/progs/xdp_metadata.c        |  16 +
 tools/testing/selftests/bpf/testing_helpers.h |   3 +
 tools/testing/selftests/bpf/xdp_hw_metadata.c |  67 ++-
 tools/testing/selftests/bpf/xdp_metadata.h    |  42 +-
 29 files changed, 1124 insertions(+), 459 deletions(-)

-- 
2.41.0


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [RFC bpf-next 01/23] ice: make RX hash reading code more reusable
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-09-04 14:37   ` [xdp-hints] " Maciej Fijalkowski
  2023-09-14 16:12   ` Alexander Lobakin
  2023-08-24 19:26 ` [RFC bpf-next 02/23] ice: make RX HW timestamp " Larysa Zaremba
                   ` (23 subsequent siblings)
  24 siblings, 2 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Previously, we only needed RX hash in skb path,
hence all related code was written with skb in mind.
But with the addition of XDP hints via kfuncs to the ice driver,
the same logic will be needed in .xmo_() callbacks.

Separate generic process of reading RX hash from a descriptor
into a separate function.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 37 +++++++++++++------
 1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index c8322fb6f2b3..8f7f6d78f7bf 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -63,28 +63,43 @@ static enum pkt_hash_types ice_ptype_to_htype(u16 ptype)
 }
 
 /**
- * ice_rx_hash - set the hash value in the skb
+ * ice_get_rx_hash - get RX hash value from descriptor
+ * @rx_desc: specific descriptor
+ *
+ * Returns hash, if present, 0 otherwise.
+ */
+static u32
+ice_get_rx_hash(const union ice_32b_rx_flex_desc *rx_desc)
+{
+	const struct ice_32b_rx_flex_desc_nic *nic_mdid;
+
+	if (rx_desc->wb.rxdid != ICE_RXDID_FLEX_NIC)
+		return 0;
+
+	nic_mdid = (struct ice_32b_rx_flex_desc_nic *)rx_desc;
+	return le32_to_cpu(nic_mdid->rss_hash);
+}
+
+/**
+ * ice_rx_hash_to_skb - set the hash value in the skb
  * @rx_ring: descriptor ring
  * @rx_desc: specific descriptor
  * @skb: pointer to current skb
  * @rx_ptype: the ptype value from the descriptor
  */
 static void
-ice_rx_hash(struct ice_rx_ring *rx_ring, union ice_32b_rx_flex_desc *rx_desc,
-	    struct sk_buff *skb, u16 rx_ptype)
+ice_rx_hash_to_skb(const struct ice_rx_ring *rx_ring,
+		   const union ice_32b_rx_flex_desc *rx_desc,
+		   struct sk_buff *skb, u16 rx_ptype)
 {
-	struct ice_32b_rx_flex_desc_nic *nic_mdid;
 	u32 hash;
 
 	if (!(rx_ring->netdev->features & NETIF_F_RXHASH))
 		return;
 
-	if (rx_desc->wb.rxdid != ICE_RXDID_FLEX_NIC)
-		return;
-
-	nic_mdid = (struct ice_32b_rx_flex_desc_nic *)rx_desc;
-	hash = le32_to_cpu(nic_mdid->rss_hash);
-	skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype));
+	hash = ice_get_rx_hash(rx_desc);
+	if (likely(hash))
+		skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype));
 }
 
 /**
@@ -186,7 +201,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
 		       union ice_32b_rx_flex_desc *rx_desc,
 		       struct sk_buff *skb, u16 ptype)
 {
-	ice_rx_hash(rx_ring, rx_desc, skb, ptype);
+	ice_rx_hash_to_skb(rx_ring, rx_desc, skb, ptype);
 
 	/* modifies the skb - consumes the enet header */
 	skb->protocol = eth_type_trans(skb, rx_ring->netdev);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 02/23] ice: make RX HW timestamp reading code more reusable
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
  2023-08-24 19:26 ` [RFC bpf-next 01/23] ice: make RX hash reading code more reusable Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-09-04 14:56   ` Maciej Fijalkowski
  2023-08-24 19:26 ` [RFC bpf-next 03/23] ice: make RX checksum checking " Larysa Zaremba
                   ` (22 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Previously, we only needed RX HW timestamp in skb path,
hence all related code was written with skb in mind.
But with the addition of XDP hints via kfuncs to the ice driver,
the same logic will be needed in .xmo_() callbacks.

Put generic process of reading RX HW timestamp from a descriptor
into a separate function.
Move skb-related code into another source file.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_ptp.c      | 24 ++++++------------
 drivers/net/ethernet/intel/ice/ice_ptp.h      | 15 ++++++-----
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 25 ++++++++++++++++++-
 3 files changed, 41 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index 81d96a40d5a7..a31333972c68 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -2147,30 +2147,24 @@ int ice_ptp_set_ts_config(struct ice_pf *pf, struct ifreq *ifr)
 }
 
 /**
- * ice_ptp_rx_hwtstamp - Check for an Rx timestamp
- * @rx_ring: Ring to get the VSI info
+ * ice_ptp_get_rx_hwts - Get packet Rx timestamp
  * @rx_desc: Receive descriptor
- * @skb: Particular skb to send timestamp with
+ * @cached_time: Cached PHC time
  *
  * The driver receives a notification in the receive descriptor with timestamp.
- * The timestamp is in ns, so we must convert the result first.
  */
-void
-ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring,
-		    union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb)
+u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc,
+			u64 cached_time)
 {
-	struct skb_shared_hwtstamps *hwtstamps;
-	u64 ts_ns, cached_time;
 	u32 ts_high;
+	u64 ts_ns;
 
 	if (!(rx_desc->wb.time_stamp_low & ICE_PTP_TS_VALID))
-		return;
-
-	cached_time = READ_ONCE(rx_ring->cached_phctime);
+		return 0;
 
 	/* Do not report a timestamp if we don't have a cached PHC time */
 	if (!cached_time)
-		return;
+		return 0;
 
 	/* Use ice_ptp_extend_32b_ts directly, using the ring-specific cached
 	 * PHC value, rather than accessing the PF. This also allows us to
@@ -2181,9 +2175,7 @@ ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring,
 	ts_high = le32_to_cpu(rx_desc->wb.flex_ts.ts_high);
 	ts_ns = ice_ptp_extend_32b_ts(cached_time, ts_high);
 
-	hwtstamps = skb_hwtstamps(skb);
-	memset(hwtstamps, 0, sizeof(*hwtstamps));
-	hwtstamps->hwtstamp = ns_to_ktime(ts_ns);
+	return ts_ns;
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.h b/drivers/net/ethernet/intel/ice/ice_ptp.h
index 995a57019ba7..523eefbfdf95 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.h
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.h
@@ -268,9 +268,8 @@ void ice_ptp_extts_event(struct ice_pf *pf);
 s8 ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb);
 enum ice_tx_tstamp_work ice_ptp_process_ts(struct ice_pf *pf);
 
-void
-ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring,
-		    union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb);
+u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc,
+			u64 cached_time);
 void ice_ptp_reset(struct ice_pf *pf);
 void ice_ptp_prepare_for_reset(struct ice_pf *pf);
 void ice_ptp_init(struct ice_pf *pf);
@@ -304,9 +303,13 @@ static inline bool ice_ptp_process_ts(struct ice_pf *pf)
 {
 	return true;
 }
-static inline void
-ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring,
-		    union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb) { }
+
+static inline u64
+ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc, u64 cached_time)
+{
+	return 0;
+}
+
 static inline void ice_ptp_reset(struct ice_pf *pf) { }
 static inline void ice_ptp_prepare_for_reset(struct ice_pf *pf) { }
 static inline void ice_ptp_init(struct ice_pf *pf) { }
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index 8f7f6d78f7bf..b2f241b73934 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -185,6 +185,29 @@ ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
 	ring->vsi->back->hw_csum_rx_error++;
 }
 
+/**
+ * ice_ptp_rx_hwts_to_skb - Put RX timestamp into skb
+ * @rx_ring: Ring to get the VSI info
+ * @rx_desc: Receive descriptor
+ * @skb: Particular skb to send timestamp with
+ *
+ * The timestamp is in ns, so we must convert the result first.
+ */
+static void
+ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring,
+		       const union ice_32b_rx_flex_desc *rx_desc,
+		       struct sk_buff *skb)
+{
+	u64 ts_ns, cached_time;
+
+	cached_time = READ_ONCE(rx_ring->cached_phctime);
+	ts_ns = ice_ptp_get_rx_hwts(rx_desc, cached_time);
+
+	*skb_hwtstamps(skb) = (struct skb_shared_hwtstamps){
+		.hwtstamp	= ns_to_ktime(ts_ns),
+	};
+}
+
 /**
  * ice_process_skb_fields - Populate skb header fields from Rx descriptor
  * @rx_ring: Rx descriptor ring packet is being transacted on
@@ -209,7 +232,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
 	ice_rx_csum(rx_ring, skb, rx_desc, ptype);
 
 	if (rx_ring->ptp_rx)
-		ice_ptp_rx_hwtstamp(rx_ring, rx_desc, skb);
+		ice_ptp_rx_hwts_to_skb(rx_ring, rx_desc, skb);
 }
 
 /**
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 03/23] ice: make RX checksum checking code more reusable
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
  2023-08-24 19:26 ` [RFC bpf-next 01/23] ice: make RX hash reading code more reusable Larysa Zaremba
  2023-08-24 19:26 ` [RFC bpf-next 02/23] ice: make RX HW timestamp " Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-09-04 15:02   ` [xdp-hints] " Maciej Fijalkowski
  2023-08-24 19:26 ` [RFC bpf-next 04/23] ice: Make ptype internal to descriptor info processing Larysa Zaremba
                   ` (21 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Previously, we only needed RX checksum flags in skb path,
hence all related code was written with skb in mind.
But with the addition of XDP hints via kfuncs to the ice driver,
the same logic will be needed in .xmo_() callbacks.

Put generic process of determining checksum status into
a separate function.

Now we cannot operate directly on skb, when deducing
checksum status, therefore introduce an intermediate enum for checksum
status. Fortunately, in ice, we have only 4 possibilities: checksum
validated at level 0, validated at level 1, no checksum, checksum error.
Use 3 bits for more convenient conversion.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 105 ++++++++++++------
 1 file changed, 69 insertions(+), 36 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index b2f241b73934..8b155a502b3b 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -102,18 +102,41 @@ ice_rx_hash_to_skb(const struct ice_rx_ring *rx_ring,
 		skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype));
 }
 
+enum ice_rx_csum_status {
+	ICE_RX_CSUM_LVL_0	= 0,
+	ICE_RX_CSUM_LVL_1	= BIT(0),
+	ICE_RX_CSUM_NONE	= BIT(1),
+	ICE_RX_CSUM_ERROR	= BIT(2),
+	ICE_RX_CSUM_FAIL	= ICE_RX_CSUM_NONE | ICE_RX_CSUM_ERROR,
+};
+
 /**
- * ice_rx_csum - Indicate in skb if checksum is good
- * @ring: the ring we care about
- * @skb: skb currently being received and modified
+ * ice_rx_csum_lvl - Get checksum level from status
+ * @status: driver-specific checksum status
+ */
+static u8 ice_rx_csum_lvl(enum ice_rx_csum_status status)
+{
+	return status & ICE_RX_CSUM_LVL_1;
+}
+
+/**
+ * ice_rx_csum_ip_summed - Checksum status from driver-specific to generic
+ * @status: driver-specific checksum status
+ */
+static u8 ice_rx_csum_ip_summed(enum ice_rx_csum_status status)
+{
+	return status & ICE_RX_CSUM_NONE ? CHECKSUM_NONE : CHECKSUM_UNNECESSARY;
+}
+
+/**
+ * ice_get_rx_csum_status - Deduce checksum status from descriptor
  * @rx_desc: the receive descriptor
  * @ptype: the packet type decoded by hardware
  *
- * skb->protocol must be set before this function is called
+ * Returns driver-specific checksum status
  */
-static void
-ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
-	    union ice_32b_rx_flex_desc *rx_desc, u16 ptype)
+static enum ice_rx_csum_status
+ice_get_rx_csum_status(const union ice_32b_rx_flex_desc *rx_desc, u16 ptype)
 {
 	struct ice_rx_ptype_decoded decoded;
 	u16 rx_status0, rx_status1;
@@ -124,20 +147,12 @@ ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
 
 	decoded = ice_decode_rx_desc_ptype(ptype);
 
-	/* Start with CHECKSUM_NONE and by default csum_level = 0 */
-	skb->ip_summed = CHECKSUM_NONE;
-	skb_checksum_none_assert(skb);
-
-	/* check if Rx checksum is enabled */
-	if (!(ring->netdev->features & NETIF_F_RXCSUM))
-		return;
-
 	/* check if HW has decoded the packet and checksum */
 	if (!(rx_status0 & BIT(ICE_RX_FLEX_DESC_STATUS0_L3L4P_S)))
-		return;
+		return ICE_RX_CSUM_NONE;
 
 	if (!(decoded.known && decoded.outer_ip))
-		return;
+		return ICE_RX_CSUM_NONE;
 
 	ipv4 = (decoded.outer_ip == ICE_RX_PTYPE_OUTER_IP) &&
 	       (decoded.outer_ip_ver == ICE_RX_PTYPE_OUTER_IPV4);
@@ -146,43 +161,61 @@ ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
 
 	if (ipv4 && (rx_status0 & (BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_IPE_S) |
 				   BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_EIPE_S))))
-		goto checksum_fail;
+		return ICE_RX_CSUM_FAIL;
 
 	if (ipv6 && (rx_status0 & (BIT(ICE_RX_FLEX_DESC_STATUS0_IPV6EXADD_S))))
-		goto checksum_fail;
+		return ICE_RX_CSUM_FAIL;
 
 	/* check for L4 errors and handle packets that were not able to be
 	 * checksummed due to arrival speed
 	 */
 	if (rx_status0 & BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_L4E_S))
-		goto checksum_fail;
+		return ICE_RX_CSUM_FAIL;
 
 	/* check for outer UDP checksum error in tunneled packets */
 	if ((rx_status1 & BIT(ICE_RX_FLEX_DESC_STATUS1_NAT_S)) &&
 	    (rx_status0 & BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_EUDPE_S)))
-		goto checksum_fail;
-
-	/* If there is an outer header present that might contain a checksum
-	 * we need to bump the checksum level by 1 to reflect the fact that
-	 * we are indicating we validated the inner checksum.
-	 */
-	if (decoded.tunnel_type >= ICE_RX_PTYPE_TUNNEL_IP_GRENAT)
-		skb->csum_level = 1;
+		return ICE_RX_CSUM_FAIL;
 
 	/* Only report checksum unnecessary for TCP, UDP, or SCTP */
 	switch (decoded.inner_prot) {
 	case ICE_RX_PTYPE_INNER_PROT_TCP:
 	case ICE_RX_PTYPE_INNER_PROT_UDP:
 	case ICE_RX_PTYPE_INNER_PROT_SCTP:
-		skb->ip_summed = CHECKSUM_UNNECESSARY;
-		break;
-	default:
-		break;
+		/* If there is an outer header present that might contain
+		 * a checksum we need to bump the checksum level by 1 to reflect
+		 * the fact that we have validated the inner checksum.
+		 */
+		return decoded.tunnel_type >= ICE_RX_PTYPE_TUNNEL_IP_GRENAT ?
+		       ICE_RX_CSUM_LVL_1 : ICE_RX_CSUM_LVL_0;
 	}
-	return;
 
-checksum_fail:
-	ring->vsi->back->hw_csum_rx_error++;
+	return ICE_RX_CSUM_NONE;
+}
+
+/**
+ * ice_rx_csum_into_skb - Indicate in skb if checksum is good
+ * @ring: the ring we care about
+ * @skb: skb currently being received and modified
+ * @rx_desc: the receive descriptor
+ * @ptype: the packet type decoded by hardware
+ */
+static void
+ice_rx_csum_into_skb(struct ice_rx_ring *ring, struct sk_buff *skb,
+		     const union ice_32b_rx_flex_desc *rx_desc, u16 ptype)
+{
+	enum ice_rx_csum_status csum_status;
+
+	/* check if Rx checksum is enabled */
+	if (!(ring->netdev->features & NETIF_F_RXCSUM))
+		return;
+
+	csum_status = ice_get_rx_csum_status(rx_desc, ptype);
+	if (csum_status & ICE_RX_CSUM_ERROR)
+		ring->vsi->back->hw_csum_rx_error++;
+
+	skb->ip_summed = ice_rx_csum_ip_summed(csum_status);
+	skb->csum_level = ice_rx_csum_lvl(csum_status);
 }
 
 /**
@@ -229,7 +262,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
 	/* modifies the skb - consumes the enet header */
 	skb->protocol = eth_type_trans(skb, rx_ring->netdev);
 
-	ice_rx_csum(rx_ring, skb, rx_desc, ptype);
+	ice_rx_csum_into_skb(rx_ring, skb, rx_desc, ptype);
 
 	if (rx_ring->ptp_rx)
 		ice_ptp_rx_hwts_to_skb(rx_ring, rx_desc, skb);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 04/23] ice: Make ptype internal to descriptor info processing
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (2 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 03/23] ice: make RX checksum checking " Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-09-04 15:04   ` Maciej Fijalkowski
  2023-08-24 19:26 ` [RFC bpf-next 05/23] ice: Introduce ice_xdp_buff Larysa Zaremba
                   ` (20 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Currently, rx_ptype variable is used only as an argument
to ice_process_skb_fields() and is computed
just before the function call.

Therefore, there is no reason to pass this value as an argument.
Instead, remove this argument and compute the value directly inside
ice_process_skb_fields() function.

Also, separate its calculation into a short function, so the code
can later be reused in .xmo_() callbacks.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_txrx.c     |  6 +-----
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 15 +++++++++++++--
 drivers/net/ethernet/intel/ice/ice_txrx_lib.h |  2 +-
 drivers/net/ethernet/intel/ice/ice_xsk.c      |  6 +-----
 4 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 52d0a126eb61..40f2f6dabb81 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -1181,7 +1181,6 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
 		unsigned int size;
 		u16 stat_err_bits;
 		u16 vlan_tag = 0;
-		u16 rx_ptype;
 
 		/* get the Rx desc from Rx ring based on 'next_to_clean' */
 		rx_desc = ICE_RX_DESC(rx_ring, ntc);
@@ -1286,10 +1285,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
 		total_rx_bytes += skb->len;
 
 		/* populate checksum, VLAN, and protocol */
-		rx_ptype = le16_to_cpu(rx_desc->wb.ptype_flex_flags0) &
-			ICE_RX_FLEX_DESC_PTYPE_M;
-
-		ice_process_skb_fields(rx_ring, rx_desc, skb, rx_ptype);
+		ice_process_skb_fields(rx_ring, rx_desc, skb);
 
 		ice_trace(clean_rx_irq_indicate, rx_ring, rx_desc, skb);
 		/* send completed skb up the stack */
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index 8b155a502b3b..07241f4229b7 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -241,12 +241,21 @@ ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring,
 	};
 }
 
+/**
+ * ice_get_ptype - Read HW packet type from the descriptor
+ * @rx_desc: RX descriptor
+ */
+static u16 ice_get_ptype(const union ice_32b_rx_flex_desc *rx_desc)
+{
+	return le16_to_cpu(rx_desc->wb.ptype_flex_flags0) &
+	       ICE_RX_FLEX_DESC_PTYPE_M;
+}
+
 /**
  * ice_process_skb_fields - Populate skb header fields from Rx descriptor
  * @rx_ring: Rx descriptor ring packet is being transacted on
  * @rx_desc: pointer to the EOP Rx descriptor
  * @skb: pointer to current skb being populated
- * @ptype: the packet type decoded by hardware
  *
  * This function checks the ring, descriptor, and packet information in
  * order to populate the hash, checksum, VLAN, protocol, and
@@ -255,8 +264,10 @@ ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring,
 void
 ice_process_skb_fields(struct ice_rx_ring *rx_ring,
 		       union ice_32b_rx_flex_desc *rx_desc,
-		       struct sk_buff *skb, u16 ptype)
+		       struct sk_buff *skb)
 {
+	u16 ptype = ice_get_ptype(rx_desc);
+
 	ice_rx_hash_to_skb(rx_ring, rx_desc, skb, ptype);
 
 	/* modifies the skb - consumes the enet header */
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
index 115969ecdf7b..e1d49e1235b3 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
@@ -148,7 +148,7 @@ void ice_release_rx_desc(struct ice_rx_ring *rx_ring, u16 val);
 void
 ice_process_skb_fields(struct ice_rx_ring *rx_ring,
 		       union ice_32b_rx_flex_desc *rx_desc,
-		       struct sk_buff *skb, u16 ptype);
+		       struct sk_buff *skb);
 void
 ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag);
 #endif /* !_ICE_TXRX_LIB_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 2a3f0834e139..ef778b8e6d1b 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -870,7 +870,6 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
 		struct sk_buff *skb;
 		u16 stat_err_bits;
 		u16 vlan_tag = 0;
-		u16 rx_ptype;
 
 		rx_desc = ICE_RX_DESC(rx_ring, ntc);
 
@@ -950,10 +949,7 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
 
 		vlan_tag = ice_get_vlan_tag_from_rx_desc(rx_desc);
 
-		rx_ptype = le16_to_cpu(rx_desc->wb.ptype_flex_flags0) &
-				       ICE_RX_FLEX_DESC_PTYPE_M;
-
-		ice_process_skb_fields(rx_ring, rx_desc, skb, rx_ptype);
+		ice_process_skb_fields(rx_ring, rx_desc, skb);
 		ice_receive_skb(rx_ring, skb, vlan_tag);
 	}
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 05/23] ice: Introduce ice_xdp_buff
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (3 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 04/23] ice: Make ptype internal to descriptor info processing Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-09-04 15:32   ` [xdp-hints] " Maciej Fijalkowski
  2023-08-24 19:26 ` [RFC bpf-next 06/23] ice: Support HW timestamp hint Larysa Zaremba
                   ` (19 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

In order to use XDP hints via kfuncs we need to put
RX descriptor and ring pointers just next to xdp_buff.
Same as in hints implementations in other drivers, we achieve
this through putting xdp_buff into a child structure.

Currently, xdp_buff is stored in the ring structure,
so replace it with union that includes child structure.
This way enough memory is available while existing XDP code
remains isolated from hints.

Minimum size of the new child structure (ice_xdp_buff) is exactly
64 bytes (single cache line). To place it at the start of a cache line,
move 'next' field from CL1 to CL3, as it isn't used often. This still
leaves 128 bits available in CL3 for packet context extensions.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_txrx.c     |  7 +++--
 drivers/net/ethernet/intel/ice/ice_txrx.h     | 26 ++++++++++++++++---
 drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 10 +++++++
 3 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 40f2f6dabb81..4e6546d9cf85 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -557,13 +557,14 @@ ice_rx_frame_truesize(struct ice_rx_ring *rx_ring, const unsigned int size)
  * @xdp_prog: XDP program to run
  * @xdp_ring: ring to be used for XDP_TX action
  * @rx_buf: Rx buffer to store the XDP action
+ * @eop_desc: Last descriptor in packet to read metadata from
  *
  * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}
  */
 static void
 ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
 	    struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
-	    struct ice_rx_buf *rx_buf)
+	    struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc)
 {
 	unsigned int ret = ICE_XDP_PASS;
 	u32 act;
@@ -571,6 +572,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
 	if (!xdp_prog)
 		goto exit;
 
+	ice_xdp_meta_set_desc(xdp, eop_desc);
+
 	act = bpf_prog_run_xdp(xdp_prog, xdp);
 	switch (act) {
 	case XDP_PASS:
@@ -1240,7 +1243,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
 		if (ice_is_non_eop(rx_ring, rx_desc))
 			continue;
 
-		ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf);
+		ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc);
 		if (rx_buf->act == ICE_XDP_PASS)
 			goto construct_skb;
 		total_rx_bytes += xdp_get_buff_len(xdp);
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h
index 166413fc33f4..d0ab2c4c0c91 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.h
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.h
@@ -257,6 +257,18 @@ enum ice_rx_dtype {
 	ICE_RX_DTYPE_SPLIT_ALWAYS	= 2,
 };
 
+struct ice_pkt_ctx {
+	const union ice_32b_rx_flex_desc *eop_desc;
+};
+
+struct ice_xdp_buff {
+	struct xdp_buff xdp_buff;
+	struct ice_pkt_ctx pkt_ctx;
+};
+
+/* Required for compatibility with xdp_buffs from xsk_pool */
+static_assert(offsetof(struct ice_xdp_buff, xdp_buff) == 0);
+
 /* indices into GLINT_ITR registers */
 #define ICE_RX_ITR	ICE_IDX_ITR0
 #define ICE_TX_ITR	ICE_IDX_ITR1
@@ -298,7 +310,6 @@ enum ice_dynamic_itr {
 /* descriptor ring, associated with a VSI */
 struct ice_rx_ring {
 	/* CL1 - 1st cacheline starts here */
-	struct ice_rx_ring *next;	/* pointer to next ring in q_vector */
 	void *desc;			/* Descriptor ring memory */
 	struct device *dev;		/* Used for DMA mapping */
 	struct net_device *netdev;	/* netdev ring maps to */
@@ -310,12 +321,19 @@ struct ice_rx_ring {
 	u16 count;			/* Number of descriptors */
 	u16 reg_idx;			/* HW register index of the ring */
 	u16 next_to_alloc;
-	/* CL2 - 2nd cacheline starts here */
+
 	union {
 		struct ice_rx_buf *rx_buf;
 		struct xdp_buff **xdp_buf;
 	};
-	struct xdp_buff xdp;
+	/* CL2 - 2nd cacheline starts here */
+	union {
+		struct ice_xdp_buff xdp_ext;
+		struct {
+			struct xdp_buff xdp;
+			struct ice_pkt_ctx pkt_ctx;
+		};
+	};
 	/* CL3 - 3rd cacheline starts here */
 	struct bpf_prog *xdp_prog;
 	u16 rx_offset;
@@ -325,6 +343,8 @@ struct ice_rx_ring {
 	u16 next_to_clean;
 	u16 first_desc;
 
+	struct ice_rx_ring *next;	/* pointer to next ring in q_vector */
+
 	/* stats structs */
 	struct ice_ring_stats *ring_stats;
 
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
index e1d49e1235b3..145883eec129 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
@@ -151,4 +151,14 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
 		       struct sk_buff *skb);
 void
 ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag);
+
+static inline void
+ice_xdp_meta_set_desc(struct xdp_buff *xdp,
+		      union ice_32b_rx_flex_desc *eop_desc)
+{
+	struct ice_xdp_buff *xdp_ext = container_of(xdp, struct ice_xdp_buff,
+						    xdp_buff);
+
+	xdp_ext->pkt_ctx.eop_desc = eop_desc;
+}
 #endif /* !_ICE_TXRX_LIB_H_ */
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 06/23] ice: Support HW timestamp hint
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (4 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 05/23] ice: Introduce ice_xdp_buff Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-09-04 15:38   ` Maciej Fijalkowski
  2023-08-24 19:26 ` [RFC bpf-next 07/23] ice: Support RX hash XDP hint Larysa Zaremba
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Use previously refactored code and create a function
that allows XDP code to read HW timestamp.

Also, move cached_phctime into packet context, this way this data still
stays in the ring structure, just at the different address.

HW timestamp is the first supported hint in the driver,
so also add xdp_metadata_ops.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 drivers/net/ethernet/intel/ice/ice.h          |  2 ++
 drivers/net/ethernet/intel/ice/ice_ethtool.c  |  2 +-
 drivers/net/ethernet/intel/ice/ice_lib.c      |  2 +-
 drivers/net/ethernet/intel/ice/ice_main.c     |  1 +
 drivers/net/ethernet/intel/ice/ice_ptp.c      |  3 ++-
 drivers/net/ethernet/intel/ice/ice_txrx.h     |  2 +-
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 26 ++++++++++++++++++-
 7 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 5ac0ad12f9f1..34e4731b5d5f 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -951,4 +951,6 @@ static inline void ice_clear_rdma_cap(struct ice_pf *pf)
 	set_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags);
 	clear_bit(ICE_FLAG_RDMA_ENA, pf->flags);
 }
+
+extern const struct xdp_metadata_ops ice_xdp_md_ops;
 #endif /* _ICE_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index ad4d4702129f..f740e0ad0e3c 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -2846,7 +2846,7 @@ ice_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
 		/* clone ring and setup updated count */
 		rx_rings[i] = *vsi->rx_rings[i];
 		rx_rings[i].count = new_rx_cnt;
-		rx_rings[i].cached_phctime = pf->ptp.cached_phc_time;
+		rx_rings[i].pkt_ctx.cached_phctime = pf->ptp.cached_phc_time;
 		rx_rings[i].desc = NULL;
 		rx_rings[i].rx_buf = NULL;
 		/* this is to allow wr32 to have something to write to
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 927518fcad51..12290defb730 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -1445,7 +1445,7 @@ static int ice_vsi_alloc_rings(struct ice_vsi *vsi)
 		ring->netdev = vsi->netdev;
 		ring->dev = dev;
 		ring->count = vsi->num_rx_desc;
-		ring->cached_phctime = pf->ptp.cached_phc_time;
+		ring->pkt_ctx.cached_phctime = pf->ptp.cached_phc_time;
 		WRITE_ONCE(vsi->rx_rings[i], ring);
 	}
 
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 0f04347eda39..557c6326ff87 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -3395,6 +3395,7 @@ static void ice_set_ops(struct ice_vsi *vsi)
 
 	netdev->netdev_ops = &ice_netdev_ops;
 	netdev->udp_tunnel_nic_info = &pf->hw.udp_tunnel_nic;
+	netdev->xdp_metadata_ops = &ice_xdp_md_ops;
 	ice_set_ethtool_ops(netdev);
 
 	if (vsi->type != ICE_VSI_PF)
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index a31333972c68..26fad7038996 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -1038,7 +1038,8 @@ static int ice_ptp_update_cached_phctime(struct ice_pf *pf)
 		ice_for_each_rxq(vsi, j) {
 			if (!vsi->rx_rings[j])
 				continue;
-			WRITE_ONCE(vsi->rx_rings[j]->cached_phctime, systime);
+			WRITE_ONCE(vsi->rx_rings[j]->pkt_ctx.cached_phctime,
+				   systime);
 		}
 	}
 	clear_bit(ICE_CFG_BUSY, pf->state);
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h
index d0ab2c4c0c91..4237702a58a9 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.h
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.h
@@ -259,6 +259,7 @@ enum ice_rx_dtype {
 
 struct ice_pkt_ctx {
 	const union ice_32b_rx_flex_desc *eop_desc;
+	u64 cached_phctime;
 };
 
 struct ice_xdp_buff {
@@ -354,7 +355,6 @@ struct ice_rx_ring {
 	struct ice_tx_ring *xdp_ring;
 	struct xsk_buff_pool *xsk_pool;
 	dma_addr_t dma;			/* physical address of ring */
-	u64 cached_phctime;
 	u16 rx_buf_len;
 	u8 dcb_tc;			/* Traffic class of ring */
 	u8 ptp_rx;
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index 07241f4229b7..463d9e5cbe05 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -233,7 +233,7 @@ ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring,
 {
 	u64 ts_ns, cached_time;
 
-	cached_time = READ_ONCE(rx_ring->cached_phctime);
+	cached_time = READ_ONCE(rx_ring->pkt_ctx.cached_phctime);
 	ts_ns = ice_ptp_get_rx_hwts(rx_desc, cached_time);
 
 	*skb_hwtstamps(skb) = (struct skb_shared_hwtstamps){
@@ -546,3 +546,27 @@ void ice_finalize_xdp_rx(struct ice_tx_ring *xdp_ring, unsigned int xdp_res,
 			spin_unlock(&xdp_ring->tx_lock);
 	}
 }
+
+/**
+ * ice_xdp_rx_hw_ts - HW timestamp XDP hint handler
+ * @ctx: XDP buff pointer
+ * @ts_ns: destination address
+ *
+ * Copy HW timestamp (if available) to the destination address.
+ */
+static int ice_xdp_rx_hw_ts(const struct xdp_md *ctx, u64 *ts_ns)
+{
+	const struct ice_xdp_buff *xdp_ext = (void *)ctx;
+	u64 cached_time;
+
+	cached_time = READ_ONCE(xdp_ext->pkt_ctx.cached_phctime);
+	*ts_ns = ice_ptp_get_rx_hwts(xdp_ext->pkt_ctx.eop_desc, cached_time);
+	if (!*ts_ns)
+		return -ENODATA;
+
+	return 0;
+}
+
+const struct xdp_metadata_ops ice_xdp_md_ops = {
+	.xmo_rx_timestamp		= ice_xdp_rx_hw_ts,
+};
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 07/23] ice: Support RX hash XDP hint
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (5 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 06/23] ice: Support HW timestamp hint Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-09-05 15:42   ` [xdp-hints] " Maciej Fijalkowski
  2023-09-14 16:54   ` Alexander Lobakin
  2023-08-24 19:26 ` [RFC bpf-next 08/23] ice: Support XDP hints in AF_XDP ZC mode Larysa Zaremba
                   ` (17 subsequent siblings)
  24 siblings, 2 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

RX hash XDP hint requests both hash value and type.
Type is XDP-specific, so we need a separate way to map
these values to the hardware ptypes, so create a lookup table.

Instead of creating a new long list, reuse contents
of ice_decode_rx_desc_ptype[] through preprocessor.

Current hash type enum does not contain ICMP packet type,
but ice devices support it, so also add a new type into core code.

Then use previously refactored code and create a function
that allows XDP code to read RX hash.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 .../net/ethernet/intel/ice/ice_lan_tx_rx.h    | 412 +++++++++---------
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c |  73 ++++
 include/net/xdp.h                             |   3 +
 3 files changed, 284 insertions(+), 204 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h b/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h
index 89f986a75cc8..d384ddfcb83e 100644
--- a/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h
+++ b/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h
@@ -673,6 +673,212 @@ struct ice_tlan_ctx {
  *      Use the enum ice_rx_l2_ptype to decode the packet type
  * ENDIF
  */
+#define ICE_PTYPES								\
+	/* L2 Packet types */							\
+	ICE_PTT_UNUSED_ENTRY(0),						\
+	ICE_PTT(1, L2, NONE, NOF, NONE, NONE, NOF, NONE, PAY2),			\
+	ICE_PTT_UNUSED_ENTRY(2),						\
+	ICE_PTT_UNUSED_ENTRY(3),						\
+	ICE_PTT_UNUSED_ENTRY(4),						\
+	ICE_PTT_UNUSED_ENTRY(5),						\
+	ICE_PTT(6, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),			\
+	ICE_PTT(7, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),			\
+	ICE_PTT_UNUSED_ENTRY(8),						\
+	ICE_PTT_UNUSED_ENTRY(9),						\
+	ICE_PTT(10, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),		\
+	ICE_PTT(11, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),		\
+	ICE_PTT_UNUSED_ENTRY(12),						\
+	ICE_PTT_UNUSED_ENTRY(13),						\
+	ICE_PTT_UNUSED_ENTRY(14),						\
+	ICE_PTT_UNUSED_ENTRY(15),						\
+	ICE_PTT_UNUSED_ENTRY(16),						\
+	ICE_PTT_UNUSED_ENTRY(17),						\
+	ICE_PTT_UNUSED_ENTRY(18),						\
+	ICE_PTT_UNUSED_ENTRY(19),						\
+	ICE_PTT_UNUSED_ENTRY(20),						\
+	ICE_PTT_UNUSED_ENTRY(21),						\
+										\
+	/* Non Tunneled IPv4 */							\
+	ICE_PTT(22, IP, IPV4, FRG, NONE, NONE, NOF, NONE, PAY3),		\
+	ICE_PTT(23, IP, IPV4, NOF, NONE, NONE, NOF, NONE, PAY3),		\
+	ICE_PTT(24, IP, IPV4, NOF, NONE, NONE, NOF, UDP,  PAY4),		\
+	ICE_PTT_UNUSED_ENTRY(25),						\
+	ICE_PTT(26, IP, IPV4, NOF, NONE, NONE, NOF, TCP,  PAY4),		\
+	ICE_PTT(27, IP, IPV4, NOF, NONE, NONE, NOF, SCTP, PAY4),		\
+	ICE_PTT(28, IP, IPV4, NOF, NONE, NONE, NOF, ICMP, PAY4),		\
+										\
+	/* IPv4 --> IPv4 */							\
+	ICE_PTT(29, IP, IPV4, NOF, IP_IP, IPV4, FRG, NONE, PAY3),		\
+	ICE_PTT(30, IP, IPV4, NOF, IP_IP, IPV4, NOF, NONE, PAY3),		\
+	ICE_PTT(31, IP, IPV4, NOF, IP_IP, IPV4, NOF, UDP,  PAY4),		\
+	ICE_PTT_UNUSED_ENTRY(32),						\
+	ICE_PTT(33, IP, IPV4, NOF, IP_IP, IPV4, NOF, TCP,  PAY4),		\
+	ICE_PTT(34, IP, IPV4, NOF, IP_IP, IPV4, NOF, SCTP, PAY4),		\
+	ICE_PTT(35, IP, IPV4, NOF, IP_IP, IPV4, NOF, ICMP, PAY4),		\
+										\
+	/* IPv4 --> IPv6 */							\
+	ICE_PTT(36, IP, IPV4, NOF, IP_IP, IPV6, FRG, NONE, PAY3),		\
+	ICE_PTT(37, IP, IPV4, NOF, IP_IP, IPV6, NOF, NONE, PAY3),		\
+	ICE_PTT(38, IP, IPV4, NOF, IP_IP, IPV6, NOF, UDP,  PAY4),		\
+	ICE_PTT_UNUSED_ENTRY(39),						\
+	ICE_PTT(40, IP, IPV4, NOF, IP_IP, IPV6, NOF, TCP,  PAY4),		\
+	ICE_PTT(41, IP, IPV4, NOF, IP_IP, IPV6, NOF, SCTP, PAY4),		\
+	ICE_PTT(42, IP, IPV4, NOF, IP_IP, IPV6, NOF, ICMP, PAY4),		\
+										\
+	/* IPv4 --> GRE/NAT */							\
+	ICE_PTT(43, IP, IPV4, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3),		\
+										\
+	/* IPv4 --> GRE/NAT --> IPv4 */						\
+	ICE_PTT(44, IP, IPV4, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3),		\
+	ICE_PTT(45, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3),		\
+	ICE_PTT(46, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, UDP,  PAY4),		\
+	ICE_PTT_UNUSED_ENTRY(47),						\
+	ICE_PTT(48, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, TCP,  PAY4),		\
+	ICE_PTT(49, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4),		\
+	ICE_PTT(50, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4),		\
+										\
+	/* IPv4 --> GRE/NAT --> IPv6 */						\
+	ICE_PTT(51, IP, IPV4, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3),		\
+	ICE_PTT(52, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3),		\
+	ICE_PTT(53, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, UDP,  PAY4),		\
+	ICE_PTT_UNUSED_ENTRY(54),						\
+	ICE_PTT(55, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, TCP,  PAY4),		\
+	ICE_PTT(56, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4),		\
+	ICE_PTT(57, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4),		\
+										\
+	/* IPv4 --> GRE/NAT --> MAC */						\
+	ICE_PTT(58, IP, IPV4, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3),	\
+										\
+	/* IPv4 --> GRE/NAT --> MAC --> IPv4 */					\
+	ICE_PTT(59, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3),	\
+	ICE_PTT(60, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3),	\
+	ICE_PTT(61, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP,  PAY4),	\
+	ICE_PTT_UNUSED_ENTRY(62),						\
+	ICE_PTT(63, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP,  PAY4),	\
+	ICE_PTT(64, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4),	\
+	ICE_PTT(65, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4),	\
+										\
+	/* IPv4 --> GRE/NAT -> MAC --> IPv6 */					\
+	ICE_PTT(66, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3),	\
+	ICE_PTT(67, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3),	\
+	ICE_PTT(68, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP,  PAY4),	\
+	ICE_PTT_UNUSED_ENTRY(69),						\
+	ICE_PTT(70, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP,  PAY4),	\
+	ICE_PTT(71, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4),	\
+	ICE_PTT(72, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4),	\
+										\
+	/* IPv4 --> GRE/NAT --> MAC/VLAN */					\
+	ICE_PTT(73, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3),	\
+										\
+	/* IPv4 ---> GRE/NAT -> MAC/VLAN --> IPv4 */				\
+	ICE_PTT(74, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3),	\
+	ICE_PTT(75, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3),	\
+	ICE_PTT(76, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP,  PAY4),	\
+	ICE_PTT_UNUSED_ENTRY(77),						\
+	ICE_PTT(78, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP,  PAY4),	\
+	ICE_PTT(79, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4),	\
+	ICE_PTT(80, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4),	\
+										\
+	/* IPv4 -> GRE/NAT -> MAC/VLAN --> IPv6 */				\
+	ICE_PTT(81, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3),	\
+	ICE_PTT(82, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3),	\
+	ICE_PTT(83, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP,  PAY4),	\
+	ICE_PTT_UNUSED_ENTRY(84),						\
+	ICE_PTT(85, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP,  PAY4),	\
+	ICE_PTT(86, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4),	\
+	ICE_PTT(87, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4),	\
+										\
+	/* Non Tunneled IPv6 */							\
+	ICE_PTT(88, IP, IPV6, FRG, NONE, NONE, NOF, NONE, PAY3),		\
+	ICE_PTT(89, IP, IPV6, NOF, NONE, NONE, NOF, NONE, PAY3),		\
+	ICE_PTT(90, IP, IPV6, NOF, NONE, NONE, NOF, UDP,  PAY4),		\
+	ICE_PTT_UNUSED_ENTRY(91),						\
+	ICE_PTT(92, IP, IPV6, NOF, NONE, NONE, NOF, TCP,  PAY4),		\
+	ICE_PTT(93, IP, IPV6, NOF, NONE, NONE, NOF, SCTP, PAY4),		\
+	ICE_PTT(94, IP, IPV6, NOF, NONE, NONE, NOF, ICMP, PAY4),		\
+										\
+	/* IPv6 --> IPv4 */							\
+	ICE_PTT(95, IP, IPV6, NOF, IP_IP, IPV4, FRG, NONE, PAY3),		\
+	ICE_PTT(96, IP, IPV6, NOF, IP_IP, IPV4, NOF, NONE, PAY3),		\
+	ICE_PTT(97, IP, IPV6, NOF, IP_IP, IPV4, NOF, UDP,  PAY4),		\
+	ICE_PTT_UNUSED_ENTRY(98),						\
+	ICE_PTT(99, IP, IPV6, NOF, IP_IP, IPV4, NOF, TCP,  PAY4),		\
+	ICE_PTT(100, IP, IPV6, NOF, IP_IP, IPV4, NOF, SCTP, PAY4),		\
+	ICE_PTT(101, IP, IPV6, NOF, IP_IP, IPV4, NOF, ICMP, PAY4),		\
+										\
+	/* IPv6 --> IPv6 */							\
+	ICE_PTT(102, IP, IPV6, NOF, IP_IP, IPV6, FRG, NONE, PAY3),		\
+	ICE_PTT(103, IP, IPV6, NOF, IP_IP, IPV6, NOF, NONE, PAY3),		\
+	ICE_PTT(104, IP, IPV6, NOF, IP_IP, IPV6, NOF, UDP,  PAY4),		\
+	ICE_PTT_UNUSED_ENTRY(105),						\
+	ICE_PTT(106, IP, IPV6, NOF, IP_IP, IPV6, NOF, TCP,  PAY4),		\
+	ICE_PTT(107, IP, IPV6, NOF, IP_IP, IPV6, NOF, SCTP, PAY4),		\
+	ICE_PTT(108, IP, IPV6, NOF, IP_IP, IPV6, NOF, ICMP, PAY4),		\
+										\
+	/* IPv6 --> GRE/NAT */							\
+	ICE_PTT(109, IP, IPV6, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3),		\
+										\
+	/* IPv6 --> GRE/NAT -> IPv4 */						\
+	ICE_PTT(110, IP, IPV6, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3),		\
+	ICE_PTT(111, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3),		\
+	ICE_PTT(112, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, UDP,  PAY4),		\
+	ICE_PTT_UNUSED_ENTRY(113),						\
+	ICE_PTT(114, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, TCP,  PAY4),		\
+	ICE_PTT(115, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4),		\
+	ICE_PTT(116, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4),		\
+										\
+	/* IPv6 --> GRE/NAT -> IPv6 */						\
+	ICE_PTT(117, IP, IPV6, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3),		\
+	ICE_PTT(118, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3),		\
+	ICE_PTT(119, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, UDP,  PAY4),		\
+	ICE_PTT_UNUSED_ENTRY(120),						\
+	ICE_PTT(121, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, TCP,  PAY4),		\
+	ICE_PTT(122, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4),		\
+	ICE_PTT(123, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4),		\
+										\
+	/* IPv6 --> GRE/NAT -> MAC */						\
+	ICE_PTT(124, IP, IPV6, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3),	\
+										\
+	/* IPv6 --> GRE/NAT -> MAC -> IPv4 */					\
+	ICE_PTT(125, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3),	\
+	ICE_PTT(126, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3),	\
+	ICE_PTT(127, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP,  PAY4),	\
+	ICE_PTT_UNUSED_ENTRY(128),						\
+	ICE_PTT(129, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP,  PAY4),	\
+	ICE_PTT(130, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4),	\
+	ICE_PTT(131, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4),	\
+										\
+	/* IPv6 --> GRE/NAT -> MAC -> IPv6 */					\
+	ICE_PTT(132, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3),	\
+	ICE_PTT(133, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3),	\
+	ICE_PTT(134, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP,  PAY4),	\
+	ICE_PTT_UNUSED_ENTRY(135),						\
+	ICE_PTT(136, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP,  PAY4),	\
+	ICE_PTT(137, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4),	\
+	ICE_PTT(138, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4),	\
+										\
+	/* IPv6 --> GRE/NAT -> MAC/VLAN */					\
+	ICE_PTT(139, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3),	\
+										\
+	/* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv4 */				\
+	ICE_PTT(140, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3),	\
+	ICE_PTT(141, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3),	\
+	ICE_PTT(142, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP,  PAY4),	\
+	ICE_PTT_UNUSED_ENTRY(143),						\
+	ICE_PTT(144, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP,  PAY4),	\
+	ICE_PTT(145, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4),	\
+	ICE_PTT(146, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4),	\
+										\
+	/* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv6 */				\
+	ICE_PTT(147, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3),	\
+	ICE_PTT(148, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3),	\
+	ICE_PTT(149, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP,  PAY4),	\
+	ICE_PTT_UNUSED_ENTRY(150),						\
+	ICE_PTT(151, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP,  PAY4),	\
+	ICE_PTT(152, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4),	\
+	ICE_PTT(153, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4),
+
+#define ICE_NUM_DEFINED_PTYPES	154
 
 /* macro to make the table lines short, use explicit indexing with [PTYPE] */
 #define ICE_PTT(PTYPE, OUTER_IP, OUTER_IP_VER, OUTER_FRAG, T, TE, TEF, I, PL)\
@@ -695,212 +901,10 @@ struct ice_tlan_ctx {
 
 /* Lookup table mapping in the 10-bit HW PTYPE to the bit field for decoding */
 static const struct ice_rx_ptype_decoded ice_ptype_lkup[BIT(10)] = {
-	/* L2 Packet types */
-	ICE_PTT_UNUSED_ENTRY(0),
-	ICE_PTT(1, L2, NONE, NOF, NONE, NONE, NOF, NONE, PAY2),
-	ICE_PTT_UNUSED_ENTRY(2),
-	ICE_PTT_UNUSED_ENTRY(3),
-	ICE_PTT_UNUSED_ENTRY(4),
-	ICE_PTT_UNUSED_ENTRY(5),
-	ICE_PTT(6, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),
-	ICE_PTT(7, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),
-	ICE_PTT_UNUSED_ENTRY(8),
-	ICE_PTT_UNUSED_ENTRY(9),
-	ICE_PTT(10, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),
-	ICE_PTT(11, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),
-	ICE_PTT_UNUSED_ENTRY(12),
-	ICE_PTT_UNUSED_ENTRY(13),
-	ICE_PTT_UNUSED_ENTRY(14),
-	ICE_PTT_UNUSED_ENTRY(15),
-	ICE_PTT_UNUSED_ENTRY(16),
-	ICE_PTT_UNUSED_ENTRY(17),
-	ICE_PTT_UNUSED_ENTRY(18),
-	ICE_PTT_UNUSED_ENTRY(19),
-	ICE_PTT_UNUSED_ENTRY(20),
-	ICE_PTT_UNUSED_ENTRY(21),
-
-	/* Non Tunneled IPv4 */
-	ICE_PTT(22, IP, IPV4, FRG, NONE, NONE, NOF, NONE, PAY3),
-	ICE_PTT(23, IP, IPV4, NOF, NONE, NONE, NOF, NONE, PAY3),
-	ICE_PTT(24, IP, IPV4, NOF, NONE, NONE, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(25),
-	ICE_PTT(26, IP, IPV4, NOF, NONE, NONE, NOF, TCP,  PAY4),
-	ICE_PTT(27, IP, IPV4, NOF, NONE, NONE, NOF, SCTP, PAY4),
-	ICE_PTT(28, IP, IPV4, NOF, NONE, NONE, NOF, ICMP, PAY4),
-
-	/* IPv4 --> IPv4 */
-	ICE_PTT(29, IP, IPV4, NOF, IP_IP, IPV4, FRG, NONE, PAY3),
-	ICE_PTT(30, IP, IPV4, NOF, IP_IP, IPV4, NOF, NONE, PAY3),
-	ICE_PTT(31, IP, IPV4, NOF, IP_IP, IPV4, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(32),
-	ICE_PTT(33, IP, IPV4, NOF, IP_IP, IPV4, NOF, TCP,  PAY4),
-	ICE_PTT(34, IP, IPV4, NOF, IP_IP, IPV4, NOF, SCTP, PAY4),
-	ICE_PTT(35, IP, IPV4, NOF, IP_IP, IPV4, NOF, ICMP, PAY4),
-
-	/* IPv4 --> IPv6 */
-	ICE_PTT(36, IP, IPV4, NOF, IP_IP, IPV6, FRG, NONE, PAY3),
-	ICE_PTT(37, IP, IPV4, NOF, IP_IP, IPV6, NOF, NONE, PAY3),
-	ICE_PTT(38, IP, IPV4, NOF, IP_IP, IPV6, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(39),
-	ICE_PTT(40, IP, IPV4, NOF, IP_IP, IPV6, NOF, TCP,  PAY4),
-	ICE_PTT(41, IP, IPV4, NOF, IP_IP, IPV6, NOF, SCTP, PAY4),
-	ICE_PTT(42, IP, IPV4, NOF, IP_IP, IPV6, NOF, ICMP, PAY4),
-
-	/* IPv4 --> GRE/NAT */
-	ICE_PTT(43, IP, IPV4, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3),
-
-	/* IPv4 --> GRE/NAT --> IPv4 */
-	ICE_PTT(44, IP, IPV4, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3),
-	ICE_PTT(45, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3),
-	ICE_PTT(46, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(47),
-	ICE_PTT(48, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, TCP,  PAY4),
-	ICE_PTT(49, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4),
-	ICE_PTT(50, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4),
-
-	/* IPv4 --> GRE/NAT --> IPv6 */
-	ICE_PTT(51, IP, IPV4, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3),
-	ICE_PTT(52, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3),
-	ICE_PTT(53, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(54),
-	ICE_PTT(55, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, TCP,  PAY4),
-	ICE_PTT(56, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4),
-	ICE_PTT(57, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4),
-
-	/* IPv4 --> GRE/NAT --> MAC */
-	ICE_PTT(58, IP, IPV4, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3),
-
-	/* IPv4 --> GRE/NAT --> MAC --> IPv4 */
-	ICE_PTT(59, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3),
-	ICE_PTT(60, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3),
-	ICE_PTT(61, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(62),
-	ICE_PTT(63, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP,  PAY4),
-	ICE_PTT(64, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4),
-	ICE_PTT(65, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4),
-
-	/* IPv4 --> GRE/NAT -> MAC --> IPv6 */
-	ICE_PTT(66, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3),
-	ICE_PTT(67, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3),
-	ICE_PTT(68, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(69),
-	ICE_PTT(70, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP,  PAY4),
-	ICE_PTT(71, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4),
-	ICE_PTT(72, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4),
-
-	/* IPv4 --> GRE/NAT --> MAC/VLAN */
-	ICE_PTT(73, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3),
-
-	/* IPv4 ---> GRE/NAT -> MAC/VLAN --> IPv4 */
-	ICE_PTT(74, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3),
-	ICE_PTT(75, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3),
-	ICE_PTT(76, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(77),
-	ICE_PTT(78, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP,  PAY4),
-	ICE_PTT(79, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4),
-	ICE_PTT(80, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4),
-
-	/* IPv4 -> GRE/NAT -> MAC/VLAN --> IPv6 */
-	ICE_PTT(81, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3),
-	ICE_PTT(82, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3),
-	ICE_PTT(83, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(84),
-	ICE_PTT(85, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP,  PAY4),
-	ICE_PTT(86, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4),
-	ICE_PTT(87, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4),
-
-	/* Non Tunneled IPv6 */
-	ICE_PTT(88, IP, IPV6, FRG, NONE, NONE, NOF, NONE, PAY3),
-	ICE_PTT(89, IP, IPV6, NOF, NONE, NONE, NOF, NONE, PAY3),
-	ICE_PTT(90, IP, IPV6, NOF, NONE, NONE, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(91),
-	ICE_PTT(92, IP, IPV6, NOF, NONE, NONE, NOF, TCP,  PAY4),
-	ICE_PTT(93, IP, IPV6, NOF, NONE, NONE, NOF, SCTP, PAY4),
-	ICE_PTT(94, IP, IPV6, NOF, NONE, NONE, NOF, ICMP, PAY4),
-
-	/* IPv6 --> IPv4 */
-	ICE_PTT(95, IP, IPV6, NOF, IP_IP, IPV4, FRG, NONE, PAY3),
-	ICE_PTT(96, IP, IPV6, NOF, IP_IP, IPV4, NOF, NONE, PAY3),
-	ICE_PTT(97, IP, IPV6, NOF, IP_IP, IPV4, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(98),
-	ICE_PTT(99, IP, IPV6, NOF, IP_IP, IPV4, NOF, TCP,  PAY4),
-	ICE_PTT(100, IP, IPV6, NOF, IP_IP, IPV4, NOF, SCTP, PAY4),
-	ICE_PTT(101, IP, IPV6, NOF, IP_IP, IPV4, NOF, ICMP, PAY4),
-
-	/* IPv6 --> IPv6 */
-	ICE_PTT(102, IP, IPV6, NOF, IP_IP, IPV6, FRG, NONE, PAY3),
-	ICE_PTT(103, IP, IPV6, NOF, IP_IP, IPV6, NOF, NONE, PAY3),
-	ICE_PTT(104, IP, IPV6, NOF, IP_IP, IPV6, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(105),
-	ICE_PTT(106, IP, IPV6, NOF, IP_IP, IPV6, NOF, TCP,  PAY4),
-	ICE_PTT(107, IP, IPV6, NOF, IP_IP, IPV6, NOF, SCTP, PAY4),
-	ICE_PTT(108, IP, IPV6, NOF, IP_IP, IPV6, NOF, ICMP, PAY4),
-
-	/* IPv6 --> GRE/NAT */
-	ICE_PTT(109, IP, IPV6, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3),
-
-	/* IPv6 --> GRE/NAT -> IPv4 */
-	ICE_PTT(110, IP, IPV6, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3),
-	ICE_PTT(111, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3),
-	ICE_PTT(112, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(113),
-	ICE_PTT(114, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, TCP,  PAY4),
-	ICE_PTT(115, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4),
-	ICE_PTT(116, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4),
-
-	/* IPv6 --> GRE/NAT -> IPv6 */
-	ICE_PTT(117, IP, IPV6, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3),
-	ICE_PTT(118, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3),
-	ICE_PTT(119, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(120),
-	ICE_PTT(121, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, TCP,  PAY4),
-	ICE_PTT(122, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4),
-	ICE_PTT(123, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4),
-
-	/* IPv6 --> GRE/NAT -> MAC */
-	ICE_PTT(124, IP, IPV6, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3),
-
-	/* IPv6 --> GRE/NAT -> MAC -> IPv4 */
-	ICE_PTT(125, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3),
-	ICE_PTT(126, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3),
-	ICE_PTT(127, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(128),
-	ICE_PTT(129, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP,  PAY4),
-	ICE_PTT(130, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4),
-	ICE_PTT(131, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4),
-
-	/* IPv6 --> GRE/NAT -> MAC -> IPv6 */
-	ICE_PTT(132, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3),
-	ICE_PTT(133, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3),
-	ICE_PTT(134, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(135),
-	ICE_PTT(136, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP,  PAY4),
-	ICE_PTT(137, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4),
-	ICE_PTT(138, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4),
-
-	/* IPv6 --> GRE/NAT -> MAC/VLAN */
-	ICE_PTT(139, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3),
-
-	/* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv4 */
-	ICE_PTT(140, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3),
-	ICE_PTT(141, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3),
-	ICE_PTT(142, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(143),
-	ICE_PTT(144, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP,  PAY4),
-	ICE_PTT(145, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4),
-	ICE_PTT(146, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4),
-
-	/* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv6 */
-	ICE_PTT(147, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3),
-	ICE_PTT(148, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3),
-	ICE_PTT(149, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP,  PAY4),
-	ICE_PTT_UNUSED_ENTRY(150),
-	ICE_PTT(151, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP,  PAY4),
-	ICE_PTT(152, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4),
-	ICE_PTT(153, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4),
+	ICE_PTYPES
 
 	/* unused entries */
-	[154 ... 1023] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }
+	[ICE_NUM_DEFINED_PTYPES ... 1023] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }
 };
 
 static inline struct ice_rx_ptype_decoded ice_decode_rx_desc_ptype(u16 ptype)
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index 463d9e5cbe05..b11cfaedb81c 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -567,6 +567,79 @@ static int ice_xdp_rx_hw_ts(const struct xdp_md *ctx, u64 *ts_ns)
 	return 0;
 }
 
+/* Define a ptype index -> XDP hash type lookup table.
+ * It uses the same ptype definitions as ice_decode_rx_desc_ptype[],
+ * avoiding possible copy-paste errors.
+ */
+#undef ICE_PTT
+#undef ICE_PTT_UNUSED_ENTRY
+
+#define ICE_PTT(PTYPE, OUTER_IP, OUTER_IP_VER, OUTER_FRAG, T, TE, TEF, I, PL)\
+	[PTYPE] = XDP_RSS_L3_##OUTER_IP_VER | XDP_RSS_L4_##I | XDP_RSS_TYPE_##PL
+
+#define ICE_PTT_UNUSED_ENTRY(PTYPE) [PTYPE] = 0
+
+/* A few supplementary definitions for when XDP hash types do not coincide
+ * with what can be generated from ptype definitions
+ * by means of preprocessor concatenation.
+ */
+#define XDP_RSS_L3_NONE		XDP_RSS_TYPE_NONE
+#define XDP_RSS_L4_NONE		XDP_RSS_TYPE_NONE
+#define XDP_RSS_TYPE_PAY2	XDP_RSS_TYPE_L2
+#define XDP_RSS_TYPE_PAY3	XDP_RSS_TYPE_NONE
+#define XDP_RSS_TYPE_PAY4	XDP_RSS_L4
+
+static const enum xdp_rss_hash_type
+ice_ptype_to_xdp_hash[ICE_NUM_DEFINED_PTYPES] = {
+	ICE_PTYPES
+};
+
+#undef XDP_RSS_L3_NONE
+#undef XDP_RSS_L4_NONE
+#undef XDP_RSS_TYPE_PAY2
+#undef XDP_RSS_TYPE_PAY3
+#undef XDP_RSS_TYPE_PAY4
+
+#undef ICE_PTT
+#undef ICE_PTT_UNUSED_ENTRY
+
+/**
+ * ice_xdp_rx_hash_type - Get XDP-specific hash type from the RX descriptor
+ * @eop_desc: End of Packet descriptor
+ */
+static enum xdp_rss_hash_type
+ice_xdp_rx_hash_type(const union ice_32b_rx_flex_desc *eop_desc)
+{
+	u16 ptype = ice_get_ptype(eop_desc);
+
+	if (unlikely(ptype >= ICE_NUM_DEFINED_PTYPES))
+		return 0;
+
+	return ice_ptype_to_xdp_hash[ptype];
+}
+
+/**
+ * ice_xdp_rx_hash - RX hash XDP hint handler
+ * @ctx: XDP buff pointer
+ * @hash: hash destination address
+ * @rss_type: XDP hash type destination address
+ *
+ * Copy RX hash (if available) and its type to the destination address.
+ */
+static int ice_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash,
+			   enum xdp_rss_hash_type *rss_type)
+{
+	const struct ice_xdp_buff *xdp_ext = (void *)ctx;
+
+	*hash = ice_get_rx_hash(xdp_ext->pkt_ctx.eop_desc);
+	*rss_type = ice_xdp_rx_hash_type(xdp_ext->pkt_ctx.eop_desc);
+	if (!likely(*hash))
+		return -ENODATA;
+
+	return 0;
+}
+
 const struct xdp_metadata_ops ice_xdp_md_ops = {
 	.xmo_rx_timestamp		= ice_xdp_rx_hw_ts,
+	.xmo_rx_hash			= ice_xdp_rx_hash,
 };
diff --git a/include/net/xdp.h b/include/net/xdp.h
index de08c8e0d134..1e9870d5f025 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -416,6 +416,7 @@ enum xdp_rss_hash_type {
 	XDP_RSS_L4_UDP		= BIT(5),
 	XDP_RSS_L4_SCTP		= BIT(6),
 	XDP_RSS_L4_IPSEC	= BIT(7), /* L4 based hash include IPSEC SPI */
+	XDP_RSS_L4_ICMP		= BIT(8),
 
 	/* Second part: RSS hash type combinations used for driver HW mapping */
 	XDP_RSS_TYPE_NONE            = 0,
@@ -431,11 +432,13 @@ enum xdp_rss_hash_type {
 	XDP_RSS_TYPE_L4_IPV4_UDP     = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_UDP,
 	XDP_RSS_TYPE_L4_IPV4_SCTP    = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_SCTP,
 	XDP_RSS_TYPE_L4_IPV4_IPSEC   = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_IPSEC,
+	XDP_RSS_TYPE_L4_IPV4_ICMP    = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_ICMP,
 
 	XDP_RSS_TYPE_L4_IPV6_TCP     = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_TCP,
 	XDP_RSS_TYPE_L4_IPV6_UDP     = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_UDP,
 	XDP_RSS_TYPE_L4_IPV6_SCTP    = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_SCTP,
 	XDP_RSS_TYPE_L4_IPV6_IPSEC   = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_IPSEC,
+	XDP_RSS_TYPE_L4_IPV6_ICMP    = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_ICMP,
 
 	XDP_RSS_TYPE_L4_IPV6_TCP_EX  = XDP_RSS_TYPE_L4_IPV6_TCP  | XDP_RSS_L3_DYNHDR,
 	XDP_RSS_TYPE_L4_IPV6_UDP_EX  = XDP_RSS_TYPE_L4_IPV6_UDP  | XDP_RSS_L3_DYNHDR,
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 08/23] ice: Support XDP hints in AF_XDP ZC mode
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (6 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 07/23] ice: Support RX hash XDP hint Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-09-04 15:42   ` [xdp-hints] " Maciej Fijalkowski
  2023-08-24 19:26 ` [RFC bpf-next 09/23] xdp: Add VLAN tag hint Larysa Zaremba
                   ` (16 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

In AF_XDP ZC, xdp_buff is not stored on ring,
instead it is provided by xsk_pool.
Space for metadata sources right after such buffers was already reserved
in commit 94ecc5ca4dbf ("xsk: Add cb area to struct xdp_buff_xsk").
This makes the implementation rather straightforward.

Update AF_XDP ZC packet processing to support XDP hints.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_xsk.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index ef778b8e6d1b..fdeddad9b639 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -758,16 +758,25 @@ static int ice_xmit_xdp_tx_zc(struct xdp_buff *xdp,
  * @xdp: xdp_buff used as input to the XDP program
  * @xdp_prog: XDP program to run
  * @xdp_ring: ring to be used for XDP_TX action
+ * @rx_desc: packet descriptor
  *
  * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}
  */
 static int
 ice_run_xdp_zc(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
-	       struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring)
+	       struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
+	       union ice_32b_rx_flex_desc *rx_desc)
 {
 	int err, result = ICE_XDP_PASS;
 	u32 act;
 
+	/* We can safely convert xdp_buff_xsk to ice_xdp_buff,
+	 * because there are XSK_PRIV_MAX bytes reserved in xdp_buff_xsk
+	 * right after xdp_buff, for our private use.
+	 * Macro insures we do not go above the limit.
+	 */
+	XSK_CHECK_PRIV_TYPE(struct ice_xdp_buff);
+	ice_xdp_meta_set_desc(xdp, rx_desc);
 	act = bpf_prog_run_xdp(xdp_prog, xdp);
 
 	if (likely(act == XDP_REDIRECT)) {
@@ -907,7 +916,8 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
 		if (ice_is_non_eop(rx_ring, rx_desc))
 			continue;
 
-		xdp_res = ice_run_xdp_zc(rx_ring, first, xdp_prog, xdp_ring);
+		xdp_res = ice_run_xdp_zc(rx_ring, xdp, xdp_prog, xdp_ring,
+					 rx_desc);
 		if (likely(xdp_res & (ICE_XDP_TX | ICE_XDP_REDIR))) {
 			xdp_xmit |= xdp_res;
 		} else if (xdp_res == ICE_XDP_EXIT) {
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 09/23] xdp: Add VLAN tag hint
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (7 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 08/23] ice: Support XDP hints in AF_XDP ZC mode Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-08-24 22:02   ` kernel test robot
  2023-09-14 16:18   ` Alexander Lobakin
  2023-08-24 19:26 ` [RFC bpf-next 10/23] ice: Implement " Larysa Zaremba
                   ` (15 subsequent siblings)
  24 siblings, 2 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Implement functionality that enables drivers to expose VLAN tag
to XDP code.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 Documentation/networking/xdp-rx-metadata.rst |  8 ++++-
 include/net/xdp.h                            |  4 +++
 kernel/bpf/offload.c                         |  2 ++
 net/core/xdp.c                               | 34 ++++++++++++++++++++
 4 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
index 25ce72af81c2..ea6dd79a21d3 100644
--- a/Documentation/networking/xdp-rx-metadata.rst
+++ b/Documentation/networking/xdp-rx-metadata.rst
@@ -18,7 +18,13 @@ Currently, the following kfuncs are supported. In the future, as more
 metadata is supported, this set will grow:
 
 .. kernel-doc:: net/core/xdp.c
-   :identifiers: bpf_xdp_metadata_rx_timestamp bpf_xdp_metadata_rx_hash
+   :identifiers: bpf_xdp_metadata_rx_timestamp
+
+.. kernel-doc:: net/core/xdp.c
+   :identifiers: bpf_xdp_metadata_rx_hash
+
+.. kernel-doc:: net/core/xdp.c
+   :identifiers: bpf_xdp_metadata_rx_vlan_tag
 
 An XDP program can use these kfuncs to read the metadata into stack
 variables for its own consumption. Or, to pass the metadata on to other
diff --git a/include/net/xdp.h b/include/net/xdp.h
index 1e9870d5f025..8bb64fc76498 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -388,6 +388,8 @@ void xdp_attachment_setup(struct xdp_attachment_info *info,
 			   bpf_xdp_metadata_rx_timestamp) \
 	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_HASH, \
 			   bpf_xdp_metadata_rx_hash) \
+	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_VLAN_TAG, \
+			   bpf_xdp_metadata_rx_vlan_tag) \
 
 enum {
 #define XDP_METADATA_KFUNC(name, _) name,
@@ -449,6 +451,8 @@ struct xdp_metadata_ops {
 	int	(*xmo_rx_timestamp)(const struct xdp_md *ctx, u64 *timestamp);
 	int	(*xmo_rx_hash)(const struct xdp_md *ctx, u32 *hash,
 			       enum xdp_rss_hash_type *rss_type);
+	int	(*xmo_rx_vlan_tag)(const struct xdp_md *ctx, u16 *vlan_tci,
+				   __be16 *vlan_proto);
 };
 
 #ifdef CONFIG_NET
diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c
index 3e4f2ec1af06..8be340cf06f9 100644
--- a/kernel/bpf/offload.c
+++ b/kernel/bpf/offload.c
@@ -849,6 +849,8 @@ void *bpf_dev_bound_resolve_kfunc(struct bpf_prog *prog, u32 func_id)
 		p = ops->xmo_rx_timestamp;
 	else if (func_id == bpf_xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_HASH))
 		p = ops->xmo_rx_hash;
+	else if (func_id == bpf_xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_VLAN_TAG))
+		p = ops->xmo_rx_vlan_tag;
 out:
 	up_read(&bpf_devs_lock);
 
diff --git a/net/core/xdp.c b/net/core/xdp.c
index a70670fe9a2d..856e02bb4ce6 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -738,6 +738,40 @@ __bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash,
 	return -EOPNOTSUPP;
 }
 
+/**
+ * bpf_xdp_metadata_rx_vlan_tag - Get XDP packet outermost VLAN tag
+ * @ctx: XDP context pointer.
+ * @vlan_tci: Destination pointer for VLAN TCI (VID + DEI + PCP)
+ * @vlan_proto: Destination pointer for VLAN Tag protocol identifier (TPID).
+ *
+ * In case of success, ``vlan_proto`` contains *Tag protocol identifier (TPID)*,
+ * usually ``ETH_P_8021Q`` or ``ETH_P_8021AD``, but some networks can use
+ * custom TPIDs. ``vlan_proto`` is stored in **network byte order (BE)**
+ * and should be used as follows:
+ * ``if (vlan_proto == bpf_htons(ETH_P_8021Q)) do_something();``
+ *
+ * ``vlan_tci`` contains the remaining 16 bits of a VLAN tag.
+ * Driver is expected to provide those in **host byte order (usually LE)**,
+ * so the bpf program should not perform byte conversion.
+ * According to 802.1Q standard, *VLAN TCI (Tag control information)*
+ * is a bit field that contains:
+ * *VLAN identifier (VID)* that can be read with ``vlan_tci & 0xfff``,
+ * *Drop eligible indicator (DEI)* - 1 bit,
+ * *Priority code point (PCP)* - 3 bits.
+ * For detailed meaning of DEI and PCP, please refer to other sources.
+ *
+ * Return:
+ * * Returns 0 on success or ``-errno`` on error.
+ * * ``-EOPNOTSUPP`` : device driver doesn't implement kfunc
+ * * ``-ENODATA``    : VLAN tag was not stripped or is not available
+ */
+__bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
+					     u16 *vlan_tci,
+					     __be16 *vlan_proto)
+{
+	return -EOPNOTSUPP;
+}
+
 __diag_pop();
 
 BTF_SET8_START(xdp_metadata_kfunc_ids)
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 10/23] ice: Implement VLAN tag hint
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (8 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 09/23] xdp: Add VLAN tag hint Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-09-04 16:00   ` Maciej Fijalkowski
  2023-09-14 16:25   ` [xdp-hints] " Alexander Lobakin
  2023-08-24 19:26 ` [RFC bpf-next 11/23] ice: use VLAN proto from ring packet context in skb path Larysa Zaremba
                   ` (14 subsequent siblings)
  24 siblings, 2 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Implement .xmo_rx_vlan_tag callback to allow XDP code to read
packet's VLAN tag.

At the same time, use vlan_tci instead of vlan_tag in touched code,
because vlan_tag is misleading.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_main.c     | 22 ++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_txrx.c     |  6 ++---
 drivers/net/ethernet/intel/ice/ice_txrx.h     |  1 +
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 26 +++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_txrx_lib.h |  4 +--
 drivers/net/ethernet/intel/ice/ice_xsk.c      |  6 ++---
 6 files changed, 57 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 557c6326ff87..aff4fa1a75f8 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -6007,6 +6007,23 @@ ice_fix_features(struct net_device *netdev, netdev_features_t features)
 	return features;
 }
 
+/**
+ * ice_set_rx_rings_vlan_proto - update rings with new stripped VLAN proto
+ * @vsi: PF's VSI
+ * @vlan_ethertype: VLAN ethertype (802.1Q or 802.1ad) in network byte order
+ *
+ * Store current stripped VLAN proto in ring packet context,
+ * so it can be accessed more efficiently by packet processing code.
+ */
+static void
+ice_set_rx_rings_vlan_proto(struct ice_vsi *vsi, __be16 vlan_ethertype)
+{
+	u16 i;
+
+	ice_for_each_alloc_rxq(vsi, i)
+		vsi->rx_rings[i]->pkt_ctx.vlan_proto = vlan_ethertype;
+}
+
 /**
  * ice_set_vlan_offload_features - set VLAN offload features for the PF VSI
  * @vsi: PF's VSI
@@ -6049,6 +6066,11 @@ ice_set_vlan_offload_features(struct ice_vsi *vsi, netdev_features_t features)
 	if (strip_err || insert_err)
 		return -EIO;
 
+	if (enable_stripping)
+		ice_set_rx_rings_vlan_proto(vsi, htons(vlan_ethertype));
+	else
+		ice_set_rx_rings_vlan_proto(vsi, 0);
+
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 4e6546d9cf85..4fd7614f243d 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -1183,7 +1183,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
 		struct sk_buff *skb;
 		unsigned int size;
 		u16 stat_err_bits;
-		u16 vlan_tag = 0;
+		u16 vlan_tci;
 
 		/* get the Rx desc from Rx ring based on 'next_to_clean' */
 		rx_desc = ICE_RX_DESC(rx_ring, ntc);
@@ -1278,7 +1278,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
 			continue;
 		}
 
-		vlan_tag = ice_get_vlan_tag_from_rx_desc(rx_desc);
+		vlan_tci = ice_get_vlan_tci(rx_desc);
 
 		/* pad the skb if needed, to make a valid ethernet frame */
 		if (eth_skb_pad(skb))
@@ -1292,7 +1292,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
 
 		ice_trace(clean_rx_irq_indicate, rx_ring, rx_desc, skb);
 		/* send completed skb up the stack */
-		ice_receive_skb(rx_ring, skb, vlan_tag);
+		ice_receive_skb(rx_ring, skb, vlan_tci);
 
 		/* update budget accounting */
 		total_rx_pkts++;
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h
index 4237702a58a9..41e0b14e6643 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.h
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.h
@@ -260,6 +260,7 @@ enum ice_rx_dtype {
 struct ice_pkt_ctx {
 	const union ice_32b_rx_flex_desc *eop_desc;
 	u64 cached_phctime;
+	__be16 vlan_proto;
 };
 
 struct ice_xdp_buff {
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index b11cfaedb81c..10e7ec51f4ef 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -639,7 +639,33 @@ static int ice_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash,
 	return 0;
 }
 
+/**
+ * ice_xdp_rx_vlan_tag - VLAN tag XDP hint handler
+ * @ctx: XDP buff pointer
+ * @vlan_tci: destination address for VLAN tag
+ * @vlan_proto: destination address for VLAN protocol
+ *
+ * Copy VLAN tag (if was stripped) and corresponding protocol
+ * to the destination address.
+ */
+static int ice_xdp_rx_vlan_tag(const struct xdp_md *ctx, u16 *vlan_tci,
+			       __be16 *vlan_proto)
+{
+	const struct ice_xdp_buff *xdp_ext = (void *)ctx;
+
+	*vlan_proto = xdp_ext->pkt_ctx.vlan_proto;
+	if (!*vlan_proto)
+		return -ENODATA;
+
+	*vlan_tci = ice_get_vlan_tci(xdp_ext->pkt_ctx.eop_desc);
+	if (!*vlan_tci)
+		return -ENODATA;
+
+	return 0;
+}
+
 const struct xdp_metadata_ops ice_xdp_md_ops = {
 	.xmo_rx_timestamp		= ice_xdp_rx_hw_ts,
 	.xmo_rx_hash			= ice_xdp_rx_hash,
+	.xmo_rx_vlan_tag		= ice_xdp_rx_vlan_tag,
 };
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
index 145883eec129..b7205826fea8 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
@@ -84,7 +84,7 @@ ice_build_ctob(u64 td_cmd, u64 td_offset, unsigned int size, u64 td_tag)
 }
 
 /**
- * ice_get_vlan_tag_from_rx_desc - get VLAN from Rx flex descriptor
+ * ice_get_vlan_tci - get VLAN TCI from Rx flex descriptor
  * @rx_desc: Rx 32b flex descriptor with RXDID=2
  *
  * The OS and current PF implementation only support stripping a single VLAN tag
@@ -92,7 +92,7 @@ ice_build_ctob(u64 td_cmd, u64 td_offset, unsigned int size, u64 td_tag)
  * one is found return the tag, else return 0 to mean no VLAN tag was found.
  */
 static inline u16
-ice_get_vlan_tag_from_rx_desc(union ice_32b_rx_flex_desc *rx_desc)
+ice_get_vlan_tci(const union ice_32b_rx_flex_desc *rx_desc)
 {
 	u16 stat_err_bits;
 
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index fdeddad9b639..eeb02f76b4a6 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -878,7 +878,7 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
 		struct xdp_buff *xdp;
 		struct sk_buff *skb;
 		u16 stat_err_bits;
-		u16 vlan_tag = 0;
+		u16 vlan_tci;
 
 		rx_desc = ICE_RX_DESC(rx_ring, ntc);
 
@@ -957,10 +957,10 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
 		total_rx_bytes += skb->len;
 		total_rx_packets++;
 
-		vlan_tag = ice_get_vlan_tag_from_rx_desc(rx_desc);
+		vlan_tci = ice_get_vlan_tci(rx_desc);
 
 		ice_process_skb_fields(rx_ring, rx_desc, skb);
-		ice_receive_skb(rx_ring, skb, vlan_tag);
+		ice_receive_skb(rx_ring, skb, vlan_tci);
 	}
 
 	rx_ring->next_to_clean = ntc;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 11/23] ice: use VLAN proto from ring packet context in skb path
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (9 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 10/23] ice: Implement " Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-09-14 16:30   ` Alexander Lobakin
  2023-08-24 19:26 ` [RFC bpf-next 12/23] xdp: Add checksum hint Larysa Zaremba
                   ` (13 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

VLAN proto, used in ice XDP hints implementation is stored in ring packet
context. Utilize this value in skb VLAN processing too instead of checking
netdev features.

At the same time, use vlan_tci instead of vlan_tag in touched code,
because vlan_tag is misleading.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 14 +++++---------
 drivers/net/ethernet/intel/ice/ice_txrx_lib.h |  2 +-
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index 10e7ec51f4ef..6ae57a98a4d8 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -283,21 +283,17 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
  * ice_receive_skb - Send a completed packet up the stack
  * @rx_ring: Rx ring in play
  * @skb: packet to send up
- * @vlan_tag: VLAN tag for packet
+ * @vlan_tci: VLAN TCI for packet
  *
  * This function sends the completed packet (via. skb) up the stack using
  * gro receive functions (with/without VLAN tag)
  */
 void
-ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag)
+ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tci)
 {
-	netdev_features_t features = rx_ring->netdev->features;
-	bool non_zero_vlan = !!(vlan_tag & VLAN_VID_MASK);
-
-	if ((features & NETIF_F_HW_VLAN_CTAG_RX) && non_zero_vlan)
-		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan_tag);
-	else if ((features & NETIF_F_HW_VLAN_STAG_RX) && non_zero_vlan)
-		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021AD), vlan_tag);
+	if (vlan_tci & VLAN_VID_MASK && rx_ring->pkt_ctx.vlan_proto)
+		__vlan_hwaccel_put_tag(skb, rx_ring->pkt_ctx.vlan_proto,
+				       vlan_tci);
 
 	napi_gro_receive(&rx_ring->q_vector->napi, skb);
 }
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
index b7205826fea8..8487884bf5c4 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
@@ -150,7 +150,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
 		       union ice_32b_rx_flex_desc *rx_desc,
 		       struct sk_buff *skb);
 void
-ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag);
+ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tci);
 
 static inline void
 ice_xdp_meta_set_desc(struct xdp_buff *xdp,
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 12/23] xdp: Add checksum hint
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (10 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 11/23] ice: use VLAN proto from ring packet context in skb path Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-08-24 22:56   ` kernel test robot
  2023-09-14 16:34   ` Alexander Lobakin
  2023-08-24 19:26 ` [RFC bpf-next 13/23] ice: Implement " Larysa Zaremba
                   ` (12 subsequent siblings)
  24 siblings, 2 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Implement functionality that enables drivers to expose to XDP code checksum
information that consists of:

- Checksum status - 2 non-exlusive flags:
  - XDP_CHECKSUM_VERIFIED indicating HW has validated the checksum
    (corresponding to CHECKSUM_UNNECESSARY in sk_buff)
  - XDP_CHECKSUM_COMPLETE signifies the validity of the second argument
    (corresponding to CHECKSUM_COMPLETE in sk_buff)
- Checksum, calculated over the entire packet, valid if the second flag is
  set

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 Documentation/networking/xdp-rx-metadata.rst |  3 +++
 include/net/xdp.h                            | 15 +++++++++++++
 kernel/bpf/offload.c                         |  2 ++
 net/core/xdp.c                               | 23 ++++++++++++++++++++
 4 files changed, 43 insertions(+)

diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
index ea6dd79a21d3..7f056a44f682 100644
--- a/Documentation/networking/xdp-rx-metadata.rst
+++ b/Documentation/networking/xdp-rx-metadata.rst
@@ -26,6 +26,9 @@ metadata is supported, this set will grow:
 .. kernel-doc:: net/core/xdp.c
    :identifiers: bpf_xdp_metadata_rx_vlan_tag
 
+.. kernel-doc:: net/core/xdp.c
+   :identifiers: bpf_xdp_metadata_rx_csum
+
 An XDP program can use these kfuncs to read the metadata into stack
 variables for its own consumption. Or, to pass the metadata on to other
 consumers, an XDP program can store it into the metadata area carried
diff --git a/include/net/xdp.h b/include/net/xdp.h
index 8bb64fc76498..495c4d2a2c50 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -390,6 +390,8 @@ void xdp_attachment_setup(struct xdp_attachment_info *info,
 			   bpf_xdp_metadata_rx_hash) \
 	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_VLAN_TAG, \
 			   bpf_xdp_metadata_rx_vlan_tag) \
+	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_CSUM, \
+			   bpf_xdp_metadata_rx_csum) \
 
 enum {
 #define XDP_METADATA_KFUNC(name, _) name,
@@ -447,12 +449,25 @@ enum xdp_rss_hash_type {
 	XDP_RSS_TYPE_L4_IPV6_SCTP_EX = XDP_RSS_TYPE_L4_IPV6_SCTP | XDP_RSS_L3_DYNHDR,
 };
 
+enum xdp_csum_status {
+	/* HW had parsed headers and validated the outermost checksum,
+	 * same as ``CHECKSUM_UNNECESSARY`` in ``sk_buff``.
+	 */
+	XDP_CHECKSUM_VERIFIED		= BIT(0),
+
+	/* Checksum, calculated over the entire packet is provided */
+	XDP_CHECKSUM_COMPLETE		= BIT(1),
+};
+
 struct xdp_metadata_ops {
 	int	(*xmo_rx_timestamp)(const struct xdp_md *ctx, u64 *timestamp);
 	int	(*xmo_rx_hash)(const struct xdp_md *ctx, u32 *hash,
 			       enum xdp_rss_hash_type *rss_type);
 	int	(*xmo_rx_vlan_tag)(const struct xdp_md *ctx, u16 *vlan_tci,
 				   __be16 *vlan_proto);
+	int	(*xmo_rx_csum)(const struct xdp_md *ctx,
+			       enum xdp_csum_status *csum_status,
+			       __wsum *csum);
 };
 
 #ifdef CONFIG_NET
diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c
index 8be340cf06f9..ee35f33a96d1 100644
--- a/kernel/bpf/offload.c
+++ b/kernel/bpf/offload.c
@@ -851,6 +851,8 @@ void *bpf_dev_bound_resolve_kfunc(struct bpf_prog *prog, u32 func_id)
 		p = ops->xmo_rx_hash;
 	else if (func_id == bpf_xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_VLAN_TAG))
 		p = ops->xmo_rx_vlan_tag;
+	else if (func_id == bpf_xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_CSUM))
+		p = ops->xmo_rx_csum;
 out:
 	up_read(&bpf_devs_lock);
 
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 856e02bb4ce6..b197287d7196 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -772,6 +772,29 @@ __bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
 	return -EOPNOTSUPP;
 }
 
+/**
+ * bpf_xdp_metadata_rx_csum - Get checksum status with additional info.
+ * @ctx: XDP context pointer.
+ * @csum_status: Destination for checksum status.
+ * @csum: Destination for complete checksum.
+ *
+ * Status (@csum_status) is a bitfield that informs, what checksum
+ * processing was performed. If ``XDP_CHECKSUM_COMPLETE`` in status is set,
+ * second argument (@csum) contains a checksum, calculated over the entire
+ * packet.
+ *
+ * Return:
+ * * Returns 0 on success or ``-errno`` on error.
+ * * ``-EOPNOTSUPP`` : device driver doesn't implement kfunc
+ * * ``-ENODATA``    : Checksum status is unknown
+ */
+__bpf_kfunc int bpf_xdp_metadata_rx_csum(const struct xdp_md *ctx,
+					 enum xdp_csum_status *csum_status,
+					 __wsum *csum)
+{
+	return -EOPNOTSUPP;
+}
+
 __diag_pop();
 
 BTF_SET8_START(xdp_metadata_kfunc_ids)
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 13/23] ice: Implement checksum hint
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (11 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 12/23] xdp: Add checksum hint Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-08-24 19:26 ` [RFC bpf-next 14/23] selftests/bpf: Allow VLAN packets in xdp_hw_metadata Larysa Zaremba
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Implement .xmo_rx_csum callback to allow XDP code to determine,
whether HW has validated any checksums.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 26 +++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index 6ae57a98a4d8..f11a245705bc 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -660,8 +660,34 @@ static int ice_xdp_rx_vlan_tag(const struct xdp_md *ctx, u16 *vlan_tci,
 	return 0;
 }
 
+/**
+ * ice_xdp_rx_csum - RX checksum XDP hint handler
+ * @ctx: XDP buff pointer
+ * @csum_status: status destination address
+ * @csum: not used
+ */
+static int ice_xdp_rx_csum(const struct xdp_md *ctx,
+			   enum xdp_csum_status *csum_status, __wsum *csum)
+{
+	const struct ice_xdp_buff *xdp_ext = (void *)ctx;
+	const union ice_32b_rx_flex_desc *eop_desc;
+	enum ice_rx_csum_status status;
+	u16 ptype;
+
+	eop_desc = xdp_ext->pkt_ctx.eop_desc;
+	ptype = ice_get_ptype(eop_desc);
+
+	status = ice_get_rx_csum_status(eop_desc, ptype);
+	if (status & ICE_RX_CSUM_FAIL)
+		return -ENODATA;
+
+	*csum_status = XDP_CHECKSUM_VERIFIED;
+	return 0;
+}
+
 const struct xdp_metadata_ops ice_xdp_md_ops = {
 	.xmo_rx_timestamp		= ice_xdp_rx_hw_ts,
 	.xmo_rx_hash			= ice_xdp_rx_hash,
 	.xmo_rx_vlan_tag		= ice_xdp_rx_vlan_tag,
+	.xmo_rx_csum			= ice_xdp_rx_csum,
 };
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 14/23] selftests/bpf: Allow VLAN packets in xdp_hw_metadata
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (12 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 13/23] ice: Implement " Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-08-24 19:26 ` [RFC bpf-next 15/23] net, xdp: allow metadata > 32 Larysa Zaremba
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Make VLAN c-tag and s-tag XDP hint testing more convenient
by not skipping VLAN-ed packets.

Allow both 802.1ad and 802.1Q headers.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 tools/testing/selftests/bpf/progs/xdp_hw_metadata.c | 10 +++++++++-
 tools/testing/selftests/bpf/xdp_metadata.h          |  8 ++++++++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
index b2dfd7066c6e..63d7de6c6bbb 100644
--- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
+++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
@@ -26,15 +26,23 @@ int rx(struct xdp_md *ctx)
 {
 	void *data, *data_meta, *data_end;
 	struct ipv6hdr *ip6h = NULL;
-	struct ethhdr *eth = NULL;
 	struct udphdr *udp = NULL;
 	struct iphdr *iph = NULL;
 	struct xdp_meta *meta;
+	struct ethhdr *eth;
 	int err;
 
 	data = (void *)(long)ctx->data;
 	data_end = (void *)(long)ctx->data_end;
 	eth = data;
+
+	if (eth + 1 < data_end && (eth->h_proto == bpf_htons(ETH_P_8021AD) ||
+				   eth->h_proto == bpf_htons(ETH_P_8021Q)))
+		eth = (void *)eth + sizeof(struct vlan_hdr);
+
+	if (eth + 1 < data_end && eth->h_proto == bpf_htons(ETH_P_8021Q))
+		eth = (void *)eth + sizeof(struct vlan_hdr);
+
 	if (eth + 1 < data_end) {
 		if (eth->h_proto == bpf_htons(ETH_P_IP)) {
 			iph = (void *)(eth + 1);
diff --git a/tools/testing/selftests/bpf/xdp_metadata.h b/tools/testing/selftests/bpf/xdp_metadata.h
index 938a729bd307..6664893c2c77 100644
--- a/tools/testing/selftests/bpf/xdp_metadata.h
+++ b/tools/testing/selftests/bpf/xdp_metadata.h
@@ -9,6 +9,14 @@
 #define ETH_P_IPV6 0x86DD
 #endif
 
+#ifndef ETH_P_8021Q
+#define ETH_P_8021Q 0x8100
+#endif
+
+#ifndef ETH_P_8021AD
+#define ETH_P_8021AD 0x88A8
+#endif
+
 struct xdp_meta {
 	__u64 rx_timestamp;
 	__u64 xdp_timestamp;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 15/23] net, xdp: allow metadata > 32
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (13 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 14/23] selftests/bpf: Allow VLAN packets in xdp_hw_metadata Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-08-24 19:26 ` [RFC bpf-next 16/23] selftests/bpf: Add flags and new hints to xdp_hw_metadata Larysa Zaremba
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed,
	Aleksander Lobakin

From: Aleksander Lobakin <aleksander.lobakin@intel.com>

When using XDP hints, metadata sometimes has to be much bigger
than 32 bytes. Relax the restriction, allow metadata larger than 32 bytes
and make __skb_metadata_differs() work with bigger lengths.

Now size of metadata is only limited by the fact it is stored as u8
in skb_shared_info, so maximum possible value is 255. Other important
conditions, such as having enough space for xdp_frame building, are already
checked in bpf_xdp_adjust_meta().

The requirement of having its length aligned to 4 bytes is still
valid.

Signed-off-by: Aleksander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 include/linux/skbuff.h | 13 ++++++++-----
 include/net/xdp.h      |  7 ++++++-
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index aa57e2eca33b..0e455678cb8a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -4216,10 +4216,13 @@ static inline bool __skb_metadata_differs(const struct sk_buff *skb_a,
 {
 	const void *a = skb_metadata_end(skb_a);
 	const void *b = skb_metadata_end(skb_b);
-	/* Using more efficient varaiant than plain call to memcmp(). */
-#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
 	u64 diffs = 0;
 
+	if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) ||
+	    BITS_PER_LONG != 64)
+		goto slow;
+
+	/* Using more efficient variant than plain call to memcmp(). */
 	switch (meta_len) {
 #define __it(x, op) (x -= sizeof(u##op))
 #define __it_diff(a, b, op) (*(u##op *)__it(a, op)) ^ (*(u##op *)__it(b, op))
@@ -4239,11 +4242,11 @@ static inline bool __skb_metadata_differs(const struct sk_buff *skb_a,
 		fallthrough;
 	case  4: diffs |= __it_diff(a, b, 32);
 		break;
+	default:
+slow:
+		return memcmp(a - meta_len, b - meta_len, meta_len);
 	}
 	return diffs;
-#else
-	return memcmp(a - meta_len, b - meta_len, meta_len);
-#endif
 }
 
 static inline bool skb_metadata_differs(const struct sk_buff *skb_a,
diff --git a/include/net/xdp.h b/include/net/xdp.h
index 495c4d2a2c50..05234f156a73 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -369,7 +369,12 @@ xdp_data_meta_unsupported(const struct xdp_buff *xdp)
 
 static inline bool xdp_metalen_invalid(unsigned long metalen)
 {
-	return (metalen & (sizeof(__u32) - 1)) || (metalen > 32);
+	typeof(metalen) meta_max;
+
+	meta_max = type_max(typeof_member(struct skb_shared_info, meta_len));
+	BUILD_BUG_ON(!__builtin_constant_p(meta_max));
+
+	return !IS_ALIGNED(metalen, sizeof(u32)) || metalen > meta_max;
 }
 
 struct xdp_attachment_info {
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 16/23] selftests/bpf: Add flags and new hints to xdp_hw_metadata
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (14 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 15/23] net, xdp: allow metadata > 32 Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-08-24 19:26 ` [RFC bpf-next 17/23] veth: Implement VLAN tag and checksum XDP hint Larysa Zaremba
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Add hints added in the previous patches (VLAN tag and checksum)
to the xdp_hw_metadata program.

Also, to make metadata layout more straightforward, add flags field
to pass information about validity of every separate hint separately.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 .../selftests/bpf/progs/xdp_hw_metadata.c     | 38 +++++++++--
 tools/testing/selftests/bpf/xdp_hw_metadata.c | 67 +++++++++++++++++--
 tools/testing/selftests/bpf/xdp_metadata.h    | 34 +++++++++-
 3 files changed, 126 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
index 63d7de6c6bbb..95b17eaf8f05 100644
--- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
+++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
@@ -20,6 +20,12 @@ extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx,
 					 __u64 *timestamp) __ksym;
 extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash,
 				    enum xdp_rss_hash_type *rss_type) __ksym;
+extern int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
+					__u16 *vlan_tci,
+					__be16 *vlan_proto) __ksym;
+extern int bpf_xdp_metadata_rx_csum(const struct xdp_md *ctx,
+				    enum xdp_csum_status *csum_status,
+				    __wsum *csum) __ksym;
 
 SEC("xdp")
 int rx(struct xdp_md *ctx)
@@ -84,15 +90,35 @@ int rx(struct xdp_md *ctx)
 		return XDP_PASS;
 	}
 
+	meta->hint_valid = 0;
+
+	meta->xdp_timestamp = bpf_ktime_get_tai_ns();
 	err = bpf_xdp_metadata_rx_timestamp(ctx, &meta->rx_timestamp);
-	if (!err)
-		meta->xdp_timestamp = bpf_ktime_get_tai_ns();
+	if (err)
+		meta->rx_timestamp_err = err;
+	else
+		meta->hint_valid |= XDP_META_FIELD_TS;
+
+	err = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash,
+				       &meta->rx_hash_type);
+	if (err)
+		meta->rx_hash_err = err;
 	else
-		meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
+		meta->hint_valid |= XDP_META_FIELD_RSS;
 
-	err = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash, &meta->rx_hash_type);
-	if (err < 0)
-		meta->rx_hash_err = err; /* Used by AF_XDP as no hash signal */
+	err = bpf_xdp_metadata_rx_vlan_tag(ctx, &meta->rx_vlan_tci,
+					   &meta->rx_vlan_proto);
+	if (err)
+		meta->rx_vlan_tag_err = err;
+	else
+		meta->hint_valid |= XDP_META_FIELD_VLAN_TAG;
+
+	err = bpf_xdp_metadata_rx_csum(ctx, &meta->rx_csum_status,
+				       &meta->rx_csum);
+	if (err)
+		meta->rx_csum_err = err;
+	else
+		meta->hint_valid |= XDP_META_FIELD_CSUM;
 
 	__sync_add_and_fetch(&pkts_redir, 1);
 	return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
index 613321eb84c1..7535baa7e7ef 100644
--- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
+++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
@@ -19,6 +19,9 @@
 #include "xsk.h"
 
 #include <error.h>
+#include <linux/kernel.h>
+#include <linux/bits.h>
+#include <linux/bitfield.h>
 #include <linux/errqueue.h>
 #include <linux/if_link.h>
 #include <linux/net_tstamp.h>
@@ -150,21 +153,58 @@ static __u64 gettime(clockid_t clock_id)
 	return (__u64) t.tv_sec * NANOSEC_PER_SEC + t.tv_nsec;
 }
 
+#define VLAN_PRIO_MASK		GENMASK(15, 13) /* Priority Code Point */
+#define VLAN_DEI_MASK		GENMASK(12, 12) /* Drop Eligible Indicator */
+#define VLAN_VID_MASK		GENMASK(11, 0)	/* VLAN Identifier */
+static void print_vlan_tci(__u16 tag)
+{
+	__u16 vlan_id = FIELD_GET(VLAN_VID_MASK, tag);
+	__u8 pcp = FIELD_GET(VLAN_PRIO_MASK, tag);
+	bool dei = FIELD_GET(VLAN_DEI_MASK, tag);
+
+	printf("PCP=%u, DEI=%d, VID=0x%X\n", pcp, dei, vlan_id);
+}
+
+#define XDP_CHECKSUM_VERIFIED		BIT(0)
+#define XDP_CHECKSUM_COMPLETE		BIT(1)
+
+struct partial_csum_info {
+	__u16 csum_start;
+	__u16 csum_offset;
+};
+
+static void print_csum_state(__u32 status, __u32 info)
+{
+	bool is_verified = status & XDP_CHECKSUM_VERIFIED;
+
+	printf("Checksum status: ");
+	if (status & ~(XDP_CHECKSUM_COMPLETE | XDP_CHECKSUM_VERIFIED))
+		printf("cannot be interpreted, status=0x%X\n", status);
+
+	if (status & XDP_CHECKSUM_COMPLETE)
+		printf("complete, checksum=0x%X%s", info,
+		       is_verified ? ", " : "\n");
+
+	if (is_verified)
+		printf("outermost checksum is verified\n");
+}
+
 static void verify_xdp_metadata(void *data, clockid_t clock_id)
 {
 	struct xdp_meta *meta;
 
 	meta = data - sizeof(*meta);
 
-	if (meta->rx_hash_err < 0)
-		printf("No rx_hash err=%d\n", meta->rx_hash_err);
-	else
+	if (meta->hint_valid & XDP_META_FIELD_RSS)
 		printf("rx_hash: 0x%X with RSS type:0x%X\n",
 		       meta->rx_hash, meta->rx_hash_type);
+	else
+		printf("No rx_hash, err=%d\n", meta->rx_hash_err);
+
+	if (meta->hint_valid & XDP_META_FIELD_TS) {
+		printf("rx_timestamp:  %llu (sec:%0.4f)\n", meta->rx_timestamp,
+		       (double)meta->rx_timestamp / NANOSEC_PER_SEC);
 
-	printf("rx_timestamp:  %llu (sec:%0.4f)\n", meta->rx_timestamp,
-	       (double)meta->rx_timestamp / NANOSEC_PER_SEC);
-	if (meta->rx_timestamp) {
 		__u64 usr_clock = gettime(clock_id);
 		__u64 xdp_clock = meta->xdp_timestamp;
 		__s64 delta_X = xdp_clock - meta->rx_timestamp;
@@ -179,8 +219,23 @@ static void verify_xdp_metadata(void *data, clockid_t clock_id)
 		       usr_clock, (double)usr_clock / NANOSEC_PER_SEC,
 		       (double)delta_X2U / NANOSEC_PER_SEC,
 		       (double)delta_X2U / 1000);
+	} else {
+		printf("No rx_timestamp, err=%d\n", meta->rx_timestamp_err);
+	}
+
+	if (meta->hint_valid & XDP_META_FIELD_VLAN_TAG) {
+		printf("rx_vlan_proto: 0x%X\n", ntohs(meta->rx_vlan_proto));
+		printf("rx_vlan_tci: ");
+		print_vlan_tci(meta->rx_vlan_tci);
+	} else {
+		printf("No rx_vlan_tci or rx_vlan_proto, err=%d\n",
+		       meta->rx_vlan_tag_err);
 	}
 
+	if (meta->hint_valid & XDP_META_FIELD_CSUM)
+		print_csum_state(meta->rx_csum_status, meta->rx_csum);
+	else
+		printf("Checksum was not checked, err=%d\n", meta->rx_csum_err);
 }
 
 static void verify_skb_metadata(int fd)
diff --git a/tools/testing/selftests/bpf/xdp_metadata.h b/tools/testing/selftests/bpf/xdp_metadata.h
index 6664893c2c77..0b749bfed2de 100644
--- a/tools/testing/selftests/bpf/xdp_metadata.h
+++ b/tools/testing/selftests/bpf/xdp_metadata.h
@@ -17,12 +17,44 @@
 #define ETH_P_8021AD 0x88A8
 #endif
 
+#ifndef BIT
+#define BIT(nr)			(1 << (nr))
+#endif
+
+/* Non-existent checksum status */
+#define XDP_CHECKSUM_MAGIC	BIT(2)
+
+enum xdp_meta_field {
+	XDP_META_FIELD_TS	= BIT(0),
+	XDP_META_FIELD_RSS	= BIT(1),
+	XDP_META_FIELD_VLAN_TAG	= BIT(2),
+	XDP_META_FIELD_CSUM	= BIT(3),
+};
+
 struct xdp_meta {
-	__u64 rx_timestamp;
+	union {
+		__u64 rx_timestamp;
+		__s32 rx_timestamp_err;
+	};
 	__u64 xdp_timestamp;
 	__u32 rx_hash;
 	union {
 		__u32 rx_hash_type;
 		__s32 rx_hash_err;
 	};
+	union {
+		struct {
+			__u16 rx_vlan_tci;
+			__be16 rx_vlan_proto;
+		};
+		__s32 rx_vlan_tag_err;
+	};
+	union {
+		struct {
+			__u32 rx_csum_status;
+			__wsum rx_csum;
+		};
+		__s32 rx_csum_err;
+	};
+	enum xdp_meta_field hint_valid;
 };
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 17/23] veth: Implement VLAN tag and checksum XDP hint
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (15 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 16/23] selftests/bpf: Add flags and new hints to xdp_hw_metadata Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-08-24 19:26 ` [RFC bpf-next 18/23] net: make vlan_get_tag() return -ENODATA instead of -EINVAL Larysa Zaremba
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

In order to test VLAN tag and checksum XDP hints in hardware-independent
selftests, implement newly added XDP hints in veth driver.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 drivers/net/veth.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 953f6d8f8db0..f3ee85aa5edf 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1732,6 +1732,46 @@ static int veth_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash,
 	return 0;
 }
 
+static int veth_xdp_rx_vlan_tag(const struct xdp_md *ctx, u16 *vlan_tci,
+				__be16 *vlan_proto)
+{
+	struct veth_xdp_buff *_ctx = (void *)ctx;
+	struct sk_buff *skb = _ctx->skb;
+	int err;
+
+	if (!skb)
+		return -ENODATA;
+
+	err = __vlan_hwaccel_get_tag(skb, vlan_tci);
+	if (err)
+		return err;
+
+	*vlan_proto = skb->vlan_proto;
+	return err;
+}
+
+static int veth_xdp_rx_csum(const struct xdp_md *ctx,
+			    enum xdp_csum_status *csum_status,
+			    __wsum *csum)
+{
+	struct veth_xdp_buff *_ctx = (void *)ctx;
+	struct sk_buff *skb = _ctx->skb;
+
+	if (!skb)
+		return -ENODATA;
+
+	if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
+		*csum_status = XDP_CHECKSUM_VERIFIED;
+	} else if (skb->ip_summed == CHECKSUM_COMPLETE) {
+		*csum_status = XDP_CHECKSUM_COMPLETE;
+		*csum = skb->csum;
+	} else {
+		return -ENODATA;
+	}
+
+	return 0;
+}
+
 static const struct net_device_ops veth_netdev_ops = {
 	.ndo_init            = veth_dev_init,
 	.ndo_open            = veth_open,
@@ -1756,6 +1796,8 @@ static const struct net_device_ops veth_netdev_ops = {
 static const struct xdp_metadata_ops veth_xdp_metadata_ops = {
 	.xmo_rx_timestamp		= veth_xdp_rx_timestamp,
 	.xmo_rx_hash			= veth_xdp_rx_hash,
+	.xmo_rx_vlan_tag		= veth_xdp_rx_vlan_tag,
+	.xmo_rx_csum			= veth_xdp_rx_csum,
 };
 
 #define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HW_CSUM | \
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 18/23] net: make vlan_get_tag() return -ENODATA instead of -EINVAL
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (16 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 17/23] veth: Implement VLAN tag and checksum XDP hint Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-08-24 19:26 ` [RFC bpf-next 19/23] selftests/bpf: Use AF_INET for TX in xdp_metadata Larysa Zaremba
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed,
	Jesper Dangaard Brouer

__vlan_hwaccel_get_tag() is used in veth XDP hints implementation,
its return value (-EINVAL if skb is not VLAN tagged) is passed to bpf code,
but XDP hints specification requires drivers to return -ENODATA, if a hint
cannot be provided for a particular packet.

Solve this inconsistency by changing error return value of
__vlan_hwaccel_get_tag() from -EINVAL to -ENODATA, do the same thing to
__vlan_get_tag(), because this function is supposed to follow the same
convention. This, in turn, makes -ENODATA the only non-zero value
vlan_get_tag() can return. We can do this with no side effects, because
none of the users of the 3 above-mentioned functions rely on the exact
value.

Suggested-by: Jesper Dangaard Brouer <jbrouer@redhat.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 include/linux/if_vlan.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 3028af87716e..c1645c86eed9 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -540,7 +540,7 @@ static inline int __vlan_get_tag(const struct sk_buff *skb, u16 *vlan_tci)
 	struct vlan_ethhdr *veth = skb_vlan_eth_hdr(skb);
 
 	if (!eth_type_vlan(veth->h_vlan_proto))
-		return -EINVAL;
+		return -ENODATA;
 
 	*vlan_tci = ntohs(veth->h_vlan_TCI);
 	return 0;
@@ -561,7 +561,7 @@ static inline int __vlan_hwaccel_get_tag(const struct sk_buff *skb,
 		return 0;
 	} else {
 		*vlan_tci = 0;
-		return -EINVAL;
+		return -ENODATA;
 	}
 }
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 19/23] selftests/bpf: Use AF_INET for TX in xdp_metadata
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (17 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 18/23] net: make vlan_get_tag() return -ENODATA instead of -EINVAL Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-08-24 19:26 ` [RFC bpf-next 20/23] selftests/bpf: Check VLAN tag and proto " Larysa Zaremba
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

The easiest way to simulate stripped VLAN tag in veth is to send a packet
from VLAN interface, attached to veth. Unfortunately, this approach is
incompatible with AF_XDP on TX side, because VLAN interfaces do not have
such feature.

Replace AF_XDP packet generation with sending the same datagram via
AF_INET socket.

This does not change the packet contents or hints values with one notable
exception: rx_hash_type, which previously was expected to be 0, now is
expected be at least XDP_RSS_TYPE_L4.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 .../selftests/bpf/prog_tests/xdp_metadata.c   | 167 +++++++-----------
 1 file changed, 59 insertions(+), 108 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
index 626c461fa34d..1877e5c6d6c7 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
@@ -20,7 +20,7 @@
 
 #define UDP_PAYLOAD_BYTES 4
 
-#define AF_XDP_SOURCE_PORT 1234
+#define UDP_SOURCE_PORT 1234
 #define AF_XDP_CONSUMER_PORT 8080
 
 #define UMEM_NUM 16
@@ -33,6 +33,12 @@
 #define RX_ADDR "10.0.0.2"
 #define PREFIX_LEN "8"
 #define FAMILY AF_INET
+#define TX_NETNS_NAME "xdp_metadata_tx"
+#define RX_NETNS_NAME "xdp_metadata_rx"
+#define TX_MAC "00:00:00:00:00:01"
+#define RX_MAC "00:00:00:00:00:02"
+
+#define XDP_RSS_TYPE_L4 BIT(3)
 
 struct xsk {
 	void *umem_area;
@@ -119,90 +125,28 @@ static void close_xsk(struct xsk *xsk)
 	munmap(xsk->umem_area, UMEM_SIZE);
 }
 
-static void ip_csum(struct iphdr *iph)
+static int generate_packet_udp(void)
 {
-	__u32 sum = 0;
-	__u16 *p;
-	int i;
-
-	iph->check = 0;
-	p = (void *)iph;
-	for (i = 0; i < sizeof(*iph) / sizeof(*p); i++)
-		sum += p[i];
-
-	while (sum >> 16)
-		sum = (sum & 0xffff) + (sum >> 16);
-
-	iph->check = ~sum;
-}
-
-static int generate_packet(struct xsk *xsk, __u16 dst_port)
-{
-	struct xdp_desc *tx_desc;
-	struct udphdr *udph;
-	struct ethhdr *eth;
-	struct iphdr *iph;
-	void *data;
-	__u32 idx;
-	int ret;
-
-	ret = xsk_ring_prod__reserve(&xsk->tx, 1, &idx);
-	if (!ASSERT_EQ(ret, 1, "xsk_ring_prod__reserve"))
-		return -1;
-
-	tx_desc = xsk_ring_prod__tx_desc(&xsk->tx, idx);
-	tx_desc->addr = idx % (UMEM_NUM / 2) * UMEM_FRAME_SIZE;
-	printf("%p: tx_desc[%u]->addr=%llx\n", xsk, idx, tx_desc->addr);
-	data = xsk_umem__get_data(xsk->umem_area, tx_desc->addr);
-
-	eth = data;
-	iph = (void *)(eth + 1);
-	udph = (void *)(iph + 1);
-
-	memcpy(eth->h_dest, "\x00\x00\x00\x00\x00\x02", ETH_ALEN);
-	memcpy(eth->h_source, "\x00\x00\x00\x00\x00\x01", ETH_ALEN);
-	eth->h_proto = htons(ETH_P_IP);
-
-	iph->version = 0x4;
-	iph->ihl = 0x5;
-	iph->tos = 0x9;
-	iph->tot_len = htons(sizeof(*iph) + sizeof(*udph) + UDP_PAYLOAD_BYTES);
-	iph->id = 0;
-	iph->frag_off = 0;
-	iph->ttl = 0;
-	iph->protocol = IPPROTO_UDP;
-	ASSERT_EQ(inet_pton(FAMILY, TX_ADDR, &iph->saddr), 1, "inet_pton(TX_ADDR)");
-	ASSERT_EQ(inet_pton(FAMILY, RX_ADDR, &iph->daddr), 1, "inet_pton(RX_ADDR)");
-	ip_csum(iph);
-
-	udph->source = htons(AF_XDP_SOURCE_PORT);
-	udph->dest = htons(dst_port);
-	udph->len = htons(sizeof(*udph) + UDP_PAYLOAD_BYTES);
-	udph->check = 0;
-
-	memset(udph + 1, 0xAA, UDP_PAYLOAD_BYTES);
-
-	tx_desc->len = sizeof(*eth) + sizeof(*iph) + sizeof(*udph) + UDP_PAYLOAD_BYTES;
-	xsk_ring_prod__submit(&xsk->tx, 1);
-
-	ret = sendto(xsk_socket__fd(xsk->socket), NULL, 0, MSG_DONTWAIT, NULL, 0);
-	if (!ASSERT_GE(ret, 0, "sendto"))
-		return ret;
-
-	return 0;
-}
-
-static void complete_tx(struct xsk *xsk)
-{
-	__u32 idx;
-	__u64 addr;
-
-	if (ASSERT_EQ(xsk_ring_cons__peek(&xsk->comp, 1, &idx), 1, "xsk_ring_cons__peek")) {
-		addr = *xsk_ring_cons__comp_addr(&xsk->comp, idx);
-
-		printf("%p: complete tx idx=%u addr=%llx\n", xsk, idx, addr);
-		xsk_ring_cons__release(&xsk->comp, 1);
-	}
+	char udp_payload[UDP_PAYLOAD_BYTES];
+	struct sockaddr_in rx_addr;
+	int sock_fd, err = 0;
+
+	/* Build a packet */
+	memset(udp_payload, 0xAA, UDP_PAYLOAD_BYTES);
+	rx_addr.sin_addr.s_addr = inet_addr(RX_ADDR);
+	rx_addr.sin_family = AF_INET;
+	rx_addr.sin_port = htons(UDP_SOURCE_PORT);
+
+	sock_fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
+	if (!ASSERT_GE(sock_fd, 0, "socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)"))
+		return sock_fd;
+
+	err = sendto(sock_fd, udp_payload, UDP_PAYLOAD_BYTES, MSG_DONTWAIT,
+		     (void *)&rx_addr, sizeof(rx_addr));
+	ASSERT_GE(err, 0, "sendto");
+
+	close(sock_fd);
+	return err;
 }
 
 static void refill_rx(struct xsk *xsk, __u64 addr)
@@ -268,7 +212,8 @@ static int verify_xsk_metadata(struct xsk *xsk)
 	if (!ASSERT_NEQ(meta->rx_hash, 0, "rx_hash"))
 		return -1;
 
-	ASSERT_EQ(meta->rx_hash_type, 0, "rx_hash_type");
+	if (!ASSERT_NEQ(meta->rx_hash_type & XDP_RSS_TYPE_L4, 0, "rx_hash_type"))
+		return -1;
 
 	xsk_ring_cons__release(&xsk->rx, 1);
 	refill_rx(xsk, comp_addr);
@@ -284,36 +229,38 @@ void test_xdp_metadata(void)
 	struct nstoken *tok = NULL;
 	__u32 queue_id = QUEUE_ID;
 	struct bpf_map *prog_arr;
-	struct xsk tx_xsk = {};
 	struct xsk rx_xsk = {};
 	__u32 val, key = 0;
 	int retries = 10;
 	int rx_ifindex;
-	int tx_ifindex;
 	int sock_fd;
 	int ret;
 
-	/* Setup new networking namespace, with a veth pair. */
+	/* Setup new networking namespaces, with a veth pair. */
 
-	SYS(out, "ip netns add xdp_metadata");
-	tok = open_netns("xdp_metadata");
+	SYS(out, "ip netns add " TX_NETNS_NAME);
+	SYS(out, "ip netns add " RX_NETNS_NAME);
+
+	tok = open_netns(TX_NETNS_NAME);
 	SYS(out, "ip link add numtxqueues 1 numrxqueues 1 " TX_NAME
 	    " type veth peer " RX_NAME " numtxqueues 1 numrxqueues 1");
-	SYS(out, "ip link set dev " TX_NAME " address 00:00:00:00:00:01");
-	SYS(out, "ip link set dev " RX_NAME " address 00:00:00:00:00:02");
+	SYS(out, "ip link set " RX_NAME " netns " RX_NETNS_NAME);
+
+	SYS(out, "ip link set dev " TX_NAME " address " TX_MAC);
 	SYS(out, "ip link set dev " TX_NAME " up");
-	SYS(out, "ip link set dev " RX_NAME " up");
 	SYS(out, "ip addr add " TX_ADDR "/" PREFIX_LEN " dev " TX_NAME);
-	SYS(out, "ip addr add " RX_ADDR "/" PREFIX_LEN " dev " RX_NAME);
 
-	rx_ifindex = if_nametoindex(RX_NAME);
-	tx_ifindex = if_nametoindex(TX_NAME);
+	/* Avoid ARP calls */
+	SYS(out, "ip -4 neigh add " RX_ADDR " lladdr " RX_MAC " dev " TX_NAME);
+	close_netns(tok);
 
-	/* Setup separate AF_XDP for TX and RX interfaces. */
+	tok = open_netns(RX_NETNS_NAME);
+	SYS(out, "ip link set dev " RX_NAME " address " RX_MAC);
+	SYS(out, "ip link set dev " RX_NAME " up");
+	SYS(out, "ip addr add " RX_ADDR "/" PREFIX_LEN " dev " RX_NAME);
+	rx_ifindex = if_nametoindex(RX_NAME);
 
-	ret = open_xsk(tx_ifindex, &tx_xsk);
-	if (!ASSERT_OK(ret, "open_xsk(TX_NAME)"))
-		goto out;
+	/* Setup AF_XDP for RX interface. */
 
 	ret = open_xsk(rx_ifindex, &rx_xsk);
 	if (!ASSERT_OK(ret, "open_xsk(RX_NAME)"))
@@ -353,19 +300,20 @@ void test_xdp_metadata(void)
 	ret = bpf_map_update_elem(bpf_map__fd(bpf_obj->maps.xsk), &queue_id, &sock_fd, 0);
 	if (!ASSERT_GE(ret, 0, "bpf_map_update_elem"))
 		goto out;
+	close_netns(tok);
 
 	/* Send packet destined to RX AF_XDP socket. */
-	if (!ASSERT_GE(generate_packet(&tx_xsk, AF_XDP_CONSUMER_PORT), 0,
-		       "generate AF_XDP_CONSUMER_PORT"))
+	tok = open_netns(TX_NETNS_NAME);
+	if (!ASSERT_GE(generate_packet_udp(), 0, "generate UDP packet"))
 		goto out;
+	close_netns(tok);
 
 	/* Verify AF_XDP RX packet has proper metadata. */
+	tok = open_netns(RX_NETNS_NAME);
 	if (!ASSERT_GE(verify_xsk_metadata(&rx_xsk), 0,
 		       "verify_xsk_metadata"))
 		goto out;
 
-	complete_tx(&tx_xsk);
-
 	/* Make sure freplace correctly picks up original bound device
 	 * and doesn't crash.
 	 */
@@ -382,12 +330,15 @@ void test_xdp_metadata(void)
 
 	if (!ASSERT_OK(xdp_metadata2__attach(bpf_obj2), "attach freplace"))
 		goto out;
+	close_netns(tok);
 
 	/* Send packet to trigger . */
-	if (!ASSERT_GE(generate_packet(&tx_xsk, AF_XDP_CONSUMER_PORT), 0,
-		       "generate freplace packet"))
+	tok = open_netns(TX_NETNS_NAME);
+	if (!ASSERT_GE(generate_packet_udp(), 0, "generate freplace packet"))
 		goto out;
+	close_netns(tok);
 
+	tok = open_netns(RX_NETNS_NAME);
 	while (!retries--) {
 		if (bpf_obj2->bss->called)
 			break;
@@ -397,10 +348,10 @@ void test_xdp_metadata(void)
 
 out:
 	close_xsk(&rx_xsk);
-	close_xsk(&tx_xsk);
 	xdp_metadata2__destroy(bpf_obj2);
 	xdp_metadata__destroy(bpf_obj);
 	if (tok)
 		close_netns(tok);
-	SYS_NOFAIL("ip netns del xdp_metadata");
+	SYS_NOFAIL("ip netns del " RX_NETNS_NAME);
+	SYS_NOFAIL("ip netns del " TX_NETNS_NAME);
 }
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 20/23] selftests/bpf: Check VLAN tag and proto in xdp_metadata
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (18 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 19/23] selftests/bpf: Use AF_INET for TX in xdp_metadata Larysa Zaremba
@ 2023-08-24 19:26 ` Larysa Zaremba
  2023-08-24 19:27 ` [RFC bpf-next 21/23] selftests/bpf: check checksum state " Larysa Zaremba
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Verify, whether VLAN tag and proto are set correctly.

To simulate "stripped" VLAN tag on veth, send test packet from VLAN
interface.

Also, add TO_STR() macro for convenience.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 .../selftests/bpf/prog_tests/xdp_metadata.c   | 21 +++++++++++++++++--
 .../selftests/bpf/progs/xdp_metadata.c        |  5 +++++
 tools/testing/selftests/bpf/testing_helpers.h |  3 +++
 3 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
index 1877e5c6d6c7..61e1b073a4b2 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
@@ -38,7 +38,14 @@
 #define TX_MAC "00:00:00:00:00:01"
 #define RX_MAC "00:00:00:00:00:02"
 
+#define VLAN_ID 59
+#define VLAN_PROTO "802.1Q"
+#define VLAN_PID htons(ETH_P_8021Q)
+#define TX_NAME_VLAN TX_NAME "." TO_STR(VLAN_ID)
+#define RX_NAME_VLAN RX_NAME "." TO_STR(VLAN_ID)
+
 #define XDP_RSS_TYPE_L4 BIT(3)
+#define VLAN_VID_MASK 0xfff
 
 struct xsk {
 	void *umem_area;
@@ -215,6 +222,12 @@ static int verify_xsk_metadata(struct xsk *xsk)
 	if (!ASSERT_NEQ(meta->rx_hash_type & XDP_RSS_TYPE_L4, 0, "rx_hash_type"))
 		return -1;
 
+	if (!ASSERT_EQ(meta->rx_vlan_tci & VLAN_VID_MASK, VLAN_ID, "rx_vlan_tci"))
+		return -1;
+
+	if (!ASSERT_EQ(meta->rx_vlan_proto, VLAN_PID, "rx_vlan_proto"))
+		return -1;
+
 	xsk_ring_cons__release(&xsk->rx, 1);
 	refill_rx(xsk, comp_addr);
 
@@ -248,10 +261,14 @@ void test_xdp_metadata(void)
 
 	SYS(out, "ip link set dev " TX_NAME " address " TX_MAC);
 	SYS(out, "ip link set dev " TX_NAME " up");
-	SYS(out, "ip addr add " TX_ADDR "/" PREFIX_LEN " dev " TX_NAME);
+
+	SYS(out, "ip link add link " TX_NAME " " TX_NAME_VLAN
+		 " type vlan proto " VLAN_PROTO " id " TO_STR(VLAN_ID));
+	SYS(out, "ip link set dev " TX_NAME_VLAN " up");
+	SYS(out, "ip addr add " TX_ADDR "/" PREFIX_LEN " dev " TX_NAME_VLAN);
 
 	/* Avoid ARP calls */
-	SYS(out, "ip -4 neigh add " RX_ADDR " lladdr " RX_MAC " dev " TX_NAME);
+	SYS(out, "ip -4 neigh add " RX_ADDR " lladdr " RX_MAC " dev " TX_NAME_VLAN);
 	close_netns(tok);
 
 	tok = open_netns(RX_NETNS_NAME);
diff --git a/tools/testing/selftests/bpf/progs/xdp_metadata.c b/tools/testing/selftests/bpf/progs/xdp_metadata.c
index d151d406a123..f3db5cef4726 100644
--- a/tools/testing/selftests/bpf/progs/xdp_metadata.c
+++ b/tools/testing/selftests/bpf/progs/xdp_metadata.c
@@ -23,6 +23,9 @@ extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx,
 					 __u64 *timestamp) __ksym;
 extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash,
 				    enum xdp_rss_hash_type *rss_type) __ksym;
+extern int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
+					__u16 *vlan_tci,
+					__be16 *vlan_proto) __ksym;
 
 SEC("xdp")
 int rx(struct xdp_md *ctx)
@@ -57,6 +60,8 @@ int rx(struct xdp_md *ctx)
 		meta->rx_timestamp = 1;
 
 	bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash, &meta->rx_hash_type);
+	bpf_xdp_metadata_rx_vlan_tag(ctx, &meta->rx_vlan_tci,
+				     &meta->rx_vlan_proto);
 
 	return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
 }
diff --git a/tools/testing/selftests/bpf/testing_helpers.h b/tools/testing/selftests/bpf/testing_helpers.h
index 5b7a55136741..35284faff4f2 100644
--- a/tools/testing/selftests/bpf/testing_helpers.h
+++ b/tools/testing/selftests/bpf/testing_helpers.h
@@ -9,6 +9,9 @@
 #include <bpf/libbpf.h>
 #include <time.h>
 
+#define __TO_STR(x) #x
+#define TO_STR(x) __TO_STR(x)
+
 int parse_num_list(const char *s, bool **set, int *set_len);
 __u32 link_info_prog_id(const struct bpf_link *link, struct bpf_link_info *info);
 int bpf_prog_test_load(const char *file, enum bpf_prog_type type,
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 21/23] selftests/bpf: check checksum state in xdp_metadata
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (19 preceding siblings ...)
  2023-08-24 19:26 ` [RFC bpf-next 20/23] selftests/bpf: Check VLAN tag and proto " Larysa Zaremba
@ 2023-08-24 19:27 ` Larysa Zaremba
  2023-08-24 19:27 ` [RFC bpf-next 22/23] mlx5: implement VLAN tag XDP hint Larysa Zaremba
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:27 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Verify, whether kfunc in xdp_metadata test correctly represent veth
metadata state.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 tools/testing/selftests/bpf/prog_tests/xdp_metadata.c |  3 +++
 tools/testing/selftests/bpf/progs/xdp_metadata.c      | 11 +++++++++++
 2 files changed, 14 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
index 61e1b073a4b2..aeb2701efba5 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
@@ -228,6 +228,9 @@ static int verify_xsk_metadata(struct xsk *xsk)
 	if (!ASSERT_EQ(meta->rx_vlan_proto, VLAN_PID, "rx_vlan_proto"))
 		return -1;
 
+	if (!ASSERT_EQ(meta->rx_csum_status, XDP_CHECKSUM_MAGIC, "rx_csum_status"))
+		return -1;
+
 	xsk_ring_cons__release(&xsk->rx, 1);
 	refill_rx(xsk, comp_addr);
 
diff --git a/tools/testing/selftests/bpf/progs/xdp_metadata.c b/tools/testing/selftests/bpf/progs/xdp_metadata.c
index f3db5cef4726..766477e0a31d 100644
--- a/tools/testing/selftests/bpf/progs/xdp_metadata.c
+++ b/tools/testing/selftests/bpf/progs/xdp_metadata.c
@@ -26,6 +26,9 @@ extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash,
 extern int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
 					__u16 *vlan_tci,
 					__be16 *vlan_proto) __ksym;
+extern int bpf_xdp_metadata_rx_csum(const struct xdp_md *ctx,
+				    enum xdp_csum_status *csum_status,
+				    __wsum *csum) __ksym;
 
 SEC("xdp")
 int rx(struct xdp_md *ctx)
@@ -63,6 +66,14 @@ int rx(struct xdp_md *ctx)
 	bpf_xdp_metadata_rx_vlan_tag(ctx, &meta->rx_vlan_tci,
 				     &meta->rx_vlan_proto);
 
+	/* This is supposed to fail on veth, so tell userspace
+	 * everything is OK by passing a magic status.
+	 */
+	ret = bpf_xdp_metadata_rx_csum(ctx, &meta->rx_csum_status,
+				       &meta->rx_csum);
+	if (ret)
+		meta->rx_csum_status = XDP_CHECKSUM_MAGIC;
+
 	return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
 }
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 22/23] mlx5: implement VLAN tag XDP hint
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (20 preceding siblings ...)
  2023-08-24 19:27 ` [RFC bpf-next 21/23] selftests/bpf: check checksum state " Larysa Zaremba
@ 2023-08-24 19:27 ` Larysa Zaremba
  2023-08-24 19:27 ` [RFC bpf-next 23/23] mlx5: implement RX checksum " Larysa Zaremba
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:27 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Implement the newly added .xmo_rx_vlan_tag() hint function.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 15 +++++++++++++++
 include/linux/mlx5/device.h                      |  2 +-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index 12f56d0db0af..e8319ab0fa85 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -256,9 +256,24 @@ static int mlx5e_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash,
 	return 0;
 }
 
+static int mlx5e_xdp_rx_vlan_tag(const struct xdp_md *ctx, u16 *vlan_tci,
+				 __be16 *vlan_proto)
+{
+	const struct mlx5e_xdp_buff *_ctx = (void *)ctx;
+	const struct mlx5_cqe64 *cqe = _ctx->cqe;
+
+	if (!cqe_has_vlan(cqe))
+		return -ENODATA;
+
+	*vlan_proto = htons(ETH_P_8021Q);
+	*vlan_tci = be16_to_cpu(cqe->vlan_info);
+	return 0;
+}
+
 const struct xdp_metadata_ops mlx5e_xdp_metadata_ops = {
 	.xmo_rx_timestamp		= mlx5e_xdp_rx_timestamp,
 	.xmo_rx_hash			= mlx5e_xdp_rx_hash,
+	.xmo_rx_vlan_tag		= mlx5e_xdp_rx_vlan_tag,
 };
 
 /* returns true if packet was consumed by xdp */
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index 93399802ba77..95ffd78546a7 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -913,7 +913,7 @@ static inline u8 get_cqe_tls_offload(struct mlx5_cqe64 *cqe)
 	return (cqe->tls_outer_l3_tunneled >> 3) & 0x3;
 }
 
-static inline bool cqe_has_vlan(struct mlx5_cqe64 *cqe)
+static inline bool cqe_has_vlan(const struct mlx5_cqe64 *cqe)
 {
 	return cqe->l4_l3_hdr_type & 0x1;
 }
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC bpf-next 23/23] mlx5: implement RX checksum XDP hint
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (21 preceding siblings ...)
  2023-08-24 19:27 ` [RFC bpf-next 22/23] mlx5: implement VLAN tag XDP hint Larysa Zaremba
@ 2023-08-24 19:27 ` Larysa Zaremba
  2023-08-31 14:50 ` [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
  2023-09-04 16:06 ` [xdp-hints] " Maciej Fijalkowski
  24 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-24 19:27 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

Implement .xmo_rx_csum() callback to expose checksum information
to XDP code.

This version contains a lot of logic, duplicated from skb path, because
refactoring would be much more complex than implementation itself, checksum
code is too coupled with the skb concept.

Inteded logic differences from the skb path:
- when checksum does not cover the whole packet, no fixups are performed,
  such packet is treated as one without complete checksum. Just to prevent
  the patch from ballooning from hints-unrelated code.
- with hints API, we can now inform about both complete and validated
  checksum statuses, that is why XDP_CHECKSUM_VERIFIED is ORed to the
  status. I hope this represents HW logic well.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 .../net/ethernet/mellanox/mlx5/core/en/txrx.h |  10 ++
 .../net/ethernet/mellanox/mlx5/core/en/xdp.c  | 100 ++++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   |  12 +--
 include/linux/mlx5/device.h                   |   2 +-
 4 files changed, 112 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
index 879d698b6119..9467a0dea6ae 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
@@ -506,4 +506,14 @@ static inline struct mlx5e_mpw_info *mlx5e_get_mpw_info(struct mlx5e_rq *rq, int
 
 	return (struct mlx5e_mpw_info *)((char *)rq->mpwqe.info + array_size(i, isz));
 }
+
+static inline u8 get_ip_proto(void *data, int network_depth, __be16 proto)
+{
+	void *ip_p = data + network_depth;
+
+	return (proto == htons(ETH_P_IP)) ? ((struct iphdr *)ip_p)->protocol :
+					    ((struct ipv6hdr *)ip_p)->nexthdr;
+}
+
+#define short_frame(size) ((size) <= ETH_ZLEN + ETH_FCS_LEN)
 #endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index e8319ab0fa85..e08b2ad56442 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -270,10 +270,110 @@ static int mlx5e_xdp_rx_vlan_tag(const struct xdp_md *ctx, u16 *vlan_tci,
 	return 0;
 }
 
+static __be16 xdp_buff_last_ethertype(const struct xdp_buff *xdp,
+				      int *network_offset)
+{
+	__be16 proto = ((struct ethhdr *)xdp->data)->h_proto;
+	struct vlan_hdr *remaining_data = xdp->data + ETH_HLEN;
+	u8 allowed_depth = VLAN_MAX_DEPTH;
+
+	while (eth_type_vlan(proto)) {
+		struct vlan_hdr *next_data = remaining_data + 1;
+
+		if ((void *)next_data > xdp->data_end || !--allowed_depth)
+			return 0;
+		proto = remaining_data->h_vlan_encapsulated_proto;
+		remaining_data = next_data;
+	}
+
+	*network_offset = (void *)remaining_data - xdp->data;
+	return proto;
+}
+
+static bool xdp_csum_needs_fixup(const struct xdp_buff *xdp, int network_depth,
+				 __be16 proto)
+{
+	struct ipv6hdr *ip6;
+	struct iphdr   *ip4;
+	int pkt_len;
+
+	if (network_depth > ETH_HLEN)
+		return true;
+
+	switch (proto) {
+	case htons(ETH_P_IP):
+		ip4 = (struct iphdr *)(xdp->data + network_depth);
+		pkt_len = network_depth + ntohs(ip4->tot_len);
+		break;
+	case htons(ETH_P_IPV6):
+		ip6 = (struct ipv6hdr *)(xdp->data + network_depth);
+		pkt_len = network_depth + sizeof(*ip6) + ntohs(ip6->payload_len);
+		break;
+	default:
+		return true;
+	}
+
+	if (likely(pkt_len >= xdp->data_end - xdp->data))
+		return false;
+
+	return true;
+}
+
+static int mlx5e_xdp_rx_csum(const struct xdp_md *ctx,
+			     enum xdp_csum_status *csum_status,
+			     __wsum *csum)
+{
+	const struct mlx5e_xdp_buff *_ctx = (void *)ctx;
+	const struct mlx5_cqe64 *cqe = _ctx->cqe;
+	const struct mlx5e_rq *rq = _ctx->rq;
+	__be16 last_ethertype;
+	int network_offset;
+	u8 lro_num_seg;
+
+	lro_num_seg = be32_to_cpu(cqe->srqn) >> 24;
+	if (lro_num_seg) {
+		*csum_status = XDP_CHECKSUM_VERIFIED;
+		return 0;
+	}
+
+	if (test_bit(MLX5E_RQ_STATE_NO_CSUM_COMPLETE, &rq->state) ||
+	    get_cqe_tls_offload(cqe))
+		goto csum_unnecessary;
+
+	if (short_frame(ctx->data_end - ctx->data))
+		goto csum_unnecessary;
+
+	last_ethertype = xdp_buff_last_ethertype(&_ctx->xdp, &network_offset);
+	if (last_ethertype != htons(ETH_P_IP) && last_ethertype != htons(ETH_P_IPV6))
+		goto csum_unnecessary;
+	if (unlikely(get_ip_proto(_ctx->xdp.data, network_offset,
+				  last_ethertype) == IPPROTO_SCTP))
+		goto csum_unnecessary;
+
+	*csum_status = XDP_CHECKSUM_COMPLETE;
+	*csum = csum_unfold((__force __sum16)cqe->check_sum);
+
+	if (test_bit(MLX5E_RQ_STATE_CSUM_FULL, &rq->state))
+		goto csum_unnecessary;
+
+	if (unlikely(xdp_csum_needs_fixup(&_ctx->xdp, network_offset,
+					  last_ethertype)))
+		*csum_status = 0;
+
+csum_unnecessary:
+	if (likely((cqe->hds_ip_ext & CQE_L3_OK) &&
+		   (cqe->hds_ip_ext & CQE_L4_OK))) {
+		*csum_status |= XDP_CHECKSUM_VERIFIED;
+	}
+
+	return *csum_status ? 0 : -ENODATA;
+}
+
 const struct xdp_metadata_ops mlx5e_xdp_metadata_ops = {
 	.xmo_rx_timestamp		= mlx5e_xdp_rx_timestamp,
 	.xmo_rx_hash			= mlx5e_xdp_rx_hash,
 	.xmo_rx_vlan_tag		= mlx5e_xdp_rx_vlan_tag,
+	.xmo_rx_csum			= mlx5e_xdp_rx_csum,
 };
 
 /* returns true if packet was consumed by xdp */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 3fd11b0761e0..c303ab8b928c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1374,16 +1374,6 @@ static inline void mlx5e_enable_ecn(struct mlx5e_rq *rq, struct sk_buff *skb)
 	rq->stats->ecn_mark += !!rc;
 }
 
-static u8 get_ip_proto(struct sk_buff *skb, int network_depth, __be16 proto)
-{
-	void *ip_p = skb->data + network_depth;
-
-	return (proto == htons(ETH_P_IP)) ? ((struct iphdr *)ip_p)->protocol :
-					    ((struct ipv6hdr *)ip_p)->nexthdr;
-}
-
-#define short_frame(size) ((size) <= ETH_ZLEN + ETH_FCS_LEN)
-
 #define MAX_PADDING 8
 
 static void
@@ -1493,7 +1483,7 @@ static inline void mlx5e_handle_csum(struct net_device *netdev,
 		goto csum_unnecessary;
 
 	if (likely(is_last_ethertype_ip(skb, &network_depth, &proto))) {
-		if (unlikely(get_ip_proto(skb, network_depth, proto) == IPPROTO_SCTP))
+		if (unlikely(get_ip_proto(skb->data, network_depth, proto) == IPPROTO_SCTP))
 			goto csum_unnecessary;
 
 		stats->csum_complete++;
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index 95ffd78546a7..82813efae79d 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -908,7 +908,7 @@ static inline bool cqe_is_tunneled(struct mlx5_cqe64 *cqe)
 	return cqe->tls_outer_l3_tunneled & 0x1;
 }
 
-static inline u8 get_cqe_tls_offload(struct mlx5_cqe64 *cqe)
+static inline u8 get_cqe_tls_offload(const struct mlx5_cqe64 *cqe)
 {
 	return (cqe->tls_outer_l3_tunneled >> 3) & 0x3;
 }
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 09/23] xdp: Add VLAN tag hint
  2023-08-24 19:26 ` [RFC bpf-next 09/23] xdp: Add VLAN tag hint Larysa Zaremba
@ 2023-08-24 22:02   ` kernel test robot
  2023-09-14 16:18   ` Alexander Lobakin
  1 sibling, 0 replies; 72+ messages in thread
From: kernel test robot @ 2023-08-24 22:02 UTC (permalink / raw)
  To: Larysa Zaremba; +Cc: oe-kbuild-all

Hi Larysa,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Larysa-Zaremba/ice-make-RX-hash-reading-code-more-reusable/20230825-034643
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20230824192703.712881-10-larysa.zaremba%40intel.com
patch subject: [RFC bpf-next 09/23] xdp: Add VLAN tag hint
config: x86_64-randconfig-r013-20230825 (https://download.01.org/0day-ci/archive/20230825/202308250535.zDjN5U2y-lkp@intel.com/config)
compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0
reproduce: (https://download.01.org/0day-ci/archive/20230825/202308250535.zDjN5U2y-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202308250535.zDjN5U2y-lkp@intel.com/

All warnings (new ones prefixed by >>):

   net/core/xdp.c:713:17: warning: no previous declaration for 'bpf_xdp_metadata_rx_timestamp' [-Wmissing-declarations]
    __bpf_kfunc int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   net/core/xdp.c:735:17: warning: no previous declaration for 'bpf_xdp_metadata_rx_hash' [-Wmissing-declarations]
    __bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash,
                    ^~~~~~~~~~~~~~~~~~~~~~~~
>> net/core/xdp.c:768:17: warning: no previous declaration for 'bpf_xdp_metadata_rx_vlan_tag' [-Wmissing-declarations]
    __bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~


vim +/bpf_xdp_metadata_rx_vlan_tag +768 net/core/xdp.c

   740	
   741	/**
   742	 * bpf_xdp_metadata_rx_vlan_tag - Get XDP packet outermost VLAN tag
   743	 * @ctx: XDP context pointer.
   744	 * @vlan_tci: Destination pointer for VLAN TCI (VID + DEI + PCP)
   745	 * @vlan_proto: Destination pointer for VLAN Tag protocol identifier (TPID).
   746	 *
   747	 * In case of success, ``vlan_proto`` contains *Tag protocol identifier (TPID)*,
   748	 * usually ``ETH_P_8021Q`` or ``ETH_P_8021AD``, but some networks can use
   749	 * custom TPIDs. ``vlan_proto`` is stored in **network byte order (BE)**
   750	 * and should be used as follows:
   751	 * ``if (vlan_proto == bpf_htons(ETH_P_8021Q)) do_something();``
   752	 *
   753	 * ``vlan_tci`` contains the remaining 16 bits of a VLAN tag.
   754	 * Driver is expected to provide those in **host byte order (usually LE)**,
   755	 * so the bpf program should not perform byte conversion.
   756	 * According to 802.1Q standard, *VLAN TCI (Tag control information)*
   757	 * is a bit field that contains:
   758	 * *VLAN identifier (VID)* that can be read with ``vlan_tci & 0xfff``,
   759	 * *Drop eligible indicator (DEI)* - 1 bit,
   760	 * *Priority code point (PCP)* - 3 bits.
   761	 * For detailed meaning of DEI and PCP, please refer to other sources.
   762	 *
   763	 * Return:
   764	 * * Returns 0 on success or ``-errno`` on error.
   765	 * * ``-EOPNOTSUPP`` : device driver doesn't implement kfunc
   766	 * * ``-ENODATA``    : VLAN tag was not stripped or is not available
   767	 */
 > 768	__bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
   769						     u16 *vlan_tci,
   770						     __be16 *vlan_proto)
   771	{
   772		return -EOPNOTSUPP;
   773	}
   774	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 12/23] xdp: Add checksum hint
  2023-08-24 19:26 ` [RFC bpf-next 12/23] xdp: Add checksum hint Larysa Zaremba
@ 2023-08-24 22:56   ` kernel test robot
  2023-09-14 16:34   ` Alexander Lobakin
  1 sibling, 0 replies; 72+ messages in thread
From: kernel test robot @ 2023-08-24 22:56 UTC (permalink / raw)
  To: Larysa Zaremba; +Cc: oe-kbuild-all

Hi Larysa,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Larysa-Zaremba/ice-make-RX-hash-reading-code-more-reusable/20230825-034643
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20230824192703.712881-13-larysa.zaremba%40intel.com
patch subject: [RFC bpf-next 12/23] xdp: Add checksum hint
config: x86_64-randconfig-r013-20230825 (https://download.01.org/0day-ci/archive/20230825/202308250652.iZvvascZ-lkp@intel.com/config)
compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0
reproduce: (https://download.01.org/0day-ci/archive/20230825/202308250652.iZvvascZ-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202308250652.iZvvascZ-lkp@intel.com/

All warnings (new ones prefixed by >>):

   net/core/xdp.c:713:17: warning: no previous declaration for 'bpf_xdp_metadata_rx_timestamp' [-Wmissing-declarations]
    __bpf_kfunc int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   net/core/xdp.c:735:17: warning: no previous declaration for 'bpf_xdp_metadata_rx_hash' [-Wmissing-declarations]
    __bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash,
                    ^~~~~~~~~~~~~~~~~~~~~~~~
   net/core/xdp.c:768:17: warning: no previous declaration for 'bpf_xdp_metadata_rx_vlan_tag' [-Wmissing-declarations]
    __bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> net/core/xdp.c:791:17: warning: no previous declaration for 'bpf_xdp_metadata_rx_csum' [-Wmissing-declarations]
    __bpf_kfunc int bpf_xdp_metadata_rx_csum(const struct xdp_md *ctx,
                    ^~~~~~~~~~~~~~~~~~~~~~~~


vim +/bpf_xdp_metadata_rx_csum +791 net/core/xdp.c

   774	
   775	/**
   776	 * bpf_xdp_metadata_rx_csum - Get checksum status with additional info.
   777	 * @ctx: XDP context pointer.
   778	 * @csum_status: Destination for checksum status.
   779	 * @csum: Destination for complete checksum.
   780	 *
   781	 * Status (@csum_status) is a bitfield that informs, what checksum
   782	 * processing was performed. If ``XDP_CHECKSUM_COMPLETE`` in status is set,
   783	 * second argument (@csum) contains a checksum, calculated over the entire
   784	 * packet.
   785	 *
   786	 * Return:
   787	 * * Returns 0 on success or ``-errno`` on error.
   788	 * * ``-EOPNOTSUPP`` : device driver doesn't implement kfunc
   789	 * * ``-ENODATA``    : Checksum status is unknown
   790	 */
 > 791	__bpf_kfunc int bpf_xdp_metadata_rx_csum(const struct xdp_md *ctx,
   792						 enum xdp_csum_status *csum_status,
   793						 __wsum *csum)
   794	{
   795		return -EOPNOTSUPP;
   796	}
   797	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (22 preceding siblings ...)
  2023-08-24 19:27 ` [RFC bpf-next 23/23] mlx5: implement RX checksum " Larysa Zaremba
@ 2023-08-31 14:50 ` Larysa Zaremba
  2023-09-04 16:06 ` [xdp-hints] " Maciej Fijalkowski
  24 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-08-31 14:50 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed, bpf

On Thu, Aug 24, 2023 at 09:26:39PM +0200, Larysa Zaremba wrote:
> Alexei has requested an implementation of VLAN and checksum XDP hints
> for one more driver [0].
> 
> This series is exactly the v5 of "XDP metadata via kfuncs for ice" [1]
> with 2 additional patches for mlx5.
> 
> Firstly, there is a VLAN hint implementation. I am pretty sure this
> one works and would not object adding it to the main series, if someone
> from nvidia ACKs it.
> 
> The second patch is a checksum hint implementation and it is very rough.
> There is logic duplication and some missing features, but I am sure it
> captures the main points of the potential end implementation.
> 
> I think it is unrealistic for me to provide a fully working mlx5 checksum
> hint implementation (complex logic, no HW), so would much rather prefer
> not having it in my main series. My main intension with this RFC is
> to prove proposed hints functions are suitable for non-intel HW.
> 
> [0] https://lore.kernel.org/bpf/CAADnVQLNeO81zc4f_z_UDCi+tJ2LS4dj2E1+au5TbXM+CPSyXQ@mail.gmail.com/
> [1] https://lore.kernel.org/bpf/20230811161509.19722-1-larysa.zaremba@intel.com/
 
[...]

Is this an OK approach to your reauest or have you expected something else?

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 01/23] ice: make RX hash reading code more reusable
  2023-08-24 19:26 ` [RFC bpf-next 01/23] ice: make RX hash reading code more reusable Larysa Zaremba
@ 2023-09-04 14:37   ` Maciej Fijalkowski
  2023-09-06 12:23     ` Alexander Lobakin
  2023-09-14 16:12   ` Alexander Lobakin
  1 sibling, 1 reply; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-04 14:37 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Aug 24, 2023 at 09:26:40PM +0200, Larysa Zaremba wrote:
> Previously, we only needed RX hash in skb path,
> hence all related code was written with skb in mind.
> But with the addition of XDP hints via kfuncs to the ice driver,
> the same logic will be needed in .xmo_() callbacks.
> 
> Separate generic process of reading RX hash from a descriptor
> into a separate function.
> 
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 37 +++++++++++++------
>  1 file changed, 26 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> index c8322fb6f2b3..8f7f6d78f7bf 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> @@ -63,28 +63,43 @@ static enum pkt_hash_types ice_ptype_to_htype(u16 ptype)
>  }
>  
>  /**
> - * ice_rx_hash - set the hash value in the skb
> + * ice_get_rx_hash - get RX hash value from descriptor
> + * @rx_desc: specific descriptor
> + *
> + * Returns hash, if present, 0 otherwise.
> + */
> +static u32
> +ice_get_rx_hash(const union ice_32b_rx_flex_desc *rx_desc)
> +{
> +	const struct ice_32b_rx_flex_desc_nic *nic_mdid;
> +
> +	if (rx_desc->wb.rxdid != ICE_RXDID_FLEX_NIC)
> +		return 0;
> +
> +	nic_mdid = (struct ice_32b_rx_flex_desc_nic *)rx_desc;
> +	return le32_to_cpu(nic_mdid->rss_hash);
> +}
> +
> +/**
> + * ice_rx_hash_to_skb - set the hash value in the skb
>   * @rx_ring: descriptor ring
>   * @rx_desc: specific descriptor
>   * @skb: pointer to current skb
>   * @rx_ptype: the ptype value from the descriptor
>   */
>  static void
> -ice_rx_hash(struct ice_rx_ring *rx_ring, union ice_32b_rx_flex_desc *rx_desc,
> -	    struct sk_buff *skb, u16 rx_ptype)
> +ice_rx_hash_to_skb(const struct ice_rx_ring *rx_ring,

nit: maybe ice_rx_skb_hash, but i have not seen xdp side yet.

other idea would be to turn ice_get_rx_hash to __ice_rx_hash and keep the
ice_rx_hash name as-is. Usual way of naming internal funcs.

Take it or leave it:)

> +		   const union ice_32b_rx_flex_desc *rx_desc,
> +		   struct sk_buff *skb, u16 rx_ptype)
>  {
> -	struct ice_32b_rx_flex_desc_nic *nic_mdid;
>  	u32 hash;
>  
>  	if (!(rx_ring->netdev->features & NETIF_F_RXHASH))
>  		return;
>  
> -	if (rx_desc->wb.rxdid != ICE_RXDID_FLEX_NIC)
> -		return;
> -
> -	nic_mdid = (struct ice_32b_rx_flex_desc_nic *)rx_desc;
> -	hash = le32_to_cpu(nic_mdid->rss_hash);
> -	skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype));
> +	hash = ice_get_rx_hash(rx_desc);
> +	if (likely(hash))
> +		skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype));

Looks like a behavior change as you wouldn't be setting l4_hash and
sw_hash from skb in case !hash ? When can we get hash == 0 ?

>  }
>  
>  /**
> @@ -186,7 +201,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
>  		       union ice_32b_rx_flex_desc *rx_desc,
>  		       struct sk_buff *skb, u16 ptype)
>  {
> -	ice_rx_hash(rx_ring, rx_desc, skb, ptype);
> +	ice_rx_hash_to_skb(rx_ring, rx_desc, skb, ptype);
>  
>  	/* modifies the skb - consumes the enet header */
>  	skb->protocol = eth_type_trans(skb, rx_ring->netdev);
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 02/23] ice: make RX HW timestamp reading code more reusable
  2023-08-24 19:26 ` [RFC bpf-next 02/23] ice: make RX HW timestamp " Larysa Zaremba
@ 2023-09-04 14:56   ` Maciej Fijalkowski
  2023-09-04 16:29     ` Larysa Zaremba
  0 siblings, 1 reply; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-04 14:56 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Aug 24, 2023 at 09:26:41PM +0200, Larysa Zaremba wrote:
> Previously, we only needed RX HW timestamp in skb path,
> hence all related code was written with skb in mind.
> But with the addition of XDP hints via kfuncs to the ice driver,
> the same logic will be needed in .xmo_() callbacks.
> 
> Put generic process of reading RX HW timestamp from a descriptor
> into a separate function.
> Move skb-related code into another source file.
> 
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_ptp.c      | 24 ++++++------------
>  drivers/net/ethernet/intel/ice/ice_ptp.h      | 15 ++++++-----
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 25 ++++++++++++++++++-
>  3 files changed, 41 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
> index 81d96a40d5a7..a31333972c68 100644
> --- a/drivers/net/ethernet/intel/ice/ice_ptp.c
> +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
> @@ -2147,30 +2147,24 @@ int ice_ptp_set_ts_config(struct ice_pf *pf, struct ifreq *ifr)
>  }
>  
>  /**
> - * ice_ptp_rx_hwtstamp - Check for an Rx timestamp
> - * @rx_ring: Ring to get the VSI info
> + * ice_ptp_get_rx_hwts - Get packet Rx timestamp
>   * @rx_desc: Receive descriptor
> - * @skb: Particular skb to send timestamp with
> + * @cached_time: Cached PHC time
>   *
>   * The driver receives a notification in the receive descriptor with timestamp.
> - * The timestamp is in ns, so we must convert the result first.
>   */
> -void
> -ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring,
> -		    union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb)
> +u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc,
> +			u64 cached_time)
>  {
> -	struct skb_shared_hwtstamps *hwtstamps;
> -	u64 ts_ns, cached_time;
>  	u32 ts_high;
> +	u64 ts_ns;
>  
>  	if (!(rx_desc->wb.time_stamp_low & ICE_PTP_TS_VALID))
> -		return;
> -
> -	cached_time = READ_ONCE(rx_ring->cached_phctime);
> +		return 0;
>  
>  	/* Do not report a timestamp if we don't have a cached PHC time */
>  	if (!cached_time)
> -		return;
> +		return 0;
>  
>  	/* Use ice_ptp_extend_32b_ts directly, using the ring-specific cached
>  	 * PHC value, rather than accessing the PF. This also allows us to
> @@ -2181,9 +2175,7 @@ ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring,
>  	ts_high = le32_to_cpu(rx_desc->wb.flex_ts.ts_high);
>  	ts_ns = ice_ptp_extend_32b_ts(cached_time, ts_high);
>  
> -	hwtstamps = skb_hwtstamps(skb);
> -	memset(hwtstamps, 0, sizeof(*hwtstamps));
> -	hwtstamps->hwtstamp = ns_to_ktime(ts_ns);
> +	return ts_ns;
>  }
>  
>  /**
> diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.h b/drivers/net/ethernet/intel/ice/ice_ptp.h
> index 995a57019ba7..523eefbfdf95 100644
> --- a/drivers/net/ethernet/intel/ice/ice_ptp.h
> +++ b/drivers/net/ethernet/intel/ice/ice_ptp.h
> @@ -268,9 +268,8 @@ void ice_ptp_extts_event(struct ice_pf *pf);
>  s8 ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb);
>  enum ice_tx_tstamp_work ice_ptp_process_ts(struct ice_pf *pf);
>  
> -void
> -ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring,
> -		    union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb);
> +u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc,
> +			u64 cached_time);
>  void ice_ptp_reset(struct ice_pf *pf);
>  void ice_ptp_prepare_for_reset(struct ice_pf *pf);
>  void ice_ptp_init(struct ice_pf *pf);
> @@ -304,9 +303,13 @@ static inline bool ice_ptp_process_ts(struct ice_pf *pf)
>  {
>  	return true;
>  }
> -static inline void
> -ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring,
> -		    union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb) { }
> +
> +static inline u64
> +ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc, u64 cached_time)
> +{
> +	return 0;
> +}
> +
>  static inline void ice_ptp_reset(struct ice_pf *pf) { }
>  static inline void ice_ptp_prepare_for_reset(struct ice_pf *pf) { }
>  static inline void ice_ptp_init(struct ice_pf *pf) { }
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> index 8f7f6d78f7bf..b2f241b73934 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> @@ -185,6 +185,29 @@ ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
>  	ring->vsi->back->hw_csum_rx_error++;
>  }
>  
> +/**
> + * ice_ptp_rx_hwts_to_skb - Put RX timestamp into skb
> + * @rx_ring: Ring to get the VSI info
> + * @rx_desc: Receive descriptor
> + * @skb: Particular skb to send timestamp with
> + *
> + * The timestamp is in ns, so we must convert the result first.
> + */
> +static void
> +ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring,
> +		       const union ice_32b_rx_flex_desc *rx_desc,
> +		       struct sk_buff *skb)
> +{
> +	u64 ts_ns, cached_time;
> +
> +	cached_time = READ_ONCE(rx_ring->cached_phctime);

any reason for not reading cached_phctime within ice_ptp_get_rx_hwts?

> +	ts_ns = ice_ptp_get_rx_hwts(rx_desc, cached_time);
> +
> +	*skb_hwtstamps(skb) = (struct skb_shared_hwtstamps){
> +		.hwtstamp	= ns_to_ktime(ts_ns),
> +	};
> +}
> +
>  /**
>   * ice_process_skb_fields - Populate skb header fields from Rx descriptor
>   * @rx_ring: Rx descriptor ring packet is being transacted on
> @@ -209,7 +232,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
>  	ice_rx_csum(rx_ring, skb, rx_desc, ptype);
>  
>  	if (rx_ring->ptp_rx)
> -		ice_ptp_rx_hwtstamp(rx_ring, rx_desc, skb);
> +		ice_ptp_rx_hwts_to_skb(rx_ring, rx_desc, skb);
>  }
>  
>  /**
> -- 
> 2.41.0
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 03/23] ice: make RX checksum checking code more reusable
  2023-08-24 19:26 ` [RFC bpf-next 03/23] ice: make RX checksum checking " Larysa Zaremba
@ 2023-09-04 15:02   ` Maciej Fijalkowski
  2023-09-04 18:01     ` Larysa Zaremba
  0 siblings, 1 reply; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-04 15:02 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Aug 24, 2023 at 09:26:42PM +0200, Larysa Zaremba wrote:
> Previously, we only needed RX checksum flags in skb path,
> hence all related code was written with skb in mind.
> But with the addition of XDP hints via kfuncs to the ice driver,
> the same logic will be needed in .xmo_() callbacks.
> 
> Put generic process of determining checksum status into
> a separate function.
> 
> Now we cannot operate directly on skb, when deducing
> checksum status, therefore introduce an intermediate enum for checksum
> status. Fortunately, in ice, we have only 4 possibilities: checksum
> validated at level 0, validated at level 1, no checksum, checksum error.
> Use 3 bits for more convenient conversion.
> 
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 105 ++++++++++++------
>  1 file changed, 69 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> index b2f241b73934..8b155a502b3b 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> @@ -102,18 +102,41 @@ ice_rx_hash_to_skb(const struct ice_rx_ring *rx_ring,
>  		skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype));
>  }
>  
> +enum ice_rx_csum_status {
> +	ICE_RX_CSUM_LVL_0	= 0,
> +	ICE_RX_CSUM_LVL_1	= BIT(0),
> +	ICE_RX_CSUM_NONE	= BIT(1),
> +	ICE_RX_CSUM_ERROR	= BIT(2),
> +	ICE_RX_CSUM_FAIL	= ICE_RX_CSUM_NONE | ICE_RX_CSUM_ERROR,
> +};
> +
>  /**
> - * ice_rx_csum - Indicate in skb if checksum is good
> - * @ring: the ring we care about
> - * @skb: skb currently being received and modified
> + * ice_rx_csum_lvl - Get checksum level from status
> + * @status: driver-specific checksum status
> + */
> +static u8 ice_rx_csum_lvl(enum ice_rx_csum_status status)
> +{
> +	return status & ICE_RX_CSUM_LVL_1;
> +}
> +
> +/**
> + * ice_rx_csum_ip_summed - Checksum status from driver-specific to generic
> + * @status: driver-specific checksum status
> + */
> +static u8 ice_rx_csum_ip_summed(enum ice_rx_csum_status status)
> +{
> +	return status & ICE_RX_CSUM_NONE ? CHECKSUM_NONE : CHECKSUM_UNNECESSARY;

	return !(status & ICE_RX_CSUM_NONE);

?

> +}
> +
> +/**
> + * ice_get_rx_csum_status - Deduce checksum status from descriptor
>   * @rx_desc: the receive descriptor
>   * @ptype: the packet type decoded by hardware
>   *
> - * skb->protocol must be set before this function is called
> + * Returns driver-specific checksum status
>   */
> -static void
> -ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
> -	    union ice_32b_rx_flex_desc *rx_desc, u16 ptype)
> +static enum ice_rx_csum_status
> +ice_get_rx_csum_status(const union ice_32b_rx_flex_desc *rx_desc, u16 ptype)
>  {
>  	struct ice_rx_ptype_decoded decoded;
>  	u16 rx_status0, rx_status1;
> @@ -124,20 +147,12 @@ ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
>  
>  	decoded = ice_decode_rx_desc_ptype(ptype);
>  
> -	/* Start with CHECKSUM_NONE and by default csum_level = 0 */
> -	skb->ip_summed = CHECKSUM_NONE;
> -	skb_checksum_none_assert(skb);
> -
> -	/* check if Rx checksum is enabled */
> -	if (!(ring->netdev->features & NETIF_F_RXCSUM))
> -		return;
> -
>  	/* check if HW has decoded the packet and checksum */
>  	if (!(rx_status0 & BIT(ICE_RX_FLEX_DESC_STATUS0_L3L4P_S)))
> -		return;
> +		return ICE_RX_CSUM_NONE;
>  
>  	if (!(decoded.known && decoded.outer_ip))
> -		return;
> +		return ICE_RX_CSUM_NONE;
>  
>  	ipv4 = (decoded.outer_ip == ICE_RX_PTYPE_OUTER_IP) &&
>  	       (decoded.outer_ip_ver == ICE_RX_PTYPE_OUTER_IPV4);
> @@ -146,43 +161,61 @@ ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
>  
>  	if (ipv4 && (rx_status0 & (BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_IPE_S) |
>  				   BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_EIPE_S))))
> -		goto checksum_fail;
> +		return ICE_RX_CSUM_FAIL;
>  
>  	if (ipv6 && (rx_status0 & (BIT(ICE_RX_FLEX_DESC_STATUS0_IPV6EXADD_S))))
> -		goto checksum_fail;
> +		return ICE_RX_CSUM_FAIL;
>  
>  	/* check for L4 errors and handle packets that were not able to be
>  	 * checksummed due to arrival speed
>  	 */
>  	if (rx_status0 & BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_L4E_S))
> -		goto checksum_fail;
> +		return ICE_RX_CSUM_FAIL;
>  
>  	/* check for outer UDP checksum error in tunneled packets */
>  	if ((rx_status1 & BIT(ICE_RX_FLEX_DESC_STATUS1_NAT_S)) &&
>  	    (rx_status0 & BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_EUDPE_S)))
> -		goto checksum_fail;
> -
> -	/* If there is an outer header present that might contain a checksum
> -	 * we need to bump the checksum level by 1 to reflect the fact that
> -	 * we are indicating we validated the inner checksum.
> -	 */
> -	if (decoded.tunnel_type >= ICE_RX_PTYPE_TUNNEL_IP_GRENAT)
> -		skb->csum_level = 1;
> +		return ICE_RX_CSUM_FAIL;
>  
>  	/* Only report checksum unnecessary for TCP, UDP, or SCTP */
>  	switch (decoded.inner_prot) {
>  	case ICE_RX_PTYPE_INNER_PROT_TCP:
>  	case ICE_RX_PTYPE_INNER_PROT_UDP:
>  	case ICE_RX_PTYPE_INNER_PROT_SCTP:
> -		skb->ip_summed = CHECKSUM_UNNECESSARY;
> -		break;
> -	default:
> -		break;
> +		/* If there is an outer header present that might contain
> +		 * a checksum we need to bump the checksum level by 1 to reflect
> +		 * the fact that we have validated the inner checksum.
> +		 */
> +		return decoded.tunnel_type >= ICE_RX_PTYPE_TUNNEL_IP_GRENAT ?
> +		       ICE_RX_CSUM_LVL_1 : ICE_RX_CSUM_LVL_0;
>  	}
> -	return;
>  
> -checksum_fail:
> -	ring->vsi->back->hw_csum_rx_error++;
> +	return ICE_RX_CSUM_NONE;
> +}
> +
> +/**
> + * ice_rx_csum_into_skb - Indicate in skb if checksum is good
> + * @ring: the ring we care about
> + * @skb: skb currently being received and modified
> + * @rx_desc: the receive descriptor
> + * @ptype: the packet type decoded by hardware
> + */
> +static void
> +ice_rx_csum_into_skb(struct ice_rx_ring *ring, struct sk_buff *skb,
> +		     const union ice_32b_rx_flex_desc *rx_desc, u16 ptype)
> +{
> +	enum ice_rx_csum_status csum_status;
> +
> +	/* check if Rx checksum is enabled */
> +	if (!(ring->netdev->features & NETIF_F_RXCSUM))
> +		return;
> +
> +	csum_status = ice_get_rx_csum_status(rx_desc, ptype);
> +	if (csum_status & ICE_RX_CSUM_ERROR)
> +		ring->vsi->back->hw_csum_rx_error++;
> +
> +	skb->ip_summed = ice_rx_csum_ip_summed(csum_status);
> +	skb->csum_level = ice_rx_csum_lvl(csum_status);
>  }
>  
>  /**
> @@ -229,7 +262,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
>  	/* modifies the skb - consumes the enet header */
>  	skb->protocol = eth_type_trans(skb, rx_ring->netdev);
>  
> -	ice_rx_csum(rx_ring, skb, rx_desc, ptype);
> +	ice_rx_csum_into_skb(rx_ring, skb, rx_desc, ptype);
>  
>  	if (rx_ring->ptp_rx)
>  		ice_ptp_rx_hwts_to_skb(rx_ring, rx_desc, skb);
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 04/23] ice: Make ptype internal to descriptor info processing
  2023-08-24 19:26 ` [RFC bpf-next 04/23] ice: Make ptype internal to descriptor info processing Larysa Zaremba
@ 2023-09-04 15:04   ` Maciej Fijalkowski
  0 siblings, 0 replies; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-04 15:04 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Aug 24, 2023 at 09:26:43PM +0200, Larysa Zaremba wrote:
> Currently, rx_ptype variable is used only as an argument
> to ice_process_skb_fields() and is computed
> just before the function call.
> 
> Therefore, there is no reason to pass this value as an argument.
> Instead, remove this argument and compute the value directly inside
> ice_process_skb_fields() function.
> 
> Also, separate its calculation into a short function, so the code
> can later be reused in .xmo_() callbacks.
> 
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>

Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

> ---
>  drivers/net/ethernet/intel/ice/ice_txrx.c     |  6 +-----
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 15 +++++++++++++--
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.h |  2 +-
>  drivers/net/ethernet/intel/ice/ice_xsk.c      |  6 +-----
>  4 files changed, 16 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> index 52d0a126eb61..40f2f6dabb81 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> @@ -1181,7 +1181,6 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
>  		unsigned int size;
>  		u16 stat_err_bits;
>  		u16 vlan_tag = 0;
> -		u16 rx_ptype;
>  
>  		/* get the Rx desc from Rx ring based on 'next_to_clean' */
>  		rx_desc = ICE_RX_DESC(rx_ring, ntc);
> @@ -1286,10 +1285,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
>  		total_rx_bytes += skb->len;
>  
>  		/* populate checksum, VLAN, and protocol */
> -		rx_ptype = le16_to_cpu(rx_desc->wb.ptype_flex_flags0) &
> -			ICE_RX_FLEX_DESC_PTYPE_M;
> -
> -		ice_process_skb_fields(rx_ring, rx_desc, skb, rx_ptype);
> +		ice_process_skb_fields(rx_ring, rx_desc, skb);
>  
>  		ice_trace(clean_rx_irq_indicate, rx_ring, rx_desc, skb);
>  		/* send completed skb up the stack */
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> index 8b155a502b3b..07241f4229b7 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> @@ -241,12 +241,21 @@ ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring,
>  	};
>  }
>  
> +/**
> + * ice_get_ptype - Read HW packet type from the descriptor
> + * @rx_desc: RX descriptor
> + */
> +static u16 ice_get_ptype(const union ice_32b_rx_flex_desc *rx_desc)
> +{
> +	return le16_to_cpu(rx_desc->wb.ptype_flex_flags0) &
> +	       ICE_RX_FLEX_DESC_PTYPE_M;
> +}
> +
>  /**
>   * ice_process_skb_fields - Populate skb header fields from Rx descriptor
>   * @rx_ring: Rx descriptor ring packet is being transacted on
>   * @rx_desc: pointer to the EOP Rx descriptor
>   * @skb: pointer to current skb being populated
> - * @ptype: the packet type decoded by hardware
>   *
>   * This function checks the ring, descriptor, and packet information in
>   * order to populate the hash, checksum, VLAN, protocol, and
> @@ -255,8 +264,10 @@ ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring,
>  void
>  ice_process_skb_fields(struct ice_rx_ring *rx_ring,
>  		       union ice_32b_rx_flex_desc *rx_desc,
> -		       struct sk_buff *skb, u16 ptype)
> +		       struct sk_buff *skb)
>  {
> +	u16 ptype = ice_get_ptype(rx_desc);
> +
>  	ice_rx_hash_to_skb(rx_ring, rx_desc, skb, ptype);
>  
>  	/* modifies the skb - consumes the enet header */
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
> index 115969ecdf7b..e1d49e1235b3 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
> @@ -148,7 +148,7 @@ void ice_release_rx_desc(struct ice_rx_ring *rx_ring, u16 val);
>  void
>  ice_process_skb_fields(struct ice_rx_ring *rx_ring,
>  		       union ice_32b_rx_flex_desc *rx_desc,
> -		       struct sk_buff *skb, u16 ptype);
> +		       struct sk_buff *skb);
>  void
>  ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag);
>  #endif /* !_ICE_TXRX_LIB_H_ */
> diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> index 2a3f0834e139..ef778b8e6d1b 100644
> --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> @@ -870,7 +870,6 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
>  		struct sk_buff *skb;
>  		u16 stat_err_bits;
>  		u16 vlan_tag = 0;
> -		u16 rx_ptype;
>  
>  		rx_desc = ICE_RX_DESC(rx_ring, ntc);
>  
> @@ -950,10 +949,7 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
>  
>  		vlan_tag = ice_get_vlan_tag_from_rx_desc(rx_desc);
>  
> -		rx_ptype = le16_to_cpu(rx_desc->wb.ptype_flex_flags0) &
> -				       ICE_RX_FLEX_DESC_PTYPE_M;
> -
> -		ice_process_skb_fields(rx_ring, rx_desc, skb, rx_ptype);
> +		ice_process_skb_fields(rx_ring, rx_desc, skb);
>  		ice_receive_skb(rx_ring, skb, vlan_tag);
>  	}
>  
> -- 
> 2.41.0
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 05/23] ice: Introduce ice_xdp_buff
  2023-08-24 19:26 ` [RFC bpf-next 05/23] ice: Introduce ice_xdp_buff Larysa Zaremba
@ 2023-09-04 15:32   ` Maciej Fijalkowski
  2023-09-04 18:11     ` Larysa Zaremba
  0 siblings, 1 reply; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-04 15:32 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Aug 24, 2023 at 09:26:44PM +0200, Larysa Zaremba wrote:
> In order to use XDP hints via kfuncs we need to put
> RX descriptor and ring pointers just next to xdp_buff.
> Same as in hints implementations in other drivers, we achieve
> this through putting xdp_buff into a child structure.

Don't you mean a parent struct? xdp_buff will be 'child' of ice_xdp_buff
if i'm reading this right.

> 
> Currently, xdp_buff is stored in the ring structure,
> so replace it with union that includes child structure.
> This way enough memory is available while existing XDP code
> remains isolated from hints.
> 
> Minimum size of the new child structure (ice_xdp_buff) is exactly
> 64 bytes (single cache line). To place it at the start of a cache line,
> move 'next' field from CL1 to CL3, as it isn't used often. This still
> leaves 128 bits available in CL3 for packet context extensions.

I believe ice_xdp_buff will be beefed up in later patches, so what is the
point of moving 'next' ? We won't be able to keep ice_xdp_buff in a single
CL anyway.

> 
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_txrx.c     |  7 +++--
>  drivers/net/ethernet/intel/ice/ice_txrx.h     | 26 ++++++++++++++++---
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 10 +++++++
>  3 files changed, 38 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> index 40f2f6dabb81..4e6546d9cf85 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> @@ -557,13 +557,14 @@ ice_rx_frame_truesize(struct ice_rx_ring *rx_ring, const unsigned int size)
>   * @xdp_prog: XDP program to run
>   * @xdp_ring: ring to be used for XDP_TX action
>   * @rx_buf: Rx buffer to store the XDP action
> + * @eop_desc: Last descriptor in packet to read metadata from
>   *
>   * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}
>   */
>  static void
>  ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
>  	    struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
> -	    struct ice_rx_buf *rx_buf)
> +	    struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc)
>  {
>  	unsigned int ret = ICE_XDP_PASS;
>  	u32 act;
> @@ -571,6 +572,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
>  	if (!xdp_prog)
>  		goto exit;
>  
> +	ice_xdp_meta_set_desc(xdp, eop_desc);

I am currently not sure if for multi-buffer case HW repeats all the
necessary info within each descriptor for every frag? IOW shouldn't you be
using the ice_rx_ring::first_desc?

Would be good to test hints for mbuf case for sure.

> +
>  	act = bpf_prog_run_xdp(xdp_prog, xdp);
>  	switch (act) {
>  	case XDP_PASS:
> @@ -1240,7 +1243,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
>  		if (ice_is_non_eop(rx_ring, rx_desc))
>  			continue;
>  
> -		ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf);
> +		ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc);
>  		if (rx_buf->act == ICE_XDP_PASS)
>  			goto construct_skb;
>  		total_rx_bytes += xdp_get_buff_len(xdp);
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h
> index 166413fc33f4..d0ab2c4c0c91 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx.h
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h
> @@ -257,6 +257,18 @@ enum ice_rx_dtype {
>  	ICE_RX_DTYPE_SPLIT_ALWAYS	= 2,
>  };
>  
> +struct ice_pkt_ctx {
> +	const union ice_32b_rx_flex_desc *eop_desc;
> +};
> +
> +struct ice_xdp_buff {
> +	struct xdp_buff xdp_buff;
> +	struct ice_pkt_ctx pkt_ctx;
> +};
> +
> +/* Required for compatibility with xdp_buffs from xsk_pool */
> +static_assert(offsetof(struct ice_xdp_buff, xdp_buff) == 0);
> +
>  /* indices into GLINT_ITR registers */
>  #define ICE_RX_ITR	ICE_IDX_ITR0
>  #define ICE_TX_ITR	ICE_IDX_ITR1
> @@ -298,7 +310,6 @@ enum ice_dynamic_itr {
>  /* descriptor ring, associated with a VSI */
>  struct ice_rx_ring {
>  	/* CL1 - 1st cacheline starts here */
> -	struct ice_rx_ring *next;	/* pointer to next ring in q_vector */
>  	void *desc;			/* Descriptor ring memory */
>  	struct device *dev;		/* Used for DMA mapping */
>  	struct net_device *netdev;	/* netdev ring maps to */
> @@ -310,12 +321,19 @@ struct ice_rx_ring {
>  	u16 count;			/* Number of descriptors */
>  	u16 reg_idx;			/* HW register index of the ring */
>  	u16 next_to_alloc;
> -	/* CL2 - 2nd cacheline starts here */
> +
>  	union {
>  		struct ice_rx_buf *rx_buf;
>  		struct xdp_buff **xdp_buf;
>  	};
> -	struct xdp_buff xdp;
> +	/* CL2 - 2nd cacheline starts here */
> +	union {
> +		struct ice_xdp_buff xdp_ext;
> +		struct {
> +			struct xdp_buff xdp;
> +			struct ice_pkt_ctx pkt_ctx;
> +		};
> +	};
>  	/* CL3 - 3rd cacheline starts here */
>  	struct bpf_prog *xdp_prog;
>  	u16 rx_offset;
> @@ -325,6 +343,8 @@ struct ice_rx_ring {
>  	u16 next_to_clean;
>  	u16 first_desc;
>  
> +	struct ice_rx_ring *next;	/* pointer to next ring in q_vector */
> +
>  	/* stats structs */
>  	struct ice_ring_stats *ring_stats;
>  
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
> index e1d49e1235b3..145883eec129 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
> @@ -151,4 +151,14 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
>  		       struct sk_buff *skb);
>  void
>  ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag);
> +
> +static inline void
> +ice_xdp_meta_set_desc(struct xdp_buff *xdp,
> +		      union ice_32b_rx_flex_desc *eop_desc)
> +{
> +	struct ice_xdp_buff *xdp_ext = container_of(xdp, struct ice_xdp_buff,
> +						    xdp_buff);
> +
> +	xdp_ext->pkt_ctx.eop_desc = eop_desc;
> +}
>  #endif /* !_ICE_TXRX_LIB_H_ */
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 06/23] ice: Support HW timestamp hint
  2023-08-24 19:26 ` [RFC bpf-next 06/23] ice: Support HW timestamp hint Larysa Zaremba
@ 2023-09-04 15:38   ` Maciej Fijalkowski
  2023-09-04 18:12     ` Larysa Zaremba
  0 siblings, 1 reply; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-04 15:38 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Aug 24, 2023 at 09:26:45PM +0200, Larysa Zaremba wrote:
> Use previously refactored code and create a function
> that allows XDP code to read HW timestamp.
> 
> Also, move cached_phctime into packet context, this way this data still
> stays in the ring structure, just at the different address.
> 
> HW timestamp is the first supported hint in the driver,
> so also add xdp_metadata_ops.
> 
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice.h          |  2 ++
>  drivers/net/ethernet/intel/ice/ice_ethtool.c  |  2 +-
>  drivers/net/ethernet/intel/ice/ice_lib.c      |  2 +-
>  drivers/net/ethernet/intel/ice/ice_main.c     |  1 +
>  drivers/net/ethernet/intel/ice/ice_ptp.c      |  3 ++-
>  drivers/net/ethernet/intel/ice/ice_txrx.h     |  2 +-
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 26 ++++++++++++++++++-
>  7 files changed, 33 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
> index 5ac0ad12f9f1..34e4731b5d5f 100644
> --- a/drivers/net/ethernet/intel/ice/ice.h
> +++ b/drivers/net/ethernet/intel/ice/ice.h
> @@ -951,4 +951,6 @@ static inline void ice_clear_rdma_cap(struct ice_pf *pf)
>  	set_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags);
>  	clear_bit(ICE_FLAG_RDMA_ENA, pf->flags);
>  }
> +
> +extern const struct xdp_metadata_ops ice_xdp_md_ops;
>  #endif /* _ICE_H_ */
> diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
> index ad4d4702129f..f740e0ad0e3c 100644
> --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
> +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
> @@ -2846,7 +2846,7 @@ ice_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
>  		/* clone ring and setup updated count */
>  		rx_rings[i] = *vsi->rx_rings[i];
>  		rx_rings[i].count = new_rx_cnt;
> -		rx_rings[i].cached_phctime = pf->ptp.cached_phc_time;
> +		rx_rings[i].pkt_ctx.cached_phctime = pf->ptp.cached_phc_time;
>  		rx_rings[i].desc = NULL;
>  		rx_rings[i].rx_buf = NULL;
>  		/* this is to allow wr32 to have something to write to
> diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
> index 927518fcad51..12290defb730 100644
> --- a/drivers/net/ethernet/intel/ice/ice_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_lib.c
> @@ -1445,7 +1445,7 @@ static int ice_vsi_alloc_rings(struct ice_vsi *vsi)
>  		ring->netdev = vsi->netdev;
>  		ring->dev = dev;
>  		ring->count = vsi->num_rx_desc;
> -		ring->cached_phctime = pf->ptp.cached_phc_time;
> +		ring->pkt_ctx.cached_phctime = pf->ptp.cached_phc_time;
>  		WRITE_ONCE(vsi->rx_rings[i], ring);
>  	}
>  
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index 0f04347eda39..557c6326ff87 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -3395,6 +3395,7 @@ static void ice_set_ops(struct ice_vsi *vsi)
>  
>  	netdev->netdev_ops = &ice_netdev_ops;
>  	netdev->udp_tunnel_nic_info = &pf->hw.udp_tunnel_nic;
> +	netdev->xdp_metadata_ops = &ice_xdp_md_ops;
>  	ice_set_ethtool_ops(netdev);
>  
>  	if (vsi->type != ICE_VSI_PF)
> diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
> index a31333972c68..26fad7038996 100644
> --- a/drivers/net/ethernet/intel/ice/ice_ptp.c
> +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
> @@ -1038,7 +1038,8 @@ static int ice_ptp_update_cached_phctime(struct ice_pf *pf)
>  		ice_for_each_rxq(vsi, j) {
>  			if (!vsi->rx_rings[j])
>  				continue;
> -			WRITE_ONCE(vsi->rx_rings[j]->cached_phctime, systime);
> +			WRITE_ONCE(vsi->rx_rings[j]->pkt_ctx.cached_phctime,
> +				   systime);
>  		}
>  	}
>  	clear_bit(ICE_CFG_BUSY, pf->state);
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h
> index d0ab2c4c0c91..4237702a58a9 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx.h
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h
> @@ -259,6 +259,7 @@ enum ice_rx_dtype {
>  
>  struct ice_pkt_ctx {
>  	const union ice_32b_rx_flex_desc *eop_desc;
> +	u64 cached_phctime;
>  };
>  
>  struct ice_xdp_buff {
> @@ -354,7 +355,6 @@ struct ice_rx_ring {
>  	struct ice_tx_ring *xdp_ring;
>  	struct xsk_buff_pool *xsk_pool;
>  	dma_addr_t dma;			/* physical address of ring */
> -	u64 cached_phctime;
>  	u16 rx_buf_len;
>  	u8 dcb_tc;			/* Traffic class of ring */
>  	u8 ptp_rx;
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> index 07241f4229b7..463d9e5cbe05 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> @@ -233,7 +233,7 @@ ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring,
>  {
>  	u64 ts_ns, cached_time;
>  
> -	cached_time = READ_ONCE(rx_ring->cached_phctime);
> +	cached_time = READ_ONCE(rx_ring->pkt_ctx.cached_phctime);
>  	ts_ns = ice_ptp_get_rx_hwts(rx_desc, cached_time);
>  
>  	*skb_hwtstamps(skb) = (struct skb_shared_hwtstamps){
> @@ -546,3 +546,27 @@ void ice_finalize_xdp_rx(struct ice_tx_ring *xdp_ring, unsigned int xdp_res,
>  			spin_unlock(&xdp_ring->tx_lock);
>  	}
>  }
> +
> +/**
> + * ice_xdp_rx_hw_ts - HW timestamp XDP hint handler
> + * @ctx: XDP buff pointer
> + * @ts_ns: destination address
> + *
> + * Copy HW timestamp (if available) to the destination address.
> + */
> +static int ice_xdp_rx_hw_ts(const struct xdp_md *ctx, u64 *ts_ns)
> +{
> +	const struct ice_xdp_buff *xdp_ext = (void *)ctx;
> +	u64 cached_time;
> +
> +	cached_time = READ_ONCE(xdp_ext->pkt_ctx.cached_phctime);
> +	*ts_ns = ice_ptp_get_rx_hwts(xdp_ext->pkt_ctx.eop_desc, cached_time);

having cached_phctime within pkt_ctx doesn't stop skb side from using it
right? so again, why note read it within ice_ptp_get_rx_hwts.

> +	if (!*ts_ns)
> +		return -ENODATA;
> +
> +	return 0;
> +}
> +
> +const struct xdp_metadata_ops ice_xdp_md_ops = {
> +	.xmo_rx_timestamp		= ice_xdp_rx_hw_ts,
> +};
> -- 
> 2.41.0
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 08/23] ice: Support XDP hints in AF_XDP ZC mode
  2023-08-24 19:26 ` [RFC bpf-next 08/23] ice: Support XDP hints in AF_XDP ZC mode Larysa Zaremba
@ 2023-09-04 15:42   ` Maciej Fijalkowski
  2023-09-04 18:14     ` Larysa Zaremba
  0 siblings, 1 reply; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-04 15:42 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Aug 24, 2023 at 09:26:47PM +0200, Larysa Zaremba wrote:
> In AF_XDP ZC, xdp_buff is not stored on ring,
> instead it is provided by xsk_pool.

xsk_buff_pool

> Space for metadata sources right after such buffers was already reserved
> in commit 94ecc5ca4dbf ("xsk: Add cb area to struct xdp_buff_xsk").
> This makes the implementation rather straightforward.
> 
> Update AF_XDP ZC packet processing to support XDP hints.
> 
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_xsk.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> index ef778b8e6d1b..fdeddad9b639 100644
> --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> @@ -758,16 +758,25 @@ static int ice_xmit_xdp_tx_zc(struct xdp_buff *xdp,
>   * @xdp: xdp_buff used as input to the XDP program
>   * @xdp_prog: XDP program to run
>   * @xdp_ring: ring to be used for XDP_TX action
> + * @rx_desc: packet descriptor
>   *
>   * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}
>   */
>  static int
>  ice_run_xdp_zc(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> -	       struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring)
> +	       struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
> +	       union ice_32b_rx_flex_desc *rx_desc)
>  {
>  	int err, result = ICE_XDP_PASS;
>  	u32 act;
>  
> +	/* We can safely convert xdp_buff_xsk to ice_xdp_buff,
> +	 * because there are XSK_PRIV_MAX bytes reserved in xdp_buff_xsk
> +	 * right after xdp_buff, for our private use.
> +	 * Macro insures we do not go above the limit.

ensures?

> +	 */
> +	XSK_CHECK_PRIV_TYPE(struct ice_xdp_buff);
> +	ice_xdp_meta_set_desc(xdp, rx_desc);
>  	act = bpf_prog_run_xdp(xdp_prog, xdp);
>  
>  	if (likely(act == XDP_REDIRECT)) {
> @@ -907,7 +916,8 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
>  		if (ice_is_non_eop(rx_ring, rx_desc))
>  			continue;
>  
> -		xdp_res = ice_run_xdp_zc(rx_ring, first, xdp_prog, xdp_ring);
> +		xdp_res = ice_run_xdp_zc(rx_ring, xdp, xdp_prog, xdp_ring,
> +					 rx_desc);
>  		if (likely(xdp_res & (ICE_XDP_TX | ICE_XDP_REDIR))) {
>  			xdp_xmit |= xdp_res;
>  		} else if (xdp_res == ICE_XDP_EXIT) {
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 10/23] ice: Implement VLAN tag hint
  2023-08-24 19:26 ` [RFC bpf-next 10/23] ice: Implement " Larysa Zaremba
@ 2023-09-04 16:00   ` Maciej Fijalkowski
  2023-09-04 18:18     ` Larysa Zaremba
  2023-09-14 16:25   ` [xdp-hints] " Alexander Lobakin
  1 sibling, 1 reply; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-04 16:00 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Aug 24, 2023 at 09:26:49PM +0200, Larysa Zaremba wrote:
> Implement .xmo_rx_vlan_tag callback to allow XDP code to read
> packet's VLAN tag.
> 
> At the same time, use vlan_tci instead of vlan_tag in touched code,
> because vlan_tag is misleading.

misleading...because? ;)

> 
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_main.c     | 22 ++++++++++++++++
>  drivers/net/ethernet/intel/ice/ice_txrx.c     |  6 ++---
>  drivers/net/ethernet/intel/ice/ice_txrx.h     |  1 +
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 26 +++++++++++++++++++
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.h |  4 +--
>  drivers/net/ethernet/intel/ice/ice_xsk.c      |  6 ++---
>  6 files changed, 57 insertions(+), 8 deletions(-)
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5
  2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
                   ` (23 preceding siblings ...)
  2023-08-31 14:50 ` [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
@ 2023-09-04 16:06 ` Maciej Fijalkowski
  2023-09-06 14:09   ` Larysa Zaremba
  24 siblings, 1 reply; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-04 16:06 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Aug 24, 2023 at 09:26:39PM +0200, Larysa Zaremba wrote:
> Alexei has requested an implementation of VLAN and checksum XDP hints
> for one more driver [0].
> 
> This series is exactly the v5 of "XDP metadata via kfuncs for ice" [1]
> with 2 additional patches for mlx5.
> 
> Firstly, there is a VLAN hint implementation. I am pretty sure this
> one works and would not object adding it to the main series, if someone
> from nvidia ACKs it.
> 
> The second patch is a checksum hint implementation and it is very rough.
> There is logic duplication and some missing features, but I am sure it
> captures the main points of the potential end implementation.
> 
> I think it is unrealistic for me to provide a fully working mlx5 checksum
> hint implementation (complex logic, no HW), so would much rather prefer
> not having it in my main series. My main intension with this RFC is
> to prove proposed hints functions are suitable for non-intel HW.

I went through ice patches mostly, can you provide performance numbers for
XDP workloads without metadata in picture? I'd like to see whether
standard 64b traffic gets affected or not since you're modifying
ice_rx_ring layout.

> 
> [0] https://lore.kernel.org/bpf/CAADnVQLNeO81zc4f_z_UDCi+tJ2LS4dj2E1+au5TbXM+CPSyXQ@mail.gmail.com/
> [1] https://lore.kernel.org/bpf/20230811161509.19722-1-larysa.zaremba@intel.com/
> 
> Aleksander Lobakin (1):
>   net, xdp: allow metadata > 32
> 
> Larysa Zaremba (22):
>   ice: make RX hash reading code more reusable
>   ice: make RX HW timestamp reading code more reusable
>   ice: make RX checksum checking code more reusable
>   ice: Make ptype internal to descriptor info processing
>   ice: Introduce ice_xdp_buff
>   ice: Support HW timestamp hint
>   ice: Support RX hash XDP hint
>   ice: Support XDP hints in AF_XDP ZC mode
>   xdp: Add VLAN tag hint
>   ice: Implement VLAN tag hint
>   ice: use VLAN proto from ring packet context in skb path
>   xdp: Add checksum hint
>   ice: Implement checksum hint
>   selftests/bpf: Allow VLAN packets in xdp_hw_metadata
>   selftests/bpf: Add flags and new hints to xdp_hw_metadata
>   veth: Implement VLAN tag and checksum XDP hint
>   net: make vlan_get_tag() return -ENODATA instead of -EINVAL
>   selftests/bpf: Use AF_INET for TX in xdp_metadata
>   selftests/bpf: Check VLAN tag and proto in xdp_metadata
>   selftests/bpf: check checksum state in xdp_metadata
>   mlx5: implement VLAN tag XDP hint
>   mlx5: implement RX checksum XDP hint
> 
>  Documentation/networking/xdp-rx-metadata.rst  |  11 +-
>  drivers/net/ethernet/intel/ice/ice.h          |   2 +
>  drivers/net/ethernet/intel/ice/ice_ethtool.c  |   2 +-
>  .../net/ethernet/intel/ice/ice_lan_tx_rx.h    | 412 +++++++++---------
>  drivers/net/ethernet/intel/ice/ice_lib.c      |   2 +-
>  drivers/net/ethernet/intel/ice/ice_main.c     |  23 +
>  drivers/net/ethernet/intel/ice/ice_ptp.c      |  27 +-
>  drivers/net/ethernet/intel/ice/ice_ptp.h      |  15 +-
>  drivers/net/ethernet/intel/ice/ice_txrx.c     |  19 +-
>  drivers/net/ethernet/intel/ice/ice_txrx.h     |  29 +-
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 343 ++++++++++++---
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.h |  18 +-
>  drivers/net/ethernet/intel/ice/ice_xsk.c      |  26 +-
>  .../net/ethernet/mellanox/mlx5/core/en/txrx.h |  10 +
>  .../net/ethernet/mellanox/mlx5/core/en/xdp.c  | 116 +++++
>  .../net/ethernet/mellanox/mlx5/core/en_rx.c   |  12 +-
>  drivers/net/veth.c                            |  42 ++
>  include/linux/if_vlan.h                       |   4 +-
>  include/linux/mlx5/device.h                   |   4 +-
>  include/linux/skbuff.h                        |  13 +-
>  include/net/xdp.h                             |  29 +-
>  kernel/bpf/offload.c                          |   4 +
>  net/core/xdp.c                                |  57 +++
>  .../selftests/bpf/prog_tests/xdp_metadata.c   | 187 ++++----
>  .../selftests/bpf/progs/xdp_hw_metadata.c     |  48 +-
>  .../selftests/bpf/progs/xdp_metadata.c        |  16 +
>  tools/testing/selftests/bpf/testing_helpers.h |   3 +
>  tools/testing/selftests/bpf/xdp_hw_metadata.c |  67 ++-
>  tools/testing/selftests/bpf/xdp_metadata.h    |  42 +-
>  29 files changed, 1124 insertions(+), 459 deletions(-)
> 
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 02/23] ice: make RX HW timestamp reading code more reusable
  2023-09-04 14:56   ` Maciej Fijalkowski
@ 2023-09-04 16:29     ` Larysa Zaremba
  2023-09-05 15:22       ` Maciej Fijalkowski
  0 siblings, 1 reply; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-04 16:29 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Mon, Sep 04, 2023 at 04:56:32PM +0200, Maciej Fijalkowski wrote:
> On Thu, Aug 24, 2023 at 09:26:41PM +0200, Larysa Zaremba wrote:
> > Previously, we only needed RX HW timestamp in skb path,
> > hence all related code was written with skb in mind.
> > But with the addition of XDP hints via kfuncs to the ice driver,
> > the same logic will be needed in .xmo_() callbacks.
> > 
> > Put generic process of reading RX HW timestamp from a descriptor
> > into a separate function.
> > Move skb-related code into another source file.
> > 
> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_ptp.c      | 24 ++++++------------
> >  drivers/net/ethernet/intel/ice/ice_ptp.h      | 15 ++++++-----
> >  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 25 ++++++++++++++++++-
> >  3 files changed, 41 insertions(+), 23 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
> > index 81d96a40d5a7..a31333972c68 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_ptp.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
> > @@ -2147,30 +2147,24 @@ int ice_ptp_set_ts_config(struct ice_pf *pf, struct ifreq *ifr)
> >  }
> >  
> >  /**
> > - * ice_ptp_rx_hwtstamp - Check for an Rx timestamp
> > - * @rx_ring: Ring to get the VSI info
> > + * ice_ptp_get_rx_hwts - Get packet Rx timestamp
> >   * @rx_desc: Receive descriptor
> > - * @skb: Particular skb to send timestamp with
> > + * @cached_time: Cached PHC time
> >   *
> >   * The driver receives a notification in the receive descriptor with timestamp.
> > - * The timestamp is in ns, so we must convert the result first.
> >   */
> > -void
> > -ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring,
> > -		    union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb)
> > +u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc,
> > +			u64 cached_time)
> >  {
> > -	struct skb_shared_hwtstamps *hwtstamps;
> > -	u64 ts_ns, cached_time;
> >  	u32 ts_high;
> > +	u64 ts_ns;
> >  
> >  	if (!(rx_desc->wb.time_stamp_low & ICE_PTP_TS_VALID))
> > -		return;
> > -
> > -	cached_time = READ_ONCE(rx_ring->cached_phctime);
> > +		return 0;
> >  
> >  	/* Do not report a timestamp if we don't have a cached PHC time */
> >  	if (!cached_time)
> > -		return;
> > +		return 0;
> >  
> >  	/* Use ice_ptp_extend_32b_ts directly, using the ring-specific cached
> >  	 * PHC value, rather than accessing the PF. This also allows us to
> > @@ -2181,9 +2175,7 @@ ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring,
> >  	ts_high = le32_to_cpu(rx_desc->wb.flex_ts.ts_high);
> >  	ts_ns = ice_ptp_extend_32b_ts(cached_time, ts_high);
> >  
> > -	hwtstamps = skb_hwtstamps(skb);
> > -	memset(hwtstamps, 0, sizeof(*hwtstamps));
> > -	hwtstamps->hwtstamp = ns_to_ktime(ts_ns);
> > +	return ts_ns;
> >  }
> >  
> >  /**
> > diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.h b/drivers/net/ethernet/intel/ice/ice_ptp.h
> > index 995a57019ba7..523eefbfdf95 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_ptp.h
> > +++ b/drivers/net/ethernet/intel/ice/ice_ptp.h
> > @@ -268,9 +268,8 @@ void ice_ptp_extts_event(struct ice_pf *pf);
> >  s8 ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb);
> >  enum ice_tx_tstamp_work ice_ptp_process_ts(struct ice_pf *pf);
> >  
> > -void
> > -ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring,
> > -		    union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb);
> > +u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc,
> > +			u64 cached_time);
> >  void ice_ptp_reset(struct ice_pf *pf);
> >  void ice_ptp_prepare_for_reset(struct ice_pf *pf);
> >  void ice_ptp_init(struct ice_pf *pf);
> > @@ -304,9 +303,13 @@ static inline bool ice_ptp_process_ts(struct ice_pf *pf)
> >  {
> >  	return true;
> >  }
> > -static inline void
> > -ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring,
> > -		    union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb) { }
> > +
> > +static inline u64
> > +ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc, u64 cached_time)
> > +{
> > +	return 0;
> > +}
> > +
> >  static inline void ice_ptp_reset(struct ice_pf *pf) { }
> >  static inline void ice_ptp_prepare_for_reset(struct ice_pf *pf) { }
> >  static inline void ice_ptp_init(struct ice_pf *pf) { }
> > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > index 8f7f6d78f7bf..b2f241b73934 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > @@ -185,6 +185,29 @@ ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
> >  	ring->vsi->back->hw_csum_rx_error++;
> >  }
> >  
> > +/**
> > + * ice_ptp_rx_hwts_to_skb - Put RX timestamp into skb
> > + * @rx_ring: Ring to get the VSI info
> > + * @rx_desc: Receive descriptor
> > + * @skb: Particular skb to send timestamp with
> > + *
> > + * The timestamp is in ns, so we must convert the result first.
> > + */
> > +static void
> > +ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring,
> > +		       const union ice_32b_rx_flex_desc *rx_desc,
> > +		       struct sk_buff *skb)
> > +{
> > +	u64 ts_ns, cached_time;
> > +
> > +	cached_time = READ_ONCE(rx_ring->cached_phctime);
> 
> any reason for not reading cached_phctime within ice_ptp_get_rx_hwts?
>

Not at this point, but later for hints, this is read from the xdp_buff tail 
instead of ring.

But maybe it would be actually better to leave cached time where it used to be 
for now and instead later in hint patch replace rx_ring with ice_pkt_ctx in 
ice_ptp_get_rx_hwts(). I guess that would look better.
 
> > +	ts_ns = ice_ptp_get_rx_hwts(rx_desc, cached_time);
> > +
> > +	*skb_hwtstamps(skb) = (struct skb_shared_hwtstamps){
> > +		.hwtstamp	= ns_to_ktime(ts_ns),
> > +	};
> > +}
> > +
> >  /**
> >   * ice_process_skb_fields - Populate skb header fields from Rx descriptor
> >   * @rx_ring: Rx descriptor ring packet is being transacted on
> > @@ -209,7 +232,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
> >  	ice_rx_csum(rx_ring, skb, rx_desc, ptype);
> >  
> >  	if (rx_ring->ptp_rx)
> > -		ice_ptp_rx_hwtstamp(rx_ring, rx_desc, skb);
> > +		ice_ptp_rx_hwts_to_skb(rx_ring, rx_desc, skb);
> >  }
> >  
> >  /**
> > -- 
> > 2.41.0
> > 
> > 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 03/23] ice: make RX checksum checking code more reusable
  2023-09-04 15:02   ` [xdp-hints] " Maciej Fijalkowski
@ 2023-09-04 18:01     ` Larysa Zaremba
  2023-09-05 15:37       ` Maciej Fijalkowski
  0 siblings, 1 reply; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-04 18:01 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Mon, Sep 04, 2023 at 05:02:40PM +0200, Maciej Fijalkowski wrote:
> On Thu, Aug 24, 2023 at 09:26:42PM +0200, Larysa Zaremba wrote:
> > Previously, we only needed RX checksum flags in skb path,
> > hence all related code was written with skb in mind.
> > But with the addition of XDP hints via kfuncs to the ice driver,
> > the same logic will be needed in .xmo_() callbacks.
> > 
> > Put generic process of determining checksum status into
> > a separate function.
> > 
> > Now we cannot operate directly on skb, when deducing
> > checksum status, therefore introduce an intermediate enum for checksum
> > status. Fortunately, in ice, we have only 4 possibilities: checksum
> > validated at level 0, validated at level 1, no checksum, checksum error.
> > Use 3 bits for more convenient conversion.
> > 
> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 105 ++++++++++++------
> >  1 file changed, 69 insertions(+), 36 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > index b2f241b73934..8b155a502b3b 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > @@ -102,18 +102,41 @@ ice_rx_hash_to_skb(const struct ice_rx_ring *rx_ring,
> >  		skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype));
> >  }
> >  
> > +enum ice_rx_csum_status {
> > +	ICE_RX_CSUM_LVL_0	= 0,
> > +	ICE_RX_CSUM_LVL_1	= BIT(0),
> > +	ICE_RX_CSUM_NONE	= BIT(1),
> > +	ICE_RX_CSUM_ERROR	= BIT(2),
> > +	ICE_RX_CSUM_FAIL	= ICE_RX_CSUM_NONE | ICE_RX_CSUM_ERROR,
> > +};
> > +
> >  /**
> > - * ice_rx_csum - Indicate in skb if checksum is good
> > - * @ring: the ring we care about
> > - * @skb: skb currently being received and modified
> > + * ice_rx_csum_lvl - Get checksum level from status
> > + * @status: driver-specific checksum status
> > + */
> > +static u8 ice_rx_csum_lvl(enum ice_rx_csum_status status)
> > +{
> > +	return status & ICE_RX_CSUM_LVL_1;
> > +}
> > +
> > +/**
> > + * ice_rx_csum_ip_summed - Checksum status from driver-specific to generic
> > + * @status: driver-specific checksum status
> > + */
> > +static u8 ice_rx_csum_ip_summed(enum ice_rx_csum_status status)
> > +{
> > +	return status & ICE_RX_CSUM_NONE ? CHECKSUM_NONE : CHECKSUM_UNNECESSARY;
> 
> 	return !(status & ICE_RX_CSUM_NONE);
> 
> ?

status & ICE_RX_CSUM_NONE ? CHECKSUM_NONE : CHECKSUM_UNNECESSARY;

is immediately understandable and results in 3 asm operations (I have checked):

result = status >> 1;
result ^= 1;
result &= 1;

I do not think "!(status & ICE_RX_CSUM_NONE);" could produce less.

> 
> > +}
> > +
> > +/**
> > + * ice_get_rx_csum_status - Deduce checksum status from descriptor
> >   * @rx_desc: the receive descriptor
> >   * @ptype: the packet type decoded by hardware
> >   *
> > - * skb->protocol must be set before this function is called
> > + * Returns driver-specific checksum status
> >   */
> > -static void
> > -ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
> > -	    union ice_32b_rx_flex_desc *rx_desc, u16 ptype)
> > +static enum ice_rx_csum_status
> > +ice_get_rx_csum_status(const union ice_32b_rx_flex_desc *rx_desc, u16 ptype)
> >  {
> >  	struct ice_rx_ptype_decoded decoded;
> >  	u16 rx_status0, rx_status1;
> > @@ -124,20 +147,12 @@ ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
> >  
> >  	decoded = ice_decode_rx_desc_ptype(ptype);
> >  
> > -	/* Start with CHECKSUM_NONE and by default csum_level = 0 */
> > -	skb->ip_summed = CHECKSUM_NONE;
> > -	skb_checksum_none_assert(skb);
> > -
> > -	/* check if Rx checksum is enabled */
> > -	if (!(ring->netdev->features & NETIF_F_RXCSUM))
> > -		return;
> > -
> >  	/* check if HW has decoded the packet and checksum */
> >  	if (!(rx_status0 & BIT(ICE_RX_FLEX_DESC_STATUS0_L3L4P_S)))
> > -		return;
> > +		return ICE_RX_CSUM_NONE;
> >  
> >  	if (!(decoded.known && decoded.outer_ip))
> > -		return;
> > +		return ICE_RX_CSUM_NONE;
> >  
> >  	ipv4 = (decoded.outer_ip == ICE_RX_PTYPE_OUTER_IP) &&
> >  	       (decoded.outer_ip_ver == ICE_RX_PTYPE_OUTER_IPV4);
> > @@ -146,43 +161,61 @@ ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
> >  
> >  	if (ipv4 && (rx_status0 & (BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_IPE_S) |
> >  				   BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_EIPE_S))))
> > -		goto checksum_fail;
> > +		return ICE_RX_CSUM_FAIL;
> >  
> >  	if (ipv6 && (rx_status0 & (BIT(ICE_RX_FLEX_DESC_STATUS0_IPV6EXADD_S))))
> > -		goto checksum_fail;
> > +		return ICE_RX_CSUM_FAIL;
> >  
> >  	/* check for L4 errors and handle packets that were not able to be
> >  	 * checksummed due to arrival speed
> >  	 */
> >  	if (rx_status0 & BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_L4E_S))
> > -		goto checksum_fail;
> > +		return ICE_RX_CSUM_FAIL;
> >  
> >  	/* check for outer UDP checksum error in tunneled packets */
> >  	if ((rx_status1 & BIT(ICE_RX_FLEX_DESC_STATUS1_NAT_S)) &&
> >  	    (rx_status0 & BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_EUDPE_S)))
> > -		goto checksum_fail;
> > -
> > -	/* If there is an outer header present that might contain a checksum
> > -	 * we need to bump the checksum level by 1 to reflect the fact that
> > -	 * we are indicating we validated the inner checksum.
> > -	 */
> > -	if (decoded.tunnel_type >= ICE_RX_PTYPE_TUNNEL_IP_GRENAT)
> > -		skb->csum_level = 1;
> > +		return ICE_RX_CSUM_FAIL;
> >  
> >  	/* Only report checksum unnecessary for TCP, UDP, or SCTP */
> >  	switch (decoded.inner_prot) {
> >  	case ICE_RX_PTYPE_INNER_PROT_TCP:
> >  	case ICE_RX_PTYPE_INNER_PROT_UDP:
> >  	case ICE_RX_PTYPE_INNER_PROT_SCTP:
> > -		skb->ip_summed = CHECKSUM_UNNECESSARY;
> > -		break;
> > -	default:
> > -		break;
> > +		/* If there is an outer header present that might contain
> > +		 * a checksum we need to bump the checksum level by 1 to reflect
> > +		 * the fact that we have validated the inner checksum.
> > +		 */
> > +		return decoded.tunnel_type >= ICE_RX_PTYPE_TUNNEL_IP_GRENAT ?
> > +		       ICE_RX_CSUM_LVL_1 : ICE_RX_CSUM_LVL_0;
> >  	}
> > -	return;
> >  
> > -checksum_fail:
> > -	ring->vsi->back->hw_csum_rx_error++;
> > +	return ICE_RX_CSUM_NONE;
> > +}
> > +
> > +/**
> > + * ice_rx_csum_into_skb - Indicate in skb if checksum is good
> > + * @ring: the ring we care about
> > + * @skb: skb currently being received and modified
> > + * @rx_desc: the receive descriptor
> > + * @ptype: the packet type decoded by hardware
> > + */
> > +static void
> > +ice_rx_csum_into_skb(struct ice_rx_ring *ring, struct sk_buff *skb,
> > +		     const union ice_32b_rx_flex_desc *rx_desc, u16 ptype)
> > +{
> > +	enum ice_rx_csum_status csum_status;
> > +
> > +	/* check if Rx checksum is enabled */
> > +	if (!(ring->netdev->features & NETIF_F_RXCSUM))
> > +		return;
> > +
> > +	csum_status = ice_get_rx_csum_status(rx_desc, ptype);
> > +	if (csum_status & ICE_RX_CSUM_ERROR)
> > +		ring->vsi->back->hw_csum_rx_error++;
> > +
> > +	skb->ip_summed = ice_rx_csum_ip_summed(csum_status);
> > +	skb->csum_level = ice_rx_csum_lvl(csum_status);
> >  }
> >  
> >  /**
> > @@ -229,7 +262,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
> >  	/* modifies the skb - consumes the enet header */
> >  	skb->protocol = eth_type_trans(skb, rx_ring->netdev);
> >  
> > -	ice_rx_csum(rx_ring, skb, rx_desc, ptype);
> > +	ice_rx_csum_into_skb(rx_ring, skb, rx_desc, ptype);
> >  
> >  	if (rx_ring->ptp_rx)
> >  		ice_ptp_rx_hwts_to_skb(rx_ring, rx_desc, skb);
> > -- 
> > 2.41.0
> > 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 05/23] ice: Introduce ice_xdp_buff
  2023-09-04 15:32   ` [xdp-hints] " Maciej Fijalkowski
@ 2023-09-04 18:11     ` Larysa Zaremba
  2023-09-05 17:53       ` Maciej Fijalkowski
  0 siblings, 1 reply; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-04 18:11 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Mon, Sep 04, 2023 at 05:32:14PM +0200, Maciej Fijalkowski wrote:
> On Thu, Aug 24, 2023 at 09:26:44PM +0200, Larysa Zaremba wrote:
> > In order to use XDP hints via kfuncs we need to put
> > RX descriptor and ring pointers just next to xdp_buff.
> > Same as in hints implementations in other drivers, we achieve
> > this through putting xdp_buff into a child structure.
> 
> Don't you mean a parent struct? xdp_buff will be 'child' of ice_xdp_buff
> if i'm reading this right.
>

ice_xdp_buff is a child in terms of inheritance (pointer to ice_xdp_buff could 
replace pointer to xdp_buff, but not in reverse).

> > 
> > Currently, xdp_buff is stored in the ring structure,
> > so replace it with union that includes child structure.
> > This way enough memory is available while existing XDP code
> > remains isolated from hints.
> > 
> > Minimum size of the new child structure (ice_xdp_buff) is exactly
> > 64 bytes (single cache line). To place it at the start of a cache line,
> > move 'next' field from CL1 to CL3, as it isn't used often. This still
> > leaves 128 bits available in CL3 for packet context extensions.
> 
> I believe ice_xdp_buff will be beefed up in later patches, so what is the
> point of moving 'next' ? We won't be able to keep ice_xdp_buff in a single
> CL anyway.
>

It is to at least keep xdp_buff and descriptor pointer (used for every hint) in 
a single CL, other fields are situational.

> > 
> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_txrx.c     |  7 +++--
> >  drivers/net/ethernet/intel/ice/ice_txrx.h     | 26 ++++++++++++++++---
> >  drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 10 +++++++
> >  3 files changed, 38 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > index 40f2f6dabb81..4e6546d9cf85 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > @@ -557,13 +557,14 @@ ice_rx_frame_truesize(struct ice_rx_ring *rx_ring, const unsigned int size)
> >   * @xdp_prog: XDP program to run
> >   * @xdp_ring: ring to be used for XDP_TX action
> >   * @rx_buf: Rx buffer to store the XDP action
> > + * @eop_desc: Last descriptor in packet to read metadata from
> >   *
> >   * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}
> >   */
> >  static void
> >  ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> >  	    struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
> > -	    struct ice_rx_buf *rx_buf)
> > +	    struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc)
> >  {
> >  	unsigned int ret = ICE_XDP_PASS;
> >  	u32 act;
> > @@ -571,6 +572,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> >  	if (!xdp_prog)
> >  		goto exit;
> >  
> > +	ice_xdp_meta_set_desc(xdp, eop_desc);
> 
> I am currently not sure if for multi-buffer case HW repeats all the
> necessary info within each descriptor for every frag? IOW shouldn't you be
> using the ice_rx_ring::first_desc?
> 
> Would be good to test hints for mbuf case for sure.
>

In the skb path, we take metadata from the last descriptor only, so this should 
be fine. Really worth testing with mbuf though.

> > +
> >  	act = bpf_prog_run_xdp(xdp_prog, xdp);
> >  	switch (act) {
> >  	case XDP_PASS:
> > @@ -1240,7 +1243,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
> >  		if (ice_is_non_eop(rx_ring, rx_desc))
> >  			continue;
> >  
> > -		ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf);
> > +		ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc);
> >  		if (rx_buf->act == ICE_XDP_PASS)
> >  			goto construct_skb;
> >  		total_rx_bytes += xdp_get_buff_len(xdp);
> > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h
> > index 166413fc33f4..d0ab2c4c0c91 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_txrx.h
> > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h
> > @@ -257,6 +257,18 @@ enum ice_rx_dtype {
> >  	ICE_RX_DTYPE_SPLIT_ALWAYS	= 2,
> >  };
> >  
> > +struct ice_pkt_ctx {
> > +	const union ice_32b_rx_flex_desc *eop_desc;
> > +};
> > +
> > +struct ice_xdp_buff {
> > +	struct xdp_buff xdp_buff;
> > +	struct ice_pkt_ctx pkt_ctx;
> > +};
> > +
> > +/* Required for compatibility with xdp_buffs from xsk_pool */
> > +static_assert(offsetof(struct ice_xdp_buff, xdp_buff) == 0);
> > +
> >  /* indices into GLINT_ITR registers */
> >  #define ICE_RX_ITR	ICE_IDX_ITR0
> >  #define ICE_TX_ITR	ICE_IDX_ITR1
> > @@ -298,7 +310,6 @@ enum ice_dynamic_itr {
> >  /* descriptor ring, associated with a VSI */
> >  struct ice_rx_ring {
> >  	/* CL1 - 1st cacheline starts here */
> > -	struct ice_rx_ring *next;	/* pointer to next ring in q_vector */
> >  	void *desc;			/* Descriptor ring memory */
> >  	struct device *dev;		/* Used for DMA mapping */
> >  	struct net_device *netdev;	/* netdev ring maps to */
> > @@ -310,12 +321,19 @@ struct ice_rx_ring {
> >  	u16 count;			/* Number of descriptors */
> >  	u16 reg_idx;			/* HW register index of the ring */
> >  	u16 next_to_alloc;
> > -	/* CL2 - 2nd cacheline starts here */
> > +
> >  	union {
> >  		struct ice_rx_buf *rx_buf;
> >  		struct xdp_buff **xdp_buf;
> >  	};
> > -	struct xdp_buff xdp;
> > +	/* CL2 - 2nd cacheline starts here */
> > +	union {
> > +		struct ice_xdp_buff xdp_ext;
> > +		struct {
> > +			struct xdp_buff xdp;
> > +			struct ice_pkt_ctx pkt_ctx;
> > +		};
> > +	};
> >  	/* CL3 - 3rd cacheline starts here */
> >  	struct bpf_prog *xdp_prog;
> >  	u16 rx_offset;
> > @@ -325,6 +343,8 @@ struct ice_rx_ring {
> >  	u16 next_to_clean;
> >  	u16 first_desc;
> >  
> > +	struct ice_rx_ring *next;	/* pointer to next ring in q_vector */
> > +
> >  	/* stats structs */
> >  	struct ice_ring_stats *ring_stats;
> >  
> > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
> > index e1d49e1235b3..145883eec129 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
> > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
> > @@ -151,4 +151,14 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
> >  		       struct sk_buff *skb);
> >  void
> >  ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag);
> > +
> > +static inline void
> > +ice_xdp_meta_set_desc(struct xdp_buff *xdp,
> > +		      union ice_32b_rx_flex_desc *eop_desc)
> > +{
> > +	struct ice_xdp_buff *xdp_ext = container_of(xdp, struct ice_xdp_buff,
> > +						    xdp_buff);
> > +
> > +	xdp_ext->pkt_ctx.eop_desc = eop_desc;
> > +}
> >  #endif /* !_ICE_TXRX_LIB_H_ */
> > -- 
> > 2.41.0
> > 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 06/23] ice: Support HW timestamp hint
  2023-09-04 15:38   ` Maciej Fijalkowski
@ 2023-09-04 18:12     ` Larysa Zaremba
  0 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-04 18:12 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Mon, Sep 04, 2023 at 05:38:45PM +0200, Maciej Fijalkowski wrote:
> On Thu, Aug 24, 2023 at 09:26:45PM +0200, Larysa Zaremba wrote:
> > Use previously refactored code and create a function
> > that allows XDP code to read HW timestamp.
> > 
> > Also, move cached_phctime into packet context, this way this data still
> > stays in the ring structure, just at the different address.
> > 
> > HW timestamp is the first supported hint in the driver,
> > so also add xdp_metadata_ops.
> > 
> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice.h          |  2 ++
> >  drivers/net/ethernet/intel/ice/ice_ethtool.c  |  2 +-
> >  drivers/net/ethernet/intel/ice/ice_lib.c      |  2 +-
> >  drivers/net/ethernet/intel/ice/ice_main.c     |  1 +
> >  drivers/net/ethernet/intel/ice/ice_ptp.c      |  3 ++-
> >  drivers/net/ethernet/intel/ice/ice_txrx.h     |  2 +-
> >  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 26 ++++++++++++++++++-
> >  7 files changed, 33 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
> > index 5ac0ad12f9f1..34e4731b5d5f 100644
> > --- a/drivers/net/ethernet/intel/ice/ice.h
> > +++ b/drivers/net/ethernet/intel/ice/ice.h
> > @@ -951,4 +951,6 @@ static inline void ice_clear_rdma_cap(struct ice_pf *pf)
> >  	set_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags);
> >  	clear_bit(ICE_FLAG_RDMA_ENA, pf->flags);
> >  }
> > +
> > +extern const struct xdp_metadata_ops ice_xdp_md_ops;
> >  #endif /* _ICE_H_ */
> > diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
> > index ad4d4702129f..f740e0ad0e3c 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
> > @@ -2846,7 +2846,7 @@ ice_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
> >  		/* clone ring and setup updated count */
> >  		rx_rings[i] = *vsi->rx_rings[i];
> >  		rx_rings[i].count = new_rx_cnt;
> > -		rx_rings[i].cached_phctime = pf->ptp.cached_phc_time;
> > +		rx_rings[i].pkt_ctx.cached_phctime = pf->ptp.cached_phc_time;
> >  		rx_rings[i].desc = NULL;
> >  		rx_rings[i].rx_buf = NULL;
> >  		/* this is to allow wr32 to have something to write to
> > diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
> > index 927518fcad51..12290defb730 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_lib.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_lib.c
> > @@ -1445,7 +1445,7 @@ static int ice_vsi_alloc_rings(struct ice_vsi *vsi)
> >  		ring->netdev = vsi->netdev;
> >  		ring->dev = dev;
> >  		ring->count = vsi->num_rx_desc;
> > -		ring->cached_phctime = pf->ptp.cached_phc_time;
> > +		ring->pkt_ctx.cached_phctime = pf->ptp.cached_phc_time;
> >  		WRITE_ONCE(vsi->rx_rings[i], ring);
> >  	}
> >  
> > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> > index 0f04347eda39..557c6326ff87 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_main.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> > @@ -3395,6 +3395,7 @@ static void ice_set_ops(struct ice_vsi *vsi)
> >  
> >  	netdev->netdev_ops = &ice_netdev_ops;
> >  	netdev->udp_tunnel_nic_info = &pf->hw.udp_tunnel_nic;
> > +	netdev->xdp_metadata_ops = &ice_xdp_md_ops;
> >  	ice_set_ethtool_ops(netdev);
> >  
> >  	if (vsi->type != ICE_VSI_PF)
> > diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
> > index a31333972c68..26fad7038996 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_ptp.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
> > @@ -1038,7 +1038,8 @@ static int ice_ptp_update_cached_phctime(struct ice_pf *pf)
> >  		ice_for_each_rxq(vsi, j) {
> >  			if (!vsi->rx_rings[j])
> >  				continue;
> > -			WRITE_ONCE(vsi->rx_rings[j]->cached_phctime, systime);
> > +			WRITE_ONCE(vsi->rx_rings[j]->pkt_ctx.cached_phctime,
> > +				   systime);
> >  		}
> >  	}
> >  	clear_bit(ICE_CFG_BUSY, pf->state);
> > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h
> > index d0ab2c4c0c91..4237702a58a9 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_txrx.h
> > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h
> > @@ -259,6 +259,7 @@ enum ice_rx_dtype {
> >  
> >  struct ice_pkt_ctx {
> >  	const union ice_32b_rx_flex_desc *eop_desc;
> > +	u64 cached_phctime;
> >  };
> >  
> >  struct ice_xdp_buff {
> > @@ -354,7 +355,6 @@ struct ice_rx_ring {
> >  	struct ice_tx_ring *xdp_ring;
> >  	struct xsk_buff_pool *xsk_pool;
> >  	dma_addr_t dma;			/* physical address of ring */
> > -	u64 cached_phctime;
> >  	u16 rx_buf_len;
> >  	u8 dcb_tc;			/* Traffic class of ring */
> >  	u8 ptp_rx;
> > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > index 07241f4229b7..463d9e5cbe05 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > @@ -233,7 +233,7 @@ ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring,
> >  {
> >  	u64 ts_ns, cached_time;
> >  
> > -	cached_time = READ_ONCE(rx_ring->cached_phctime);
> > +	cached_time = READ_ONCE(rx_ring->pkt_ctx.cached_phctime);
> >  	ts_ns = ice_ptp_get_rx_hwts(rx_desc, cached_time);
> >  
> >  	*skb_hwtstamps(skb) = (struct skb_shared_hwtstamps){
> > @@ -546,3 +546,27 @@ void ice_finalize_xdp_rx(struct ice_tx_ring *xdp_ring, unsigned int xdp_res,
> >  			spin_unlock(&xdp_ring->tx_lock);
> >  	}
> >  }
> > +
> > +/**
> > + * ice_xdp_rx_hw_ts - HW timestamp XDP hint handler
> > + * @ctx: XDP buff pointer
> > + * @ts_ns: destination address
> > + *
> > + * Copy HW timestamp (if available) to the destination address.
> > + */
> > +static int ice_xdp_rx_hw_ts(const struct xdp_md *ctx, u64 *ts_ns)
> > +{
> > +	const struct ice_xdp_buff *xdp_ext = (void *)ctx;
> > +	u64 cached_time;
> > +
> > +	cached_time = READ_ONCE(xdp_ext->pkt_ctx.cached_phctime);
> > +	*ts_ns = ice_ptp_get_rx_hwts(xdp_ext->pkt_ctx.eop_desc, cached_time);
> 
> having cached_phctime within pkt_ctx doesn't stop skb side from using it
> right? so again, why note read it within ice_ptp_get_rx_hwts.
>

I have answered to the related comment for the previous patch.

> > +	if (!*ts_ns)
> > +		return -ENODATA;
> > +
> > +	return 0;
> > +}
> > +
> > +const struct xdp_metadata_ops ice_xdp_md_ops = {
> > +	.xmo_rx_timestamp		= ice_xdp_rx_hw_ts,
> > +};
> > -- 
> > 2.41.0
> > 
> > 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 08/23] ice: Support XDP hints in AF_XDP ZC mode
  2023-09-04 15:42   ` [xdp-hints] " Maciej Fijalkowski
@ 2023-09-04 18:14     ` Larysa Zaremba
  0 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-04 18:14 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Mon, Sep 04, 2023 at 05:42:59PM +0200, Maciej Fijalkowski wrote:
> On Thu, Aug 24, 2023 at 09:26:47PM +0200, Larysa Zaremba wrote:
> > In AF_XDP ZC, xdp_buff is not stored on ring,
> > instead it is provided by xsk_pool.
> 
> xsk_buff_pool
>

Will correct.
 
> > Space for metadata sources right after such buffers was already reserved
> > in commit 94ecc5ca4dbf ("xsk: Add cb area to struct xdp_buff_xsk").
> > This makes the implementation rather straightforward.
> > 
> > Update AF_XDP ZC packet processing to support XDP hints.
> > 
> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_xsk.c | 14 ++++++++++++--
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > index ef778b8e6d1b..fdeddad9b639 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > @@ -758,16 +758,25 @@ static int ice_xmit_xdp_tx_zc(struct xdp_buff *xdp,
> >   * @xdp: xdp_buff used as input to the XDP program
> >   * @xdp_prog: XDP program to run
> >   * @xdp_ring: ring to be used for XDP_TX action
> > + * @rx_desc: packet descriptor
> >   *
> >   * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}
> >   */
> >  static int
> >  ice_run_xdp_zc(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > -	       struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring)
> > +	       struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
> > +	       union ice_32b_rx_flex_desc *rx_desc)
> >  {
> >  	int err, result = ICE_XDP_PASS;
> >  	u32 act;
> >  
> > +	/* We can safely convert xdp_buff_xsk to ice_xdp_buff,
> > +	 * because there are XSK_PRIV_MAX bytes reserved in xdp_buff_xsk
> > +	 * right after xdp_buff, for our private use.
> > +	 * Macro insures we do not go above the limit.
> 
> ensures?

Yes :D

> 
> > +	 */
> > +	XSK_CHECK_PRIV_TYPE(struct ice_xdp_buff);
> > +	ice_xdp_meta_set_desc(xdp, rx_desc);
> >  	act = bpf_prog_run_xdp(xdp_prog, xdp);
> >  
> >  	if (likely(act == XDP_REDIRECT)) {
> > @@ -907,7 +916,8 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
> >  		if (ice_is_non_eop(rx_ring, rx_desc))
> >  			continue;
> >  
> > -		xdp_res = ice_run_xdp_zc(rx_ring, first, xdp_prog, xdp_ring);
> > +		xdp_res = ice_run_xdp_zc(rx_ring, xdp, xdp_prog, xdp_ring,
> > +					 rx_desc);
> >  		if (likely(xdp_res & (ICE_XDP_TX | ICE_XDP_REDIR))) {
> >  			xdp_xmit |= xdp_res;
> >  		} else if (xdp_res == ICE_XDP_EXIT) {
> > -- 
> > 2.41.0
> > 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 10/23] ice: Implement VLAN tag hint
  2023-09-04 16:00   ` Maciej Fijalkowski
@ 2023-09-04 18:18     ` Larysa Zaremba
  0 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-04 18:18 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Mon, Sep 04, 2023 at 06:00:34PM +0200, Maciej Fijalkowski wrote:
> On Thu, Aug 24, 2023 at 09:26:49PM +0200, Larysa Zaremba wrote:
> > Implement .xmo_rx_vlan_tag callback to allow XDP code to read
> > packet's VLAN tag.
> > 
> > At the same time, use vlan_tci instead of vlan_tag in touched code,
> > because vlan_tag is misleading.
> 
> misleading...because? ;)
>

VLAN tag ofter refers to VLAN proto and VLAN TCI combined, while in the 
corrected code we clearly store only VLAN TCI.

Will add the above to the commit message.

> > 
> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_main.c     | 22 ++++++++++++++++
> >  drivers/net/ethernet/intel/ice/ice_txrx.c     |  6 ++---
> >  drivers/net/ethernet/intel/ice/ice_txrx.h     |  1 +
> >  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 26 +++++++++++++++++++
> >  drivers/net/ethernet/intel/ice/ice_txrx_lib.h |  4 +--
> >  drivers/net/ethernet/intel/ice/ice_xsk.c      |  6 ++---
> >  6 files changed, 57 insertions(+), 8 deletions(-)
> > 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 02/23] ice: make RX HW timestamp reading code more reusable
  2023-09-04 16:29     ` Larysa Zaremba
@ 2023-09-05 15:22       ` Maciej Fijalkowski
  0 siblings, 0 replies; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-05 15:22 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Mon, Sep 04, 2023 at 06:29:03PM +0200, Larysa Zaremba wrote:
> On Mon, Sep 04, 2023 at 04:56:32PM +0200, Maciej Fijalkowski wrote:
> > On Thu, Aug 24, 2023 at 09:26:41PM +0200, Larysa Zaremba wrote:
> > > Previously, we only needed RX HW timestamp in skb path,
> > > hence all related code was written with skb in mind.
> > > But with the addition of XDP hints via kfuncs to the ice driver,
> > > the same logic will be needed in .xmo_() callbacks.
> > > 
> > > Put generic process of reading RX HW timestamp from a descriptor
> > > into a separate function.
> > > Move skb-related code into another source file.
> > > 
> > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > ---
> > >  drivers/net/ethernet/intel/ice/ice_ptp.c      | 24 ++++++------------
> > >  drivers/net/ethernet/intel/ice/ice_ptp.h      | 15 ++++++-----
> > >  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 25 ++++++++++++++++++-
> > >  3 files changed, 41 insertions(+), 23 deletions(-)
> > > 
> > >  

(...)

> > > +/**
> > > + * ice_ptp_rx_hwts_to_skb - Put RX timestamp into skb
> > > + * @rx_ring: Ring to get the VSI info
> > > + * @rx_desc: Receive descriptor
> > > + * @skb: Particular skb to send timestamp with
> > > + *
> > > + * The timestamp is in ns, so we must convert the result first.
> > > + */
> > > +static void
> > > +ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring,
> > > +		       const union ice_32b_rx_flex_desc *rx_desc,
> > > +		       struct sk_buff *skb)
> > > +{
> > > +	u64 ts_ns, cached_time;
> > > +
> > > +	cached_time = READ_ONCE(rx_ring->cached_phctime);
> > 
> > any reason for not reading cached_phctime within ice_ptp_get_rx_hwts?
> >
> 
> Not at this point, but later for hints, this is read from the xdp_buff tail 
> instead of ring.
> 
> But maybe it would be actually better to leave cached time where it used to be 
> for now and instead later in hint patch replace rx_ring with ice_pkt_ctx in 
> ice_ptp_get_rx_hwts(). I guess that would look better.

Yes, that's what I was trying to say mostly. Thanks.

>  
> > > +	ts_ns = ice_ptp_get_rx_hwts(rx_desc, cached_time);
> > > +
> > > +	*skb_hwtstamps(skb) = (struct skb_shared_hwtstamps){
> > > +		.hwtstamp	= ns_to_ktime(ts_ns),
> > > +	};
> > > +}
> > > +
> > >  /**
> > >   * ice_process_skb_fields - Populate skb header fields from Rx descriptor
> > >   * @rx_ring: Rx descriptor ring packet is being transacted on
> > > @@ -209,7 +232,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
> > >  	ice_rx_csum(rx_ring, skb, rx_desc, ptype);
> > >  
> > >  	if (rx_ring->ptp_rx)
> > > -		ice_ptp_rx_hwtstamp(rx_ring, rx_desc, skb);
> > > +		ice_ptp_rx_hwts_to_skb(rx_ring, rx_desc, skb);
> > >  }
> > >  
> > >  /**
> > > -- 
> > > 2.41.0
> > > 
> > > 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 03/23] ice: make RX checksum checking code more reusable
  2023-09-04 18:01     ` Larysa Zaremba
@ 2023-09-05 15:37       ` Maciej Fijalkowski
  2023-09-05 16:53         ` Larysa Zaremba
  0 siblings, 1 reply; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-05 15:37 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Mon, Sep 04, 2023 at 08:01:06PM +0200, Larysa Zaremba wrote:
> On Mon, Sep 04, 2023 at 05:02:40PM +0200, Maciej Fijalkowski wrote:
> > On Thu, Aug 24, 2023 at 09:26:42PM +0200, Larysa Zaremba wrote:
> > > Previously, we only needed RX checksum flags in skb path,
> > > hence all related code was written with skb in mind.
> > > But with the addition of XDP hints via kfuncs to the ice driver,
> > > the same logic will be needed in .xmo_() callbacks.
> > > 
> > > Put generic process of determining checksum status into
> > > a separate function.
> > > 
> > > Now we cannot operate directly on skb, when deducing
> > > checksum status, therefore introduce an intermediate enum for checksum
> > > status. Fortunately, in ice, we have only 4 possibilities: checksum
> > > validated at level 0, validated at level 1, no checksum, checksum error.
> > > Use 3 bits for more convenient conversion.
> > > 
> > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > ---
> > >  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 105 ++++++++++++------
> > >  1 file changed, 69 insertions(+), 36 deletions(-)
> > > 
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > > index b2f241b73934..8b155a502b3b 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > > @@ -102,18 +102,41 @@ ice_rx_hash_to_skb(const struct ice_rx_ring *rx_ring,
> > >  		skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype));
> > >  }
> > >  
> > > +enum ice_rx_csum_status {
> > > +	ICE_RX_CSUM_LVL_0	= 0,
> > > +	ICE_RX_CSUM_LVL_1	= BIT(0),
> > > +	ICE_RX_CSUM_NONE	= BIT(1),
> > > +	ICE_RX_CSUM_ERROR	= BIT(2),
> > > +	ICE_RX_CSUM_FAIL	= ICE_RX_CSUM_NONE | ICE_RX_CSUM_ERROR,
> > > +};
> > > +
> > >  /**
> > > - * ice_rx_csum - Indicate in skb if checksum is good
> > > - * @ring: the ring we care about
> > > - * @skb: skb currently being received and modified
> > > + * ice_rx_csum_lvl - Get checksum level from status
> > > + * @status: driver-specific checksum status

nit: describe retval?

> > > + */
> > > +static u8 ice_rx_csum_lvl(enum ice_rx_csum_status status)
> > > +{
> > > +	return status & ICE_RX_CSUM_LVL_1;
> > > +}
> > > +
> > > +/**
> > > + * ice_rx_csum_ip_summed - Checksum status from driver-specific to generic
> > > + * @status: driver-specific checksum status

ditto

> > > + */
> > > +static u8 ice_rx_csum_ip_summed(enum ice_rx_csum_status status)
> > > +{
> > > +	return status & ICE_RX_CSUM_NONE ? CHECKSUM_NONE : CHECKSUM_UNNECESSARY;
> > 
> > 	return !(status & ICE_RX_CSUM_NONE);
> > 
> > ?
> 
> status & ICE_RX_CSUM_NONE ? CHECKSUM_NONE : CHECKSUM_UNNECESSARY;
> 
> is immediately understandable and results in 3 asm operations (I have checked):
> 
> result = status >> 1;
> result ^= 1;
> result &= 1;
> 
> I do not think "!(status & ICE_RX_CSUM_NONE);" could produce less.

oh, nice. Just the fact that branch being added caught my eye.

(...)

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 07/23] ice: Support RX hash XDP hint
  2023-08-24 19:26 ` [RFC bpf-next 07/23] ice: Support RX hash XDP hint Larysa Zaremba
@ 2023-09-05 15:42   ` Maciej Fijalkowski
  2023-09-05 17:09     ` Larysa Zaremba
  2023-09-06 12:03     ` Alexander Lobakin
  2023-09-14 16:54   ` Alexander Lobakin
  1 sibling, 2 replies; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-05 15:42 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Aug 24, 2023 at 09:26:46PM +0200, Larysa Zaremba wrote:
> RX hash XDP hint requests both hash value and type.
> Type is XDP-specific, so we need a separate way to map
> these values to the hardware ptypes, so create a lookup table.
> 
> Instead of creating a new long list, reuse contents
> of ice_decode_rx_desc_ptype[] through preprocessor.
> 
> Current hash type enum does not contain ICMP packet type,
> but ice devices support it, so also add a new type into core code.
> 
> Then use previously refactored code and create a function
> that allows XDP code to read RX hash.
> 
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> ---
>  .../net/ethernet/intel/ice/ice_lan_tx_rx.h    | 412 +++++++++---------
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.c |  73 ++++
>  include/net/xdp.h                             |   3 +
>  3 files changed, 284 insertions(+), 204 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h b/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h
> index 89f986a75cc8..d384ddfcb83e 100644
> --- a/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h
> +++ b/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h
> @@ -673,6 +673,212 @@ struct ice_tlan_ctx {
>   *      Use the enum ice_rx_l2_ptype to decode the packet type
>   * ENDIF
>   */
> +#define ICE_PTYPES								\

ERROR: Macros with complex values should be enclosed in parentheses
#34: FILE: drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h:676:
+#define ICE_PTYPES                                                             \

(...)

> +	/* L2 Packet types */							\
> +	ICE_PTT_UNUSED_ENTRY(0),						\
> +	ICE_PTT(1, L2, NONE, NOF, NONE, NONE, NOF, NONE, PAY2),			\
> +	ICE_PTT_UNUSED_ENTRY(2),						\
> +	ICE_PTT_UNUSED_ENTRY(3),						\
> +	ICE_PTT_UNUSED_ENTRY(4),						\
> +	ICE_PTT_UNUSED_ENTRY(5),						\
> +	ICE_PTT(6, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),			\
> +	ICE_PTT(7, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),			\
> +	ICE_PTT_UNUSED_ENTRY(8),						\
> +	ICE_PTT_UNUSED_ENTRY(9),						\
> +	ICE_PTT(10, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),		\
> +	ICE_PTT(11, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),		\
> +	ICE_PTT_UNUSED_ENTRY(12),						\
> +	ICE_PTT_UNUSED_ENTRY(13),						\
> +	ICE_PTT_UNUSED_ENTRY(14),						\
> +	ICE_PTT_UNUSED_ENTRY(15),						\
> +	ICE_PTT_UNUSED_ENTRY(16),						\
> +	ICE_PTT_UNUSED_ENTRY(17),						\
> +	ICE_PTT_UNUSED_ENTRY(18),						\
> +	ICE_PTT_UNUSED_ENTRY(19),						\
> +	ICE_PTT_UNUSED_ENTRY(20),						\
> +	ICE_PTT_UNUSED_ENTRY(21),						\
> +										\
> +	/* Non Tunneled IPv4 */							\
> +	ICE_PTT(22, IP, IPV4, FRG, NONE, NONE, NOF, NONE, PAY3),		\
> +	ICE_PTT(23, IP, IPV4, NOF, NONE, NONE, NOF, NONE, PAY3),		\
> +	ICE_PTT(24, IP, IPV4, NOF, NONE, NONE, NOF, UDP,  PAY4),		\
> +	ICE_PTT_UNUSED_ENTRY(25),						\
> +	ICE_PTT(26, IP, IPV4, NOF, NONE, NONE, NOF, TCP,  PAY4),		\
> +	ICE_PTT(27, IP, IPV4, NOF, NONE, NONE, NOF, SCTP, PAY4),		\
> +	ICE_PTT(28, IP, IPV4, NOF, NONE, NONE, NOF, ICMP, PAY4),		\
> +										\
> +	/* IPv4 --> IPv4 */							\
> +	ICE_PTT(29, IP, IPV4, NOF, IP_IP, IPV4, FRG, NONE, PAY3),		\
> +	ICE_PTT(30, IP, IPV4, NOF, IP_IP, IPV4, NOF, NONE, PAY3),		\
> +	ICE_PTT(31, IP, IPV4, NOF, IP_IP, IPV4, NOF, UDP,  PAY4),		\
> +	ICE_PTT_UNUSED_ENTRY(32),						\
> +	ICE_PTT(33, IP, IPV4, NOF, IP_IP, IPV4, NOF, TCP,  PAY4),		\
> +	ICE_PTT(34, IP, IPV4, NOF, IP_IP, IPV4, NOF, SCTP, PAY4),		\
> +	ICE_PTT(35, IP, IPV4, NOF, IP_IP, IPV4, NOF, ICMP, PAY4),		\
> +										\
> +	/* IPv4 --> IPv6 */							\
> +	ICE_PTT(36, IP, IPV4, NOF, IP_IP, IPV6, FRG, NONE, PAY3),		\
> +	ICE_PTT(37, IP, IPV4, NOF, IP_IP, IPV6, NOF, NONE, PAY3),		\
> +	ICE_PTT(38, IP, IPV4, NOF, IP_IP, IPV6, NOF, UDP,  PAY4),		\
> +	ICE_PTT_UNUSED_ENTRY(39),						\
> +	ICE_PTT(40, IP, IPV4, NOF, IP_IP, IPV6, NOF, TCP,  PAY4),		\
> +	ICE_PTT(41, IP, IPV4, NOF, IP_IP, IPV6, NOF, SCTP, PAY4),		\
> +	ICE_PTT(42, IP, IPV4, NOF, IP_IP, IPV6, NOF, ICMP, PAY4),		\
> +										\
> +	/* IPv4 --> GRE/NAT */							\
> +	ICE_PTT(43, IP, IPV4, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3),		\
> +										\
> +	/* IPv4 --> GRE/NAT --> IPv4 */						\
> +	ICE_PTT(44, IP, IPV4, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3),		\
> +	ICE_PTT(45, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3),		\
> +	ICE_PTT(46, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, UDP,  PAY4),		\
> +	ICE_PTT_UNUSED_ENTRY(47),						\
> +	ICE_PTT(48, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, TCP,  PAY4),		\
> +	ICE_PTT(49, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4),		\
> +	ICE_PTT(50, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4),		\
> +										\
> +	/* IPv4 --> GRE/NAT --> IPv6 */						\
> +	ICE_PTT(51, IP, IPV4, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3),		\
> +	ICE_PTT(52, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3),		\
> +	ICE_PTT(53, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, UDP,  PAY4),		\
> +	ICE_PTT_UNUSED_ENTRY(54),						\
> +	ICE_PTT(55, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, TCP,  PAY4),		\
> +	ICE_PTT(56, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4),		\
> +	ICE_PTT(57, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4),		\
> +										\
> +	/* IPv4 --> GRE/NAT --> MAC */						\
> +	ICE_PTT(58, IP, IPV4, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3),	\
> +										\
> +	/* IPv4 --> GRE/NAT --> MAC --> IPv4 */					\
> +	ICE_PTT(59, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3),	\
> +	ICE_PTT(60, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3),	\
> +	ICE_PTT(61, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP,  PAY4),	\
> +	ICE_PTT_UNUSED_ENTRY(62),						\
> +	ICE_PTT(63, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP,  PAY4),	\
> +	ICE_PTT(64, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4),	\
> +	ICE_PTT(65, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4),	\
> +										\
> +	/* IPv4 --> GRE/NAT -> MAC --> IPv6 */					\
> +	ICE_PTT(66, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3),	\
> +	ICE_PTT(67, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3),	\
> +	ICE_PTT(68, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP,  PAY4),	\
> +	ICE_PTT_UNUSED_ENTRY(69),						\
> +	ICE_PTT(70, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP,  PAY4),	\
> +	ICE_PTT(71, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4),	\
> +	ICE_PTT(72, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4),	\
> +										\
> +	/* IPv4 --> GRE/NAT --> MAC/VLAN */					\
> +	ICE_PTT(73, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3),	\
> +										\
> +	/* IPv4 ---> GRE/NAT -> MAC/VLAN --> IPv4 */				\
> +	ICE_PTT(74, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3),	\
> +	ICE_PTT(75, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3),	\
> +	ICE_PTT(76, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP,  PAY4),	\
> +	ICE_PTT_UNUSED_ENTRY(77),						\
> +	ICE_PTT(78, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP,  PAY4),	\
> +	ICE_PTT(79, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4),	\
> +	ICE_PTT(80, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4),	\
> +										\
> +	/* IPv4 -> GRE/NAT -> MAC/VLAN --> IPv6 */				\
> +	ICE_PTT(81, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3),	\
> +	ICE_PTT(82, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3),	\
> +	ICE_PTT(83, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP,  PAY4),	\
> +	ICE_PTT_UNUSED_ENTRY(84),						\
> +	ICE_PTT(85, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP,  PAY4),	\
> +	ICE_PTT(86, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4),	\
> +	ICE_PTT(87, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4),	\
> +										\
> +	/* Non Tunneled IPv6 */							\
> +	ICE_PTT(88, IP, IPV6, FRG, NONE, NONE, NOF, NONE, PAY3),		\
> +	ICE_PTT(89, IP, IPV6, NOF, NONE, NONE, NOF, NONE, PAY3),		\
> +	ICE_PTT(90, IP, IPV6, NOF, NONE, NONE, NOF, UDP,  PAY4),		\
> +	ICE_PTT_UNUSED_ENTRY(91),						\
> +	ICE_PTT(92, IP, IPV6, NOF, NONE, NONE, NOF, TCP,  PAY4),		\
> +	ICE_PTT(93, IP, IPV6, NOF, NONE, NONE, NOF, SCTP, PAY4),		\
> +	ICE_PTT(94, IP, IPV6, NOF, NONE, NONE, NOF, ICMP, PAY4),		\
> +										\
> +	/* IPv6 --> IPv4 */							\
> +	ICE_PTT(95, IP, IPV6, NOF, IP_IP, IPV4, FRG, NONE, PAY3),		\
> +	ICE_PTT(96, IP, IPV6, NOF, IP_IP, IPV4, NOF, NONE, PAY3),		\
> +	ICE_PTT(97, IP, IPV6, NOF, IP_IP, IPV4, NOF, UDP,  PAY4),		\
> +	ICE_PTT_UNUSED_ENTRY(98),						\
> +	ICE_PTT(99, IP, IPV6, NOF, IP_IP, IPV4, NOF, TCP,  PAY4),		\
> +	ICE_PTT(100, IP, IPV6, NOF, IP_IP, IPV4, NOF, SCTP, PAY4),		\
> +	ICE_PTT(101, IP, IPV6, NOF, IP_IP, IPV4, NOF, ICMP, PAY4),		\
> +										\
> +	/* IPv6 --> IPv6 */							\
> +	ICE_PTT(102, IP, IPV6, NOF, IP_IP, IPV6, FRG, NONE, PAY3),		\
> +	ICE_PTT(103, IP, IPV6, NOF, IP_IP, IPV6, NOF, NONE, PAY3),		\
> +	ICE_PTT(104, IP, IPV6, NOF, IP_IP, IPV6, NOF, UDP,  PAY4),		\
> +	ICE_PTT_UNUSED_ENTRY(105),						\
> +	ICE_PTT(106, IP, IPV6, NOF, IP_IP, IPV6, NOF, TCP,  PAY4),		\
> +	ICE_PTT(107, IP, IPV6, NOF, IP_IP, IPV6, NOF, SCTP, PAY4),		\
> +	ICE_PTT(108, IP, IPV6, NOF, IP_IP, IPV6, NOF, ICMP, PAY4),		\
> +										\
> +	/* IPv6 --> GRE/NAT */							\
> +	ICE_PTT(109, IP, IPV6, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3),		\
> +										\
> +	/* IPv6 --> GRE/NAT -> IPv4 */						\
> +	ICE_PTT(110, IP, IPV6, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3),		\
> +	ICE_PTT(111, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3),		\
> +	ICE_PTT(112, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, UDP,  PAY4),		\
> +	ICE_PTT_UNUSED_ENTRY(113),						\
> +	ICE_PTT(114, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, TCP,  PAY4),		\
> +	ICE_PTT(115, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4),		\
> +	ICE_PTT(116, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4),		\
> +										\
> +	/* IPv6 --> GRE/NAT -> IPv6 */						\
> +	ICE_PTT(117, IP, IPV6, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3),		\
> +	ICE_PTT(118, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3),		\
> +	ICE_PTT(119, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, UDP,  PAY4),		\
> +	ICE_PTT_UNUSED_ENTRY(120),						\
> +	ICE_PTT(121, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, TCP,  PAY4),		\
> +	ICE_PTT(122, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4),		\
> +	ICE_PTT(123, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4),		\
> +										\
> +	/* IPv6 --> GRE/NAT -> MAC */						\
> +	ICE_PTT(124, IP, IPV6, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3),	\
> +										\
> +	/* IPv6 --> GRE/NAT -> MAC -> IPv4 */					\
> +	ICE_PTT(125, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3),	\
> +	ICE_PTT(126, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3),	\
> +	ICE_PTT(127, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP,  PAY4),	\
> +	ICE_PTT_UNUSED_ENTRY(128),						\
> +	ICE_PTT(129, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP,  PAY4),	\
> +	ICE_PTT(130, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4),	\
> +	ICE_PTT(131, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4),	\
> +										\
> +	/* IPv6 --> GRE/NAT -> MAC -> IPv6 */					\
> +	ICE_PTT(132, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3),	\
> +	ICE_PTT(133, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3),	\
> +	ICE_PTT(134, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP,  PAY4),	\
> +	ICE_PTT_UNUSED_ENTRY(135),						\
> +	ICE_PTT(136, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP,  PAY4),	\
> +	ICE_PTT(137, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4),	\
> +	ICE_PTT(138, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4),	\
> +										\
> +	/* IPv6 --> GRE/NAT -> MAC/VLAN */					\
> +	ICE_PTT(139, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3),	\
> +										\
> +	/* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv4 */				\
> +	ICE_PTT(140, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3),	\
> +	ICE_PTT(141, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3),	\
> +	ICE_PTT(142, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP,  PAY4),	\
> +	ICE_PTT_UNUSED_ENTRY(143),						\
> +	ICE_PTT(144, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP,  PAY4),	\
> +	ICE_PTT(145, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4),	\
> +	ICE_PTT(146, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4),	\
> +										\
> +	/* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv6 */				\
> +	ICE_PTT(147, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3),	\
> +	ICE_PTT(148, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3),	\
> +	ICE_PTT(149, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP,  PAY4),	\
> +	ICE_PTT_UNUSED_ENTRY(150),						\
> +	ICE_PTT(151, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP,  PAY4),	\
> +	ICE_PTT(152, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4),	\
> +	ICE_PTT(153, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4),
> +
> +#define ICE_NUM_DEFINED_PTYPES	154
>  
>  /* macro to make the table lines short, use explicit indexing with [PTYPE] */
>  #define ICE_PTT(PTYPE, OUTER_IP, OUTER_IP_VER, OUTER_FRAG, T, TE, TEF, I, PL)\
> @@ -695,212 +901,10 @@ struct ice_tlan_ctx {
>  
>  /* Lookup table mapping in the 10-bit HW PTYPE to the bit field for decoding */
>  static const struct ice_rx_ptype_decoded ice_ptype_lkup[BIT(10)] = {
> -	/* L2 Packet types */
> -	ICE_PTT_UNUSED_ENTRY(0),
> -	ICE_PTT(1, L2, NONE, NOF, NONE, NONE, NOF, NONE, PAY2),
> -	ICE_PTT_UNUSED_ENTRY(2),
> -	ICE_PTT_UNUSED_ENTRY(3),
> -	ICE_PTT_UNUSED_ENTRY(4),
> -	ICE_PTT_UNUSED_ENTRY(5),
> -	ICE_PTT(6, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),
> -	ICE_PTT(7, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),
> -	ICE_PTT_UNUSED_ENTRY(8),
> -	ICE_PTT_UNUSED_ENTRY(9),
> -	ICE_PTT(10, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),
> -	ICE_PTT(11, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),
> -	ICE_PTT_UNUSED_ENTRY(12),
> -	ICE_PTT_UNUSED_ENTRY(13),
> -	ICE_PTT_UNUSED_ENTRY(14),
> -	ICE_PTT_UNUSED_ENTRY(15),
> -	ICE_PTT_UNUSED_ENTRY(16),
> -	ICE_PTT_UNUSED_ENTRY(17),
> -	ICE_PTT_UNUSED_ENTRY(18),
> -	ICE_PTT_UNUSED_ENTRY(19),
> -	ICE_PTT_UNUSED_ENTRY(20),
> -	ICE_PTT_UNUSED_ENTRY(21),
> -
> -	/* Non Tunneled IPv4 */
> -	ICE_PTT(22, IP, IPV4, FRG, NONE, NONE, NOF, NONE, PAY3),
> -	ICE_PTT(23, IP, IPV4, NOF, NONE, NONE, NOF, NONE, PAY3),
> -	ICE_PTT(24, IP, IPV4, NOF, NONE, NONE, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(25),
> -	ICE_PTT(26, IP, IPV4, NOF, NONE, NONE, NOF, TCP,  PAY4),
> -	ICE_PTT(27, IP, IPV4, NOF, NONE, NONE, NOF, SCTP, PAY4),
> -	ICE_PTT(28, IP, IPV4, NOF, NONE, NONE, NOF, ICMP, PAY4),
> -
> -	/* IPv4 --> IPv4 */
> -	ICE_PTT(29, IP, IPV4, NOF, IP_IP, IPV4, FRG, NONE, PAY3),
> -	ICE_PTT(30, IP, IPV4, NOF, IP_IP, IPV4, NOF, NONE, PAY3),
> -	ICE_PTT(31, IP, IPV4, NOF, IP_IP, IPV4, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(32),
> -	ICE_PTT(33, IP, IPV4, NOF, IP_IP, IPV4, NOF, TCP,  PAY4),
> -	ICE_PTT(34, IP, IPV4, NOF, IP_IP, IPV4, NOF, SCTP, PAY4),
> -	ICE_PTT(35, IP, IPV4, NOF, IP_IP, IPV4, NOF, ICMP, PAY4),
> -
> -	/* IPv4 --> IPv6 */
> -	ICE_PTT(36, IP, IPV4, NOF, IP_IP, IPV6, FRG, NONE, PAY3),
> -	ICE_PTT(37, IP, IPV4, NOF, IP_IP, IPV6, NOF, NONE, PAY3),
> -	ICE_PTT(38, IP, IPV4, NOF, IP_IP, IPV6, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(39),
> -	ICE_PTT(40, IP, IPV4, NOF, IP_IP, IPV6, NOF, TCP,  PAY4),
> -	ICE_PTT(41, IP, IPV4, NOF, IP_IP, IPV6, NOF, SCTP, PAY4),
> -	ICE_PTT(42, IP, IPV4, NOF, IP_IP, IPV6, NOF, ICMP, PAY4),
> -
> -	/* IPv4 --> GRE/NAT */
> -	ICE_PTT(43, IP, IPV4, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3),
> -
> -	/* IPv4 --> GRE/NAT --> IPv4 */
> -	ICE_PTT(44, IP, IPV4, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3),
> -	ICE_PTT(45, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3),
> -	ICE_PTT(46, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(47),
> -	ICE_PTT(48, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, TCP,  PAY4),
> -	ICE_PTT(49, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4),
> -	ICE_PTT(50, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4),
> -
> -	/* IPv4 --> GRE/NAT --> IPv6 */
> -	ICE_PTT(51, IP, IPV4, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3),
> -	ICE_PTT(52, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3),
> -	ICE_PTT(53, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(54),
> -	ICE_PTT(55, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, TCP,  PAY4),
> -	ICE_PTT(56, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4),
> -	ICE_PTT(57, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4),
> -
> -	/* IPv4 --> GRE/NAT --> MAC */
> -	ICE_PTT(58, IP, IPV4, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3),
> -
> -	/* IPv4 --> GRE/NAT --> MAC --> IPv4 */
> -	ICE_PTT(59, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3),
> -	ICE_PTT(60, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3),
> -	ICE_PTT(61, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(62),
> -	ICE_PTT(63, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP,  PAY4),
> -	ICE_PTT(64, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4),
> -	ICE_PTT(65, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4),
> -
> -	/* IPv4 --> GRE/NAT -> MAC --> IPv6 */
> -	ICE_PTT(66, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3),
> -	ICE_PTT(67, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3),
> -	ICE_PTT(68, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(69),
> -	ICE_PTT(70, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP,  PAY4),
> -	ICE_PTT(71, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4),
> -	ICE_PTT(72, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4),
> -
> -	/* IPv4 --> GRE/NAT --> MAC/VLAN */
> -	ICE_PTT(73, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3),
> -
> -	/* IPv4 ---> GRE/NAT -> MAC/VLAN --> IPv4 */
> -	ICE_PTT(74, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3),
> -	ICE_PTT(75, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3),
> -	ICE_PTT(76, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(77),
> -	ICE_PTT(78, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP,  PAY4),
> -	ICE_PTT(79, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4),
> -	ICE_PTT(80, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4),
> -
> -	/* IPv4 -> GRE/NAT -> MAC/VLAN --> IPv6 */
> -	ICE_PTT(81, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3),
> -	ICE_PTT(82, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3),
> -	ICE_PTT(83, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(84),
> -	ICE_PTT(85, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP,  PAY4),
> -	ICE_PTT(86, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4),
> -	ICE_PTT(87, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4),
> -
> -	/* Non Tunneled IPv6 */
> -	ICE_PTT(88, IP, IPV6, FRG, NONE, NONE, NOF, NONE, PAY3),
> -	ICE_PTT(89, IP, IPV6, NOF, NONE, NONE, NOF, NONE, PAY3),
> -	ICE_PTT(90, IP, IPV6, NOF, NONE, NONE, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(91),
> -	ICE_PTT(92, IP, IPV6, NOF, NONE, NONE, NOF, TCP,  PAY4),
> -	ICE_PTT(93, IP, IPV6, NOF, NONE, NONE, NOF, SCTP, PAY4),
> -	ICE_PTT(94, IP, IPV6, NOF, NONE, NONE, NOF, ICMP, PAY4),
> -
> -	/* IPv6 --> IPv4 */
> -	ICE_PTT(95, IP, IPV6, NOF, IP_IP, IPV4, FRG, NONE, PAY3),
> -	ICE_PTT(96, IP, IPV6, NOF, IP_IP, IPV4, NOF, NONE, PAY3),
> -	ICE_PTT(97, IP, IPV6, NOF, IP_IP, IPV4, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(98),
> -	ICE_PTT(99, IP, IPV6, NOF, IP_IP, IPV4, NOF, TCP,  PAY4),
> -	ICE_PTT(100, IP, IPV6, NOF, IP_IP, IPV4, NOF, SCTP, PAY4),
> -	ICE_PTT(101, IP, IPV6, NOF, IP_IP, IPV4, NOF, ICMP, PAY4),
> -
> -	/* IPv6 --> IPv6 */
> -	ICE_PTT(102, IP, IPV6, NOF, IP_IP, IPV6, FRG, NONE, PAY3),
> -	ICE_PTT(103, IP, IPV6, NOF, IP_IP, IPV6, NOF, NONE, PAY3),
> -	ICE_PTT(104, IP, IPV6, NOF, IP_IP, IPV6, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(105),
> -	ICE_PTT(106, IP, IPV6, NOF, IP_IP, IPV6, NOF, TCP,  PAY4),
> -	ICE_PTT(107, IP, IPV6, NOF, IP_IP, IPV6, NOF, SCTP, PAY4),
> -	ICE_PTT(108, IP, IPV6, NOF, IP_IP, IPV6, NOF, ICMP, PAY4),
> -
> -	/* IPv6 --> GRE/NAT */
> -	ICE_PTT(109, IP, IPV6, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3),
> -
> -	/* IPv6 --> GRE/NAT -> IPv4 */
> -	ICE_PTT(110, IP, IPV6, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3),
> -	ICE_PTT(111, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3),
> -	ICE_PTT(112, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(113),
> -	ICE_PTT(114, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, TCP,  PAY4),
> -	ICE_PTT(115, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4),
> -	ICE_PTT(116, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4),
> -
> -	/* IPv6 --> GRE/NAT -> IPv6 */
> -	ICE_PTT(117, IP, IPV6, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3),
> -	ICE_PTT(118, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3),
> -	ICE_PTT(119, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(120),
> -	ICE_PTT(121, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, TCP,  PAY4),
> -	ICE_PTT(122, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4),
> -	ICE_PTT(123, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4),
> -
> -	/* IPv6 --> GRE/NAT -> MAC */
> -	ICE_PTT(124, IP, IPV6, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3),
> -
> -	/* IPv6 --> GRE/NAT -> MAC -> IPv4 */
> -	ICE_PTT(125, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3),
> -	ICE_PTT(126, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3),
> -	ICE_PTT(127, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(128),
> -	ICE_PTT(129, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP,  PAY4),
> -	ICE_PTT(130, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4),
> -	ICE_PTT(131, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4),
> -
> -	/* IPv6 --> GRE/NAT -> MAC -> IPv6 */
> -	ICE_PTT(132, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3),
> -	ICE_PTT(133, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3),
> -	ICE_PTT(134, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(135),
> -	ICE_PTT(136, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP,  PAY4),
> -	ICE_PTT(137, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4),
> -	ICE_PTT(138, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4),
> -
> -	/* IPv6 --> GRE/NAT -> MAC/VLAN */
> -	ICE_PTT(139, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3),
> -
> -	/* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv4 */
> -	ICE_PTT(140, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3),
> -	ICE_PTT(141, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3),
> -	ICE_PTT(142, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(143),
> -	ICE_PTT(144, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP,  PAY4),
> -	ICE_PTT(145, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4),
> -	ICE_PTT(146, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4),
> -
> -	/* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv6 */
> -	ICE_PTT(147, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3),
> -	ICE_PTT(148, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3),
> -	ICE_PTT(149, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP,  PAY4),
> -	ICE_PTT_UNUSED_ENTRY(150),
> -	ICE_PTT(151, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP,  PAY4),
> -	ICE_PTT(152, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4),
> -	ICE_PTT(153, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4),
> +	ICE_PTYPES
>  
>  	/* unused entries */
> -	[154 ... 1023] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }
> +	[ICE_NUM_DEFINED_PTYPES ... 1023] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }
>  };
>  
>  static inline struct ice_rx_ptype_decoded ice_decode_rx_desc_ptype(u16 ptype)
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> index 463d9e5cbe05..b11cfaedb81c 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> @@ -567,6 +567,79 @@ static int ice_xdp_rx_hw_ts(const struct xdp_md *ctx, u64 *ts_ns)
>  	return 0;
>  }
>  
> +/* Define a ptype index -> XDP hash type lookup table.
> + * It uses the same ptype definitions as ice_decode_rx_desc_ptype[],
> + * avoiding possible copy-paste errors.
> + */
> +#undef ICE_PTT
> +#undef ICE_PTT_UNUSED_ENTRY
> +
> +#define ICE_PTT(PTYPE, OUTER_IP, OUTER_IP_VER, OUTER_FRAG, T, TE, TEF, I, PL)\
> +	[PTYPE] = XDP_RSS_L3_##OUTER_IP_VER | XDP_RSS_L4_##I | XDP_RSS_TYPE_##PL
> +
> +#define ICE_PTT_UNUSED_ENTRY(PTYPE) [PTYPE] = 0

ERROR: space prohibited before open square bracket '['
#476: FILE: drivers/net/ethernet/intel/ice/ice_txrx_lib.c:580:
+#define ICE_PTT_UNUSED_ENTRY(PTYPE) [PTYPE] = 0

total: 2 errors, 0 warnings, 0 checks, 525 lines checked

> +
> +/* A few supplementary definitions for when XDP hash types do not coincide
> + * with what can be generated from ptype definitions
> + * by means of preprocessor concatenation.
> + */
> +#define XDP_RSS_L3_NONE		XDP_RSS_TYPE_NONE
> +#define XDP_RSS_L4_NONE		XDP_RSS_TYPE_NONE
> +#define XDP_RSS_TYPE_PAY2	XDP_RSS_TYPE_L2
> +#define XDP_RSS_TYPE_PAY3	XDP_RSS_TYPE_NONE
> +#define XDP_RSS_TYPE_PAY4	XDP_RSS_L4
> +
> +static const enum xdp_rss_hash_type
> +ice_ptype_to_xdp_hash[ICE_NUM_DEFINED_PTYPES] = {
> +	ICE_PTYPES
> +};
> +
> +#undef XDP_RSS_L3_NONE
> +#undef XDP_RSS_L4_NONE
> +#undef XDP_RSS_TYPE_PAY2
> +#undef XDP_RSS_TYPE_PAY3
> +#undef XDP_RSS_TYPE_PAY4
> +
> +#undef ICE_PTT
> +#undef ICE_PTT_UNUSED_ENTRY
> +
> +/**
> + * ice_xdp_rx_hash_type - Get XDP-specific hash type from the RX descriptor
> + * @eop_desc: End of Packet descriptor
> + */
> +static enum xdp_rss_hash_type
> +ice_xdp_rx_hash_type(const union ice_32b_rx_flex_desc *eop_desc)
> +{
> +	u16 ptype = ice_get_ptype(eop_desc);
> +
> +	if (unlikely(ptype >= ICE_NUM_DEFINED_PTYPES))
> +		return 0;
> +
> +	return ice_ptype_to_xdp_hash[ptype];
> +}
> +
> +/**
> + * ice_xdp_rx_hash - RX hash XDP hint handler
> + * @ctx: XDP buff pointer
> + * @hash: hash destination address
> + * @rss_type: XDP hash type destination address
> + *
> + * Copy RX hash (if available) and its type to the destination address.
> + */
> +static int ice_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash,
> +			   enum xdp_rss_hash_type *rss_type)
> +{
> +	const struct ice_xdp_buff *xdp_ext = (void *)ctx;
> +
> +	*hash = ice_get_rx_hash(xdp_ext->pkt_ctx.eop_desc);
> +	*rss_type = ice_xdp_rx_hash_type(xdp_ext->pkt_ctx.eop_desc);
> +	if (!likely(*hash))
> +		return -ENODATA;
> +
> +	return 0;
> +}
> +
>  const struct xdp_metadata_ops ice_xdp_md_ops = {
>  	.xmo_rx_timestamp		= ice_xdp_rx_hw_ts,
> +	.xmo_rx_hash			= ice_xdp_rx_hash,
>  };
> diff --git a/include/net/xdp.h b/include/net/xdp.h
> index de08c8e0d134..1e9870d5f025 100644
> --- a/include/net/xdp.h
> +++ b/include/net/xdp.h
> @@ -416,6 +416,7 @@ enum xdp_rss_hash_type {
>  	XDP_RSS_L4_UDP		= BIT(5),
>  	XDP_RSS_L4_SCTP		= BIT(6),
>  	XDP_RSS_L4_IPSEC	= BIT(7), /* L4 based hash include IPSEC SPI */
> +	XDP_RSS_L4_ICMP		= BIT(8),
>  
>  	/* Second part: RSS hash type combinations used for driver HW mapping */
>  	XDP_RSS_TYPE_NONE            = 0,
> @@ -431,11 +432,13 @@ enum xdp_rss_hash_type {
>  	XDP_RSS_TYPE_L4_IPV4_UDP     = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_UDP,
>  	XDP_RSS_TYPE_L4_IPV4_SCTP    = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_SCTP,
>  	XDP_RSS_TYPE_L4_IPV4_IPSEC   = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_IPSEC,
> +	XDP_RSS_TYPE_L4_IPV4_ICMP    = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_ICMP,
>  
>  	XDP_RSS_TYPE_L4_IPV6_TCP     = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_TCP,
>  	XDP_RSS_TYPE_L4_IPV6_UDP     = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_UDP,
>  	XDP_RSS_TYPE_L4_IPV6_SCTP    = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_SCTP,
>  	XDP_RSS_TYPE_L4_IPV6_IPSEC   = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_IPSEC,
> +	XDP_RSS_TYPE_L4_IPV6_ICMP    = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_ICMP,
>  
>  	XDP_RSS_TYPE_L4_IPV6_TCP_EX  = XDP_RSS_TYPE_L4_IPV6_TCP  | XDP_RSS_L3_DYNHDR,
>  	XDP_RSS_TYPE_L4_IPV6_UDP_EX  = XDP_RSS_TYPE_L4_IPV6_UDP  | XDP_RSS_L3_DYNHDR,
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 03/23] ice: make RX checksum checking code more reusable
  2023-09-05 15:37       ` Maciej Fijalkowski
@ 2023-09-05 16:53         ` Larysa Zaremba
  2023-09-05 17:44           ` Maciej Fijalkowski
  0 siblings, 1 reply; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-05 16:53 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Tue, Sep 05, 2023 at 05:37:27PM +0200, Maciej Fijalkowski wrote:
> On Mon, Sep 04, 2023 at 08:01:06PM +0200, Larysa Zaremba wrote:
> > On Mon, Sep 04, 2023 at 05:02:40PM +0200, Maciej Fijalkowski wrote:
> > > On Thu, Aug 24, 2023 at 09:26:42PM +0200, Larysa Zaremba wrote:
> > > > Previously, we only needed RX checksum flags in skb path,
> > > > hence all related code was written with skb in mind.
> > > > But with the addition of XDP hints via kfuncs to the ice driver,
> > > > the same logic will be needed in .xmo_() callbacks.
> > > > 
> > > > Put generic process of determining checksum status into
> > > > a separate function.
> > > > 
> > > > Now we cannot operate directly on skb, when deducing
> > > > checksum status, therefore introduce an intermediate enum for checksum
> > > > status. Fortunately, in ice, we have only 4 possibilities: checksum
> > > > validated at level 0, validated at level 1, no checksum, checksum error.
> > > > Use 3 bits for more convenient conversion.
> > > > 
> > > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > > ---
> > > >  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 105 ++++++++++++------
> > > >  1 file changed, 69 insertions(+), 36 deletions(-)
> > > > 
> > > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > > > index b2f241b73934..8b155a502b3b 100644
> > > > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > > > @@ -102,18 +102,41 @@ ice_rx_hash_to_skb(const struct ice_rx_ring *rx_ring,
> > > >  		skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype));
> > > >  }
> > > >  
> > > > +enum ice_rx_csum_status {
> > > > +	ICE_RX_CSUM_LVL_0	= 0,
> > > > +	ICE_RX_CSUM_LVL_1	= BIT(0),
> > > > +	ICE_RX_CSUM_NONE	= BIT(1),
> > > > +	ICE_RX_CSUM_ERROR	= BIT(2),
> > > > +	ICE_RX_CSUM_FAIL	= ICE_RX_CSUM_NONE | ICE_RX_CSUM_ERROR,
> > > > +};
> > > > +
> > > >  /**
> > > > - * ice_rx_csum - Indicate in skb if checksum is good
> > > > - * @ring: the ring we care about
> > > > - * @skb: skb currently being received and modified
> > > > + * ice_rx_csum_lvl - Get checksum level from status
> > > > + * @status: driver-specific checksum status
> 
> nit: describe retval?
>

I think that kernel-doc is already too much for a one-liner.
Also, checksum level is fully explained in sk_buff documentation.

> > > > + */
> > > > +static u8 ice_rx_csum_lvl(enum ice_rx_csum_status status)
> > > > +{
> > > > +	return status & ICE_RX_CSUM_LVL_1;
> > > > +}
> > > > +
> > > > +/**
> > > > + * ice_rx_csum_ip_summed - Checksum status from driver-specific to generic
> > > > + * @status: driver-specific checksum status
> 
> ditto

Same as above. Moreover, there are only 2 possible return values that anyone can 
easily look up. Describing them here would only balloon the file length.

> 
> > > > + */
> > > > +static u8 ice_rx_csum_ip_summed(enum ice_rx_csum_status status)
> > > > +{
> > > > +	return status & ICE_RX_CSUM_NONE ? CHECKSUM_NONE : CHECKSUM_UNNECESSARY;
> > > 
> > > 	return !(status & ICE_RX_CSUM_NONE);
> > > 
> > > ?
> > 
> > status & ICE_RX_CSUM_NONE ? CHECKSUM_NONE : CHECKSUM_UNNECESSARY;
> > 
> > is immediately understandable and results in 3 asm operations (I have checked):
> > 
> > result = status >> 1;
> > result ^= 1;
> > result &= 1;
> > 
> > I do not think "!(status & ICE_RX_CSUM_NONE);" could produce less.
> 
> oh, nice. Just the fact that branch being added caught my eye.
> 
> (...)

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 07/23] ice: Support RX hash XDP hint
  2023-09-05 15:42   ` [xdp-hints] " Maciej Fijalkowski
@ 2023-09-05 17:09     ` Larysa Zaremba
  2023-09-06 12:03     ` Alexander Lobakin
  1 sibling, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-05 17:09 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Tue, Sep 05, 2023 at 05:42:04PM +0200, Maciej Fijalkowski wrote:
> On Thu, Aug 24, 2023 at 09:26:46PM +0200, Larysa Zaremba wrote:
> > RX hash XDP hint requests both hash value and type.
> > Type is XDP-specific, so we need a separate way to map
> > these values to the hardware ptypes, so create a lookup table.
> > 
> > Instead of creating a new long list, reuse contents
> > of ice_decode_rx_desc_ptype[] through preprocessor.
> > 
> > Current hash type enum does not contain ICMP packet type,
> > but ice devices support it, so also add a new type into core code.
> > 
> > Then use previously refactored code and create a function
> > that allows XDP code to read RX hash.
> > 
> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > ---
> >  .../net/ethernet/intel/ice/ice_lan_tx_rx.h    | 412 +++++++++---------
> >  drivers/net/ethernet/intel/ice/ice_txrx_lib.c |  73 ++++
> >  include/net/xdp.h                             |   3 +
> >  3 files changed, 284 insertions(+), 204 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h b/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h
> > index 89f986a75cc8..d384ddfcb83e 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h
> > +++ b/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h
> > @@ -673,6 +673,212 @@ struct ice_tlan_ctx {
> >   *      Use the enum ice_rx_l2_ptype to decode the packet type
> >   * ENDIF
> >   */
> > +#define ICE_PTYPES								\
> 
> ERROR: Macros with complex values should be enclosed in parentheses
> #34: FILE: drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h:676:
> +#define ICE_PTYPES                                                             \
>

If I remember correctly, I have tried to fix this by adding parentheses, but 
this would break the array definition.

Also XDP_METADATA_KFUNC_xxx is defined the same way.

> (...)
> 
> > +	/* L2 Packet types */							\
> > +	ICE_PTT_UNUSED_ENTRY(0),						\
> > +	ICE_PTT(1, L2, NONE, NOF, NONE, NONE, NOF, NONE, PAY2),			\
> > +	ICE_PTT_UNUSED_ENTRY(2),						\
> > +	ICE_PTT_UNUSED_ENTRY(3),						\
> > +	ICE_PTT_UNUSED_ENTRY(4),						\
> > +	ICE_PTT_UNUSED_ENTRY(5),						\
> > +	ICE_PTT(6, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),			\
> > +	ICE_PTT(7, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),			\
> > +	ICE_PTT_UNUSED_ENTRY(8),						\
> > +	ICE_PTT_UNUSED_ENTRY(9),						\
> > +	ICE_PTT(10, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),		\
> > +	ICE_PTT(11, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),		\
> > +	ICE_PTT_UNUSED_ENTRY(12),						\
> > +	ICE_PTT_UNUSED_ENTRY(13),						\
> > +	ICE_PTT_UNUSED_ENTRY(14),						\
> > +	ICE_PTT_UNUSED_ENTRY(15),						\
> > +	ICE_PTT_UNUSED_ENTRY(16),						\
> > +	ICE_PTT_UNUSED_ENTRY(17),						\
> > +	ICE_PTT_UNUSED_ENTRY(18),						\
> > +	ICE_PTT_UNUSED_ENTRY(19),						\
> > +	ICE_PTT_UNUSED_ENTRY(20),						\
> > +	ICE_PTT_UNUSED_ENTRY(21),						\
> > +										\
> > +	/* Non Tunneled IPv4 */							\
> > +	ICE_PTT(22, IP, IPV4, FRG, NONE, NONE, NOF, NONE, PAY3),		\
> > +	ICE_PTT(23, IP, IPV4, NOF, NONE, NONE, NOF, NONE, PAY3),		\
> > +	ICE_PTT(24, IP, IPV4, NOF, NONE, NONE, NOF, UDP,  PAY4),		\
> > +	ICE_PTT_UNUSED_ENTRY(25),						\
> > +	ICE_PTT(26, IP, IPV4, NOF, NONE, NONE, NOF, TCP,  PAY4),		\
> > +	ICE_PTT(27, IP, IPV4, NOF, NONE, NONE, NOF, SCTP, PAY4),		\
> > +	ICE_PTT(28, IP, IPV4, NOF, NONE, NONE, NOF, ICMP, PAY4),		\
> > +										\
> > +	/* IPv4 --> IPv4 */							\
> > +	ICE_PTT(29, IP, IPV4, NOF, IP_IP, IPV4, FRG, NONE, PAY3),		\
> > +	ICE_PTT(30, IP, IPV4, NOF, IP_IP, IPV4, NOF, NONE, PAY3),		\
> > +	ICE_PTT(31, IP, IPV4, NOF, IP_IP, IPV4, NOF, UDP,  PAY4),		\
> > +	ICE_PTT_UNUSED_ENTRY(32),						\
> > +	ICE_PTT(33, IP, IPV4, NOF, IP_IP, IPV4, NOF, TCP,  PAY4),		\
> > +	ICE_PTT(34, IP, IPV4, NOF, IP_IP, IPV4, NOF, SCTP, PAY4),		\
> > +	ICE_PTT(35, IP, IPV4, NOF, IP_IP, IPV4, NOF, ICMP, PAY4),		\
> > +										\
> > +	/* IPv4 --> IPv6 */							\
> > +	ICE_PTT(36, IP, IPV4, NOF, IP_IP, IPV6, FRG, NONE, PAY3),		\
> > +	ICE_PTT(37, IP, IPV4, NOF, IP_IP, IPV6, NOF, NONE, PAY3),		\
> > +	ICE_PTT(38, IP, IPV4, NOF, IP_IP, IPV6, NOF, UDP,  PAY4),		\
> > +	ICE_PTT_UNUSED_ENTRY(39),						\
> > +	ICE_PTT(40, IP, IPV4, NOF, IP_IP, IPV6, NOF, TCP,  PAY4),		\
> > +	ICE_PTT(41, IP, IPV4, NOF, IP_IP, IPV6, NOF, SCTP, PAY4),		\
> > +	ICE_PTT(42, IP, IPV4, NOF, IP_IP, IPV6, NOF, ICMP, PAY4),		\
> > +										\
> > +	/* IPv4 --> GRE/NAT */							\
> > +	ICE_PTT(43, IP, IPV4, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3),		\
> > +										\
> > +	/* IPv4 --> GRE/NAT --> IPv4 */						\
> > +	ICE_PTT(44, IP, IPV4, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3),		\
> > +	ICE_PTT(45, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3),		\
> > +	ICE_PTT(46, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, UDP,  PAY4),		\
> > +	ICE_PTT_UNUSED_ENTRY(47),						\
> > +	ICE_PTT(48, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, TCP,  PAY4),		\
> > +	ICE_PTT(49, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4),		\
> > +	ICE_PTT(50, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4),		\
> > +										\
> > +	/* IPv4 --> GRE/NAT --> IPv6 */						\
> > +	ICE_PTT(51, IP, IPV4, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3),		\
> > +	ICE_PTT(52, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3),		\
> > +	ICE_PTT(53, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, UDP,  PAY4),		\
> > +	ICE_PTT_UNUSED_ENTRY(54),						\
> > +	ICE_PTT(55, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, TCP,  PAY4),		\
> > +	ICE_PTT(56, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4),		\
> > +	ICE_PTT(57, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4),		\
> > +										\
> > +	/* IPv4 --> GRE/NAT --> MAC */						\
> > +	ICE_PTT(58, IP, IPV4, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3),	\
> > +										\
> > +	/* IPv4 --> GRE/NAT --> MAC --> IPv4 */					\
> > +	ICE_PTT(59, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3),	\
> > +	ICE_PTT(60, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3),	\
> > +	ICE_PTT(61, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP,  PAY4),	\
> > +	ICE_PTT_UNUSED_ENTRY(62),						\
> > +	ICE_PTT(63, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP,  PAY4),	\
> > +	ICE_PTT(64, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4),	\
> > +	ICE_PTT(65, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4),	\
> > +										\
> > +	/* IPv4 --> GRE/NAT -> MAC --> IPv6 */					\
> > +	ICE_PTT(66, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3),	\
> > +	ICE_PTT(67, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3),	\
> > +	ICE_PTT(68, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP,  PAY4),	\
> > +	ICE_PTT_UNUSED_ENTRY(69),						\
> > +	ICE_PTT(70, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP,  PAY4),	\
> > +	ICE_PTT(71, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4),	\
> > +	ICE_PTT(72, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4),	\
> > +										\
> > +	/* IPv4 --> GRE/NAT --> MAC/VLAN */					\
> > +	ICE_PTT(73, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3),	\
> > +										\
> > +	/* IPv4 ---> GRE/NAT -> MAC/VLAN --> IPv4 */				\
> > +	ICE_PTT(74, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3),	\
> > +	ICE_PTT(75, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3),	\
> > +	ICE_PTT(76, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP,  PAY4),	\
> > +	ICE_PTT_UNUSED_ENTRY(77),						\
> > +	ICE_PTT(78, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP,  PAY4),	\
> > +	ICE_PTT(79, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4),	\
> > +	ICE_PTT(80, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4),	\
> > +										\
> > +	/* IPv4 -> GRE/NAT -> MAC/VLAN --> IPv6 */				\
> > +	ICE_PTT(81, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3),	\
> > +	ICE_PTT(82, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3),	\
> > +	ICE_PTT(83, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP,  PAY4),	\
> > +	ICE_PTT_UNUSED_ENTRY(84),						\
> > +	ICE_PTT(85, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP,  PAY4),	\
> > +	ICE_PTT(86, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4),	\
> > +	ICE_PTT(87, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4),	\
> > +										\
> > +	/* Non Tunneled IPv6 */							\
> > +	ICE_PTT(88, IP, IPV6, FRG, NONE, NONE, NOF, NONE, PAY3),		\
> > +	ICE_PTT(89, IP, IPV6, NOF, NONE, NONE, NOF, NONE, PAY3),		\
> > +	ICE_PTT(90, IP, IPV6, NOF, NONE, NONE, NOF, UDP,  PAY4),		\
> > +	ICE_PTT_UNUSED_ENTRY(91),						\
> > +	ICE_PTT(92, IP, IPV6, NOF, NONE, NONE, NOF, TCP,  PAY4),		\
> > +	ICE_PTT(93, IP, IPV6, NOF, NONE, NONE, NOF, SCTP, PAY4),		\
> > +	ICE_PTT(94, IP, IPV6, NOF, NONE, NONE, NOF, ICMP, PAY4),		\
> > +										\
> > +	/* IPv6 --> IPv4 */							\
> > +	ICE_PTT(95, IP, IPV6, NOF, IP_IP, IPV4, FRG, NONE, PAY3),		\
> > +	ICE_PTT(96, IP, IPV6, NOF, IP_IP, IPV4, NOF, NONE, PAY3),		\
> > +	ICE_PTT(97, IP, IPV6, NOF, IP_IP, IPV4, NOF, UDP,  PAY4),		\
> > +	ICE_PTT_UNUSED_ENTRY(98),						\
> > +	ICE_PTT(99, IP, IPV6, NOF, IP_IP, IPV4, NOF, TCP,  PAY4),		\
> > +	ICE_PTT(100, IP, IPV6, NOF, IP_IP, IPV4, NOF, SCTP, PAY4),		\
> > +	ICE_PTT(101, IP, IPV6, NOF, IP_IP, IPV4, NOF, ICMP, PAY4),		\
> > +										\
> > +	/* IPv6 --> IPv6 */							\
> > +	ICE_PTT(102, IP, IPV6, NOF, IP_IP, IPV6, FRG, NONE, PAY3),		\
> > +	ICE_PTT(103, IP, IPV6, NOF, IP_IP, IPV6, NOF, NONE, PAY3),		\
> > +	ICE_PTT(104, IP, IPV6, NOF, IP_IP, IPV6, NOF, UDP,  PAY4),		\
> > +	ICE_PTT_UNUSED_ENTRY(105),						\
> > +	ICE_PTT(106, IP, IPV6, NOF, IP_IP, IPV6, NOF, TCP,  PAY4),		\
> > +	ICE_PTT(107, IP, IPV6, NOF, IP_IP, IPV6, NOF, SCTP, PAY4),		\
> > +	ICE_PTT(108, IP, IPV6, NOF, IP_IP, IPV6, NOF, ICMP, PAY4),		\
> > +										\
> > +	/* IPv6 --> GRE/NAT */							\
> > +	ICE_PTT(109, IP, IPV6, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3),		\
> > +										\
> > +	/* IPv6 --> GRE/NAT -> IPv4 */						\
> > +	ICE_PTT(110, IP, IPV6, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3),		\
> > +	ICE_PTT(111, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3),		\
> > +	ICE_PTT(112, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, UDP,  PAY4),		\
> > +	ICE_PTT_UNUSED_ENTRY(113),						\
> > +	ICE_PTT(114, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, TCP,  PAY4),		\
> > +	ICE_PTT(115, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4),		\
> > +	ICE_PTT(116, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4),		\
> > +										\
> > +	/* IPv6 --> GRE/NAT -> IPv6 */						\
> > +	ICE_PTT(117, IP, IPV6, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3),		\
> > +	ICE_PTT(118, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3),		\
> > +	ICE_PTT(119, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, UDP,  PAY4),		\
> > +	ICE_PTT_UNUSED_ENTRY(120),						\
> > +	ICE_PTT(121, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, TCP,  PAY4),		\
> > +	ICE_PTT(122, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4),		\
> > +	ICE_PTT(123, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4),		\
> > +										\
> > +	/* IPv6 --> GRE/NAT -> MAC */						\
> > +	ICE_PTT(124, IP, IPV6, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3),	\
> > +										\
> > +	/* IPv6 --> GRE/NAT -> MAC -> IPv4 */					\
> > +	ICE_PTT(125, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3),	\
> > +	ICE_PTT(126, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3),	\
> > +	ICE_PTT(127, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP,  PAY4),	\
> > +	ICE_PTT_UNUSED_ENTRY(128),						\
> > +	ICE_PTT(129, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP,  PAY4),	\
> > +	ICE_PTT(130, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4),	\
> > +	ICE_PTT(131, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4),	\
> > +										\
> > +	/* IPv6 --> GRE/NAT -> MAC -> IPv6 */					\
> > +	ICE_PTT(132, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3),	\
> > +	ICE_PTT(133, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3),	\
> > +	ICE_PTT(134, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP,  PAY4),	\
> > +	ICE_PTT_UNUSED_ENTRY(135),						\
> > +	ICE_PTT(136, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP,  PAY4),	\
> > +	ICE_PTT(137, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4),	\
> > +	ICE_PTT(138, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4),	\
> > +										\
> > +	/* IPv6 --> GRE/NAT -> MAC/VLAN */					\
> > +	ICE_PTT(139, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3),	\
> > +										\
> > +	/* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv4 */				\
> > +	ICE_PTT(140, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3),	\
> > +	ICE_PTT(141, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3),	\
> > +	ICE_PTT(142, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP,  PAY4),	\
> > +	ICE_PTT_UNUSED_ENTRY(143),						\
> > +	ICE_PTT(144, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP,  PAY4),	\
> > +	ICE_PTT(145, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4),	\
> > +	ICE_PTT(146, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4),	\
> > +										\
> > +	/* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv6 */				\
> > +	ICE_PTT(147, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3),	\
> > +	ICE_PTT(148, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3),	\
> > +	ICE_PTT(149, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP,  PAY4),	\
> > +	ICE_PTT_UNUSED_ENTRY(150),						\
> > +	ICE_PTT(151, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP,  PAY4),	\
> > +	ICE_PTT(152, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4),	\
> > +	ICE_PTT(153, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4),
> > +
> > +#define ICE_NUM_DEFINED_PTYPES	154
> >  
> >  /* macro to make the table lines short, use explicit indexing with [PTYPE] */
> >  #define ICE_PTT(PTYPE, OUTER_IP, OUTER_IP_VER, OUTER_FRAG, T, TE, TEF, I, PL)\
> > @@ -695,212 +901,10 @@ struct ice_tlan_ctx {
> >  
> >  /* Lookup table mapping in the 10-bit HW PTYPE to the bit field for decoding */
> >  static const struct ice_rx_ptype_decoded ice_ptype_lkup[BIT(10)] = {
> > -	/* L2 Packet types */
> > -	ICE_PTT_UNUSED_ENTRY(0),
> > -	ICE_PTT(1, L2, NONE, NOF, NONE, NONE, NOF, NONE, PAY2),
> > -	ICE_PTT_UNUSED_ENTRY(2),
> > -	ICE_PTT_UNUSED_ENTRY(3),
> > -	ICE_PTT_UNUSED_ENTRY(4),
> > -	ICE_PTT_UNUSED_ENTRY(5),
> > -	ICE_PTT(6, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),
> > -	ICE_PTT(7, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),
> > -	ICE_PTT_UNUSED_ENTRY(8),
> > -	ICE_PTT_UNUSED_ENTRY(9),
> > -	ICE_PTT(10, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),
> > -	ICE_PTT(11, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE),
> > -	ICE_PTT_UNUSED_ENTRY(12),
> > -	ICE_PTT_UNUSED_ENTRY(13),
> > -	ICE_PTT_UNUSED_ENTRY(14),
> > -	ICE_PTT_UNUSED_ENTRY(15),
> > -	ICE_PTT_UNUSED_ENTRY(16),
> > -	ICE_PTT_UNUSED_ENTRY(17),
> > -	ICE_PTT_UNUSED_ENTRY(18),
> > -	ICE_PTT_UNUSED_ENTRY(19),
> > -	ICE_PTT_UNUSED_ENTRY(20),
> > -	ICE_PTT_UNUSED_ENTRY(21),
> > -
> > -	/* Non Tunneled IPv4 */
> > -	ICE_PTT(22, IP, IPV4, FRG, NONE, NONE, NOF, NONE, PAY3),
> > -	ICE_PTT(23, IP, IPV4, NOF, NONE, NONE, NOF, NONE, PAY3),
> > -	ICE_PTT(24, IP, IPV4, NOF, NONE, NONE, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(25),
> > -	ICE_PTT(26, IP, IPV4, NOF, NONE, NONE, NOF, TCP,  PAY4),
> > -	ICE_PTT(27, IP, IPV4, NOF, NONE, NONE, NOF, SCTP, PAY4),
> > -	ICE_PTT(28, IP, IPV4, NOF, NONE, NONE, NOF, ICMP, PAY4),
> > -
> > -	/* IPv4 --> IPv4 */
> > -	ICE_PTT(29, IP, IPV4, NOF, IP_IP, IPV4, FRG, NONE, PAY3),
> > -	ICE_PTT(30, IP, IPV4, NOF, IP_IP, IPV4, NOF, NONE, PAY3),
> > -	ICE_PTT(31, IP, IPV4, NOF, IP_IP, IPV4, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(32),
> > -	ICE_PTT(33, IP, IPV4, NOF, IP_IP, IPV4, NOF, TCP,  PAY4),
> > -	ICE_PTT(34, IP, IPV4, NOF, IP_IP, IPV4, NOF, SCTP, PAY4),
> > -	ICE_PTT(35, IP, IPV4, NOF, IP_IP, IPV4, NOF, ICMP, PAY4),
> > -
> > -	/* IPv4 --> IPv6 */
> > -	ICE_PTT(36, IP, IPV4, NOF, IP_IP, IPV6, FRG, NONE, PAY3),
> > -	ICE_PTT(37, IP, IPV4, NOF, IP_IP, IPV6, NOF, NONE, PAY3),
> > -	ICE_PTT(38, IP, IPV4, NOF, IP_IP, IPV6, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(39),
> > -	ICE_PTT(40, IP, IPV4, NOF, IP_IP, IPV6, NOF, TCP,  PAY4),
> > -	ICE_PTT(41, IP, IPV4, NOF, IP_IP, IPV6, NOF, SCTP, PAY4),
> > -	ICE_PTT(42, IP, IPV4, NOF, IP_IP, IPV6, NOF, ICMP, PAY4),
> > -
> > -	/* IPv4 --> GRE/NAT */
> > -	ICE_PTT(43, IP, IPV4, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3),
> > -
> > -	/* IPv4 --> GRE/NAT --> IPv4 */
> > -	ICE_PTT(44, IP, IPV4, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3),
> > -	ICE_PTT(45, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3),
> > -	ICE_PTT(46, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(47),
> > -	ICE_PTT(48, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, TCP,  PAY4),
> > -	ICE_PTT(49, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4),
> > -	ICE_PTT(50, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4),
> > -
> > -	/* IPv4 --> GRE/NAT --> IPv6 */
> > -	ICE_PTT(51, IP, IPV4, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3),
> > -	ICE_PTT(52, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3),
> > -	ICE_PTT(53, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(54),
> > -	ICE_PTT(55, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, TCP,  PAY4),
> > -	ICE_PTT(56, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4),
> > -	ICE_PTT(57, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4),
> > -
> > -	/* IPv4 --> GRE/NAT --> MAC */
> > -	ICE_PTT(58, IP, IPV4, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3),
> > -
> > -	/* IPv4 --> GRE/NAT --> MAC --> IPv4 */
> > -	ICE_PTT(59, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3),
> > -	ICE_PTT(60, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3),
> > -	ICE_PTT(61, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(62),
> > -	ICE_PTT(63, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP,  PAY4),
> > -	ICE_PTT(64, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4),
> > -	ICE_PTT(65, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4),
> > -
> > -	/* IPv4 --> GRE/NAT -> MAC --> IPv6 */
> > -	ICE_PTT(66, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3),
> > -	ICE_PTT(67, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3),
> > -	ICE_PTT(68, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(69),
> > -	ICE_PTT(70, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP,  PAY4),
> > -	ICE_PTT(71, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4),
> > -	ICE_PTT(72, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4),
> > -
> > -	/* IPv4 --> GRE/NAT --> MAC/VLAN */
> > -	ICE_PTT(73, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3),
> > -
> > -	/* IPv4 ---> GRE/NAT -> MAC/VLAN --> IPv4 */
> > -	ICE_PTT(74, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3),
> > -	ICE_PTT(75, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3),
> > -	ICE_PTT(76, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(77),
> > -	ICE_PTT(78, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP,  PAY4),
> > -	ICE_PTT(79, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4),
> > -	ICE_PTT(80, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4),
> > -
> > -	/* IPv4 -> GRE/NAT -> MAC/VLAN --> IPv6 */
> > -	ICE_PTT(81, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3),
> > -	ICE_PTT(82, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3),
> > -	ICE_PTT(83, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(84),
> > -	ICE_PTT(85, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP,  PAY4),
> > -	ICE_PTT(86, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4),
> > -	ICE_PTT(87, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4),
> > -
> > -	/* Non Tunneled IPv6 */
> > -	ICE_PTT(88, IP, IPV6, FRG, NONE, NONE, NOF, NONE, PAY3),
> > -	ICE_PTT(89, IP, IPV6, NOF, NONE, NONE, NOF, NONE, PAY3),
> > -	ICE_PTT(90, IP, IPV6, NOF, NONE, NONE, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(91),
> > -	ICE_PTT(92, IP, IPV6, NOF, NONE, NONE, NOF, TCP,  PAY4),
> > -	ICE_PTT(93, IP, IPV6, NOF, NONE, NONE, NOF, SCTP, PAY4),
> > -	ICE_PTT(94, IP, IPV6, NOF, NONE, NONE, NOF, ICMP, PAY4),
> > -
> > -	/* IPv6 --> IPv4 */
> > -	ICE_PTT(95, IP, IPV6, NOF, IP_IP, IPV4, FRG, NONE, PAY3),
> > -	ICE_PTT(96, IP, IPV6, NOF, IP_IP, IPV4, NOF, NONE, PAY3),
> > -	ICE_PTT(97, IP, IPV6, NOF, IP_IP, IPV4, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(98),
> > -	ICE_PTT(99, IP, IPV6, NOF, IP_IP, IPV4, NOF, TCP,  PAY4),
> > -	ICE_PTT(100, IP, IPV6, NOF, IP_IP, IPV4, NOF, SCTP, PAY4),
> > -	ICE_PTT(101, IP, IPV6, NOF, IP_IP, IPV4, NOF, ICMP, PAY4),
> > -
> > -	/* IPv6 --> IPv6 */
> > -	ICE_PTT(102, IP, IPV6, NOF, IP_IP, IPV6, FRG, NONE, PAY3),
> > -	ICE_PTT(103, IP, IPV6, NOF, IP_IP, IPV6, NOF, NONE, PAY3),
> > -	ICE_PTT(104, IP, IPV6, NOF, IP_IP, IPV6, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(105),
> > -	ICE_PTT(106, IP, IPV6, NOF, IP_IP, IPV6, NOF, TCP,  PAY4),
> > -	ICE_PTT(107, IP, IPV6, NOF, IP_IP, IPV6, NOF, SCTP, PAY4),
> > -	ICE_PTT(108, IP, IPV6, NOF, IP_IP, IPV6, NOF, ICMP, PAY4),
> > -
> > -	/* IPv6 --> GRE/NAT */
> > -	ICE_PTT(109, IP, IPV6, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3),
> > -
> > -	/* IPv6 --> GRE/NAT -> IPv4 */
> > -	ICE_PTT(110, IP, IPV6, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3),
> > -	ICE_PTT(111, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3),
> > -	ICE_PTT(112, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(113),
> > -	ICE_PTT(114, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, TCP,  PAY4),
> > -	ICE_PTT(115, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4),
> > -	ICE_PTT(116, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4),
> > -
> > -	/* IPv6 --> GRE/NAT -> IPv6 */
> > -	ICE_PTT(117, IP, IPV6, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3),
> > -	ICE_PTT(118, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3),
> > -	ICE_PTT(119, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(120),
> > -	ICE_PTT(121, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, TCP,  PAY4),
> > -	ICE_PTT(122, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4),
> > -	ICE_PTT(123, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4),
> > -
> > -	/* IPv6 --> GRE/NAT -> MAC */
> > -	ICE_PTT(124, IP, IPV6, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3),
> > -
> > -	/* IPv6 --> GRE/NAT -> MAC -> IPv4 */
> > -	ICE_PTT(125, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3),
> > -	ICE_PTT(126, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3),
> > -	ICE_PTT(127, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(128),
> > -	ICE_PTT(129, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP,  PAY4),
> > -	ICE_PTT(130, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4),
> > -	ICE_PTT(131, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4),
> > -
> > -	/* IPv6 --> GRE/NAT -> MAC -> IPv6 */
> > -	ICE_PTT(132, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3),
> > -	ICE_PTT(133, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3),
> > -	ICE_PTT(134, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(135),
> > -	ICE_PTT(136, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP,  PAY4),
> > -	ICE_PTT(137, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4),
> > -	ICE_PTT(138, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4),
> > -
> > -	/* IPv6 --> GRE/NAT -> MAC/VLAN */
> > -	ICE_PTT(139, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3),
> > -
> > -	/* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv4 */
> > -	ICE_PTT(140, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3),
> > -	ICE_PTT(141, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3),
> > -	ICE_PTT(142, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(143),
> > -	ICE_PTT(144, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP,  PAY4),
> > -	ICE_PTT(145, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4),
> > -	ICE_PTT(146, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4),
> > -
> > -	/* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv6 */
> > -	ICE_PTT(147, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3),
> > -	ICE_PTT(148, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3),
> > -	ICE_PTT(149, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP,  PAY4),
> > -	ICE_PTT_UNUSED_ENTRY(150),
> > -	ICE_PTT(151, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP,  PAY4),
> > -	ICE_PTT(152, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4),
> > -	ICE_PTT(153, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4),
> > +	ICE_PTYPES
> >  
> >  	/* unused entries */
> > -	[154 ... 1023] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }
> > +	[ICE_NUM_DEFINED_PTYPES ... 1023] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }
> >  };
> >  
> >  static inline struct ice_rx_ptype_decoded ice_decode_rx_desc_ptype(u16 ptype)
> > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > index 463d9e5cbe05..b11cfaedb81c 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > @@ -567,6 +567,79 @@ static int ice_xdp_rx_hw_ts(const struct xdp_md *ctx, u64 *ts_ns)
> >  	return 0;
> >  }
> >  
> > +/* Define a ptype index -> XDP hash type lookup table.
> > + * It uses the same ptype definitions as ice_decode_rx_desc_ptype[],
> > + * avoiding possible copy-paste errors.
> > + */
> > +#undef ICE_PTT
> > +#undef ICE_PTT_UNUSED_ENTRY
> > +
> > +#define ICE_PTT(PTYPE, OUTER_IP, OUTER_IP_VER, OUTER_FRAG, T, TE, TEF, I, PL)\
> > +	[PTYPE] = XDP_RSS_L3_##OUTER_IP_VER | XDP_RSS_L4_##I | XDP_RSS_TYPE_##PL
> > +
> > +#define ICE_PTT_UNUSED_ENTRY(PTYPE) [PTYPE] = 0
> 
> ERROR: space prohibited before open square bracket '['
> #476: FILE: drivers/net/ethernet/intel/ice/ice_txrx_lib.c:580:
> +#define ICE_PTT_UNUSED_ENTRY(PTYPE) [PTYPE] = 0
> 
> total: 2 errors, 0 warnings, 0 checks, 525 lines checked

Now, this one is a true false positive. Seems like checkpatch would stop 
complainining, if I did:

#define ICE_PTT_UNUSED_ENTRY(PTYPE)\
	[PTYPE] = 0

But is it worth it?

> 
> > +
> > +/* A few supplementary definitions for when XDP hash types do not coincide
> > + * with what can be generated from ptype definitions
> > + * by means of preprocessor concatenation.
> > + */
> > +#define XDP_RSS_L3_NONE		XDP_RSS_TYPE_NONE
> > +#define XDP_RSS_L4_NONE		XDP_RSS_TYPE_NONE
> > +#define XDP_RSS_TYPE_PAY2	XDP_RSS_TYPE_L2
> > +#define XDP_RSS_TYPE_PAY3	XDP_RSS_TYPE_NONE
> > +#define XDP_RSS_TYPE_PAY4	XDP_RSS_L4
> > +
> > +static const enum xdp_rss_hash_type
> > +ice_ptype_to_xdp_hash[ICE_NUM_DEFINED_PTYPES] = {
> > +	ICE_PTYPES
> > +};
> > +
> > +#undef XDP_RSS_L3_NONE
> > +#undef XDP_RSS_L4_NONE
> > +#undef XDP_RSS_TYPE_PAY2
> > +#undef XDP_RSS_TYPE_PAY3
> > +#undef XDP_RSS_TYPE_PAY4
> > +
> > +#undef ICE_PTT
> > +#undef ICE_PTT_UNUSED_ENTRY
> > +
> > +/**
> > + * ice_xdp_rx_hash_type - Get XDP-specific hash type from the RX descriptor
> > + * @eop_desc: End of Packet descriptor
> > + */
> > +static enum xdp_rss_hash_type
> > +ice_xdp_rx_hash_type(const union ice_32b_rx_flex_desc *eop_desc)
> > +{
> > +	u16 ptype = ice_get_ptype(eop_desc);
> > +
> > +	if (unlikely(ptype >= ICE_NUM_DEFINED_PTYPES))
> > +		return 0;
> > +
> > +	return ice_ptype_to_xdp_hash[ptype];
> > +}
> > +
> > +/**
> > + * ice_xdp_rx_hash - RX hash XDP hint handler
> > + * @ctx: XDP buff pointer
> > + * @hash: hash destination address
> > + * @rss_type: XDP hash type destination address
> > + *
> > + * Copy RX hash (if available) and its type to the destination address.
> > + */
> > +static int ice_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash,
> > +			   enum xdp_rss_hash_type *rss_type)
> > +{
> > +	const struct ice_xdp_buff *xdp_ext = (void *)ctx;
> > +
> > +	*hash = ice_get_rx_hash(xdp_ext->pkt_ctx.eop_desc);
> > +	*rss_type = ice_xdp_rx_hash_type(xdp_ext->pkt_ctx.eop_desc);
> > +	if (!likely(*hash))
> > +		return -ENODATA;
> > +
> > +	return 0;
> > +}
> > +
> >  const struct xdp_metadata_ops ice_xdp_md_ops = {
> >  	.xmo_rx_timestamp		= ice_xdp_rx_hw_ts,
> > +	.xmo_rx_hash			= ice_xdp_rx_hash,
> >  };
> > diff --git a/include/net/xdp.h b/include/net/xdp.h
> > index de08c8e0d134..1e9870d5f025 100644
> > --- a/include/net/xdp.h
> > +++ b/include/net/xdp.h
> > @@ -416,6 +416,7 @@ enum xdp_rss_hash_type {
> >  	XDP_RSS_L4_UDP		= BIT(5),
> >  	XDP_RSS_L4_SCTP		= BIT(6),
> >  	XDP_RSS_L4_IPSEC	= BIT(7), /* L4 based hash include IPSEC SPI */
> > +	XDP_RSS_L4_ICMP		= BIT(8),
> >  
> >  	/* Second part: RSS hash type combinations used for driver HW mapping */
> >  	XDP_RSS_TYPE_NONE            = 0,
> > @@ -431,11 +432,13 @@ enum xdp_rss_hash_type {
> >  	XDP_RSS_TYPE_L4_IPV4_UDP     = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_UDP,
> >  	XDP_RSS_TYPE_L4_IPV4_SCTP    = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_SCTP,
> >  	XDP_RSS_TYPE_L4_IPV4_IPSEC   = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_IPSEC,
> > +	XDP_RSS_TYPE_L4_IPV4_ICMP    = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_ICMP,
> >  
> >  	XDP_RSS_TYPE_L4_IPV6_TCP     = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_TCP,
> >  	XDP_RSS_TYPE_L4_IPV6_UDP     = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_UDP,
> >  	XDP_RSS_TYPE_L4_IPV6_SCTP    = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_SCTP,
> >  	XDP_RSS_TYPE_L4_IPV6_IPSEC   = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_IPSEC,
> > +	XDP_RSS_TYPE_L4_IPV6_ICMP    = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_ICMP,
> >  
> >  	XDP_RSS_TYPE_L4_IPV6_TCP_EX  = XDP_RSS_TYPE_L4_IPV6_TCP  | XDP_RSS_L3_DYNHDR,
> >  	XDP_RSS_TYPE_L4_IPV6_UDP_EX  = XDP_RSS_TYPE_L4_IPV6_UDP  | XDP_RSS_L3_DYNHDR,
> > -- 
> > 2.41.0
> > 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 03/23] ice: make RX checksum checking code more reusable
  2023-09-05 16:53         ` Larysa Zaremba
@ 2023-09-05 17:44           ` Maciej Fijalkowski
  2023-09-06  9:28             ` Larysa Zaremba
  0 siblings, 1 reply; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-05 17:44 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Tue, Sep 05, 2023 at 06:53:37PM +0200, Larysa Zaremba wrote:
> On Tue, Sep 05, 2023 at 05:37:27PM +0200, Maciej Fijalkowski wrote:
> > On Mon, Sep 04, 2023 at 08:01:06PM +0200, Larysa Zaremba wrote:
> > > On Mon, Sep 04, 2023 at 05:02:40PM +0200, Maciej Fijalkowski wrote:
> > > > On Thu, Aug 24, 2023 at 09:26:42PM +0200, Larysa Zaremba wrote:
> > > > > Previously, we only needed RX checksum flags in skb path,
> > > > > hence all related code was written with skb in mind.
> > > > > But with the addition of XDP hints via kfuncs to the ice driver,
> > > > > the same logic will be needed in .xmo_() callbacks.
> > > > > 
> > > > > Put generic process of determining checksum status into
> > > > > a separate function.
> > > > > 
> > > > > Now we cannot operate directly on skb, when deducing
> > > > > checksum status, therefore introduce an intermediate enum for checksum
> > > > > status. Fortunately, in ice, we have only 4 possibilities: checksum
> > > > > validated at level 0, validated at level 1, no checksum, checksum error.
> > > > > Use 3 bits for more convenient conversion.
> > > > > 
> > > > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > > > ---
> > > > >  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 105 ++++++++++++------
> > > > >  1 file changed, 69 insertions(+), 36 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > > > > index b2f241b73934..8b155a502b3b 100644
> > > > > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > > > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > > > > @@ -102,18 +102,41 @@ ice_rx_hash_to_skb(const struct ice_rx_ring *rx_ring,
> > > > >  		skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype));
> > > > >  }
> > > > >  
> > > > > +enum ice_rx_csum_status {
> > > > > +	ICE_RX_CSUM_LVL_0	= 0,
> > > > > +	ICE_RX_CSUM_LVL_1	= BIT(0),
> > > > > +	ICE_RX_CSUM_NONE	= BIT(1),
> > > > > +	ICE_RX_CSUM_ERROR	= BIT(2),
> > > > > +	ICE_RX_CSUM_FAIL	= ICE_RX_CSUM_NONE | ICE_RX_CSUM_ERROR,
> > > > > +};
> > > > > +
> > > > >  /**
> > > > > - * ice_rx_csum - Indicate in skb if checksum is good
> > > > > - * @ring: the ring we care about
> > > > > - * @skb: skb currently being received and modified
> > > > > + * ice_rx_csum_lvl - Get checksum level from status
> > > > > + * @status: driver-specific checksum status
> > 
> > nit: describe retval?
> >
> 
> I think that kernel-doc is already too much for a one-liner.
> Also, checksum level is fully explained in sk_buff documentation.
> 
> > > > > + */
> > > > > +static u8 ice_rx_csum_lvl(enum ice_rx_csum_status status)
> > > > > +{
> > > > > +	return status & ICE_RX_CSUM_LVL_1;
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * ice_rx_csum_ip_summed - Checksum status from driver-specific to generic
> > > > > + * @status: driver-specific checksum status
> > 
> > ditto
> 
> Same as above. Moreover, there are only 2 possible return values that anyone can 
> easily look up. Describing them here would only balloon the file length.

You really think 5 additional lines would balloon the file length? :D

I am not sure what to say here. We have many pretty pointless kdoc retval
descriptions like 'returns 0 on success, error otherwise' but to me this
is following the guidelines from Documentation/doc-guide/kernel-doc.rst.
If i generate kdoc I don't want to open up the source code to easily look
up retvals.

Just my 0.02$, not a thing that I'd like to keep on arguing on :)

> 
> > 
> > > > > + */
> > > > > +static u8 ice_rx_csum_ip_summed(enum ice_rx_csum_status status)
> > > > > +{
> > > > > +	return status & ICE_RX_CSUM_NONE ? CHECKSUM_NONE : CHECKSUM_UNNECESSARY;
> > > > 
> > > > 	return !(status & ICE_RX_CSUM_NONE);
> > > > 
> > > > ?
> > > 
> > > status & ICE_RX_CSUM_NONE ? CHECKSUM_NONE : CHECKSUM_UNNECESSARY;
> > > 
> > > is immediately understandable and results in 3 asm operations (I have checked):
> > > 
> > > result = status >> 1;
> > > result ^= 1;
> > > result &= 1;
> > > 
> > > I do not think "!(status & ICE_RX_CSUM_NONE);" could produce less.
> > 
> > oh, nice. Just the fact that branch being added caught my eye.
> > 
> > (...)

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 05/23] ice: Introduce ice_xdp_buff
  2023-09-04 18:11     ` Larysa Zaremba
@ 2023-09-05 17:53       ` Maciej Fijalkowski
  2023-09-07 14:21         ` Larysa Zaremba
  0 siblings, 1 reply; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-05 17:53 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Mon, Sep 04, 2023 at 08:11:09PM +0200, Larysa Zaremba wrote:
> On Mon, Sep 04, 2023 at 05:32:14PM +0200, Maciej Fijalkowski wrote:
> > On Thu, Aug 24, 2023 at 09:26:44PM +0200, Larysa Zaremba wrote:
> > > In order to use XDP hints via kfuncs we need to put
> > > RX descriptor and ring pointers just next to xdp_buff.
> > > Same as in hints implementations in other drivers, we achieve
> > > this through putting xdp_buff into a child structure.
> > 
> > Don't you mean a parent struct? xdp_buff will be 'child' of ice_xdp_buff
> > if i'm reading this right.
> >
> 
> ice_xdp_buff is a child in terms of inheritance (pointer to ice_xdp_buff could 
> replace pointer to xdp_buff, but not in reverse).
> 
> > > 
> > > Currently, xdp_buff is stored in the ring structure,
> > > so replace it with union that includes child structure.
> > > This way enough memory is available while existing XDP code
> > > remains isolated from hints.
> > > 
> > > Minimum size of the new child structure (ice_xdp_buff) is exactly
> > > 64 bytes (single cache line). To place it at the start of a cache line,
> > > move 'next' field from CL1 to CL3, as it isn't used often. This still
> > > leaves 128 bits available in CL3 for packet context extensions.
> > 
> > I believe ice_xdp_buff will be beefed up in later patches, so what is the
> > point of moving 'next' ? We won't be able to keep ice_xdp_buff in a single
> > CL anyway.
> >
> 
> It is to at least keep xdp_buff and descriptor pointer (used for every hint) in 
> a single CL, other fields are situational.

Right, something must be moved...still, would be good to see perf
before/after :)

> 
> > > 
> > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > ---
> > >  drivers/net/ethernet/intel/ice/ice_txrx.c     |  7 +++--
> > >  drivers/net/ethernet/intel/ice/ice_txrx.h     | 26 ++++++++++++++++---
> > >  drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 10 +++++++
> > >  3 files changed, 38 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > index 40f2f6dabb81..4e6546d9cf85 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > @@ -557,13 +557,14 @@ ice_rx_frame_truesize(struct ice_rx_ring *rx_ring, const unsigned int size)
> > >   * @xdp_prog: XDP program to run
> > >   * @xdp_ring: ring to be used for XDP_TX action
> > >   * @rx_buf: Rx buffer to store the XDP action
> > > + * @eop_desc: Last descriptor in packet to read metadata from
> > >   *
> > >   * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}
> > >   */
> > >  static void
> > >  ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > >  	    struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
> > > -	    struct ice_rx_buf *rx_buf)
> > > +	    struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc)
> > >  {
> > >  	unsigned int ret = ICE_XDP_PASS;
> > >  	u32 act;
> > > @@ -571,6 +572,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > >  	if (!xdp_prog)
> > >  		goto exit;
> > >  
> > > +	ice_xdp_meta_set_desc(xdp, eop_desc);
> > 
> > I am currently not sure if for multi-buffer case HW repeats all the
> > necessary info within each descriptor for every frag? IOW shouldn't you be
> > using the ice_rx_ring::first_desc?
> > 
> > Would be good to test hints for mbuf case for sure.
> >
> 
> In the skb path, we take metadata from the last descriptor only, so this should 
> be fine. Really worth testing with mbuf though.

Ok, thanks!


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 03/23] ice: make RX checksum checking code more reusable
  2023-09-05 17:44           ` Maciej Fijalkowski
@ 2023-09-06  9:28             ` Larysa Zaremba
  0 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-06  9:28 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Tue, Sep 05, 2023 at 07:44:32PM +0200, Maciej Fijalkowski wrote:
> On Tue, Sep 05, 2023 at 06:53:37PM +0200, Larysa Zaremba wrote:
> > On Tue, Sep 05, 2023 at 05:37:27PM +0200, Maciej Fijalkowski wrote:
> > > On Mon, Sep 04, 2023 at 08:01:06PM +0200, Larysa Zaremba wrote:
> > > > On Mon, Sep 04, 2023 at 05:02:40PM +0200, Maciej Fijalkowski wrote:
> > > > > On Thu, Aug 24, 2023 at 09:26:42PM +0200, Larysa Zaremba wrote:
> > > > > > Previously, we only needed RX checksum flags in skb path,
> > > > > > hence all related code was written with skb in mind.
> > > > > > But with the addition of XDP hints via kfuncs to the ice driver,
> > > > > > the same logic will be needed in .xmo_() callbacks.
> > > > > > 
> > > > > > Put generic process of determining checksum status into
> > > > > > a separate function.
> > > > > > 
> > > > > > Now we cannot operate directly on skb, when deducing
> > > > > > checksum status, therefore introduce an intermediate enum for checksum
> > > > > > status. Fortunately, in ice, we have only 4 possibilities: checksum
> > > > > > validated at level 0, validated at level 1, no checksum, checksum error.
> > > > > > Use 3 bits for more convenient conversion.
> > > > > > 
> > > > > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > > > > ---
> > > > > >  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 105 ++++++++++++------
> > > > > >  1 file changed, 69 insertions(+), 36 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > > > > > index b2f241b73934..8b155a502b3b 100644
> > > > > > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > > > > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > > > > > @@ -102,18 +102,41 @@ ice_rx_hash_to_skb(const struct ice_rx_ring *rx_ring,
> > > > > >  		skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype));
> > > > > >  }
> > > > > >  
> > > > > > +enum ice_rx_csum_status {
> > > > > > +	ICE_RX_CSUM_LVL_0	= 0,
> > > > > > +	ICE_RX_CSUM_LVL_1	= BIT(0),
> > > > > > +	ICE_RX_CSUM_NONE	= BIT(1),
> > > > > > +	ICE_RX_CSUM_ERROR	= BIT(2),
> > > > > > +	ICE_RX_CSUM_FAIL	= ICE_RX_CSUM_NONE | ICE_RX_CSUM_ERROR,
> > > > > > +};
> > > > > > +
> > > > > >  /**
> > > > > > - * ice_rx_csum - Indicate in skb if checksum is good
> > > > > > - * @ring: the ring we care about
> > > > > > - * @skb: skb currently being received and modified
> > > > > > + * ice_rx_csum_lvl - Get checksum level from status
> > > > > > + * @status: driver-specific checksum status
> > > 
> > > nit: describe retval?
> > >
> > 
> > I think that kernel-doc is already too much for a one-liner.
> > Also, checksum level is fully explained in sk_buff documentation.
> > 
> > > > > > + */
> > > > > > +static u8 ice_rx_csum_lvl(enum ice_rx_csum_status status)
> > > > > > +{
> > > > > > +	return status & ICE_RX_CSUM_LVL_1;
> > > > > > +}
> > > > > > +
> > > > > > +/**
> > > > > > + * ice_rx_csum_ip_summed - Checksum status from driver-specific to generic
> > > > > > + * @status: driver-specific checksum status
> > > 
> > > ditto
> > 
> > Same as above. Moreover, there are only 2 possible return values that anyone can 
> > easily look up. Describing them here would only balloon the file length.
> 
> You really think 5 additional lines would balloon the file length? :D
> 
> I am not sure what to say here. We have many pretty pointless kdoc retval
> descriptions like 'returns 0 on success, error otherwise' but to me this
> is following the guidelines from Documentation/doc-guide/kernel-doc.rst.
> If i generate kdoc I don't want to open up the source code to easily look
> up retvals.
> 
> Just my 0.02$, not a thing that I'd like to keep on arguing on :)
>

I have consulted with the team and we came to a conclusion that maybe removing 
kernel-doc for these functions would be the best solution. In ice, functions in 
source files are always documented, but I do not think this rule is set in 
stone.

Sorry, if I was being rude, I had been traumatized by documentaion requirements
at my previous job (automotive) :D

> > 
> > > 
> > > > > > + */
> > > > > > +static u8 ice_rx_csum_ip_summed(enum ice_rx_csum_status status)
> > > > > > +{
> > > > > > +	return status & ICE_RX_CSUM_NONE ? CHECKSUM_NONE : CHECKSUM_UNNECESSARY;
> > > > > 
> > > > > 	return !(status & ICE_RX_CSUM_NONE);
> > > > > 
> > > > > ?
> > > > 
> > > > status & ICE_RX_CSUM_NONE ? CHECKSUM_NONE : CHECKSUM_UNNECESSARY;
> > > > 
> > > > is immediately understandable and results in 3 asm operations (I have checked):
> > > > 
> > > > result = status >> 1;
> > > > result ^= 1;
> > > > result &= 1;
> > > > 
> > > > I do not think "!(status & ICE_RX_CSUM_NONE);" could produce less.
> > > 
> > > oh, nice. Just the fact that branch being added caught my eye.
> > > 
> > > (...)

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 07/23] ice: Support RX hash XDP hint
  2023-09-05 15:42   ` [xdp-hints] " Maciej Fijalkowski
  2023-09-05 17:09     ` Larysa Zaremba
@ 2023-09-06 12:03     ` Alexander Lobakin
  1 sibling, 0 replies; 72+ messages in thread
From: Alexander Lobakin @ 2023-09-06 12:03 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: Larysa Zaremba, bpf, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date: Tue, 5 Sep 2023 17:42:04 +0200

> On Thu, Aug 24, 2023 at 09:26:46PM +0200, Larysa Zaremba wrote:

[...]

>>   */
>> +#define ICE_PTYPES								\
> 
> ERROR: Macros with complex values should be enclosed in parentheses
> #34: FILE: drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h:676:
> +#define ICE_PTYPES                                                             \

[...]

> ERROR: space prohibited before open square bracket '['
> #476: FILE: drivers/net/ethernet/intel/ice/ice_txrx_lib.c:580:
> +#define ICE_PTT_UNUSED_ENTRY(PTYPE) [PTYPE] = 0
> 
> total: 2 errors, 0 warnings, 0 checks, 525 lines checked

Those all are FPs. The same "errors" are present in libie. checkpatch
doesn't parse the code like e.g. sparse does and it's not able to
understand every #define black magic we're able to write :D

> 
>> +
>> +/* A few supplementary definitions for when XDP hash types do not coincide
>> + * with what can be generated from ptype definitions
>> + * by means of preprocessor concatenation.
>> + */
>> +#define XDP_RSS_L3_NONE		XDP_RSS_TYPE_NONE
>> +#define XDP_RSS_L4_NONE		XDP_RSS_TYPE_NONE
>> +#define XDP_RSS_TYPE_PAY2	XDP_RSS_TYPE_L2
>> +#define XDP_RSS_TYPE_PAY3	XDP_RSS_TYPE_NONE
>> +#define XDP_RSS_TYPE_PAY4	XDP_RSS_L4
>> +
>> +static const enum xdp_rss_hash_type
>> +ice_ptype_to_xdp_hash[ICE_NUM_DEFINED_PTYPES] = {
>> +	ICE_PTYPES
>> +};

[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 01/23] ice: make RX hash reading code more reusable
  2023-09-04 14:37   ` [xdp-hints] " Maciej Fijalkowski
@ 2023-09-06 12:23     ` Alexander Lobakin
  0 siblings, 0 replies; 72+ messages in thread
From: Alexander Lobakin @ 2023-09-06 12:23 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: Larysa Zaremba, bpf, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date: Mon, 4 Sep 2023 16:37:45 +0200

> On Thu, Aug 24, 2023 at 09:26:40PM +0200, Larysa Zaremba wrote:
>> Previously, we only needed RX hash in skb path,
>> hence all related code was written with skb in mind.
>> But with the addition of XDP hints via kfuncs to the ice driver,
>> the same logic will be needed in .xmo_() callbacks.

[...]

>> -	nic_mdid = (struct ice_32b_rx_flex_desc_nic *)rx_desc;
>> -	hash = le32_to_cpu(nic_mdid->rss_hash);
>> -	skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype));
>> +	hash = ice_get_rx_hash(rx_desc);
>> +	if (likely(hash))
>> +		skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype));
> 
> Looks like a behavior change as you wouldn't be setting l4_hash and
> sw_hash from skb in case !hash ? When can we get hash == 0 ?

I do the same in libie. hash == 0 makes no sense at all no matter if you
set sw or l4, esp. for GRO and other stack pieces.
BTW, sw_hash is never set by drivers, it's meant to be set only from the
core networking hashing functions (when it's hashed by CPU with SIPhash
with the help of Flow Dissector). So we only do care about l4_hash.
Valid hash == 0 for valid L4 frame has 0.0(0)1% probability even for
XOR, not speaking of Toeplitz / CRC (have you even seen MD5 == 0? :D).
if the frame is not L4, the kernel doesn't treat your hash as something
meaningful and falls back to SIP. But the prob of having hash == 0 for
L3- is not higher :D

> 
>>  }
>>  
>>  /**
>> @@ -186,7 +201,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
>>  		       union ice_32b_rx_flex_desc *rx_desc,
>>  		       struct sk_buff *skb, u16 ptype)
>>  {
>> -	ice_rx_hash(rx_ring, rx_desc, skb, ptype);
>> +	ice_rx_hash_to_skb(rx_ring, rx_desc, skb, ptype);
>>  
>>  	/* modifies the skb - consumes the enet header */
>>  	skb->protocol = eth_type_trans(skb, rx_ring->netdev);
>> -- 
>> 2.41.0
>>

Thanks,
Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5
  2023-09-04 16:06 ` [xdp-hints] " Maciej Fijalkowski
@ 2023-09-06 14:09   ` Larysa Zaremba
  0 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-06 14:09 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Mon, Sep 04, 2023 at 06:06:51PM +0200, Maciej Fijalkowski wrote:
> On Thu, Aug 24, 2023 at 09:26:39PM +0200, Larysa Zaremba wrote:
> > Alexei has requested an implementation of VLAN and checksum XDP hints
> > for one more driver [0].
> > 
> > This series is exactly the v5 of "XDP metadata via kfuncs for ice" [1]
> > with 2 additional patches for mlx5.
> > 
> > Firstly, there is a VLAN hint implementation. I am pretty sure this
> > one works and would not object adding it to the main series, if someone
> > from nvidia ACKs it.
> > 
> > The second patch is a checksum hint implementation and it is very rough.
> > There is logic duplication and some missing features, but I am sure it
> > captures the main points of the potential end implementation.
> > 
> > I think it is unrealistic for me to provide a fully working mlx5 checksum
> > hint implementation (complex logic, no HW), so would much rather prefer
> > not having it in my main series. My main intension with this RFC is
> > to prove proposed hints functions are suitable for non-intel HW.
> 
> I went through ice patches mostly, can you provide performance numbers for
> XDP workloads without metadata in picture? I'd like to see whether
> standard 64b traffic gets affected or not since you're modifying
> ice_rx_ring layout.

Thank you for the review, I will send the next version with performance numbers.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 05/23] ice: Introduce ice_xdp_buff
  2023-09-05 17:53       ` Maciej Fijalkowski
@ 2023-09-07 14:21         ` Larysa Zaremba
  2023-09-07 16:33           ` Stanislav Fomichev
  0 siblings, 1 reply; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-07 14:21 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Tue, Sep 05, 2023 at 07:53:03PM +0200, Maciej Fijalkowski wrote:
> On Mon, Sep 04, 2023 at 08:11:09PM +0200, Larysa Zaremba wrote:
> > On Mon, Sep 04, 2023 at 05:32:14PM +0200, Maciej Fijalkowski wrote:
> > > On Thu, Aug 24, 2023 at 09:26:44PM +0200, Larysa Zaremba wrote:
> > > > In order to use XDP hints via kfuncs we need to put
> > > > RX descriptor and ring pointers just next to xdp_buff.
> > > > Same as in hints implementations in other drivers, we achieve
> > > > this through putting xdp_buff into a child structure.
> > > 
> > > Don't you mean a parent struct? xdp_buff will be 'child' of ice_xdp_buff
> > > if i'm reading this right.
> > >
> > 
> > ice_xdp_buff is a child in terms of inheritance (pointer to ice_xdp_buff could 
> > replace pointer to xdp_buff, but not in reverse).
> > 
> > > > 
> > > > Currently, xdp_buff is stored in the ring structure,
> > > > so replace it with union that includes child structure.
> > > > This way enough memory is available while existing XDP code
> > > > remains isolated from hints.
> > > > 
> > > > Minimum size of the new child structure (ice_xdp_buff) is exactly
> > > > 64 bytes (single cache line). To place it at the start of a cache line,
> > > > move 'next' field from CL1 to CL3, as it isn't used often. This still
> > > > leaves 128 bits available in CL3 for packet context extensions.
> > > 
> > > I believe ice_xdp_buff will be beefed up in later patches, so what is the
> > > point of moving 'next' ? We won't be able to keep ice_xdp_buff in a single
> > > CL anyway.
> > >
> > 
> > It is to at least keep xdp_buff and descriptor pointer (used for every hint) in 
> > a single CL, other fields are situational.
> 
> Right, something must be moved...still, would be good to see perf
> before/after :)
> 
> > 
> > > > 
> > > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > > ---
> > > >  drivers/net/ethernet/intel/ice/ice_txrx.c     |  7 +++--
> > > >  drivers/net/ethernet/intel/ice/ice_txrx.h     | 26 ++++++++++++++++---
> > > >  drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 10 +++++++
> > > >  3 files changed, 38 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > index 40f2f6dabb81..4e6546d9cf85 100644
> > > > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > @@ -557,13 +557,14 @@ ice_rx_frame_truesize(struct ice_rx_ring *rx_ring, const unsigned int size)
> > > >   * @xdp_prog: XDP program to run
> > > >   * @xdp_ring: ring to be used for XDP_TX action
> > > >   * @rx_buf: Rx buffer to store the XDP action
> > > > + * @eop_desc: Last descriptor in packet to read metadata from
> > > >   *
> > > >   * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}
> > > >   */
> > > >  static void
> > > >  ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > > >  	    struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
> > > > -	    struct ice_rx_buf *rx_buf)
> > > > +	    struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc)
> > > >  {
> > > >  	unsigned int ret = ICE_XDP_PASS;
> > > >  	u32 act;
> > > > @@ -571,6 +572,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > > >  	if (!xdp_prog)
> > > >  		goto exit;
> > > >  
> > > > +	ice_xdp_meta_set_desc(xdp, eop_desc);
> > > 
> > > I am currently not sure if for multi-buffer case HW repeats all the
> > > necessary info within each descriptor for every frag? IOW shouldn't you be
> > > using the ice_rx_ring::first_desc?
> > > 
> > > Would be good to test hints for mbuf case for sure.
> > >
> > 
> > In the skb path, we take metadata from the last descriptor only, so this should 
> > be fine. Really worth testing with mbuf though.

I retract my promise to test this with mbuf, as for now hints and mbuf are not 
supposed to go together [0].

Making sure they can co-exist peacefully can be a topic for another series.
For now I just can just say with high confidence that in case of multi-buffer 
frames, we do have all the supported metadata in the EoP descriptor.

[0] https://elixir.bootlin.com/linux/v6.5.2/source/kernel/bpf/offload.c#L234

> 
> Ok, thanks!
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 05/23] ice: Introduce ice_xdp_buff
  2023-09-07 14:21         ` Larysa Zaremba
@ 2023-09-07 16:33           ` Stanislav Fomichev
  2023-09-07 16:42             ` Maciej Fijalkowski
  0 siblings, 1 reply; 72+ messages in thread
From: Stanislav Fomichev @ 2023-09-07 16:33 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: Maciej Fijalkowski, bpf, ast, daniel, andrii, martin.lau, song,
	yhs, john.fastabend, kpsingh, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

On Thu, Sep 7, 2023 at 7:27 AM Larysa Zaremba <larysa.zaremba@intel.com> wrote:
>
> On Tue, Sep 05, 2023 at 07:53:03PM +0200, Maciej Fijalkowski wrote:
> > On Mon, Sep 04, 2023 at 08:11:09PM +0200, Larysa Zaremba wrote:
> > > On Mon, Sep 04, 2023 at 05:32:14PM +0200, Maciej Fijalkowski wrote:
> > > > On Thu, Aug 24, 2023 at 09:26:44PM +0200, Larysa Zaremba wrote:
> > > > > In order to use XDP hints via kfuncs we need to put
> > > > > RX descriptor and ring pointers just next to xdp_buff.
> > > > > Same as in hints implementations in other drivers, we achieve
> > > > > this through putting xdp_buff into a child structure.
> > > >
> > > > Don't you mean a parent struct? xdp_buff will be 'child' of ice_xdp_buff
> > > > if i'm reading this right.
> > > >
> > >
> > > ice_xdp_buff is a child in terms of inheritance (pointer to ice_xdp_buff could
> > > replace pointer to xdp_buff, but not in reverse).
> > >
> > > > >
> > > > > Currently, xdp_buff is stored in the ring structure,
> > > > > so replace it with union that includes child structure.
> > > > > This way enough memory is available while existing XDP code
> > > > > remains isolated from hints.
> > > > >
> > > > > Minimum size of the new child structure (ice_xdp_buff) is exactly
> > > > > 64 bytes (single cache line). To place it at the start of a cache line,
> > > > > move 'next' field from CL1 to CL3, as it isn't used often. This still
> > > > > leaves 128 bits available in CL3 for packet context extensions.
> > > >
> > > > I believe ice_xdp_buff will be beefed up in later patches, so what is the
> > > > point of moving 'next' ? We won't be able to keep ice_xdp_buff in a single
> > > > CL anyway.
> > > >
> > >
> > > It is to at least keep xdp_buff and descriptor pointer (used for every hint) in
> > > a single CL, other fields are situational.
> >
> > Right, something must be moved...still, would be good to see perf
> > before/after :)
> >
> > >
> > > > >
> > > > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > > > ---
> > > > >  drivers/net/ethernet/intel/ice/ice_txrx.c     |  7 +++--
> > > > >  drivers/net/ethernet/intel/ice/ice_txrx.h     | 26 ++++++++++++++++---
> > > > >  drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 10 +++++++
> > > > >  3 files changed, 38 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > > index 40f2f6dabb81..4e6546d9cf85 100644
> > > > > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > > @@ -557,13 +557,14 @@ ice_rx_frame_truesize(struct ice_rx_ring *rx_ring, const unsigned int size)
> > > > >   * @xdp_prog: XDP program to run
> > > > >   * @xdp_ring: ring to be used for XDP_TX action
> > > > >   * @rx_buf: Rx buffer to store the XDP action
> > > > > + * @eop_desc: Last descriptor in packet to read metadata from
> > > > >   *
> > > > >   * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}
> > > > >   */
> > > > >  static void
> > > > >  ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > > > >             struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
> > > > > -           struct ice_rx_buf *rx_buf)
> > > > > +           struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc)
> > > > >  {
> > > > >         unsigned int ret = ICE_XDP_PASS;
> > > > >         u32 act;
> > > > > @@ -571,6 +572,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > > > >         if (!xdp_prog)
> > > > >                 goto exit;
> > > > >
> > > > > +       ice_xdp_meta_set_desc(xdp, eop_desc);
> > > >
> > > > I am currently not sure if for multi-buffer case HW repeats all the
> > > > necessary info within each descriptor for every frag? IOW shouldn't you be
> > > > using the ice_rx_ring::first_desc?
> > > >
> > > > Would be good to test hints for mbuf case for sure.
> > > >
> > >
> > > In the skb path, we take metadata from the last descriptor only, so this should
> > > be fine. Really worth testing with mbuf though.
>
> I retract my promise to test this with mbuf, as for now hints and mbuf are not
> supposed to go together [0].

Hm, I don't think it's intentional. I don't see why mbuf and hints
can't coexist.
Anything pops into your mind? Otherwise, can change that mask to be
~(BPF_F_XDP_DEV_BOUND_ONLY|BPF_F_XDP_HAS_FRAGS) as part of the series
(or separately, up to you).

> Making sure they can co-exist peacefully can be a topic for another series.
> For now I just can just say with high confidence that in case of multi-buffer
> frames, we do have all the supported metadata in the EoP descriptor.
>
> [0] https://elixir.bootlin.com/linux/v6.5.2/source/kernel/bpf/offload.c#L234
>
> >
> > Ok, thanks!
> >

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 05/23] ice: Introduce ice_xdp_buff
  2023-09-07 16:33           ` Stanislav Fomichev
@ 2023-09-07 16:42             ` Maciej Fijalkowski
  2023-09-07 16:43               ` Maciej Fijalkowski
  0 siblings, 1 reply; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-07 16:42 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Larysa Zaremba, bpf, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

On Thu, Sep 07, 2023 at 09:33:14AM -0700, Stanislav Fomichev wrote:
> On Thu, Sep 7, 2023 at 7:27 AM Larysa Zaremba <larysa.zaremba@intel.com> wrote:
> >
> > On Tue, Sep 05, 2023 at 07:53:03PM +0200, Maciej Fijalkowski wrote:
> > > On Mon, Sep 04, 2023 at 08:11:09PM +0200, Larysa Zaremba wrote:
> > > > On Mon, Sep 04, 2023 at 05:32:14PM +0200, Maciej Fijalkowski wrote:
> > > > > On Thu, Aug 24, 2023 at 09:26:44PM +0200, Larysa Zaremba wrote:
> > > > > > In order to use XDP hints via kfuncs we need to put
> > > > > > RX descriptor and ring pointers just next to xdp_buff.
> > > > > > Same as in hints implementations in other drivers, we achieve
> > > > > > this through putting xdp_buff into a child structure.
> > > > >
> > > > > Don't you mean a parent struct? xdp_buff will be 'child' of ice_xdp_buff
> > > > > if i'm reading this right.
> > > > >
> > > >
> > > > ice_xdp_buff is a child in terms of inheritance (pointer to ice_xdp_buff could
> > > > replace pointer to xdp_buff, but not in reverse).
> > > >
> > > > > >
> > > > > > Currently, xdp_buff is stored in the ring structure,
> > > > > > so replace it with union that includes child structure.
> > > > > > This way enough memory is available while existing XDP code
> > > > > > remains isolated from hints.
> > > > > >
> > > > > > Minimum size of the new child structure (ice_xdp_buff) is exactly
> > > > > > 64 bytes (single cache line). To place it at the start of a cache line,
> > > > > > move 'next' field from CL1 to CL3, as it isn't used often. This still
> > > > > > leaves 128 bits available in CL3 for packet context extensions.
> > > > >
> > > > > I believe ice_xdp_buff will be beefed up in later patches, so what is the
> > > > > point of moving 'next' ? We won't be able to keep ice_xdp_buff in a single
> > > > > CL anyway.
> > > > >
> > > >
> > > > It is to at least keep xdp_buff and descriptor pointer (used for every hint) in
> > > > a single CL, other fields are situational.
> > >
> > > Right, something must be moved...still, would be good to see perf
> > > before/after :)
> > >
> > > >
> > > > > >
> > > > > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > > > > ---
> > > > > >  drivers/net/ethernet/intel/ice/ice_txrx.c     |  7 +++--
> > > > > >  drivers/net/ethernet/intel/ice/ice_txrx.h     | 26 ++++++++++++++++---
> > > > > >  drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 10 +++++++
> > > > > >  3 files changed, 38 insertions(+), 5 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > > > index 40f2f6dabb81..4e6546d9cf85 100644
> > > > > > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > > > @@ -557,13 +557,14 @@ ice_rx_frame_truesize(struct ice_rx_ring *rx_ring, const unsigned int size)
> > > > > >   * @xdp_prog: XDP program to run
> > > > > >   * @xdp_ring: ring to be used for XDP_TX action
> > > > > >   * @rx_buf: Rx buffer to store the XDP action
> > > > > > + * @eop_desc: Last descriptor in packet to read metadata from
> > > > > >   *
> > > > > >   * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}
> > > > > >   */
> > > > > >  static void
> > > > > >  ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > > > > >             struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
> > > > > > -           struct ice_rx_buf *rx_buf)
> > > > > > +           struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc)
> > > > > >  {
> > > > > >         unsigned int ret = ICE_XDP_PASS;
> > > > > >         u32 act;
> > > > > > @@ -571,6 +572,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > > > > >         if (!xdp_prog)
> > > > > >                 goto exit;
> > > > > >
> > > > > > +       ice_xdp_meta_set_desc(xdp, eop_desc);
> > > > >
> > > > > I am currently not sure if for multi-buffer case HW repeats all the
> > > > > necessary info within each descriptor for every frag? IOW shouldn't you be
> > > > > using the ice_rx_ring::first_desc?
> > > > >
> > > > > Would be good to test hints for mbuf case for sure.
> > > > >
> > > >
> > > > In the skb path, we take metadata from the last descriptor only, so this should
> > > > be fine. Really worth testing with mbuf though.
> >
> > I retract my promise to test this with mbuf, as for now hints and mbuf are not
> > supposed to go together [0].
> 
> Hm, I don't think it's intentional. I don't see why mbuf and hints
> can't coexist.

They should coexist, xdp mbuf support is an integral part of driver as we
know:)

> Anything pops into your mind? Otherwise, can change that mask to be
> ~(BPF_F_XDP_DEV_BOUND_ONLY|BPF_F_XDP_HAS_FRAGS) as part of the series
> (or separately, up to you).

+1

> 
> > Making sure they can co-exist peacefully can be a topic for another series.
> > For now I just can just say with high confidence that in case of multi-buffer
> > frames, we do have all the supported metadata in the EoP descriptor.
> >
> > [0] https://elixir.bootlin.com/linux/v6.5.2/source/kernel/bpf/offload.c#L234
> >
> > >
> > > Ok, thanks!
> > >
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 05/23] ice: Introduce ice_xdp_buff
  2023-09-07 16:42             ` Maciej Fijalkowski
@ 2023-09-07 16:43               ` Maciej Fijalkowski
  2023-09-13 15:40                 ` Larysa Zaremba
  0 siblings, 1 reply; 72+ messages in thread
From: Maciej Fijalkowski @ 2023-09-07 16:43 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Larysa Zaremba, bpf, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

On Thu, Sep 07, 2023 at 06:42:33PM +0200, Maciej Fijalkowski wrote:
> On Thu, Sep 07, 2023 at 09:33:14AM -0700, Stanislav Fomichev wrote:
> > On Thu, Sep 7, 2023 at 7:27 AM Larysa Zaremba <larysa.zaremba@intel.com> wrote:
> > >
> > > On Tue, Sep 05, 2023 at 07:53:03PM +0200, Maciej Fijalkowski wrote:
> > > > On Mon, Sep 04, 2023 at 08:11:09PM +0200, Larysa Zaremba wrote:
> > > > > On Mon, Sep 04, 2023 at 05:32:14PM +0200, Maciej Fijalkowski wrote:
> > > > > > On Thu, Aug 24, 2023 at 09:26:44PM +0200, Larysa Zaremba wrote:
> > > > > > > In order to use XDP hints via kfuncs we need to put
> > > > > > > RX descriptor and ring pointers just next to xdp_buff.
> > > > > > > Same as in hints implementations in other drivers, we achieve
> > > > > > > this through putting xdp_buff into a child structure.
> > > > > >
> > > > > > Don't you mean a parent struct? xdp_buff will be 'child' of ice_xdp_buff
> > > > > > if i'm reading this right.
> > > > > >
> > > > >
> > > > > ice_xdp_buff is a child in terms of inheritance (pointer to ice_xdp_buff could
> > > > > replace pointer to xdp_buff, but not in reverse).
> > > > >
> > > > > > >
> > > > > > > Currently, xdp_buff is stored in the ring structure,
> > > > > > > so replace it with union that includes child structure.
> > > > > > > This way enough memory is available while existing XDP code
> > > > > > > remains isolated from hints.
> > > > > > >
> > > > > > > Minimum size of the new child structure (ice_xdp_buff) is exactly
> > > > > > > 64 bytes (single cache line). To place it at the start of a cache line,
> > > > > > > move 'next' field from CL1 to CL3, as it isn't used often. This still
> > > > > > > leaves 128 bits available in CL3 for packet context extensions.
> > > > > >
> > > > > > I believe ice_xdp_buff will be beefed up in later patches, so what is the
> > > > > > point of moving 'next' ? We won't be able to keep ice_xdp_buff in a single
> > > > > > CL anyway.
> > > > > >
> > > > >
> > > > > It is to at least keep xdp_buff and descriptor pointer (used for every hint) in
> > > > > a single CL, other fields are situational.
> > > >
> > > > Right, something must be moved...still, would be good to see perf
> > > > before/after :)
> > > >
> > > > >
> > > > > > >
> > > > > > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > > > > > ---
> > > > > > >  drivers/net/ethernet/intel/ice/ice_txrx.c     |  7 +++--
> > > > > > >  drivers/net/ethernet/intel/ice/ice_txrx.h     | 26 ++++++++++++++++---
> > > > > > >  drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 10 +++++++
> > > > > > >  3 files changed, 38 insertions(+), 5 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > > > > index 40f2f6dabb81..4e6546d9cf85 100644
> > > > > > > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > > > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > > > > @@ -557,13 +557,14 @@ ice_rx_frame_truesize(struct ice_rx_ring *rx_ring, const unsigned int size)
> > > > > > >   * @xdp_prog: XDP program to run
> > > > > > >   * @xdp_ring: ring to be used for XDP_TX action
> > > > > > >   * @rx_buf: Rx buffer to store the XDP action
> > > > > > > + * @eop_desc: Last descriptor in packet to read metadata from
> > > > > > >   *
> > > > > > >   * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}
> > > > > > >   */
> > > > > > >  static void
> > > > > > >  ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > > > > > >             struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
> > > > > > > -           struct ice_rx_buf *rx_buf)
> > > > > > > +           struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc)
> > > > > > >  {
> > > > > > >         unsigned int ret = ICE_XDP_PASS;
> > > > > > >         u32 act;
> > > > > > > @@ -571,6 +572,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > > > > > >         if (!xdp_prog)
> > > > > > >                 goto exit;
> > > > > > >
> > > > > > > +       ice_xdp_meta_set_desc(xdp, eop_desc);
> > > > > >
> > > > > > I am currently not sure if for multi-buffer case HW repeats all the
> > > > > > necessary info within each descriptor for every frag? IOW shouldn't you be
> > > > > > using the ice_rx_ring::first_desc?
> > > > > >
> > > > > > Would be good to test hints for mbuf case for sure.
> > > > > >
> > > > >
> > > > > In the skb path, we take metadata from the last descriptor only, so this should
> > > > > be fine. Really worth testing with mbuf though.
> > >
> > > I retract my promise to test this with mbuf, as for now hints and mbuf are not
> > > supposed to go together [0].
> > 
> > Hm, I don't think it's intentional. I don't see why mbuf and hints
> > can't coexist.
> 
> They should coexist, xdp mbuf support is an integral part of driver as we
> know:)
> 
> > Anything pops into your mind? Otherwise, can change that mask to be
> > ~(BPF_F_XDP_DEV_BOUND_ONLY|BPF_F_XDP_HAS_FRAGS) as part of the series
> > (or separately, up to you).
> 
> +1

IMHO that should be a standalone patch.

> 
> > 
> > > Making sure they can co-exist peacefully can be a topic for another series.
> > > For now I just can just say with high confidence that in case of multi-buffer
> > > frames, we do have all the supported metadata in the EoP descriptor.
> > >
> > > [0] https://elixir.bootlin.com/linux/v6.5.2/source/kernel/bpf/offload.c#L234
> > >
> > > >
> > > > Ok, thanks!
> > > >
> > 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 05/23] ice: Introduce ice_xdp_buff
  2023-09-07 16:43               ` Maciej Fijalkowski
@ 2023-09-13 15:40                 ` Larysa Zaremba
  0 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-13 15:40 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: Stanislav Fomichev, bpf, ast, daniel, andrii, martin.lau, song,
	yhs, john.fastabend, kpsingh, haoluo, jolsa, David Ahern,
	Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer,
	Anatoly Burakov, Alexander Lobakin, Magnus Karlsson,
	Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn,
	Alexei Starovoitov, Simon Horman, Tariq Toukan, Saeed Mahameed

On Thu, Sep 07, 2023 at 06:43:58PM +0200, Maciej Fijalkowski wrote:
> On Thu, Sep 07, 2023 at 06:42:33PM +0200, Maciej Fijalkowski wrote:
> > On Thu, Sep 07, 2023 at 09:33:14AM -0700, Stanislav Fomichev wrote:
> > > On Thu, Sep 7, 2023 at 7:27 AM Larysa Zaremba <larysa.zaremba@intel.com> wrote:
> > > >
> > > > On Tue, Sep 05, 2023 at 07:53:03PM +0200, Maciej Fijalkowski wrote:
> > > > > On Mon, Sep 04, 2023 at 08:11:09PM +0200, Larysa Zaremba wrote:
> > > > > > On Mon, Sep 04, 2023 at 05:32:14PM +0200, Maciej Fijalkowski wrote:
> > > > > > > On Thu, Aug 24, 2023 at 09:26:44PM +0200, Larysa Zaremba wrote:
> > > > > > > > In order to use XDP hints via kfuncs we need to put
> > > > > > > > RX descriptor and ring pointers just next to xdp_buff.
> > > > > > > > Same as in hints implementations in other drivers, we achieve
> > > > > > > > this through putting xdp_buff into a child structure.
> > > > > > >
> > > > > > > Don't you mean a parent struct? xdp_buff will be 'child' of ice_xdp_buff
> > > > > > > if i'm reading this right.
> > > > > > >
> > > > > >
> > > > > > ice_xdp_buff is a child in terms of inheritance (pointer to ice_xdp_buff could
> > > > > > replace pointer to xdp_buff, but not in reverse).
> > > > > >
> > > > > > > >
> > > > > > > > Currently, xdp_buff is stored in the ring structure,
> > > > > > > > so replace it with union that includes child structure.
> > > > > > > > This way enough memory is available while existing XDP code
> > > > > > > > remains isolated from hints.
> > > > > > > >
> > > > > > > > Minimum size of the new child structure (ice_xdp_buff) is exactly
> > > > > > > > 64 bytes (single cache line). To place it at the start of a cache line,
> > > > > > > > move 'next' field from CL1 to CL3, as it isn't used often. This still
> > > > > > > > leaves 128 bits available in CL3 for packet context extensions.
> > > > > > >
> > > > > > > I believe ice_xdp_buff will be beefed up in later patches, so what is the
> > > > > > > point of moving 'next' ? We won't be able to keep ice_xdp_buff in a single
> > > > > > > CL anyway.
> > > > > > >
> > > > > >
> > > > > > It is to at least keep xdp_buff and descriptor pointer (used for every hint) in
> > > > > > a single CL, other fields are situational.
> > > > >
> > > > > Right, something must be moved...still, would be good to see perf
> > > > > before/after :)
> > > > >
> > > > > >
> > > > > > > >
> > > > > > > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > > > > > > ---
> > > > > > > >  drivers/net/ethernet/intel/ice/ice_txrx.c     |  7 +++--
> > > > > > > >  drivers/net/ethernet/intel/ice/ice_txrx.h     | 26 ++++++++++++++++---
> > > > > > > >  drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 10 +++++++
> > > > > > > >  3 files changed, 38 insertions(+), 5 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > > > > > index 40f2f6dabb81..4e6546d9cf85 100644
> > > > > > > > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > > > > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > > > > > > @@ -557,13 +557,14 @@ ice_rx_frame_truesize(struct ice_rx_ring *rx_ring, const unsigned int size)
> > > > > > > >   * @xdp_prog: XDP program to run
> > > > > > > >   * @xdp_ring: ring to be used for XDP_TX action
> > > > > > > >   * @rx_buf: Rx buffer to store the XDP action
> > > > > > > > + * @eop_desc: Last descriptor in packet to read metadata from
> > > > > > > >   *
> > > > > > > >   * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}
> > > > > > > >   */
> > > > > > > >  static void
> > > > > > > >  ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > > > > > > >             struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
> > > > > > > > -           struct ice_rx_buf *rx_buf)
> > > > > > > > +           struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc)
> > > > > > > >  {
> > > > > > > >         unsigned int ret = ICE_XDP_PASS;
> > > > > > > >         u32 act;
> > > > > > > > @@ -571,6 +572,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > > > > > > >         if (!xdp_prog)
> > > > > > > >                 goto exit;
> > > > > > > >
> > > > > > > > +       ice_xdp_meta_set_desc(xdp, eop_desc);
> > > > > > >
> > > > > > > I am currently not sure if for multi-buffer case HW repeats all the
> > > > > > > necessary info within each descriptor for every frag? IOW shouldn't you be
> > > > > > > using the ice_rx_ring::first_desc?
> > > > > > >
> > > > > > > Would be good to test hints for mbuf case for sure.
> > > > > > >
> > > > > >
> > > > > > In the skb path, we take metadata from the last descriptor only, so this should
> > > > > > be fine. Really worth testing with mbuf though.
> > > >
> > > > I retract my promise to test this with mbuf, as for now hints and mbuf are not
> > > > supposed to go together [0].
> > > 
> > > Hm, I don't think it's intentional. I don't see why mbuf and hints
> > > can't coexist.
> > 
> > They should coexist, xdp mbuf support is an integral part of driver as we
> > know:)
> > 
> > > Anything pops into your mind? Otherwise, can change that mask to be
> > > ~(BPF_F_XDP_DEV_BOUND_ONLY|BPF_F_XDP_HAS_FRAGS) as part of the series
> > > (or separately, up to you).
> > 
> > +1
> 
> IMHO that should be a standalone patch.
>

Sorry for not answering, I was stuck in testing and debugging, wanted to come 
back with a definitive answer. Fortunately, the problems were not caused by
hints and mbuf clashing on some fundamental level, everything works now, so I 
will send the patch that allows to combine them tomorrow.

> > 
> > > 
> > > > Making sure they can co-exist peacefully can be a topic for another series.
> > > > For now I just can just say with high confidence that in case of multi-buffer
> > > > frames, we do have all the supported metadata in the EoP descriptor.
> > > >
> > > > [0] https://elixir.bootlin.com/linux/v6.5.2/source/kernel/bpf/offload.c#L234
> > > >
> > > > >
> > > > > Ok, thanks!
> > > > >
> > > 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 01/23] ice: make RX hash reading code more reusable
  2023-08-24 19:26 ` [RFC bpf-next 01/23] ice: make RX hash reading code more reusable Larysa Zaremba
  2023-09-04 14:37   ` [xdp-hints] " Maciej Fijalkowski
@ 2023-09-14 16:12   ` Alexander Lobakin
  2023-09-14 16:15     ` Larysa Zaremba
  1 sibling, 1 reply; 72+ messages in thread
From: Alexander Lobakin @ 2023-09-14 16:12 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

From: Larysa Zaremba <larysa.zaremba@intel.com>
Date: Thu, 24 Aug 2023 21:26:40 +0200

> Previously, we only needed RX hash in skb path,
> hence all related code was written with skb in mind.
> But with the addition of XDP hints via kfuncs to the ice driver,
> the same logic will be needed in .xmo_() callbacks.
> 
> Separate generic process of reading RX hash from a descriptor
> into a separate function.
> 
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>

I like the patch, except three minors above,

> ---
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 37 +++++++++++++------
>  1 file changed, 26 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> index c8322fb6f2b3..8f7f6d78f7bf 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> @@ -63,28 +63,43 @@ static enum pkt_hash_types ice_ptype_to_htype(u16 ptype)
>  }
>  
>  /**
> - * ice_rx_hash - set the hash value in the skb
> + * ice_get_rx_hash - get RX hash value from descriptor
> + * @rx_desc: specific descriptor
> + *
> + * Returns hash, if present, 0 otherwise.
> + */
> +static u32
> +ice_get_rx_hash(const union ice_32b_rx_flex_desc *rx_desc)

The whole declaration could easily fit into one line :>

> +{
> +	const struct ice_32b_rx_flex_desc_nic *nic_mdid;
> +
> +	if (rx_desc->wb.rxdid != ICE_RXDID_FLEX_NIC)

Not really related: have you tried to measure branch hit/miss here?
Can't it be a candidate for unlikely()?

> +		return 0;
> +
> +	nic_mdid = (struct ice_32b_rx_flex_desc_nic *)rx_desc;
> +	return le32_to_cpu(nic_mdid->rss_hash);

I think the common convention in the kernel is to separate the last
return from the main body with a newline.
To not leave the cast above alone, you can embed it into the declaration.

	const struct ice_32b_rx_flex_desc_nic *mdid = (typeof(mdid))rx_desc;

This is a compile-time cast w/o any maths anyway, so doing it before
checking for the descriptor type doesn't hurt in any way.

	if (!= FLEX)
		return 0;

	return le32_ ...

(or via a ternary)

> +}

[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 01/23] ice: make RX hash reading code more reusable
  2023-09-14 16:12   ` Alexander Lobakin
@ 2023-09-14 16:15     ` Larysa Zaremba
  0 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-14 16:15 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Sep 14, 2023 at 06:12:23PM +0200, Alexander Lobakin wrote:
> From: Larysa Zaremba <larysa.zaremba@intel.com>
> Date: Thu, 24 Aug 2023 21:26:40 +0200
> 
> > Previously, we only needed RX hash in skb path,
> > hence all related code was written with skb in mind.
> > But with the addition of XDP hints via kfuncs to the ice driver,
> > the same logic will be needed in .xmo_() callbacks.
> > 
> > Separate generic process of reading RX hash from a descriptor
> > into a separate function.
> > 
> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> 
> I like the patch, except three minors above,
> 
> > ---
> >  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 37 +++++++++++++------
> >  1 file changed, 26 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > index c8322fb6f2b3..8f7f6d78f7bf 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > @@ -63,28 +63,43 @@ static enum pkt_hash_types ice_ptype_to_htype(u16 ptype)
> >  }
> >  
> >  /**
> > - * ice_rx_hash - set the hash value in the skb
> > + * ice_get_rx_hash - get RX hash value from descriptor
> > + * @rx_desc: specific descriptor
> > + *
> > + * Returns hash, if present, 0 otherwise.
> > + */
> > +static u32
> > +ice_get_rx_hash(const union ice_32b_rx_flex_desc *rx_desc)
> 
> The whole declaration could easily fit into one line :>
>

I agree
 
> > +{
> > +	const struct ice_32b_rx_flex_desc_nic *nic_mdid;
> > +
> > +	if (rx_desc->wb.rxdid != ICE_RXDID_FLEX_NIC)
> 
> Not really related: have you tried to measure branch hit/miss here?
> Can't it be a candidate for unlikely()?

I have not measured this, but at least in my test setup, I have never seen any 
other rxdid, so unlikely() is a good idea. If it harms some particular 
applications, we can always remove this later on request :D

> > +		return 0;
> > +
> > +	nic_mdid = (struct ice_32b_rx_flex_desc_nic *)rx_desc;
> > +	return le32_to_cpu(nic_mdid->rss_hash);
> 
> I think the common convention in the kernel is to separate the last
> return from the main body with a newline.
> To not leave the cast above alone, you can embed it into the declaration.
> 

I am fine with leaving the cast alone.

> 	const struct ice_32b_rx_flex_desc_nic *mdid = (typeof(mdid))rx_desc;
> 
> This is a compile-time cast w/o any maths anyway, so doing it before
> checking for the descriptor type doesn't hurt in any way.
> 
> 	if (!= FLEX)
> 		return 0;
> 
> 	return le32_ ...
> 
> (or via a ternary)
> 
> > +}
> 
> [...]
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 09/23] xdp: Add VLAN tag hint
  2023-08-24 19:26 ` [RFC bpf-next 09/23] xdp: Add VLAN tag hint Larysa Zaremba
  2023-08-24 22:02   ` kernel test robot
@ 2023-09-14 16:18   ` Alexander Lobakin
  2023-09-14 16:21     ` Larysa Zaremba
  1 sibling, 1 reply; 72+ messages in thread
From: Alexander Lobakin @ 2023-09-14 16:18 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

From: Larysa Zaremba <larysa.zaremba@intel.com>
Date: Thu, 24 Aug 2023 21:26:48 +0200

> Implement functionality that enables drivers to expose VLAN tag
> to XDP code.

I'd leave a couple more words here. Mention that it exports both tag and
protocol, for example. That TCI is host-Endian and proto is BE (just
like how skb stores them and it's fine).

> 
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> ---
>  Documentation/networking/xdp-rx-metadata.rst |  8 ++++-
>  include/net/xdp.h                            |  4 +++
>  kernel/bpf/offload.c                         |  2 ++
>  net/core/xdp.c                               | 34 ++++++++++++++++++++
>  4 files changed, 47 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
> index 25ce72af81c2..ea6dd79a21d3 100644
> --- a/Documentation/networking/xdp-rx-metadata.rst
> +++ b/Documentation/networking/xdp-rx-metadata.rst
> @@ -18,7 +18,13 @@ Currently, the following kfuncs are supported. In the future, as more
>  metadata is supported, this set will grow:
>  
>  .. kernel-doc:: net/core/xdp.c
> -   :identifiers: bpf_xdp_metadata_rx_timestamp bpf_xdp_metadata_rx_hash
> +   :identifiers: bpf_xdp_metadata_rx_timestamp
> +
> +.. kernel-doc:: net/core/xdp.c
> +   :identifiers: bpf_xdp_metadata_rx_hash
> +
> +.. kernel-doc:: net/core/xdp.c
> +   :identifiers: bpf_xdp_metadata_rx_vlan_tag
>  
>  An XDP program can use these kfuncs to read the metadata into stack
>  variables for its own consumption. Or, to pass the metadata on to other
> diff --git a/include/net/xdp.h b/include/net/xdp.h
> index 1e9870d5f025..8bb64fc76498 100644
> --- a/include/net/xdp.h
> +++ b/include/net/xdp.h
> @@ -388,6 +388,8 @@ void xdp_attachment_setup(struct xdp_attachment_info *info,
>  			   bpf_xdp_metadata_rx_timestamp) \
>  	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_HASH, \
>  			   bpf_xdp_metadata_rx_hash) \
> +	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_VLAN_TAG, \
> +			   bpf_xdp_metadata_rx_vlan_tag) \
>  
>  enum {
>  #define XDP_METADATA_KFUNC(name, _) name,
> @@ -449,6 +451,8 @@ struct xdp_metadata_ops {
>  	int	(*xmo_rx_timestamp)(const struct xdp_md *ctx, u64 *timestamp);
>  	int	(*xmo_rx_hash)(const struct xdp_md *ctx, u32 *hash,
>  			       enum xdp_rss_hash_type *rss_type);
> +	int	(*xmo_rx_vlan_tag)(const struct xdp_md *ctx, u16 *vlan_tci,
> +				   __be16 *vlan_proto);

Was "TCI first, proto second" aligned with something or I can ask "why
not proto first, TCI second"?

>  };
>  
>  #ifdef CONFIG_NET

[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 09/23] xdp: Add VLAN tag hint
  2023-09-14 16:18   ` Alexander Lobakin
@ 2023-09-14 16:21     ` Larysa Zaremba
  0 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-14 16:21 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Sep 14, 2023 at 06:18:40PM +0200, Alexander Lobakin wrote:
> From: Larysa Zaremba <larysa.zaremba@intel.com>
> Date: Thu, 24 Aug 2023 21:26:48 +0200
> 
> > Implement functionality that enables drivers to expose VLAN tag
> > to XDP code.
> 
> I'd leave a couple more words here. Mention that it exports both tag and
> protocol, for example. That TCI is host-Endian and proto is BE (just
> like how skb stores them and it's fine).
>

OK

> > 
> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > ---
> >  Documentation/networking/xdp-rx-metadata.rst |  8 ++++-
> >  include/net/xdp.h                            |  4 +++
> >  kernel/bpf/offload.c                         |  2 ++
> >  net/core/xdp.c                               | 34 ++++++++++++++++++++
> >  4 files changed, 47 insertions(+), 1 deletion(-)
> > 
> > diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
> > index 25ce72af81c2..ea6dd79a21d3 100644
> > --- a/Documentation/networking/xdp-rx-metadata.rst
> > +++ b/Documentation/networking/xdp-rx-metadata.rst
> > @@ -18,7 +18,13 @@ Currently, the following kfuncs are supported. In the future, as more
> >  metadata is supported, this set will grow:
> >  
> >  .. kernel-doc:: net/core/xdp.c
> > -   :identifiers: bpf_xdp_metadata_rx_timestamp bpf_xdp_metadata_rx_hash
> > +   :identifiers: bpf_xdp_metadata_rx_timestamp
> > +
> > +.. kernel-doc:: net/core/xdp.c
> > +   :identifiers: bpf_xdp_metadata_rx_hash
> > +
> > +.. kernel-doc:: net/core/xdp.c
> > +   :identifiers: bpf_xdp_metadata_rx_vlan_tag
> >  
> >  An XDP program can use these kfuncs to read the metadata into stack
> >  variables for its own consumption. Or, to pass the metadata on to other
> > diff --git a/include/net/xdp.h b/include/net/xdp.h
> > index 1e9870d5f025..8bb64fc76498 100644
> > --- a/include/net/xdp.h
> > +++ b/include/net/xdp.h
> > @@ -388,6 +388,8 @@ void xdp_attachment_setup(struct xdp_attachment_info *info,
> >  			   bpf_xdp_metadata_rx_timestamp) \
> >  	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_HASH, \
> >  			   bpf_xdp_metadata_rx_hash) \
> > +	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_VLAN_TAG, \
> > +			   bpf_xdp_metadata_rx_vlan_tag) \
> >  
> >  enum {
> >  #define XDP_METADATA_KFUNC(name, _) name,
> > @@ -449,6 +451,8 @@ struct xdp_metadata_ops {
> >  	int	(*xmo_rx_timestamp)(const struct xdp_md *ctx, u64 *timestamp);
> >  	int	(*xmo_rx_hash)(const struct xdp_md *ctx, u32 *hash,
> >  			       enum xdp_rss_hash_type *rss_type);
> > +	int	(*xmo_rx_vlan_tag)(const struct xdp_md *ctx, u16 *vlan_tci,
> > +				   __be16 *vlan_proto);
> 
> Was "TCI first, proto second" aligned with something or I can ask "why
> not proto first, TCI second"?

No particular reason. Now I have looked it up and this is the other way in all 
places >_<. I do probably need to switch this. Time to put my regular expressions 
skills to the test.

> 
> >  };
> >  
> >  #ifdef CONFIG_NET
> 
> [...]
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 10/23] ice: Implement VLAN tag hint
  2023-08-24 19:26 ` [RFC bpf-next 10/23] ice: Implement " Larysa Zaremba
  2023-09-04 16:00   ` Maciej Fijalkowski
@ 2023-09-14 16:25   ` Alexander Lobakin
  2023-09-14 16:28     ` Larysa Zaremba
  1 sibling, 1 reply; 72+ messages in thread
From: Alexander Lobakin @ 2023-09-14 16:25 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

From: Larysa Zaremba <larysa.zaremba@intel.com>
Date: Thu, 24 Aug 2023 21:26:49 +0200

> Implement .xmo_rx_vlan_tag callback to allow XDP code to read
> packet's VLAN tag.
> 
> At the same time, use vlan_tci instead of vlan_tag in touched code,
> because vlan_tag is misleading.
> 
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_main.c     | 22 ++++++++++++++++
>  drivers/net/ethernet/intel/ice/ice_txrx.c     |  6 ++---
>  drivers/net/ethernet/intel/ice/ice_txrx.h     |  1 +
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 26 +++++++++++++++++++
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.h |  4 +--
>  drivers/net/ethernet/intel/ice/ice_xsk.c      |  6 ++---
>  6 files changed, 57 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index 557c6326ff87..aff4fa1a75f8 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -6007,6 +6007,23 @@ ice_fix_features(struct net_device *netdev, netdev_features_t features)
>  	return features;
>  }
>  
> +/**
> + * ice_set_rx_rings_vlan_proto - update rings with new stripped VLAN proto
> + * @vsi: PF's VSI
> + * @vlan_ethertype: VLAN ethertype (802.1Q or 802.1ad) in network byte order
> + *
> + * Store current stripped VLAN proto in ring packet context,
> + * so it can be accessed more efficiently by packet processing code.
> + */
> +static void
> +ice_set_rx_rings_vlan_proto(struct ice_vsi *vsi, __be16 vlan_ethertype)

@vsi can be const (I hope).
Line can be broken on arguments, not type (I hope).

> +{
> +	u16 i;
> +
> +	ice_for_each_alloc_rxq(vsi, i)
> +		vsi->rx_rings[i]->pkt_ctx.vlan_proto = vlan_ethertype;
> +}
> +
>  /**
>   * ice_set_vlan_offload_features - set VLAN offload features for the PF VSI
>   * @vsi: PF's VSI
> @@ -6049,6 +6066,11 @@ ice_set_vlan_offload_features(struct ice_vsi *vsi, netdev_features_t features)
>  	if (strip_err || insert_err)
>  		return -EIO;
>  
> +	if (enable_stripping)
> +		ice_set_rx_rings_vlan_proto(vsi, htons(vlan_ethertype));
> +	else
> +		ice_set_rx_rings_vlan_proto(vsi, 0);

Ternary?

> +
>  	return 0;
>  }
>  
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> index 4e6546d9cf85..4fd7614f243d 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> @@ -1183,7 +1183,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
>  		struct sk_buff *skb;
>  		unsigned int size;
>  		u16 stat_err_bits;
> -		u16 vlan_tag = 0;
> +		u16 vlan_tci;
>  
>  		/* get the Rx desc from Rx ring based on 'next_to_clean' */
>  		rx_desc = ICE_RX_DESC(rx_ring, ntc);
> @@ -1278,7 +1278,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
>  			continue;
>  		}
>  
> -		vlan_tag = ice_get_vlan_tag_from_rx_desc(rx_desc);
> +		vlan_tci = ice_get_vlan_tci(rx_desc);

Unrelated: I never was a fan of scattering rx_desc parsing across
several files, I remember I moved it to process_skb_fields() in both ice
(Hints series) and iavf (libie), maybe do that here as well? Or way too
out of context?

>  
>  		/* pad the skb if needed, to make a valid ethernet frame */
>  		if (eth_skb_pad(skb))

[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 10/23] ice: Implement VLAN tag hint
  2023-09-14 16:25   ` [xdp-hints] " Alexander Lobakin
@ 2023-09-14 16:28     ` Larysa Zaremba
  2023-09-14 16:38       ` Alexander Lobakin
  0 siblings, 1 reply; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-14 16:28 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Sep 14, 2023 at 06:25:04PM +0200, Alexander Lobakin wrote:
> From: Larysa Zaremba <larysa.zaremba@intel.com>
> Date: Thu, 24 Aug 2023 21:26:49 +0200
> 
> > Implement .xmo_rx_vlan_tag callback to allow XDP code to read
> > packet's VLAN tag.
> > 
> > At the same time, use vlan_tci instead of vlan_tag in touched code,
> > because vlan_tag is misleading.
> > 
> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_main.c     | 22 ++++++++++++++++
> >  drivers/net/ethernet/intel/ice/ice_txrx.c     |  6 ++---
> >  drivers/net/ethernet/intel/ice/ice_txrx.h     |  1 +
> >  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 26 +++++++++++++++++++
> >  drivers/net/ethernet/intel/ice/ice_txrx_lib.h |  4 +--
> >  drivers/net/ethernet/intel/ice/ice_xsk.c      |  6 ++---
> >  6 files changed, 57 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> > index 557c6326ff87..aff4fa1a75f8 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_main.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> > @@ -6007,6 +6007,23 @@ ice_fix_features(struct net_device *netdev, netdev_features_t features)
> >  	return features;
> >  }
> >  
> > +/**
> > + * ice_set_rx_rings_vlan_proto - update rings with new stripped VLAN proto
> > + * @vsi: PF's VSI
> > + * @vlan_ethertype: VLAN ethertype (802.1Q or 802.1ad) in network byte order
> > + *
> > + * Store current stripped VLAN proto in ring packet context,
> > + * so it can be accessed more efficiently by packet processing code.
> > + */
> > +static void
> > +ice_set_rx_rings_vlan_proto(struct ice_vsi *vsi, __be16 vlan_ethertype)
> 
> @vsi can be const (I hope).

I will try to make it const.

> Line can be broken on arguments, not type (I hope).
> 

This is how we break the lines everywhere in this file though :/

> > +{
> > +	u16 i;
> > +
> > +	ice_for_each_alloc_rxq(vsi, i)
> > +		vsi->rx_rings[i]->pkt_ctx.vlan_proto = vlan_ethertype;
> > +}
> > +
> >  /**
> >   * ice_set_vlan_offload_features - set VLAN offload features for the PF VSI
> >   * @vsi: PF's VSI
> > @@ -6049,6 +6066,11 @@ ice_set_vlan_offload_features(struct ice_vsi *vsi, netdev_features_t features)
> >  	if (strip_err || insert_err)
> >  		return -EIO;
> >  
> > +	if (enable_stripping)
> > +		ice_set_rx_rings_vlan_proto(vsi, htons(vlan_ethertype));
> > +	else
> > +		ice_set_rx_rings_vlan_proto(vsi, 0);
> 
> Ternary?

Would look ugly in this particular case, I think, too long expressions and no 
return values.

> 
> > +
> >  	return 0;
> >  }
> >  
> > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > index 4e6546d9cf85..4fd7614f243d 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > @@ -1183,7 +1183,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
> >  		struct sk_buff *skb;
> >  		unsigned int size;
> >  		u16 stat_err_bits;
> > -		u16 vlan_tag = 0;
> > +		u16 vlan_tci;
> >  
> >  		/* get the Rx desc from Rx ring based on 'next_to_clean' */
> >  		rx_desc = ICE_RX_DESC(rx_ring, ntc);
> > @@ -1278,7 +1278,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
> >  			continue;
> >  		}
> >  
> > -		vlan_tag = ice_get_vlan_tag_from_rx_desc(rx_desc);
> > +		vlan_tci = ice_get_vlan_tci(rx_desc);
> 
> Unrelated: I never was a fan of scattering rx_desc parsing across
> several files, I remember I moved it to process_skb_fields() in both ice
> (Hints series) and iavf (libie), maybe do that here as well? Or way too
> out of context?

A little bit too unrelated to the purpose of the series, but a thing we must do 
in the future.

> 
> >  
> >  		/* pad the skb if needed, to make a valid ethernet frame */
> >  		if (eth_skb_pad(skb))
> 
> [...]
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 11/23] ice: use VLAN proto from ring packet context in skb path
  2023-09-14 16:30   ` Alexander Lobakin
@ 2023-09-14 16:30     ` Larysa Zaremba
  0 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-14 16:30 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Sep 14, 2023 at 06:30:32PM +0200, Alexander Lobakin wrote:
> From: Larysa Zaremba <larysa.zaremba@intel.com>
> Date: Thu, 24 Aug 2023 21:26:50 +0200
> 
> > VLAN proto, used in ice XDP hints implementation is stored in ring packet
> > context. Utilize this value in skb VLAN processing too instead of checking
> > netdev features.
> > 
> > At the same time, use vlan_tci instead of vlan_tag in touched code,
> > because vlan_tag is misleading.
> 
> [...]
> 
> >  void
> > -ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag)
> > +ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tci)
> >  {
> > -	netdev_features_t features = rx_ring->netdev->features;
> > -	bool non_zero_vlan = !!(vlan_tag & VLAN_VID_MASK);
> > -
> > -	if ((features & NETIF_F_HW_VLAN_CTAG_RX) && non_zero_vlan)
> > -		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan_tag);
> > -	else if ((features & NETIF_F_HW_VLAN_STAG_RX) && non_zero_vlan)
> > -		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021AD), vlan_tag);
> > +	if (vlan_tci & VLAN_VID_MASK && rx_ring->pkt_ctx.vlan_proto)
> 
> I'd wrap the first expression into ()s to make it more readable (and no
> questions like "shouldn't these be three &&?").
>

OK
 
> > +		__vlan_hwaccel_put_tag(skb, rx_ring->pkt_ctx.vlan_proto,
> > +				       vlan_tci);
> >  
> >  	napi_gro_receive(&rx_ring->q_vector->napi, skb);
> >  }
> 
> [...]
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 11/23] ice: use VLAN proto from ring packet context in skb path
  2023-08-24 19:26 ` [RFC bpf-next 11/23] ice: use VLAN proto from ring packet context in skb path Larysa Zaremba
@ 2023-09-14 16:30   ` Alexander Lobakin
  2023-09-14 16:30     ` Larysa Zaremba
  0 siblings, 1 reply; 72+ messages in thread
From: Alexander Lobakin @ 2023-09-14 16:30 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

From: Larysa Zaremba <larysa.zaremba@intel.com>
Date: Thu, 24 Aug 2023 21:26:50 +0200

> VLAN proto, used in ice XDP hints implementation is stored in ring packet
> context. Utilize this value in skb VLAN processing too instead of checking
> netdev features.
> 
> At the same time, use vlan_tci instead of vlan_tag in touched code,
> because vlan_tag is misleading.

[...]

>  void
> -ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag)
> +ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tci)
>  {
> -	netdev_features_t features = rx_ring->netdev->features;
> -	bool non_zero_vlan = !!(vlan_tag & VLAN_VID_MASK);
> -
> -	if ((features & NETIF_F_HW_VLAN_CTAG_RX) && non_zero_vlan)
> -		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan_tag);
> -	else if ((features & NETIF_F_HW_VLAN_STAG_RX) && non_zero_vlan)
> -		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021AD), vlan_tag);
> +	if (vlan_tci & VLAN_VID_MASK && rx_ring->pkt_ctx.vlan_proto)

I'd wrap the first expression into ()s to make it more readable (and no
questions like "shouldn't these be three &&?").

> +		__vlan_hwaccel_put_tag(skb, rx_ring->pkt_ctx.vlan_proto,
> +				       vlan_tci);
>  
>  	napi_gro_receive(&rx_ring->q_vector->napi, skb);
>  }

[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 12/23] xdp: Add checksum hint
  2023-08-24 19:26 ` [RFC bpf-next 12/23] xdp: Add checksum hint Larysa Zaremba
  2023-08-24 22:56   ` kernel test robot
@ 2023-09-14 16:34   ` Alexander Lobakin
  1 sibling, 0 replies; 72+ messages in thread
From: Alexander Lobakin @ 2023-09-14 16:34 UTC (permalink / raw)
  To: Larysa Zaremba, bpf
  Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

From: Larysa Zaremba <larysa.zaremba@intel.com>
Date: Thu, 24 Aug 2023 21:26:51 +0200

> Implement functionality that enables drivers to expose to XDP code checksum
> information that consists of:
> 
> - Checksum status - 2 non-exlusive flags:
>   - XDP_CHECKSUM_VERIFIED indicating HW has validated the checksum
>     (corresponding to CHECKSUM_UNNECESSARY in sk_buff)
>   - XDP_CHECKSUM_COMPLETE signifies the validity of the second argument
>     (corresponding to CHECKSUM_COMPLETE in sk_buff)
> - Checksum, calculated over the entire packet, valid if the second flag is
>   set
> 
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>

Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>

Only one stupid thing from me: when a line starts from '-' in the commit
message, some editors/viewers paint it red thinking it's a diff already
:z (same for '+')
Not something important, you just may want to prefer "neutral" '*', up
to you :D

> ---
>  Documentation/networking/xdp-rx-metadata.rst |  3 +++
>  include/net/xdp.h                            | 15 +++++++++++++
>  kernel/bpf/offload.c                         |  2 ++
>  net/core/xdp.c                               | 23 ++++++++++++++++++++
>  4 files changed, 43 insertions(+)

[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 10/23] ice: Implement VLAN tag hint
  2023-09-14 16:28     ` Larysa Zaremba
@ 2023-09-14 16:38       ` Alexander Lobakin
  2023-09-14 17:02         ` Larysa Zaremba
  2023-09-18 14:07         ` Larysa Zaremba
  0 siblings, 2 replies; 72+ messages in thread
From: Alexander Lobakin @ 2023-09-14 16:38 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

From: Larysa Zaremba <larysa.zaremba@intel.com>
Date: Thu, 14 Sep 2023 18:28:07 +0200

> On Thu, Sep 14, 2023 at 06:25:04PM +0200, Alexander Lobakin wrote:
>> From: Larysa Zaremba <larysa.zaremba@intel.com>
>> Date: Thu, 24 Aug 2023 21:26:49 +0200

[...]

>>> +static void
>>> +ice_set_rx_rings_vlan_proto(struct ice_vsi *vsi, __be16 vlan_ethertype)
>>
>> @vsi can be const (I hope).
> 
> I will try to make it const.
> 
>> Line can be broken on arguments, not type (I hope).
>>
> 
> This is how we break the lines everywhere in this file though :/

I know and would really like us stop at least adding new such
occurrences when not needed :s

> 
>>> +{
>>> +	u16 i;
>>> +
>>> +	ice_for_each_alloc_rxq(vsi, i)
>>> +		vsi->rx_rings[i]->pkt_ctx.vlan_proto = vlan_ethertype;
>>> +}
>>> +
>>>  /**
>>>   * ice_set_vlan_offload_features - set VLAN offload features for the PF VSI
>>>   * @vsi: PF's VSI
>>> @@ -6049,6 +6066,11 @@ ice_set_vlan_offload_features(struct ice_vsi *vsi, netdev_features_t features)
>>>  	if (strip_err || insert_err)
>>>  		return -EIO;
>>>  
>>> +	if (enable_stripping)
>>> +		ice_set_rx_rings_vlan_proto(vsi, htons(vlan_ethertype));
>>> +	else
>>> +		ice_set_rx_rings_vlan_proto(vsi, 0);
>>
>> Ternary?
> 
> Would look ugly in this particular case, I think, too long expressions and no 
> return values.

	ice_set_rx_rings_vlan_proto(vsi, strip ? htons(vlan_ethertype) : 0);

?

[...]

>>> -		vlan_tag = ice_get_vlan_tag_from_rx_desc(rx_desc);
>>> +		vlan_tci = ice_get_vlan_tci(rx_desc);
>>
>> Unrelated: I never was a fan of scattering rx_desc parsing across
>> several files, I remember I moved it to process_skb_fields() in both ice
>> (Hints series) and iavf (libie), maybe do that here as well? Or way too
>> out of context?
> 
> A little bit too unrelated to the purpose of the series, but a thing we must do 
> in the future.

Sure, +

> 
>>
>>>  
>>>  		/* pad the skb if needed, to make a valid ethernet frame */
>>>  		if (eth_skb_pad(skb))
>>
>> [...]
>>
>> Thanks,
>> Olek

Thanks,
Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 07/23] ice: Support RX hash XDP hint
  2023-08-24 19:26 ` [RFC bpf-next 07/23] ice: Support RX hash XDP hint Larysa Zaremba
  2023-09-05 15:42   ` [xdp-hints] " Maciej Fijalkowski
@ 2023-09-14 16:54   ` Alexander Lobakin
  2023-09-14 16:59     ` Larysa Zaremba
  1 sibling, 1 reply; 72+ messages in thread
From: Alexander Lobakin @ 2023-09-14 16:54 UTC (permalink / raw)
  To: Larysa Zaremba, bpf
  Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

From: Larysa Zaremba <larysa.zaremba@intel.com>
Date: Thu, 24 Aug 2023 21:26:46 +0200

> RX hash XDP hint requests both hash value and type.
> Type is XDP-specific, so we need a separate way to map
> these values to the hardware ptypes, so create a lookup table.
> 
> Instead of creating a new long list, reuse contents
> of ice_decode_rx_desc_ptype[] through preprocessor.
> 
> Current hash type enum does not contain ICMP packet type,
> but ice devices support it, so also add a new type into core code.
> 
> Then use previously refactored code and create a function
> that allows XDP code to read RX hash.
> 
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>

[...]

>  	/* unused entries */
> -	[154 ... 1023] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }
> +	[ICE_NUM_DEFINED_PTYPES ... 1023] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }
>  };
>  
>  static inline struct ice_rx_ptype_decoded ice_decode_rx_desc_ptype(u16 ptype)
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> index 463d9e5cbe05..b11cfaedb81c 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> @@ -567,6 +567,79 @@ static int ice_xdp_rx_hw_ts(const struct xdp_md *ctx, u64 *ts_ns)
>  	return 0;
>  }
>  
> +/* Define a ptype index -> XDP hash type lookup table.
> + * It uses the same ptype definitions as ice_decode_rx_desc_ptype[],
> + * avoiding possible copy-paste errors.
> + */
> +#undef ICE_PTT
> +#undef ICE_PTT_UNUSED_ENTRY
> +
> +#define ICE_PTT(PTYPE, OUTER_IP, OUTER_IP_VER, OUTER_FRAG, T, TE, TEF, I, PL)\
> +	[PTYPE] = XDP_RSS_L3_##OUTER_IP_VER | XDP_RSS_L4_##I | XDP_RSS_TYPE_##PL
> +
> +#define ICE_PTT_UNUSED_ENTRY(PTYPE) [PTYPE] = 0
> +
> +/* A few supplementary definitions for when XDP hash types do not coincide
> + * with what can be generated from ptype definitions
> + * by means of preprocessor concatenation.
> + */
> +#define XDP_RSS_L3_NONE		XDP_RSS_TYPE_NONE
> +#define XDP_RSS_L4_NONE		XDP_RSS_TYPE_NONE
> +#define XDP_RSS_TYPE_PAY2	XDP_RSS_TYPE_L2
> +#define XDP_RSS_TYPE_PAY3	XDP_RSS_TYPE_NONE
> +#define XDP_RSS_TYPE_PAY4	XDP_RSS_L4
> +
> +static const enum xdp_rss_hash_type
> +ice_ptype_to_xdp_hash[ICE_NUM_DEFINED_PTYPES] = {
> +	ICE_PTYPES
> +};

Is there a big win in performance with this 600-byte static table
comparing to having several instructions which would do
to_parsed_ptype() and then build a return enum according to its fields?
I believe that would cost only several instructions. Not that it's a
disaster to consume 600 more bytes of rodata, but still.

Alternatively, you can look at how parsed ptype is compressed to 16 bit
in libie and use those saved bits to encode complete XDP RSS hash enum
directly there, so that ice_ptype_lkup[] would have both parsed ptype
and XDP hash return value :D

> +
> +#undef XDP_RSS_L3_NONE
> +#undef XDP_RSS_L4_NONE
> +#undef XDP_RSS_TYPE_PAY2
> +#undef XDP_RSS_TYPE_PAY3
> +#undef XDP_RSS_TYPE_PAY4

[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC bpf-next 07/23] ice: Support RX hash XDP hint
  2023-09-14 16:54   ` Alexander Lobakin
@ 2023-09-14 16:59     ` Larysa Zaremba
  0 siblings, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-14 16:59 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Sep 14, 2023 at 06:54:21PM +0200, Alexander Lobakin wrote:
> From: Larysa Zaremba <larysa.zaremba@intel.com>
> Date: Thu, 24 Aug 2023 21:26:46 +0200
> 
> > RX hash XDP hint requests both hash value and type.
> > Type is XDP-specific, so we need a separate way to map
> > these values to the hardware ptypes, so create a lookup table.
> > 
> > Instead of creating a new long list, reuse contents
> > of ice_decode_rx_desc_ptype[] through preprocessor.
> > 
> > Current hash type enum does not contain ICMP packet type,
> > but ice devices support it, so also add a new type into core code.
> > 
> > Then use previously refactored code and create a function
> > that allows XDP code to read RX hash.
> > 
> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> 
> [...]
> 
> >  	/* unused entries */
> > -	[154 ... 1023] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }
> > +	[ICE_NUM_DEFINED_PTYPES ... 1023] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 }
> >  };
> >  
> >  static inline struct ice_rx_ptype_decoded ice_decode_rx_desc_ptype(u16 ptype)
> > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > index 463d9e5cbe05..b11cfaedb81c 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
> > @@ -567,6 +567,79 @@ static int ice_xdp_rx_hw_ts(const struct xdp_md *ctx, u64 *ts_ns)
> >  	return 0;
> >  }
> >  
> > +/* Define a ptype index -> XDP hash type lookup table.
> > + * It uses the same ptype definitions as ice_decode_rx_desc_ptype[],
> > + * avoiding possible copy-paste errors.
> > + */
> > +#undef ICE_PTT
> > +#undef ICE_PTT_UNUSED_ENTRY
> > +
> > +#define ICE_PTT(PTYPE, OUTER_IP, OUTER_IP_VER, OUTER_FRAG, T, TE, TEF, I, PL)\
> > +	[PTYPE] = XDP_RSS_L3_##OUTER_IP_VER | XDP_RSS_L4_##I | XDP_RSS_TYPE_##PL
> > +
> > +#define ICE_PTT_UNUSED_ENTRY(PTYPE) [PTYPE] = 0
> > +
> > +/* A few supplementary definitions for when XDP hash types do not coincide
> > + * with what can be generated from ptype definitions
> > + * by means of preprocessor concatenation.
> > + */
> > +#define XDP_RSS_L3_NONE		XDP_RSS_TYPE_NONE
> > +#define XDP_RSS_L4_NONE		XDP_RSS_TYPE_NONE
> > +#define XDP_RSS_TYPE_PAY2	XDP_RSS_TYPE_L2
> > +#define XDP_RSS_TYPE_PAY3	XDP_RSS_TYPE_NONE
> > +#define XDP_RSS_TYPE_PAY4	XDP_RSS_L4
> > +
> > +static const enum xdp_rss_hash_type
> > +ice_ptype_to_xdp_hash[ICE_NUM_DEFINED_PTYPES] = {
> > +	ICE_PTYPES
> > +};
> 
> Is there a big win in performance with this 600-byte static table
> comparing to having several instructions which would do
> to_parsed_ptype() and then build a return enum according to its fields?
> I believe that would cost only several instructions. Not that it's a
> disaster to consume 600 more bytes of rodata, but still.
>

It is not disasterous either way, I have added this table after a discussion 
with team members and would like not to throw this away now.

> Alternatively, you can look at how parsed ptype is compressed to 16 bit
> in libie and use those saved bits to encode complete XDP RSS hash enum
> directly there, so that ice_ptype_lkup[] would have both parsed ptype
> and XDP hash return value :D
> 
> > +
> > +#undef XDP_RSS_L3_NONE
> > +#undef XDP_RSS_L4_NONE
> > +#undef XDP_RSS_TYPE_PAY2
> > +#undef XDP_RSS_TYPE_PAY3
> > +#undef XDP_RSS_TYPE_PAY4
> 
> [...]
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 10/23] ice: Implement VLAN tag hint
  2023-09-14 16:38       ` Alexander Lobakin
@ 2023-09-14 17:02         ` Larysa Zaremba
  2023-09-18 14:07         ` Larysa Zaremba
  1 sibling, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-14 17:02 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Sep 14, 2023 at 06:38:04PM +0200, Alexander Lobakin wrote:
> From: Larysa Zaremba <larysa.zaremba@intel.com>
> Date: Thu, 14 Sep 2023 18:28:07 +0200
> 
> > On Thu, Sep 14, 2023 at 06:25:04PM +0200, Alexander Lobakin wrote:
> >> From: Larysa Zaremba <larysa.zaremba@intel.com>
> >> Date: Thu, 24 Aug 2023 21:26:49 +0200
> 
> [...]
> 
> >>> +static void
> >>> +ice_set_rx_rings_vlan_proto(struct ice_vsi *vsi, __be16 vlan_ethertype)
> >>
> >> @vsi can be const (I hope).
> > 
> > I will try to make it const.
> > 
> >> Line can be broken on arguments, not type (I hope).
> >>
> > 
> > This is how we break the lines everywhere in this file though :/
> 
> I know and would really like us stop at least adding new such
> occurrences when not needed :s

I think with such minor stuff, it is more important to keep style consistent in 
files.

> 
> > 
> >>> +{
> >>> +	u16 i;
> >>> +
> >>> +	ice_for_each_alloc_rxq(vsi, i)
> >>> +		vsi->rx_rings[i]->pkt_ctx.vlan_proto = vlan_ethertype;
> >>> +}
> >>> +
> >>>  /**
> >>>   * ice_set_vlan_offload_features - set VLAN offload features for the PF VSI
> >>>   * @vsi: PF's VSI
> >>> @@ -6049,6 +6066,11 @@ ice_set_vlan_offload_features(struct ice_vsi *vsi, netdev_features_t features)
> >>>  	if (strip_err || insert_err)
> >>>  		return -EIO;
> >>>  
> >>> +	if (enable_stripping)
> >>> +		ice_set_rx_rings_vlan_proto(vsi, htons(vlan_ethertype));
> >>> +	else
> >>> +		ice_set_rx_rings_vlan_proto(vsi, 0);
> >>
> >> Ternary?
> > 
> > Would look ugly in this particular case, I think, too long expressions and no 
> > return values.
> 
> 	ice_set_rx_rings_vlan_proto(vsi, strip ? htons(vlan_ethertype) : 0);
> 
> ?
> 
> [...]
> 
> >>> -		vlan_tag = ice_get_vlan_tag_from_rx_desc(rx_desc);
> >>> +		vlan_tci = ice_get_vlan_tci(rx_desc);
> >>
> >> Unrelated: I never was a fan of scattering rx_desc parsing across
> >> several files, I remember I moved it to process_skb_fields() in both ice
> >> (Hints series) and iavf (libie), maybe do that here as well? Or way too
> >> out of context?
> > 
> > A little bit too unrelated to the purpose of the series, but a thing we must do 
> > in the future.
> 
> Sure, +
> 
> > 
> >>
> >>>  
> >>>  		/* pad the skb if needed, to make a valid ethernet frame */
> >>>  		if (eth_skb_pad(skb))
> >>
> >> [...]
> >>
> >> Thanks,
> >> Olek
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [xdp-hints] [RFC bpf-next 10/23] ice: Implement VLAN tag hint
  2023-09-14 16:38       ` Alexander Lobakin
  2023-09-14 17:02         ` Larysa Zaremba
@ 2023-09-18 14:07         ` Larysa Zaremba
  1 sibling, 0 replies; 72+ messages in thread
From: Larysa Zaremba @ 2023-09-18 14:07 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski,
	Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov,
	Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints,
	netdev, Willem de Bruijn, Alexei Starovoitov, Simon Horman,
	Tariq Toukan, Saeed Mahameed

On Thu, Sep 14, 2023 at 06:38:04PM +0200, Alexander Lobakin wrote:
> From: Larysa Zaremba <larysa.zaremba@intel.com>
> Date: Thu, 14 Sep 2023 18:28:07 +0200
> 
> > On Thu, Sep 14, 2023 at 06:25:04PM +0200, Alexander Lobakin wrote:
> >> From: Larysa Zaremba <larysa.zaremba@intel.com>
> >> Date: Thu, 24 Aug 2023 21:26:49 +0200
> 
> [...]
> 
> >>> +static void
> >>> +ice_set_rx_rings_vlan_proto(struct ice_vsi *vsi, __be16 vlan_ethertype)
> >>
> >> @vsi can be const (I hope).
> > 
> > I will try to make it const.
> > 
> >> Line can be broken on arguments, not type (I hope).
> >>
> > 
> > This is how we break the lines everywhere in this file though :/
> 
> I know and would really like us stop at least adding new such
> occurrences when not needed :s
> 
> > 
> >>> +{
> >>> +	u16 i;
> >>> +
> >>> +	ice_for_each_alloc_rxq(vsi, i)
> >>> +		vsi->rx_rings[i]->pkt_ctx.vlan_proto = vlan_ethertype;
> >>> +}
> >>> +
> >>>  /**
> >>>   * ice_set_vlan_offload_features - set VLAN offload features for the PF VSI
> >>>   * @vsi: PF's VSI
> >>> @@ -6049,6 +6066,11 @@ ice_set_vlan_offload_features(struct ice_vsi *vsi, netdev_features_t features)
> >>>  	if (strip_err || insert_err)
> >>>  		return -EIO;
> >>>  
> >>> +	if (enable_stripping)
> >>> +		ice_set_rx_rings_vlan_proto(vsi, htons(vlan_ethertype));
> >>> +	else
> >>> +		ice_set_rx_rings_vlan_proto(vsi, 0);
> >>
> >> Ternary?
> > 
> > Would look ugly in this particular case, I think, too long expressions and no 
> > return values.
> 
> 	ice_set_rx_rings_vlan_proto(vsi, strip ? htons(vlan_ethertype) : 0);
> 
> ?

Have missed this one the first time, sorry, makes sense this way :D

> 
> [...]
> 
> >>> -		vlan_tag = ice_get_vlan_tag_from_rx_desc(rx_desc);
> >>> +		vlan_tci = ice_get_vlan_tci(rx_desc);
> >>
> >> Unrelated: I never was a fan of scattering rx_desc parsing across
> >> several files, I remember I moved it to process_skb_fields() in both ice
> >> (Hints series) and iavf (libie), maybe do that here as well? Or way too
> >> out of context?
> > 
> > A little bit too unrelated to the purpose of the series, but a thing we must do 
> > in the future.
> 
> Sure, +
> 
> > 
> >>
> >>>  
> >>>  		/* pad the skb if needed, to make a valid ethernet frame */
> >>>  		if (eth_skb_pad(skb))
> >>
> >> [...]
> >>
> >> Thanks,
> >> Olek
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2023-09-18 16:05 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-24 19:26 [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 01/23] ice: make RX hash reading code more reusable Larysa Zaremba
2023-09-04 14:37   ` [xdp-hints] " Maciej Fijalkowski
2023-09-06 12:23     ` Alexander Lobakin
2023-09-14 16:12   ` Alexander Lobakin
2023-09-14 16:15     ` Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 02/23] ice: make RX HW timestamp " Larysa Zaremba
2023-09-04 14:56   ` Maciej Fijalkowski
2023-09-04 16:29     ` Larysa Zaremba
2023-09-05 15:22       ` Maciej Fijalkowski
2023-08-24 19:26 ` [RFC bpf-next 03/23] ice: make RX checksum checking " Larysa Zaremba
2023-09-04 15:02   ` [xdp-hints] " Maciej Fijalkowski
2023-09-04 18:01     ` Larysa Zaremba
2023-09-05 15:37       ` Maciej Fijalkowski
2023-09-05 16:53         ` Larysa Zaremba
2023-09-05 17:44           ` Maciej Fijalkowski
2023-09-06  9:28             ` Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 04/23] ice: Make ptype internal to descriptor info processing Larysa Zaremba
2023-09-04 15:04   ` Maciej Fijalkowski
2023-08-24 19:26 ` [RFC bpf-next 05/23] ice: Introduce ice_xdp_buff Larysa Zaremba
2023-09-04 15:32   ` [xdp-hints] " Maciej Fijalkowski
2023-09-04 18:11     ` Larysa Zaremba
2023-09-05 17:53       ` Maciej Fijalkowski
2023-09-07 14:21         ` Larysa Zaremba
2023-09-07 16:33           ` Stanislav Fomichev
2023-09-07 16:42             ` Maciej Fijalkowski
2023-09-07 16:43               ` Maciej Fijalkowski
2023-09-13 15:40                 ` Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 06/23] ice: Support HW timestamp hint Larysa Zaremba
2023-09-04 15:38   ` Maciej Fijalkowski
2023-09-04 18:12     ` Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 07/23] ice: Support RX hash XDP hint Larysa Zaremba
2023-09-05 15:42   ` [xdp-hints] " Maciej Fijalkowski
2023-09-05 17:09     ` Larysa Zaremba
2023-09-06 12:03     ` Alexander Lobakin
2023-09-14 16:54   ` Alexander Lobakin
2023-09-14 16:59     ` Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 08/23] ice: Support XDP hints in AF_XDP ZC mode Larysa Zaremba
2023-09-04 15:42   ` [xdp-hints] " Maciej Fijalkowski
2023-09-04 18:14     ` Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 09/23] xdp: Add VLAN tag hint Larysa Zaremba
2023-08-24 22:02   ` kernel test robot
2023-09-14 16:18   ` Alexander Lobakin
2023-09-14 16:21     ` Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 10/23] ice: Implement " Larysa Zaremba
2023-09-04 16:00   ` Maciej Fijalkowski
2023-09-04 18:18     ` Larysa Zaremba
2023-09-14 16:25   ` [xdp-hints] " Alexander Lobakin
2023-09-14 16:28     ` Larysa Zaremba
2023-09-14 16:38       ` Alexander Lobakin
2023-09-14 17:02         ` Larysa Zaremba
2023-09-18 14:07         ` Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 11/23] ice: use VLAN proto from ring packet context in skb path Larysa Zaremba
2023-09-14 16:30   ` Alexander Lobakin
2023-09-14 16:30     ` Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 12/23] xdp: Add checksum hint Larysa Zaremba
2023-08-24 22:56   ` kernel test robot
2023-09-14 16:34   ` Alexander Lobakin
2023-08-24 19:26 ` [RFC bpf-next 13/23] ice: Implement " Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 14/23] selftests/bpf: Allow VLAN packets in xdp_hw_metadata Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 15/23] net, xdp: allow metadata > 32 Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 16/23] selftests/bpf: Add flags and new hints to xdp_hw_metadata Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 17/23] veth: Implement VLAN tag and checksum XDP hint Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 18/23] net: make vlan_get_tag() return -ENODATA instead of -EINVAL Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 19/23] selftests/bpf: Use AF_INET for TX in xdp_metadata Larysa Zaremba
2023-08-24 19:26 ` [RFC bpf-next 20/23] selftests/bpf: Check VLAN tag and proto " Larysa Zaremba
2023-08-24 19:27 ` [RFC bpf-next 21/23] selftests/bpf: check checksum state " Larysa Zaremba
2023-08-24 19:27 ` [RFC bpf-next 22/23] mlx5: implement VLAN tag XDP hint Larysa Zaremba
2023-08-24 19:27 ` [RFC bpf-next 23/23] mlx5: implement RX checksum " Larysa Zaremba
2023-08-31 14:50 ` [RFC bpf-next 00/23] XDP metadata via kfuncs for ice + mlx5 Larysa Zaremba
2023-09-04 16:06 ` [xdp-hints] " Maciej Fijalkowski
2023-09-06 14:09   ` Larysa Zaremba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.