All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC bpf-next 0/9] Introduce XDP-hints via BTF
@ 2022-06-28 16:30 Jesper Dangaard Brouer
  2022-06-28 16:30 ` [PATCH RFC bpf-next 1/9] i40e: Refactor i40e_ptp_rx_hwtstamp Jesper Dangaard Brouer
                   ` (8 more replies)
  0 siblings, 9 replies; 15+ messages in thread
From: Jesper Dangaard Brouer @ 2022-06-28 16:30 UTC (permalink / raw)
  To: bpf; +Cc: xdp-hints, Jesper Dangaard Brouer

This patchset expose the traditional hardware offload hints to XDP and
rely on BTF to expose the layout to users.  More advanced use-case with
driver specific offloads will likely be in followup patches.

The users/consumers are (as described in [1]):
 - XDP BPF-progs
 - XDP to SKB conversion gaining HW offloads
 - AF_XDP can consume BTF info in userspace
 - Chained BPF-progs can communicate state via metadata

This is still RFC as the following features are missing:
 - Exposing XDP-hints indication in AF_XDP descriptor
 - Exporting what XDP-hints structs are avail per driver

Two drivers i40e and mvneta gets XDP-hints in this patchset.

[1] https://github.com/xdp-project/xdp-project/blob/master/conference/LLC2022/xdp_hints_hw_metadata-final.pdf

---

Jesper Dangaard Brouer (8):
      i40e: Refactor i40e_ptp_rx_hwtstamp
      bpf: export btf functions for modules
      net: create xdp_hints_common and set functions
      net: add net_device feature flag for XDP-hints
      xdp: controlling XDP-hints from BPF-prog via helper
      i40e: refactor i40e_rx_checksum with helper
      i40e: add XDP-hints handling
      net: use XDP-hints in xdp_frame to SKB conversion

Lorenzo Bianconi (1):
      mvneta: add XDP-hints support


 drivers/net/ethernet/intel/i40e/i40e.h      |   1 +
 drivers/net/ethernet/intel/i40e/i40e_main.c |  34 +++
 drivers/net/ethernet/intel/i40e/i40e_ptp.c  |  36 +++-
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 222 ++++++++++++++++----
 drivers/net/ethernet/marvell/mvneta.c       |  61 +++++-
 include/linux/btf.h                         |   2 +
 include/linux/netdev_features.h             |   3 +-
 include/net/xdp.h                           | 181 +++++++++++++++-
 include/uapi/linux/bpf.h                    |  43 ++++
 kernel/bpf/btf.c                            |  13 +-
 net/core/filter.c                           |  45 ++++
 net/core/xdp.c                              |  73 ++++++-
 net/ethtool/common.c                        |   1 +
 13 files changed, 644 insertions(+), 71 deletions(-)

--


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC bpf-next 1/9] i40e: Refactor i40e_ptp_rx_hwtstamp
  2022-06-28 16:30 [PATCH RFC bpf-next 0/9] Introduce XDP-hints via BTF Jesper Dangaard Brouer
@ 2022-06-28 16:30 ` Jesper Dangaard Brouer
  2022-06-28 16:30 ` [PATCH RFC bpf-next 2/9] bpf: export btf functions for modules Jesper Dangaard Brouer
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Jesper Dangaard Brouer @ 2022-06-28 16:30 UTC (permalink / raw)
  To: bpf; +Cc: xdp-hints, Jesper Dangaard Brouer

No functional change, this is in preparation for later patches.

Introduce i40e_ptp_rx_hwtstamp_raw() that doesn't depend on skb pointer
as input. Keep i40e_ptp_rx_hwtstamp with same semantics as before.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/intel/i40e/i40e.h     |    1 +
 drivers/net/ethernet/intel/i40e/i40e_ptp.c |   36 +++++++++++++++++++++-------
 2 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 57f4ec4f8d2f..9eb6ea92eafe 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -1246,6 +1246,7 @@ void i40e_ptp_rx_hang(struct i40e_pf *pf);
 void i40e_ptp_tx_hang(struct i40e_pf *pf);
 void i40e_ptp_tx_hwtstamp(struct i40e_pf *pf);
 void i40e_ptp_rx_hwtstamp(struct i40e_pf *pf, struct sk_buff *skb, u8 index);
+u64  i40e_ptp_rx_hwtstamp_raw(struct i40e_pf *pf, u8 index);
 void i40e_ptp_set_increment(struct i40e_pf *pf);
 int i40e_ptp_set_ts_config(struct i40e_pf *pf, struct ifreq *ifr);
 int i40e_ptp_get_ts_config(struct i40e_pf *pf, struct ifreq *ifr);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ptp.c b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
index 61e5789d78db..8906e4bbf291 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ptp.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
@@ -816,18 +816,16 @@ void i40e_ptp_tx_hwtstamp(struct i40e_pf *pf)
 }
 
 /**
- * i40e_ptp_rx_hwtstamp - Utility function which checks for an Rx timestamp
+ * i40e_ptp_rx_hwtstamp_raw - Utility function which checks for an Rx timestamp
  * @pf: Board private structure
- * @skb: Particular skb to send timestamp with
  * @index: Index into the receive timestamp registers for the timestamp
  *
  * The XL710 receives a notification in the receive descriptor with an offset
- * into the set of RXTIME registers where the timestamp is for that skb. This
+ * into the set of RXTIME registers where the timestamp is for that pkt. This
  * function goes and fetches the receive timestamp from that offset, if a valid
- * one exists. The RXTIME registers are in ns, so we must convert the result
- * first.
+ * one exists, else zero is returned.
  **/
-void i40e_ptp_rx_hwtstamp(struct i40e_pf *pf, struct sk_buff *skb, u8 index)
+u64 i40e_ptp_rx_hwtstamp_raw(struct i40e_pf *pf, u8 index)
 {
 	u32 prttsyn_stat, hi, lo;
 	struct i40e_hw *hw;
@@ -837,7 +835,7 @@ void i40e_ptp_rx_hwtstamp(struct i40e_pf *pf, struct sk_buff *skb, u8 index)
 	 * doing Tx timestamping, check if Rx timestamping is configured.
 	 */
 	if (!(pf->flags & I40E_FLAG_PTP) || !pf->ptp_rx)
-		return;
+		return 0;
 
 	hw = &pf->hw;
 
@@ -849,7 +847,7 @@ void i40e_ptp_rx_hwtstamp(struct i40e_pf *pf, struct sk_buff *skb, u8 index)
 	/* TODO: Should we warn about missing Rx timestamp event? */
 	if (!(prttsyn_stat & BIT(index))) {
 		spin_unlock_bh(&pf->ptp_rx_lock);
-		return;
+		return 0;
 	}
 
 	/* Clear the latched event since we're about to read its register */
@@ -862,7 +860,27 @@ void i40e_ptp_rx_hwtstamp(struct i40e_pf *pf, struct sk_buff *skb, u8 index)
 
 	ns = (((u64)hi) << 32) | lo;
 
-	i40e_ptp_convert_to_hwtstamp(skb_hwtstamps(skb), ns);
+	return ns;
+}
+
+/**
+ * i40e_ptp_rx_hwtstamp - Utility function which checks for an Rx timestamp
+ * @pf: Board private structure
+ * @skb: Particular skb to send timestamp with
+ * @index: Index into the receive timestamp registers for the timestamp
+ *
+ * The XL710 receives a notification in the receive descriptor with an offset
+ * into the set of RXTIME registers where the timestamp is for that skb. This
+ * function goes and fetches the receive timestamp from that offset, if a valid
+ * one exists. The RXTIME registers are in ns, so we must convert the result
+ * first.
+ **/
+void i40e_ptp_rx_hwtstamp(struct i40e_pf *pf, struct sk_buff *skb, u8 index)
+{
+	u64 ns = i40e_ptp_rx_hwtstamp_raw(pf, index);
+
+	if (ns)
+		i40e_ptp_convert_to_hwtstamp(skb_hwtstamps(skb), ns);
 }
 
 /**



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RFC bpf-next 2/9] bpf: export btf functions for modules
  2022-06-28 16:30 [PATCH RFC bpf-next 0/9] Introduce XDP-hints via BTF Jesper Dangaard Brouer
  2022-06-28 16:30 ` [PATCH RFC bpf-next 1/9] i40e: Refactor i40e_ptp_rx_hwtstamp Jesper Dangaard Brouer
@ 2022-06-28 16:30 ` Jesper Dangaard Brouer
  2022-06-28 16:30 ` [PATCH RFC bpf-next 3/9] net: create xdp_hints_common and set functions Jesper Dangaard Brouer
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Jesper Dangaard Brouer @ 2022-06-28 16:30 UTC (permalink / raw)
  To: bpf; +Cc: xdp-hints, Jesper Dangaard Brouer

This allows modules to lookup their own module BTF info.

These are get and set operations that bump the refcount.
Thus, modules can use this to control the lifetime.

Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/linux/btf.h |    2 ++
 kernel/bpf/btf.c    |   13 ++++++++++++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 1bfed7fa0428..df9776018059 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -126,6 +126,8 @@ u32 btf_obj_id(const struct btf *btf);
 bool btf_is_kernel(const struct btf *btf);
 bool btf_is_module(const struct btf *btf);
 struct module *btf_try_get_module(const struct btf *btf);
+struct btf *btf_get_module_btf(const struct module *module);
+void btf_put_module_btf(struct btf *btf);
 u32 btf_nr_types(const struct btf *btf);
 bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,
 			   const struct btf_member *m,
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 2e2066d6af94..96fd5e469b42 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -534,6 +534,7 @@ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
 
 	return -ENOENT;
 }
+EXPORT_SYMBOL_GPL(btf_find_by_name_kind);
 
 static s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
 {
@@ -1674,6 +1675,15 @@ void btf_put(struct btf *btf)
 	}
 }
 
+void btf_put_module_btf(struct btf *btf)
+{
+	if (!btf_is_module(btf))
+		return;
+
+	btf_put(btf);
+}
+EXPORT_SYMBOL_GPL(btf_put_module_btf);
+
 static int env_resolve_init(struct btf_verifier_env *env)
 {
 	struct btf *btf = env->btf;
@@ -7022,7 +7032,7 @@ struct module *btf_try_get_module(const struct btf *btf)
 /* Returns struct btf corresponding to the struct module.
  * This function can return NULL or ERR_PTR.
  */
-static struct btf *btf_get_module_btf(const struct module *module)
+struct btf *btf_get_module_btf(const struct module *module)
 {
 #ifdef CONFIG_DEBUG_INFO_BTF_MODULES
 	struct btf_module *btf_mod, *tmp;
@@ -7051,6 +7061,7 @@ static struct btf *btf_get_module_btf(const struct module *module)
 
 	return btf;
 }
+EXPORT_SYMBOL_GPL(btf_get_module_btf);
 
 BPF_CALL_4(bpf_btf_find_by_name_kind, char *, name, int, name_sz, u32, kind, int, flags)
 {



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RFC bpf-next 3/9] net: create xdp_hints_common and set functions
  2022-06-28 16:30 [PATCH RFC bpf-next 0/9] Introduce XDP-hints via BTF Jesper Dangaard Brouer
  2022-06-28 16:30 ` [PATCH RFC bpf-next 1/9] i40e: Refactor i40e_ptp_rx_hwtstamp Jesper Dangaard Brouer
  2022-06-28 16:30 ` [PATCH RFC bpf-next 2/9] bpf: export btf functions for modules Jesper Dangaard Brouer
@ 2022-06-28 16:30 ` Jesper Dangaard Brouer
  2022-06-28 16:30 ` [PATCH RFC bpf-next 4/9] net: add net_device feature flag for XDP-hints Jesper Dangaard Brouer
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Jesper Dangaard Brouer @ 2022-06-28 16:30 UTC (permalink / raw)
  To: bpf; +Cc: xdp-hints, Jesper Dangaard Brouer

XDP-hints via BTF are about giving drivers the ability to extend the
common set of hardware offload hints in a flexible way.

This patch start out with defining the common set, based on what is
used available in the SKB. Having this as a common struct in core
vmlinux makes it easier to implement xdp_frame to SKB conversion
routines as normal C-code, see later patches.

Drivers can redefine the layout of the entire metadata area, but are
encouraged to use this common struct as the base, on which they can
extend on top for their extra hardware offload hints. When doing so,
drivers can mark the xdp_buff (and xdp_frame) with flags indicating
this it compatible with the common struct.

Patch also provides XDP-hints driver helper functions for updating the
common struct. Helpers gets inlined and are defined for maximum
performance, which does require some extra care in drivers, e.g. to
keep track of flags to reduce data dependencies, see code DOC.

Userspace and BPF-prog's MUST not consider the common struct UAPI.
The common struct (and enum flags) are only exposed via BTF, which
implies consumers must read and decode this BTF before using/consuming
data layout.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/net/xdp.h |  133 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 net/core/xdp.c    |    5 ++
 2 files changed, 138 insertions(+)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index 04c852c7a77f..5b77fc8fe5ce 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -8,6 +8,137 @@
 
 #include <linux/skbuff.h> /* skb_shared_info */
 
+/**
+ * DOC: XDP hints
+ *
+ * Drivers should likely include &struct xdp_hints_common as part of the driver
+ * specific xdp_hints struct's, but at the end-of their struct given XDP
+ * metadata area grows backwards.
+ *
+ * The &enum xdp_hints_flags have reserved the first 16 bits for common flags
+ * and drivers can introduce use their own flags bits from BIT(16). For
+ * BPF-progs to find these flags (via BTF) drivers should define an enum
+ * xdp_hints_flags_driver.
+ */
+struct xdp_hints_common {
+	union {
+		__wsum		csum;
+		struct {
+			__u16	csum_start;
+			__u16	csum_offset;
+		};
+	};
+	u16 rx_queue;
+	u16 vlan_tci;
+	u32 rx_hash32;
+	u32 xdp_hints_flags;
+	u32 btf_id;
+} __attribute__((aligned(4))) __attribute__((packed));
+
+enum xdp_hints_flags {
+	HINT_FLAG_CSUM_TYPE_BIT0  = BIT(0),
+	HINT_FLAG_CSUM_TYPE_BIT1  = BIT(1),
+	HINT_FLAG_CSUM_TYPE_MASK  = 0x3,
+
+	HINT_FLAG_CSUM_LEVEL_BIT0 = BIT(2),
+	HINT_FLAG_CSUM_LEVEL_BIT1 = BIT(3),
+	HINT_FLAG_CSUM_LEVEL_MASK = 0xC,
+	HINT_FLAG_CSUM_LEVEL_SHIFT = 2,
+
+	HINT_FLAG_RX_HASH_TYPE_BIT0 = BIT(4),
+	HINT_FLAG_RX_HASH_TYPE_BIT1 = BIT(5),
+	HINT_FLAG_RX_HASH_TYPE_MASK = 0x30,
+	HINT_FLAG_RX_HASH_TYPE_SHIFT = 0x4,
+
+	HINT_FLAG_RX_QUEUE = BIT(7),
+
+	HINT_FLAG_VLAN_PRESENT            = BIT(8),
+	HINT_FLAG_VLAN_PROTO_ETH_P_8021Q  = BIT(9),
+	HINT_FLAG_VLAN_PROTO_ETH_P_8021AD = BIT(10),
+	/* Flags from BIT(16) can be used by drivers */
+};
+
+/** enum xdp_hints_csum_type - BTF exposing checksum defines
+ *
+ * This enum is primarily for BTF exposing ``CHECKSUM_*`` defines (as
+ * an enum) used by &struct skb->ip_summed.
+ *
+ * These values are stored in &enum xdp_hints_flags as bit locations
+ * ``HINT_FLAG_CSUM_TYPE_BIT*``
+
+Maps to skbuff.h skb->ipsummed values. Stored in xdp_hints_flags in
+ * HINT_FLAG_CSUM_TYPE_*
+ */
+enum xdp_hints_csum_type {
+	HINT_CHECKSUM_NONE        = CHECKSUM_NONE,
+	HINT_CHECKSUM_UNNECESSARY = CHECKSUM_UNNECESSARY,
+	HINT_CHECKSUM_COMPLETE    = CHECKSUM_COMPLETE,
+	HINT_CHECKSUM_PARTIAL     = CHECKSUM_PARTIAL,
+};
+
+/** DOC: XDP hints driver helpers
+ *
+ * Helpers for drivers updating struct xdp_hints_common.
+ *
+ * Avoid creating a data dependency on xdp_hints_flags via returning the flags
+ * that need to be set.  Drivers MUST update the xdp_hints_flags member
+ * themselves, which allows drivers to construct code with less data dependency
+ * between instructions by OR'ing the final flags together.
+ */
+
+/* Drivers please use this simple helper to ease changes across drives */
+static __always_inline void xdp_hints_set_flags(struct xdp_hints_common *hints,
+						u32 flags)
+{
+	hints->xdp_hints_flags = flags;
+}
+
+static __always_inline u32 xdp_hints_set_rx_csum(
+	struct xdp_hints_common *hints,
+	u16 type, u16 level)
+{
+	u32 flags;
+
+	flags = type & HINT_FLAG_CSUM_TYPE_MASK;
+	flags |= (level << HINT_FLAG_CSUM_LEVEL_SHIFT)
+		& HINT_FLAG_CSUM_LEVEL_MASK;
+
+	// TODO: handle CHECKSUM_PARTIAL and COMPLETE (needs updating *hints)
+	return flags;
+}
+
+/* @type	Must be &enum enum pkt_hash_types (PKT_HASH_TYPE_*) */
+static __always_inline u32 xdp_hints_set_rx_hash(
+	struct xdp_hints_common *hints,
+	u32 hash, u32 type)
+{
+	hints->rx_hash32 = hash;
+	return (type << HINT_FLAG_RX_HASH_TYPE_SHIFT) &
+		HINT_FLAG_RX_HASH_TYPE_MASK;
+}
+
+static __always_inline u32 xdp_hints_set_rxq(struct xdp_hints_common *hints,
+					     u16 q_idx)
+{
+	hints->rx_queue = q_idx;
+	return HINT_FLAG_RX_QUEUE;
+}
+
+/* @proto	Must be ETH_P_8021Q or ETH_P_8021AD in network order */
+static __always_inline u32 xdp_hints_set_vlan(struct xdp_hints_common *hints,
+					      u16 vlan_tag, const u16 proto)
+{
+	u32 flags = HINT_FLAG_VLAN_PRESENT;
+
+	hints->vlan_tci = vlan_tag;
+	if (proto == htons(ETH_P_8021Q))
+		flags |= HINT_FLAG_VLAN_PROTO_ETH_P_8021Q;
+	if (proto == htons(ETH_P_8021AD))
+		flags |= HINT_FLAG_VLAN_PROTO_ETH_P_8021AD;
+
+	return flags;
+}
+
 /**
  * DOC: XDP RX-queue information
  *
@@ -72,6 +203,8 @@ enum xdp_buff_flags {
 	XDP_FLAGS_FRAGS_PF_MEMALLOC	= BIT(1), /* xdp paged memory is under
 						   * pressure
 						   */
+	XDP_FLAGS_HAS_HINTS		= BIT(2),
+	XDP_FLAGS_HINTS_COMPAT_COMMON	= BIT(3),
 };
 
 struct xdp_buff {
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 24420209bf0e..a57bd5278b47 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -33,6 +33,11 @@ static int mem_id_next = MEM_ID_MIN;
 static bool mem_id_init; /* false */
 static struct rhashtable *mem_id_ht;
 
+/* Make xdp_hints part of core vmlinux BTF */
+struct xdp_hints_common  xdp_hints_common;
+enum xdp_hints_flags     xdp_hints_flags;
+enum xdp_hints_csum_type xdp_hints_csum_type;
+
 static u32 xdp_mem_id_hashfn(const void *data, u32 len, u32 seed)
 {
 	const u32 *k = data;



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RFC bpf-next 4/9] net: add net_device feature flag for XDP-hints
  2022-06-28 16:30 [PATCH RFC bpf-next 0/9] Introduce XDP-hints via BTF Jesper Dangaard Brouer
                   ` (2 preceding siblings ...)
  2022-06-28 16:30 ` [PATCH RFC bpf-next 3/9] net: create xdp_hints_common and set functions Jesper Dangaard Brouer
@ 2022-06-28 16:30 ` Jesper Dangaard Brouer
  2022-06-28 16:30 ` [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper Jesper Dangaard Brouer
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Jesper Dangaard Brouer @ 2022-06-28 16:30 UTC (permalink / raw)
  To: bpf; +Cc: xdp-hints, Jesper Dangaard Brouer

Make it possible to turnoff XDP-hints for a given net_device.

It is recommended that drivers default turn on XDP-hints as the
overhead is generally low, extracting these hardware hints, and the
benefit is usually higher than this small overhead e.g. getting HW to
do RX checksumming are usually a higher gain.

Some XDP use-case are not ready to take this small overhead. Thus, the
possibility to turn off XDP-hints is need to keep performance of these
use-cases intact.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/linux/netdev_features.h |    3 ++-
 net/ethtool/common.c            |    1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index 7c2d77d75a88..713f04eab497 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -14,7 +14,7 @@ typedef u64 netdev_features_t;
 enum {
 	NETIF_F_SG_BIT,			/* Scatter/gather IO. */
 	NETIF_F_IP_CSUM_BIT,		/* Can checksum TCP/UDP over IPv4. */
-	__UNUSED_NETIF_F_1,
+	NETIF_F_XDP_HINTS_BIT,		/* Populates XDP-hints metadata */
 	NETIF_F_HW_CSUM_BIT,		/* Can checksum all the packets. */
 	NETIF_F_IPV6_CSUM_BIT,		/* Can checksum TCP/UDP over IPV6 */
 	NETIF_F_HIGHDMA_BIT,		/* Can DMA to high memory. */
@@ -168,6 +168,7 @@ enum {
 #define NETIF_F_HW_HSR_TAG_RM	__NETIF_F(HW_HSR_TAG_RM)
 #define NETIF_F_HW_HSR_FWD	__NETIF_F(HW_HSR_FWD)
 #define NETIF_F_HW_HSR_DUP	__NETIF_F(HW_HSR_DUP)
+#define NETIF_F_XDP_HINTS	__NETIF_F(XDP_HINTS)
 
 /* Finds the next feature with the highest number of the range of start-1 till 0.
  */
diff --git a/net/ethtool/common.c b/net/ethtool/common.c
index 566adf85e658..a9c62482220f 100644
--- a/net/ethtool/common.c
+++ b/net/ethtool/common.c
@@ -11,6 +11,7 @@
 const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN] = {
 	[NETIF_F_SG_BIT] =               "tx-scatter-gather",
 	[NETIF_F_IP_CSUM_BIT] =          "tx-checksum-ipv4",
+	[NETIF_F_XDP_HINTS_BIT] =        "xdp-hints",
 	[NETIF_F_HW_CSUM_BIT] =          "tx-checksum-ip-generic",
 	[NETIF_F_IPV6_CSUM_BIT] =        "tx-checksum-ipv6",
 	[NETIF_F_HIGHDMA_BIT] =          "highdma",



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper
  2022-06-28 16:30 [PATCH RFC bpf-next 0/9] Introduce XDP-hints via BTF Jesper Dangaard Brouer
                   ` (3 preceding siblings ...)
  2022-06-28 16:30 ` [PATCH RFC bpf-next 4/9] net: add net_device feature flag for XDP-hints Jesper Dangaard Brouer
@ 2022-06-28 16:30 ` Jesper Dangaard Brouer
  2022-06-29 14:20   ` [xdp-hints] " Toke Høiland-Jørgensen
  2022-06-28 16:31 ` [PATCH RFC bpf-next 6/9] i40e: refactor i40e_rx_checksum with helper Jesper Dangaard Brouer
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Jesper Dangaard Brouer @ 2022-06-28 16:30 UTC (permalink / raw)
  To: bpf; +Cc: xdp-hints, Jesper Dangaard Brouer

XDP BPF-prog's need a way to interact with the XDP-hints. This patch
introduces a BPF-helper function, that allow XDP BPF-prog's to interact
with the XDP-hints.

BPF-prog can query if any XDP-hints have been setup and if this is
compatible with the xdp_hints_common struct. If XDP-hints are available
the BPF "origin" is returned (see enum xdp_hints_btf_origin) as BTF can
come from different sources or origins e.g. vmlinux, module or local.

Remember that XDP-hints are setup by the driver prior to calling the
XDP BPF-prog, which is useful for adjusting the HW provided XDP-hints
in-case of HW issues or missing HW features, for use-case like xdp2skb
or AF_XDP.

The BPP-prog might also prefer to use metadata area for other things,
either disabling XDP-hints or updating with another XDP-hints layout
that might still be compatible with common struct. Thus, helper have
"update" and "delete" mode flags.

RFC/TODO: Improve patch: Can verifier validate provided BTF on "update"
and detect if compatible with common struct???

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/net/xdp.h        |   42 ++++++++++++++++++++++++++++++++++++++----
 include/uapi/linux/bpf.h |   43 +++++++++++++++++++++++++++++++++++++++++++
 net/core/filter.c        |   45 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 126 insertions(+), 4 deletions(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index 5b77fc8fe5ce..710d145a26f9 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -199,14 +199,22 @@ struct xdp_txq_info {
 };
 
 enum xdp_buff_flags {
-	XDP_FLAGS_HAS_FRAGS		= BIT(0), /* non-linear xdp buff */
-	XDP_FLAGS_FRAGS_PF_MEMALLOC	= BIT(1), /* xdp paged memory is under
+	XDP_FLAGS_HINTS_ORIGIN_BIT0	= BIT(0),/* enum xdp_hints_btf_origin */
+	XDP_FLAGS_HINTS_ORIGIN_BIT1	= BIT(1),
+#define	XDP_FLAGS_HINTS_COMPAT_COMMON_	  BIT(3) /* HINTS_BTF_COMPAT_COMMON */
+	XDP_FLAGS_HINTS_COMPAT_COMMON	= XDP_FLAGS_HINTS_COMPAT_COMMON_,
+
+	XDP_FLAGS_HAS_FRAGS		= BIT(4), /* non-linear xdp buff */
+	XDP_FLAGS_FRAGS_PF_MEMALLOC	= BIT(5), /* xdp paged memory is under
 						   * pressure
 						   */
-	XDP_FLAGS_HAS_HINTS		= BIT(2),
-	XDP_FLAGS_HINTS_COMPAT_COMMON	= BIT(3),
 };
 
+#define XDP_FLAGS_HINTS_ORIGIN_MASK	(XDP_FLAGS_HINTS_ORIGIN_BIT0 |	\
+					 XDP_FLAGS_HINTS_ORIGIN_BIT1)
+#define XDP_FLAGS_HINTS_RETURN_MASK	(XDP_FLAGS_HINTS_ORIGIN_MASK |	\
+					 XDP_FLAGS_HINTS_COMPAT_COMMON)
+
 struct xdp_buff {
 	void *data;
 	void *data_end;
@@ -243,6 +251,32 @@ static __always_inline void xdp_buff_set_frag_pfmemalloc(struct xdp_buff *xdp)
 	xdp->flags |= XDP_FLAGS_FRAGS_PF_MEMALLOC;
 }
 
+static __always_inline bool xdp_buff_has_hints(struct xdp_buff *xdp)
+{
+	return !!(xdp->flags & XDP_FLAGS_HINTS_ORIGIN_MASK);
+}
+
+static __always_inline bool xdp_buff_has_hints_compat(struct xdp_buff *xdp)
+{
+	u32 flags = xdp->flags;
+
+	if (!(flags & XDP_FLAGS_HINTS_COMPAT_COMMON))
+		return false;
+
+	return !!(flags & XDP_FLAGS_HINTS_ORIGIN_MASK);
+}
+
+static __always_inline void xdp_buff_set_hints(struct xdp_buff *xdp,
+					       u32 btf_origin,
+					       bool is_compat_common)
+{
+	u32 common = is_compat_common ? XDP_FLAGS_HINTS_COMPAT_COMMON : 0;
+
+	/* enum xdp_hints_btf_origin */
+	btf_origin &= XDP_FLAGS_HINTS_ORIGIN_MASK;
+	xdp->flags |= btf_origin | common;
+}
+
 static __always_inline void
 xdp_init_buff(struct xdp_buff *xdp, u32 frame_sz, struct xdp_rxq_info *rxq)
 {
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index e81362891596..1c3780c02239 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5325,6 +5325,31 @@ union bpf_attr {
  *		**-EACCES** if the SYN cookie is not valid.
  *
  *		**-EPROTONOSUPPORT** if CONFIG_IPV6 is not builtin.
+ *
+ * long xdp_hints_btf(struct xdp_buff *xdp_md, u32 btf_origin, u64 flags)
+ *     Description
+ *             Update and get info on XDP hints BTF type and origin.
+ *
+ *             Drivers can provide XDP-hints information via the metadata area,
+ *             which defines the layout of this area via BTF. The BTF ID is
+ *             available as the last member. The BTF ID can originate from
+ *             different sources, e.g. vmlinux, module or local BTF-object.
+ *
+ *             In-case a BPF-prog want to redefine the layout of this area it
+ *             should update the BTF ID (last-member) and MUST call this helper
+ *             to specify the origin for the BTF ID.
+ *
+ *             If updating the BTF ID then caller can request that the layout
+ *             is compatible with kernels xdp_hints_common. This is then
+ *             validated (TODO HOW?!?) before kernel side trust this can be
+ *             used for e.g. populating SKB fields.
+ *
+ *             The **flags** are used to control the mode of the helper.
+ *
+ *     Return
+ *             Returns indications on whether XDP-hints were populated by
+ *             driver via an 'origin' value and whether this layout is
+ *             compatible with kernels xdp_hints_common.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5535,6 +5560,7 @@ union bpf_attr {
 	FN(tcp_raw_gen_syncookie_ipv6),	\
 	FN(tcp_raw_check_syncookie_ipv4),	\
 	FN(tcp_raw_check_syncookie_ipv6),	\
+	FN(xdp_hints_btf),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
@@ -5946,6 +5972,23 @@ struct xdp_md {
 	__u32 egress_ifindex;  /* txq->dev->ifindex */
 };
 
+/* Mode flags for BPF_FUNC_xdp_hints_btf helper. */
+enum xdp_hints_btf_mode_flags {
+	HINTS_BTF_QUERY_ONLY    = (1U << 0),
+	HINTS_BTF_UPDATE        = (1U << 1),
+	HINTS_BTF_DISABLE       = (1U << 2),
+	HINTS_BTF_COMPAT_COMMON = (1U << 3),
+};
+
+/* BTF can come from different sources e.g. vmlinux, module or local */
+enum xdp_hints_btf_origin {
+	BTF_ORIGIN_NONE    = 0,
+	BTF_ORIGIN_VMLINUX = 1,
+	BTF_ORIGIN_MODULE  = 2,
+	BTF_ORIGIN_LOCAL   = 3,
+	BTF_ORIGIN_MASK    = 0x3,
+};
+
 /* DEVMAP map-value layout
  *
  * The struct data-layout of map-value is a configuration interface.
diff --git a/net/core/filter.c b/net/core/filter.c
index 151aa4756bd6..614054f89fdc 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -6129,6 +6129,49 @@ static const struct bpf_func_proto bpf_xdp_check_mtu_proto = {
 	.arg5_type      = ARG_ANYTHING,
 };
 
+/* btf_origin type &enum xdp_hints_btf_origin
+ * flags      type &enum xdp_hints_btf_mode_flags
+ */
+BPF_CALL_3(bpf_xdp_hints_btf, struct xdp_buff *, xdp, u32, btf_origin, u64, flags)
+{
+	s64 ret = BTF_ORIGIN_NONE;
+	bool is_compat_common;
+
+	if (flags & HINTS_BTF_QUERY_ONLY) {
+		BUILD_BUG_ON(HINTS_BTF_COMPAT_COMMON != XDP_FLAGS_HINTS_COMPAT_COMMON_);
+		ret = xdp->flags & XDP_FLAGS_HINTS_RETURN_MASK;
+		goto out;
+	}
+	if (flags & HINTS_BTF_DISABLE) {
+		xdp_buff_set_hints(xdp, BTF_ORIGIN_NONE, false);
+		goto out;
+	}
+	if (flags & HINTS_BTF_UPDATE) {
+		is_compat_common = !!(flags & HINTS_BTF_COMPAT_COMMON);
+	/* TODO: Can kernel validate if hints are BTF compat with common? */
+	/* TODO: Could BPF prog provide BTF as ARG_PTR_TO_BTF_ID to prove compat_common ? */
+	/* TODO: Validate if metadata size is >= sizeof(xdp_hints_common) */
+		btf_origin &= BTF_ORIGIN_MASK;
+	/* TODO: Validate if module BTF_ID is large than vmlinux base */
+		xdp_buff_set_hints(xdp, btf_origin, is_compat_common);
+		BUILD_BUG_ON(BTF_ORIGIN_MASK != XDP_FLAGS_HINTS_ORIGIN_MASK);
+		ret = xdp->flags & XDP_FLAGS_HINTS_RETURN_MASK;
+		goto out;
+	}
+
+ out:
+	return ret;
+}
+
+static const struct bpf_func_proto bpf_xdp_hints_btf_proto = {
+	.func		= bpf_xdp_hints_btf,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_CTX,
+	.arg2_type	= ARG_ANYTHING,
+	.arg3_type	= ARG_ANYTHING,
+};
+
 #if IS_ENABLED(CONFIG_IPV6_SEG6_BPF)
 static int bpf_push_seg6_encap(struct sk_buff *skb, u32 type, void *hdr, u32 len)
 {
@@ -7959,6 +8002,8 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_xdp_fib_lookup_proto;
 	case BPF_FUNC_check_mtu:
 		return &bpf_xdp_check_mtu_proto;
+	case BPF_FUNC_xdp_hints_btf:
+		return &bpf_xdp_hints_btf_proto;
 #ifdef CONFIG_INET
 	case BPF_FUNC_sk_lookup_udp:
 		return &bpf_xdp_sk_lookup_udp_proto;



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RFC bpf-next 6/9] i40e: refactor i40e_rx_checksum with helper
  2022-06-28 16:30 [PATCH RFC bpf-next 0/9] Introduce XDP-hints via BTF Jesper Dangaard Brouer
                   ` (4 preceding siblings ...)
  2022-06-28 16:30 ` [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper Jesper Dangaard Brouer
@ 2022-06-28 16:31 ` Jesper Dangaard Brouer
  2022-06-28 16:31 ` [PATCH RFC bpf-next 7/9] i40e: add XDP-hints handling Jesper Dangaard Brouer
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Jesper Dangaard Brouer @ 2022-06-28 16:31 UTC (permalink / raw)
  To: bpf; +Cc: xdp-hints, Jesper Dangaard Brouer

No functional change, this is in preparation for later patches.

The helper function does not depend on skb, which will be used in later
patches.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c |   66 ++++++++++++++++-----------
 1 file changed, 40 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index b7967105a549..26459d7f68cc 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1754,45 +1754,38 @@ bool i40e_alloc_rx_buffers(struct i40e_ring *rx_ring, u16 cleaned_count)
 	return true;
 }
 
-/**
- * i40e_rx_checksum - Indicate in skb if hw indicated a good cksum
- * @vsi: the VSI we care about
- * @skb: skb currently being received and modified
- * @rx_desc: the receive descriptor
- **/
-static inline void i40e_rx_checksum(struct i40e_vsi *vsi,
-				    struct sk_buff *skb,
-				    union i40e_rx_desc *rx_desc)
+struct i40e_rx_checksum_ret {
+	u16 ip_summed;
+	u16 csum_level;
+};
+
+static inline struct i40e_rx_checksum_ret
+_i40e_rx_checksum(struct i40e_vsi *vsi,
+		  u64 qword,
+		  struct i40e_rx_ptype_decoded decoded)
 {
-	struct i40e_rx_ptype_decoded decoded;
+	struct i40e_rx_checksum_ret ret = {};
 	u32 rx_error, rx_status;
 	bool ipv4, ipv6;
-	u8 ptype;
-	u64 qword;
 
-	qword = le64_to_cpu(rx_desc->wb.qword1.status_error_len);
-	ptype = (qword & I40E_RXD_QW1_PTYPE_MASK) >> I40E_RXD_QW1_PTYPE_SHIFT;
 	rx_error = (qword & I40E_RXD_QW1_ERROR_MASK) >>
 		   I40E_RXD_QW1_ERROR_SHIFT;
 	rx_status = (qword & I40E_RXD_QW1_STATUS_MASK) >>
 		    I40E_RXD_QW1_STATUS_SHIFT;
-	decoded = decode_rx_desc_ptype(ptype);
 
-	skb->ip_summed = CHECKSUM_NONE;
-
-	skb_checksum_none_assert(skb);
+	ret.ip_summed = CHECKSUM_NONE;
 
 	/* Rx csum enabled and ip headers found? */
 	if (!(vsi->netdev->features & NETIF_F_RXCSUM))
-		return;
+		return ret;
 
 	/* did the hardware decode the packet and checksum? */
 	if (!(rx_status & BIT(I40E_RX_DESC_STATUS_L3L4P_SHIFT)))
-		return;
+		return ret;
 
 	/* both known and outer_ip must be set for the below code to work */
 	if (!(decoded.known && decoded.outer_ip))
-		return;
+		return ret;
 
 	ipv4 = (decoded.outer_ip == I40E_RX_PTYPE_OUTER_IP) &&
 	       (decoded.outer_ip_ver == I40E_RX_PTYPE_OUTER_IPV4);
@@ -1808,7 +1801,7 @@ static inline void i40e_rx_checksum(struct i40e_vsi *vsi,
 	if (ipv6 &&
 	    rx_status & BIT(I40E_RX_DESC_STATUS_IPV6EXADD_SHIFT))
 		/* don't increment checksum err here, non-fatal err */
-		return;
+		return ret;
 
 	/* there was some L4 error, count error and punt packet to the stack */
 	if (rx_error & BIT(I40E_RX_DESC_ERROR_L4E_SHIFT))
@@ -1819,30 +1812,51 @@ static inline void i40e_rx_checksum(struct i40e_vsi *vsi,
 	 * the csum.
 	 */
 	if (rx_error & BIT(I40E_RX_DESC_ERROR_PPRS_SHIFT))
-		return;
+		return ret;
 
 	/* If there is an outer header present that might contain a checksum
 	 * we need to bump the checksum level by 1 to reflect the fact that
 	 * we are indicating we validated the inner checksum.
 	 */
 	if (decoded.tunnel_type >= I40E_RX_PTYPE_TUNNEL_IP_GRENAT)
-		skb->csum_level = 1;
+		ret.csum_level = 1;
 
 	/* Only report checksum unnecessary for TCP, UDP, or SCTP */
 	switch (decoded.inner_prot) {
 	case I40E_RX_PTYPE_INNER_PROT_TCP:
 	case I40E_RX_PTYPE_INNER_PROT_UDP:
 	case I40E_RX_PTYPE_INNER_PROT_SCTP:
-		skb->ip_summed = CHECKSUM_UNNECESSARY;
+		ret.ip_summed = CHECKSUM_UNNECESSARY;
 		fallthrough;
 	default:
 		break;
 	}
 
-	return;
+	return ret;
 
 checksum_fail:
 	vsi->back->hw_csum_rx_error++;
+	return ret;
+}
+
+/**
+ * i40e_rx_checksum - Indicate in skb if hw indicated a good cksum
+ * @vsi: the VSI we care about
+ * @skb: skb currently being received and modified
+ * @rx_desc: the receive descriptor
+ **/
+static inline void i40e_rx_checksum(struct i40e_vsi *vsi,
+				    struct sk_buff *skb,
+				    union i40e_rx_desc *rx_desc)
+{
+	struct i40e_rx_checksum_ret ret;
+	u64 qword = le64_to_cpu(rx_desc->wb.qword1.status_error_len);
+	u8 ptype = (qword & I40E_RXD_QW1_PTYPE_MASK) >> I40E_RXD_QW1_PTYPE_SHIFT;
+	struct i40e_rx_ptype_decoded decoded = decode_rx_desc_ptype(ptype);
+
+	ret = _i40e_rx_checksum(vsi, qword, decoded);
+	skb->ip_summed  = ret.ip_summed;
+	skb->csum_level = ret.csum_level;
 }
 
 /**



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RFC bpf-next 7/9] i40e: add XDP-hints handling
  2022-06-28 16:30 [PATCH RFC bpf-next 0/9] Introduce XDP-hints via BTF Jesper Dangaard Brouer
                   ` (5 preceding siblings ...)
  2022-06-28 16:31 ` [PATCH RFC bpf-next 6/9] i40e: refactor i40e_rx_checksum with helper Jesper Dangaard Brouer
@ 2022-06-28 16:31 ` Jesper Dangaard Brouer
  2022-07-18  9:38   ` Maryam Tahhan
  2022-06-28 16:31 ` [PATCH RFC bpf-next 8/9] net: use XDP-hints in xdp_frame to SKB conversion Jesper Dangaard Brouer
  2022-06-28 16:31 ` [PATCH RFC bpf-next 9/9] mvneta: add XDP-hints support Jesper Dangaard Brouer
  8 siblings, 1 reply; 15+ messages in thread
From: Jesper Dangaard Brouer @ 2022-06-28 16:31 UTC (permalink / raw)
  To: bpf; +Cc: xdp-hints, Jesper Dangaard Brouer

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c |   34 ++++++
 drivers/net/ethernet/intel/i40e/i40e_txrx.c |  156 ++++++++++++++++++++++++---
 2 files changed, 173 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 83e0cf475ebd..2ef5b3e49ba4 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -5,6 +5,7 @@
 #include <linux/of_net.h>
 #include <linux/pci.h>
 #include <linux/bpf.h>
+#include <linux/btf.h>
 #include <generated/utsrelease.h>
 #include <linux/crash_dump.h>
 
@@ -27,6 +28,10 @@ static const char i40e_driver_string[] =
 
 static const char i40e_copyright[] = "Copyright (c) 2013 - 2019 Intel Corporation.";
 
+static struct btf *this_module_btf;
+extern s32 btf_id_xdp_hints_i40e;
+extern s32 btf_id_xdp_hints_i40e_timestamp;
+
 /* a bit of forward declarations */
 static void i40e_vsi_reinit_locked(struct i40e_vsi *vsi);
 static void i40e_handle_reset_warning(struct i40e_pf *pf, bool lock_acquired);
@@ -13589,6 +13594,7 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
 			  NETIF_F_SCTP_CRC		|
 			  NETIF_F_RXHASH		|
 			  NETIF_F_RXCSUM		|
+			  NETIF_F_XDP_HINTS		|
 			  0;
 
 	if (!(pf->hw_features & I40E_HW_OUTER_UDP_CSUM_CAPABLE))
@@ -13633,6 +13639,7 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
 	netdev->hw_features |= hw_features;
 
 	netdev->features |= hw_features | NETIF_F_HW_VLAN_CTAG_FILTER;
+	netdev->features |= NETIF_F_XDP_HINTS;
 	netdev->hw_enc_features |= NETIF_F_TSO_MANGLEID;
 
 	netdev->features &= ~NETIF_F_HW_TC;
@@ -16545,6 +16552,28 @@ static struct pci_driver i40e_driver = {
 	.sriov_configure = i40e_pci_sriov_configure,
 };
 
+static s32 find_btf_id(struct btf *btf, const char *name)
+{
+	s32 btf_id;
+
+	if (!btf)
+		return -EFAULT;
+
+	btf_id = btf_find_by_name_kind(btf, name, BTF_KIND_STRUCT);
+	if (btf_id < 0) {
+		pr_warn("%s: BTF cannot find struct %s", i40e_driver_name, name);
+		return 0;
+	}
+	pr_info("%s: BTF id %d for struct %s", i40e_driver_name, btf_id, name);
+	return btf_id;
+}
+
+static void i40e_this_module_btf_lookups(struct btf *btf)
+{
+	btf_id_xdp_hints_i40e = find_btf_id(btf, "xdp_hints_i40e");
+	btf_id_xdp_hints_i40e_timestamp = find_btf_id(btf, "xdp_hints_i40e_timestamp");
+}
+
 /**
  * i40e_init_module - Driver registration routine
  *
@@ -16553,6 +16582,10 @@ static struct pci_driver i40e_driver = {
  **/
 static int __init i40e_init_module(void)
 {
+	this_module_btf = btf_get_module_btf(THIS_MODULE);
+	if (this_module_btf)
+		i40e_this_module_btf_lookups(this_module_btf);
+
 	pr_info("%s: %s\n", i40e_driver_name, i40e_driver_string);
 	pr_info("%s: %s\n", i40e_driver_name, i40e_copyright);
 
@@ -16586,5 +16619,6 @@ static void __exit i40e_exit_module(void)
 	destroy_workqueue(i40e_wq);
 	ida_destroy(&i40e_client_ida);
 	i40e_dbg_exit();
+	btf_put_module_btf(this_module_btf);
 }
 module_exit(i40e_exit_module);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 26459d7f68cc..d19331d034ee 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1822,15 +1822,10 @@ _i40e_rx_checksum(struct i40e_vsi *vsi,
 		ret.csum_level = 1;
 
 	/* Only report checksum unnecessary for TCP, UDP, or SCTP */
-	switch (decoded.inner_prot) {
-	case I40E_RX_PTYPE_INNER_PROT_TCP:
-	case I40E_RX_PTYPE_INNER_PROT_UDP:
-	case I40E_RX_PTYPE_INNER_PROT_SCTP:
+	if (likely((decoded.inner_prot == I40E_RX_PTYPE_INNER_PROT_TCP) ||
+		   (decoded.inner_prot == I40E_RX_PTYPE_INNER_PROT_UDP) ||
+		   (decoded.inner_prot == I40E_RX_PTYPE_INNER_PROT_SCTP)))
 		ret.ip_summed = CHECKSUM_UNNECESSARY;
-		fallthrough;
-	default:
-		break;
-	}
 
 	return ret;
 
@@ -1861,19 +1856,17 @@ static inline void i40e_rx_checksum(struct i40e_vsi *vsi,
 
 /**
  * i40e_ptype_to_htype - get a hash type
- * @ptype: the ptype value from the descriptor
+ * @ptype: the decoded ptype value from the descriptor
  *
  * Returns a hash type to be used by skb_set_hash
  **/
-static inline int i40e_ptype_to_htype(u8 ptype)
+static inline int i40e_ptype_to_htype(struct i40e_rx_ptype_decoded decoded)
 {
-	struct i40e_rx_ptype_decoded decoded = decode_rx_desc_ptype(ptype);
-
-	if (!decoded.known)
+	if (unlikely(!decoded.known))
 		return PKT_HASH_TYPE_NONE;
 
-	if (decoded.outer_ip == I40E_RX_PTYPE_OUTER_IP &&
-	    decoded.payload_layer == I40E_RX_PTYPE_PAYLOAD_LAYER_PAY4)
+	if (likely(decoded.outer_ip == I40E_RX_PTYPE_OUTER_IP &&
+		   decoded.payload_layer == I40E_RX_PTYPE_PAYLOAD_LAYER_PAY4))
 		return PKT_HASH_TYPE_L4;
 	else if (decoded.outer_ip == I40E_RX_PTYPE_OUTER_IP &&
 		 decoded.payload_layer == I40E_RX_PTYPE_PAYLOAD_LAYER_PAY3)
@@ -1903,8 +1896,11 @@ static inline void i40e_rx_hash(struct i40e_ring *ring,
 		return;
 
 	if ((rx_desc->wb.qword1.status_error_len & rss_mask) == rss_mask) {
+		struct i40e_rx_ptype_decoded ptype;
+
+		ptype = decode_rx_desc_ptype(rx_ptype);
 		hash = le32_to_cpu(rx_desc->wb.qword0.hi_dword.rss);
-		skb_set_hash(skb, hash, i40e_ptype_to_htype(rx_ptype));
+		skb_set_hash(skb, hash, i40e_ptype_to_htype(ptype));
 	}
 }
 
@@ -1950,6 +1946,130 @@ void i40e_process_skb_fields(struct i40e_ring *rx_ring,
 	skb->protocol = eth_type_trans(skb, rx_ring->netdev);
 }
 
+struct xdp_hints_i40e {
+	struct i40e_rx_ptype_decoded i40e_hash_ptype;
+	struct xdp_hints_common common;
+};
+
+struct xdp_hints_i40e_timestamp {
+	u64 rx_timestamp;
+	struct xdp_hints_i40e base;
+};
+
+/* Extending xdp_hints_flags */
+enum xdp_hints_flags_driver {
+	HINT_FLAG_RX_TIMESTAMP = BIT(16),
+};
+
+/* BTF IDs gets looked up on driver i40e_init_module */
+s32 btf_id_xdp_hints_i40e;
+s32 btf_id_xdp_hints_i40e_timestamp;
+
+static inline u32 i40e_rx_checksum_xdp(struct i40e_vsi *vsi, u64 qword1,
+				       struct xdp_hints_i40e *xdp_hints,
+				       struct i40e_rx_ptype_decoded ptype)
+{
+	struct i40e_rx_checksum_ret ret;
+
+	ret = _i40e_rx_checksum(vsi, qword1, ptype);
+	return xdp_hints_set_rx_csum(&xdp_hints->common, ret.ip_summed, ret.csum_level);
+}
+
+static inline u32 i40e_rx_hash_xdp(struct i40e_ring *ring,
+				   union i40e_rx_desc *rx_desc,
+				   struct xdp_buff *xdp,
+				   u64 rx_desc_qword1,
+				   struct xdp_hints_i40e *xdp_hints,
+				   struct i40e_rx_ptype_decoded ptype
+	)
+{
+	const u64 rss_mask = (u64)I40E_RX_DESC_FLTSTAT_RSS_HASH <<
+				I40E_RX_DESC_STATUS_FLTSTAT_SHIFT;
+	u32 flags = 0;
+
+	if (unlikely(!(ring->netdev->features & NETIF_F_RXHASH))) {
+		struct i40e_rx_ptype_decoded zero = {};
+		xdp_hints->i40e_hash_ptype = zero;
+		return 0;
+	}
+
+	if (likely((rx_desc_qword1 & rss_mask) == rss_mask)) {
+		u32 hash = le32_to_cpu(rx_desc->wb.qword0.hi_dword.rss);
+		u32 htype;
+
+		/* i40e provide extra information about protocol type  */
+		xdp_hints->i40e_hash_ptype = ptype;
+		htype = i40e_ptype_to_htype(ptype);
+		flags = xdp_hints_set_rx_hash(&xdp_hints->common, hash, htype);
+
+	}
+	return flags;
+}
+
+static inline void i40e_process_xdp_hints(struct i40e_ring *rx_ring,
+					  union i40e_rx_desc *rx_desc,
+					  struct xdp_buff *xdp,
+					  u64 qword)
+{
+	u32 rx_status = (qword & I40E_RXD_QW1_STATUS_MASK) >>
+			I40E_RXD_QW1_STATUS_SHIFT;
+	u32 tsynvalid = rx_status & I40E_RXD_QW1_STATUS_TSYNVALID_MASK;
+	u32 tsyn = (rx_status & I40E_RXD_QW1_STATUS_TSYNINDX_MASK) >>
+		   I40E_RXD_QW1_STATUS_TSYNINDX_SHIFT;
+	u64 tsyn_ts;
+
+	struct i40e_rx_ptype_decoded ptype;
+	struct xdp_hints_i40e *xdp_hints;
+	struct xdp_hints_common *common;
+	u32 btf_id = btf_id_xdp_hints_i40e;
+	u32 btf_sz = sizeof(*xdp_hints);
+	u32 f1 = 0, f2, f3, f4, f5 = 0;
+	u8 rx_ptype;
+
+	if (!(rx_ring->netdev->features & NETIF_F_XDP_HINTS))
+		return;
+
+	/* Driver have xdp headroom when using build_skb */
+	if (unlikely(!ring_uses_build_skb(rx_ring)))
+		return;
+
+	xdp_hints = xdp->data - btf_sz;
+	common = &xdp_hints->common;
+
+	if (unlikely(tsynvalid)) {
+		struct xdp_hints_i40e_timestamp *hints;
+
+		tsyn_ts = i40e_ptp_rx_hwtstamp_raw(rx_ring->vsi->back, tsyn);
+		btf_id = btf_id_xdp_hints_i40e_timestamp;
+		btf_sz = sizeof(*hints);
+		hints = xdp->data - btf_sz;
+		hints->rx_timestamp = ns_to_ktime(tsyn_ts);
+		f1 = HINT_FLAG_RX_TIMESTAMP;
+	}
+
+	/* ptype needed by both hash and checksum code */
+	rx_ptype = (qword & I40E_RXD_QW1_PTYPE_MASK) >> I40E_RXD_QW1_PTYPE_SHIFT;
+	ptype = decode_rx_desc_ptype(rx_ptype);
+
+	f2 = i40e_rx_hash_xdp(rx_ring, rx_desc, xdp, qword, xdp_hints, ptype);
+	f3 = i40e_rx_checksum_xdp(rx_ring->vsi, qword, xdp_hints, ptype);
+	f4 = xdp_hints_set_rxq(common, rx_ring->queue_index);
+
+	if (unlikely(qword & BIT(I40E_RX_DESC_STATUS_L2TAG1P_SHIFT))) {
+		__le16 vlan_tag = rx_desc->wb.qword0.lo_dword.l2tag1;
+
+		f5 = xdp_hints_set_vlan(common, le16_to_cpu(vlan_tag),
+				   htons(ETH_P_8021Q));
+	}
+
+	xdp_hints_set_flags(common, (f1 | f2 | f3 | f4 | f5));
+	common->btf_id = btf_id;
+	xdp->data_meta = xdp->data - btf_sz;
+
+	xdp_buff_set_hints(xdp, BTF_ORIGIN_MODULE, true);
+}
+
+
 /**
  * i40e_cleanup_headers - Correct empty headers
  * @rx_ring: rx descriptor ring packet is being transacted on
@@ -2497,7 +2617,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
 		 */
 		dma_rmb();
 
-		if (i40e_rx_is_programming_status(qword)) {
+		if (unlikely(i40e_rx_is_programming_status(qword))) {
 			i40e_clean_programming_status(rx_ring,
 						      rx_desc->raw.qword[0],
 						      qword);
@@ -2524,6 +2644,8 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
 				     rx_buffer->page_offset - offset;
 			xdp_prepare_buff(&xdp, hard_start, offset, size, true);
 			xdp_buff_clear_frags_flag(&xdp);
+			prefetchw(xdp.data - 8); /* xdp.data_meta cacheline */
+			i40e_process_xdp_hints(rx_ring, rx_desc, &xdp, qword);
 #if (PAGE_SIZE > 4096)
 			/* At larger PAGE_SIZE, frame_sz depend on len size */
 			xdp.frame_sz = i40e_rx_frame_truesize(rx_ring, size);



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RFC bpf-next 8/9] net: use XDP-hints in xdp_frame to SKB conversion
  2022-06-28 16:30 [PATCH RFC bpf-next 0/9] Introduce XDP-hints via BTF Jesper Dangaard Brouer
                   ` (6 preceding siblings ...)
  2022-06-28 16:31 ` [PATCH RFC bpf-next 7/9] i40e: add XDP-hints handling Jesper Dangaard Brouer
@ 2022-06-28 16:31 ` Jesper Dangaard Brouer
  2022-06-28 16:31 ` [PATCH RFC bpf-next 9/9] mvneta: add XDP-hints support Jesper Dangaard Brouer
  8 siblings, 0 replies; 15+ messages in thread
From: Jesper Dangaard Brouer @ 2022-06-28 16:31 UTC (permalink / raw)
  To: bpf; +Cc: xdp-hints, Jesper Dangaard Brouer

This patch makes the net/core/xdp function __xdp_build_skb_from_frame()
consume HW offloads provided via XDP-hints when creating an SKB based
on an xdp_frame. This is an initial step towards SKB less drivers that
moves SKB handing to net/core.

Current users that already benefit from this are: Redirect into veth
and cpumap. XDP_PASS action in bpf_test_run_xdp_live and driver
ethernet/aquantia/atlantic/.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/net/xdp.h |   10 ++++++++
 net/core/xdp.c    |   68 ++++++++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 72 insertions(+), 6 deletions(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index 710d145a26f9..9917fa1a2d39 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -351,6 +351,16 @@ static __always_inline bool xdp_frame_is_frag_pfmemalloc(struct xdp_frame *frame
 	return !!(frame->flags & XDP_FLAGS_FRAGS_PF_MEMALLOC);
 }
 
+static __always_inline bool xdp_frame_has_hints_compat(struct xdp_frame *xdpf)
+{
+	u32 flags = xdpf->flags;
+
+	if (!(flags & XDP_FLAGS_HINTS_COMPAT_COMMON))
+		return false;
+
+	return !!(flags & XDP_FLAGS_HINTS_ORIGIN_MASK);
+}
+
 #define XDP_BULK_QUEUE_SIZE	16
 struct xdp_frame_bulk {
 	int count;
diff --git a/net/core/xdp.c b/net/core/xdp.c
index a57bd5278b47..c60e66982da0 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -618,11 +618,60 @@ int xdp_alloc_skb_bulk(void **skbs, int n_skb, gfp_t gfp)
 }
 EXPORT_SYMBOL_GPL(xdp_alloc_skb_bulk);
 
+static void xdp_hint_skb_record_rx_queue(struct sk_buff *skb,
+					 struct xdp_hints_common *hints)
+{
+	if (hints->xdp_hints_flags & HINT_FLAG_RX_QUEUE)
+		skb_record_rx_queue(skb, hints->rx_queue);
+}
+
+static void xdp_hint_skb_set_hash(struct sk_buff *skb,
+				  struct xdp_hints_common *hints)
+{
+	u32 hash_type = hints->xdp_hints_flags & HINT_FLAG_RX_HASH_TYPE_MASK;
+
+	if (hash_type) {
+		hash_type = hash_type >> HINT_FLAG_RX_HASH_TYPE_SHIFT;
+		skb_set_hash(skb, hints->rx_hash32, hash_type);
+	}
+}
+
+static void xdp_hint_skb_checksum(struct sk_buff *skb,
+				  struct xdp_hints_common *hints)
+{
+	u32 csum_type = hints->xdp_hints_flags & HINT_FLAG_CSUM_TYPE_MASK;
+	u32 csum_level = hints->xdp_hints_flags & HINT_FLAG_CSUM_LEVEL_MASK;
+
+	if (csum_type == CHECKSUM_UNNECESSARY)
+		skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+	if (csum_level)
+		skb->csum_level = csum_level >> HINT_FLAG_CSUM_LEVEL_SHIFT;
+
+	/* TODO: First driver implementing CHECKSUM_PARTIAL or CHECKSUM_COMPLETE
+	 *  need to implement handling here.
+	 */
+}
+
+static void xdp_hint_skb_vlan_hw_tag(struct sk_buff *skb,
+				     struct xdp_hints_common *hints)
+{
+	u32 flags = hints->xdp_hints_flags;
+	__be16 proto = htons(ETH_P_8021Q);
+
+	if (flags & HINT_FLAG_VLAN_PROTO_ETH_P_8021AD)
+		proto = htons(ETH_P_8021AD);
+
+	if (flags & HINT_FLAG_VLAN_PRESENT)
+		__vlan_hwaccel_put_tag(skb, hints->vlan_tci, proto);
+}
+
 struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
 					   struct sk_buff *skb,
 					   struct net_device *dev)
 {
 	struct skb_shared_info *sinfo = xdp_get_shared_info_from_frame(xdpf);
+	struct xdp_hints_common *xdp_hints = NULL;
 	unsigned int headroom, frame_size;
 	void *hard_start;
 	u8 nr_frags;
@@ -640,14 +689,17 @@ struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
 	frame_size = xdpf->frame_sz;
 
 	hard_start = xdpf->data - headroom;
+	prefetch(xdpf->data); /* cache-line for eth_type_trans */
 	skb = build_skb_around(skb, hard_start, frame_size);
 	if (unlikely(!skb))
 		return NULL;
 
 	skb_reserve(skb, headroom);
 	__skb_put(skb, xdpf->len);
-	if (xdpf->metasize)
+	if (xdpf->metasize) {
 		skb_metadata_set(skb, xdpf->metasize);
+		prefetch(xdpf->data - sizeof(*xdp_hints));
+	}
 
 	if (unlikely(xdp_frame_has_frags(xdpf)))
 		xdp_update_skb_shared_info(skb, nr_frags,
@@ -658,11 +710,15 @@ struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
 	/* Essential SKB info: protocol and skb->dev */
 	skb->protocol = eth_type_trans(skb, dev);
 
-	/* Optional SKB info, currently missing:
-	 * - HW checksum info		(skb->ip_summed)
-	 * - HW RX hash			(skb_set_hash)
-	 * - RX ring dev queue index	(skb_record_rx_queue)
-	 */
+	/* Populate (optional) HW offload hints in SKB via XDP-hints */
+	if (xdp_frame_has_hints_compat(xdpf)
+	    && xdpf->metasize >= sizeof(*xdp_hints)) {
+		xdp_hints = xdpf->data - sizeof(*xdp_hints);
+		xdp_hint_skb_record_rx_queue(skb, xdp_hints);
+		xdp_hint_skb_set_hash(skb, xdp_hints);
+		xdp_hint_skb_checksum(skb, xdp_hints);
+		xdp_hint_skb_vlan_hw_tag(skb, xdp_hints);
+	}
 
 	/* Until page_pool get SKB return path, release DMA here */
 	xdp_release_frame(xdpf);



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RFC bpf-next 9/9] mvneta: add XDP-hints support
  2022-06-28 16:30 [PATCH RFC bpf-next 0/9] Introduce XDP-hints via BTF Jesper Dangaard Brouer
                   ` (7 preceding siblings ...)
  2022-06-28 16:31 ` [PATCH RFC bpf-next 8/9] net: use XDP-hints in xdp_frame to SKB conversion Jesper Dangaard Brouer
@ 2022-06-28 16:31 ` Jesper Dangaard Brouer
  8 siblings, 0 replies; 15+ messages in thread
From: Jesper Dangaard Brouer @ 2022-06-28 16:31 UTC (permalink / raw)
  To: bpf; +Cc: xdp-hints, Jesper Dangaard Brouer

From: Lorenzo Bianconi <lorenzo@kernel.org>

Similar to i40e driver, add xdp hw-hints support for mvneta driver in
order to report rx csum offload for xdp_redirect.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/marvell/mvneta.c |   61 ++++++++++++++++++++++++++++-----
 1 file changed, 52 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 384f5a16753d..4f73ba905741 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -40,6 +40,7 @@
 #include <net/page_pool.h>
 #include <net/pkt_cls.h>
 #include <linux/bpf_trace.h>
+#include <linux/btf.h>
 
 /* Registers */
 #define MVNETA_RXQ_CONFIG_REG(q)                (0x1400 + ((q) << 2))
@@ -371,6 +372,9 @@
 #define MVNETA_RX_GET_BM_POOL_ID(rxd) \
 	(((rxd)->status & MVNETA_RXD_BM_POOL_MASK) >> MVNETA_RXD_BM_POOL_SHIFT)
 
+static struct btf *mvneta_btf;
+static s32 btf_id_xdp_hints;
+
 enum {
 	ETHTOOL_STAT_EEE_WAKEUP,
 	ETHTOOL_STAT_SKB_ALLOC_ERR,
@@ -2308,12 +2312,15 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
 		     struct mvneta_rx_desc *rx_desc,
 		     struct mvneta_rx_queue *rxq,
 		     struct xdp_buff *xdp, int *size,
-		     struct page *page)
+		     struct page *page, u32 status)
 {
 	unsigned char *data = page_address(page);
 	int data_len = -MVNETA_MH_SIZE, len;
+	struct xdp_hints_common *xdp_hints;
 	struct net_device *dev = pp->dev;
 	enum dma_data_direction dma_dir;
+	u32 xdp_hints_flags;
+	u16 cksum;
 
 	if (*size > MVNETA_MAX_RX_BUF_SIZE) {
 		len = MVNETA_MAX_RX_BUF_SIZE;
@@ -2336,6 +2343,18 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
 	xdp_buff_clear_frags_flag(xdp);
 	xdp_prepare_buff(xdp, data, pp->rx_offset_correction + MVNETA_MH_SIZE,
 			 data_len, false);
+
+	if (!(pp->dev->features & NETIF_F_XDP_HINTS))
+		return;
+
+	xdp_hints = xdp->data - sizeof(*xdp_hints);
+	cksum = mvneta_rx_csum(pp, status);
+	xdp_hints_flags = xdp_hints_set_rx_csum(xdp_hints, cksum, 0);
+	xdp_hints_set_flags(xdp_hints, xdp_hints_flags);
+	xdp_hints->btf_id = btf_id_xdp_hints;
+	xdp->data_meta = xdp->data - sizeof(*xdp_hints);
+
+	xdp_buff_set_hints(xdp, BTF_ORIGIN_VMLINUX, true);
 }
 
 static void
@@ -2385,9 +2404,24 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
 	*size -= len;
 }
 
+static void
+mvneta_set_skb_cksum_from_xdp(struct xdp_buff *xdp, struct sk_buff *skb)
+{
+	struct xdp_hints_common *xdp_hints;
+
+	if (!(xdp_buff_has_hints_compat(xdp)))
+		return;
+
+	if (xdp->data - xdp->data_meta < sizeof(*xdp_hints))
+		return;
+
+	xdp_hints = xdp->data - sizeof(*xdp_hints);
+	skb->ip_summed = xdp_hints->xdp_hints_flags & HINT_FLAG_CSUM_TYPE_MASK;
+}
+
 static struct sk_buff *
 mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
-		      struct xdp_buff *xdp, u32 desc_status)
+		      struct xdp_buff *xdp)
 {
 	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
 	struct sk_buff *skb;
@@ -2404,7 +2438,7 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
 
 	skb_reserve(skb, xdp->data - xdp->data_hard_start);
 	skb_put(skb, xdp->data_end - xdp->data);
-	skb->ip_summed = mvneta_rx_csum(pp, desc_status);
+	mvneta_set_skb_cksum_from_xdp(xdp, skb);
 
 	if (unlikely(xdp_buff_has_frags(xdp)))
 		xdp_update_skb_shared_info(skb, num_frags,
@@ -2424,8 +2458,8 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 	struct net_device *dev = pp->dev;
 	struct mvneta_stats ps = {};
 	struct bpf_prog *xdp_prog;
-	u32 desc_status, frame_sz;
 	struct xdp_buff xdp_buf;
+	u32 frame_sz;
 
 	xdp_init_buff(&xdp_buf, PAGE_SIZE, &rxq->xdp_rxq);
 	xdp_buf.data_hard_start = NULL;
@@ -2458,10 +2492,8 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 
 			size = rx_desc->data_size;
 			frame_sz = size - ETH_FCS_LEN;
-			desc_status = rx_status;
-
 			mvneta_swbm_rx_frame(pp, rx_desc, rxq, &xdp_buf,
-					     &size, page);
+					     &size, page, rx_status);
 		} else {
 			if (unlikely(!xdp_buf.data_hard_start)) {
 				rx_desc->buf_phys_addr = 0;
@@ -2487,7 +2519,7 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 		    mvneta_run_xdp(pp, rxq, xdp_prog, &xdp_buf, frame_sz, &ps))
 			goto next;
 
-		skb = mvneta_swbm_build_skb(pp, rxq->page_pool, &xdp_buf, desc_status);
+		skb = mvneta_swbm_build_skb(pp, rxq->page_pool, &xdp_buf);
 		if (IS_ERR(skb)) {
 			struct mvneta_pcpu_stats *stats = this_cpu_ptr(pp->stats);
 
@@ -5613,7 +5645,7 @@ static int mvneta_probe(struct platform_device *pdev)
 	}
 
 	dev->features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
-			NETIF_F_TSO | NETIF_F_RXCSUM;
+			NETIF_F_TSO | NETIF_F_RXCSUM | NETIF_F_XDP_HINTS;
 	dev->hw_features |= dev->features;
 	dev->vlan_features |= dev->features;
 	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
@@ -5817,6 +5849,16 @@ static int __init mvneta_driver_init(void)
 {
 	int ret;
 
+	mvneta_btf = btf_get_module_btf(THIS_MODULE);
+	if (mvneta_btf) {
+		btf_id_xdp_hints = btf_find_by_name_kind(mvneta_btf,
+							 "xdp_hints_common",
+							 BTF_KIND_STRUCT);
+		if (btf_id_xdp_hints < 0)
+			pr_warn("%s: BTF cannot find struct xdp_hints_common",
+				MVNETA_DRIVER_NAME);
+	}
+
 	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "net/mvneta:online",
 				      mvneta_cpu_online,
 				      mvneta_cpu_down_prepare);
@@ -5844,6 +5886,7 @@ module_init(mvneta_driver_init);
 
 static void __exit mvneta_driver_exit(void)
 {
+	btf_put_module_btf(mvneta_btf);
 	platform_driver_unregister(&mvneta_driver);
 	cpuhp_remove_multi_state(CPUHP_NET_MVNETA_DEAD);
 	cpuhp_remove_multi_state(online_hpstate);



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [xdp-hints] [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper
  2022-06-28 16:30 ` [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper Jesper Dangaard Brouer
@ 2022-06-29 14:20   ` Toke Høiland-Jørgensen
  2022-07-01  9:10     ` [xdp-hints] " Jesper Dangaard Brouer
  0 siblings, 1 reply; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2022-06-29 14:20 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, bpf; +Cc: xdp-hints

Jesper Dangaard Brouer <brouer@redhat.com> writes:

> XDP BPF-prog's need a way to interact with the XDP-hints. This patch
> introduces a BPF-helper function, that allow XDP BPF-prog's to interact
> with the XDP-hints.
>
> BPF-prog can query if any XDP-hints have been setup and if this is
> compatible with the xdp_hints_common struct. If XDP-hints are available
> the BPF "origin" is returned (see enum xdp_hints_btf_origin) as BTF can
> come from different sources or origins e.g. vmlinux, module or local.

I'm not sure I quite understand what this origin is supposed to be good
for? What is a BPF (or AF_XDP) program supposed to do with the
information "this XDP hints struct came from a module?" without knowing
which module that was? Ultimately, the origin is useful for a consumer
to check that the metadata is in the format that it's expecting it to be
in (so it can just load the data from the appropriate offsets). But to
answer this, we really need a unique identifier; so I think the approach
in Alexander's series of encoding the ID of the BTF structure itself
into the next 32 bits is better? That way we'll have a unique "pointer"
to the actual struct that's in the metadata area and can act on this.

> RFC/TODO: Improve patch: Can verifier validate provided BTF on "update"
> and detect if compatible with common struct???

If we have the unique ID as mentioned above, I think the kernel probably
could resolve this automatically: whenever a module is loaded, the
kernel could walk the BTF information from that module an simply inspect
all the metadata structs and see if they contain the embedded
xdp_hints_common struct. The IDs of any metadata structs that do contain
the common struct can then be kept in a central lookup table and the
consumption code can then simply compare the BTF ID to this table when
building an SKB?

As for the validation on the BPF side:n

> +	if (flags & HINTS_BTF_UPDATE) {
> +		is_compat_common = !!(flags & HINTS_BTF_COMPAT_COMMON);
> +	/* TODO: Can kernel validate if hints are BTF compat with common? */
> +	/* TODO: Could BPF prog provide BTF as ARG_PTR_TO_BTF_ID to prove compat_common ? */

If we use the "global ID + lookup table" approach above, we don't really
need to validate anything here: if the program says it's writing
metadata with a format given by a specific ID, that implies
compatibility (or not) as given by the ID. We could sanity-check the
metadata area size, but the consumption code has to do that anyway, so
I'm not sure it's worth the runtime overhead to have an additional check
here?

As for safety of the metadata content itself, I don't really think we
can do anything to guarantee this: in any case the BPF program can pass
a valid BTF ID and still write garbage values into the actual fields, so
the consumption code has to do enough validation that this won't crash
the kernel anyway. But this is no different from the packet data itself:
XDP is basically in a position to be a MITM attacker of the network
stack itself, which is why loading XDP programs is a privileged
operation...

-Toke


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [xdp-hints] Re: [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper
  2022-06-29 14:20   ` [xdp-hints] " Toke Høiland-Jørgensen
@ 2022-07-01  9:10     ` Jesper Dangaard Brouer
  2022-07-01 12:09       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 15+ messages in thread
From: Jesper Dangaard Brouer @ 2022-07-01  9:10 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, bpf
  Cc: xdp-hints, Alexander Lobakin, Alexander Lobakin



On 29/06/2022 16.20, Toke Høiland-Jørgensen wrote:
> Jesper Dangaard Brouer <brouer@redhat.com> writes:
> 
>> XDP BPF-prog's need a way to interact with the XDP-hints. This patch
>> introduces a BPF-helper function, that allow XDP BPF-prog's to interact
>> with the XDP-hints.
>>
>> BPF-prog can query if any XDP-hints have been setup and if this is
>> compatible with the xdp_hints_common struct. If XDP-hints are available
>> the BPF "origin" is returned (see enum xdp_hints_btf_origin) as BTF can
>> come from different sources or origins e.g. vmlinux, module or local.
> 
> I'm not sure I quite understand what this origin is supposed to be good
> for? 

Some background info on BTF is needed here: BTF_ID numbers are not
globally unique identifiers, thus we need to know where it originate
from, to make it unique (as we store this BTF_ID in XDP-hints).

There is a connection between origin "vmlinux" and "module", which is
that vmlinux will start at ID=1 and end at a max ID number.  Modules
refer to ID's in "vmlinux", and for this to work, they will shift their
own numbering to start after ID=max-vmlinux-id.

Origin "local" is for BTF information stored in the BPF-ELF object file.
Their numbering starts at ID=1.  The use-case is that a BPF-prog want to
extend the kernel drivers BTF-layout, and e.g. add a RX-timestamp like
[1].  Then BPF-prog can check if it knows module's BTF_ID and then
extend via bpf_xdp_adjust_meta, and update BTF_ID in XDP-hints and call
the helper (I introduced) marking this as origin "local" for kernel to
know this is no-longer origin "module".

  [1] 
https://github.com/xdp-project/bpf-examples/tree/master/AF_XDP-interaction/

> What is a BPF (or AF_XDP) program supposed to do with the
> information "this XDP hints struct came from a module?" without knowing
> which module that was? 

For AF_XDP my claim is the userspace program will already know that
driver it is are talking to because it need to "bind" to a specific
interface (and attach to XDP-prog to ifindex). See sample code[2] for
get_driver_name from ifindex.
Thus, part of using XDP-hints already involves (resolving and) opening
/sys/kernel/btf/driver_name.  So the origin "module" is enough for the
API end-user to make the BTF_ID unique.

Runtime the BPF-prog and kernel can find out what net_device the origin
"module" refers to via xdp_buff->rxq->dev.  When an end-user/program
attach XDP they also need to know the ifindex, again giving them
knowledge that make origin "module" BTF_ID's unique for them,

Same way the origin "local" have meaning to the BPF-prog and user
loading it.

  [2] 
https://github.com/torvalds/linux/blob/v5.18/samples/bpf/xdp_sample_user.c#L1602


> Ultimately, the origin is useful for a consumer
> to check that the metadata is in the format that it's expecting it to be
> in (so it can just load the data from the appropriate offsets). But to
> answer this, we really need a unique identifier; so I think the approach
> in Alexander's series of encoding the ID of the BTF structure itself
> into the next 32 bits is better? That way we'll have a unique "pointer"
> to the actual struct that's in the metadata area and can act on this.

I would really like an explanation from Alexander, how his approach
creates unique identifier across all kernel modules.  I don't get it
from reading the code.  To me it looks like some extra BTF "type"
information about the BTF_ID.

E.g. how do BTF "local" BPF-ELF object's get a unique identifier, that
doesn't overlap with e.g. kernel modules?

>> RFC/TODO: Improve patch: Can verifier validate provided BTF on "update"
>> and detect if compatible with common struct???
> 
> If we have the unique ID as mentioned above, I think the kernel probably
> could resolve this automatically: whenever a module is loaded, the
> kernel could walk the BTF information from that module an simply inspect
> all the metadata structs and see if they contain the embedded
> xdp_hints_common struct. The IDs of any metadata structs that do contain
> the common struct can then be kept in a central lookup table and the
> consumption code can then simply compare the BTF ID to this table when
> building an SKB?

I'm not against the idea for the kernel to keep track of these structs.
I just don't like the idea of checking this runtime, especially as this
approach for walking all other modules BTF struct's doesn't scale.


> As for the validation on the BPF side:n
> 
>> +	if (flags & HINTS_BTF_UPDATE) {
>> +		is_compat_common = !!(flags & HINTS_BTF_COMPAT_COMMON);
>> +	/* TODO: Can kernel validate if hints are BTF compat with common? */
>> +	/* TODO: Could BPF prog provide BTF as ARG_PTR_TO_BTF_ID to prove compat_common ? */
> 
> If we use the "global ID + lookup table" approach above, we don't really
> need to validate anything here: if the program says it's writing
> metadata with a format given by a specific ID, that implies
> compatibility (or not) as given by the ID. We could sanity-check the
> metadata area size, but the consumption code has to do that anyway, so
> I'm not sure it's worth the runtime overhead to have an additional check
> here?

As you know I hate "runtime checks", and try hard to push checks to
"setup time".  Maybe we could have verifier (or libbpf) do the check at
setup/load time, by identifying the helper call and check if provided
BTF do match COMPAT_COMMON claim.

For this to work, the verifier need to be able to resolve origin
"module", which happens at BPF load-time, so we would need to set the
ifindex (curr used for XDP-hardware-offload) at BPF load-time.


> As for safety of the metadata content itself, I don't really think we
> can do anything to guarantee this: in any case the BPF program can pass
> a valid BTF ID and still write garbage values into the actual fields, so
> the consumption code has to do enough validation that this won't crash
> the kernel anyway. But this is no different from the packet data itself:
> XDP is basically in a position to be a MITM attacker of the network
> stack itself, which is why loading XDP programs is a privileged
> operation...

I agree, that we cannot stop the end-user from screwing up their
BPF-prog to provide garbage in the fields, as long as it doesn't crash
the kernel.  I do think it would improve usability for end-users if we
can detect and report that their BPF-prog have gotten out of sync with
the running kernel and their claim that their BTF layout are
COMPAT_COMMON isn't actually true.  But I guess it is shouldn't block
the code, as it's only an extra usability help.

--Jesper


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [xdp-hints] Re: [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper
  2022-07-01  9:10     ` [xdp-hints] " Jesper Dangaard Brouer
@ 2022-07-01 12:09       ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2022-07-01 12:09 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, bpf
  Cc: xdp-hints, Alexander Lobakin, Alexander Lobakin

Jesper Dangaard Brouer <jbrouer@redhat.com> writes:

> On 29/06/2022 16.20, Toke Høiland-Jørgensen wrote:
>> Jesper Dangaard Brouer <brouer@redhat.com> writes:
>> 
>>> XDP BPF-prog's need a way to interact with the XDP-hints. This patch
>>> introduces a BPF-helper function, that allow XDP BPF-prog's to interact
>>> with the XDP-hints.
>>>
>>> BPF-prog can query if any XDP-hints have been setup and if this is
>>> compatible with the xdp_hints_common struct. If XDP-hints are available
>>> the BPF "origin" is returned (see enum xdp_hints_btf_origin) as BTF can
>>> come from different sources or origins e.g. vmlinux, module or local.
>> 
>> I'm not sure I quite understand what this origin is supposed to be good
>> for? 
>
> Some background info on BTF is needed here: BTF_ID numbers are not
> globally unique identifiers, thus we need to know where it originate
> from, to make it unique (as we store this BTF_ID in XDP-hints).
>
> There is a connection between origin "vmlinux" and "module", which is
> that vmlinux will start at ID=1 and end at a max ID number.  Modules
> refer to ID's in "vmlinux", and for this to work, they will shift their
> own numbering to start after ID=max-vmlinux-id.
>
> Origin "local" is for BTF information stored in the BPF-ELF object file.
> Their numbering starts at ID=1.  The use-case is that a BPF-prog want to
> extend the kernel drivers BTF-layout, and e.g. add a RX-timestamp like
> [1].  Then BPF-prog can check if it knows module's BTF_ID and then
> extend via bpf_xdp_adjust_meta, and update BTF_ID in XDP-hints and call
> the helper (I introduced) marking this as origin "local" for kernel to
> know this is no-longer origin "module".

Right, I realise that :)

My point was that just knowing "this is a BTF ID coming from a module"
is not terribly useful; you could already figure that out by just
looking at the ID and seeing if it's larger than the maximum ID in
vmlinux BTF.

Rather, what we need is a way to identify *which* module the BTF ID
comes from; and luckily, the kernel assigns a unique ID to every BTF
*object* as well as to each type ID within that object. These can be
dumped by bpftool:

# bpftool btf
bpftool btf      
[sudo] password for alrua: 
1: name [vmlinux]  size 4800187B
2: name [serio]  size 2588B
3: name [i8042]  size 11786B
4: name [rng_core]  size 8184B
[...]
2062: name <anon>  size 36965B
	pids bpftool(547298)

IDs 2-4 are module BTF objects, and that last one is the ID of a BTF
object loaded along with a BPF program by bpftool itself... So we *do*
in fact have a unique ID, by combining the BTF object ID with the type
ID; this is what Alexander is proposing to put into the xdp-hints struct
as well (combining the two IDs into a single u64).

>> What is a BPF (or AF_XDP) program supposed to do with the
>> information "this XDP hints struct came from a module?" without knowing
>> which module that was? 
>
> For AF_XDP my claim is the userspace program will already know that
> driver it is are talking to because it need to "bind" to a specific
> interface (and attach to XDP-prog to ifindex). See sample code[2] for
> get_driver_name from ifindex.
> Thus, part of using XDP-hints already involves (resolving and) opening
> /sys/kernel/btf/driver_name.  So the origin "module" is enough for the
> API end-user to make the BTF_ID unique.

This will probably work in the most common cases, but offers no way to
verify that this "offline" resolution of module ID is actually correct.
Explicitly encoding the full unique ID will be more robust.

> Runtime the BPF-prog and kernel can find out what net_device the origin
> "module" refers to via xdp_buff->rxq->dev.  When an end-user/program
> attach XDP they also need to know the ifindex, again giving them
> knowledge that make origin "module" BTF_ID's unique for them,

Right, but then the BPF program needs to keep its own lookup table from
ifindex to BTF ID? If we just encode the full ID in the packet, it's a
simple check, and we can likely create a "magic" CO-RE macro that turns
a struct definition into the right ID check at load time...

>> Ultimately, the origin is useful for a consumer
>> to check that the metadata is in the format that it's expecting it to be
>> in (so it can just load the data from the appropriate offsets). But to
>> answer this, we really need a unique identifier; so I think the approach
>> in Alexander's series of encoding the ID of the BTF structure itself
>> into the next 32 bits is better? That way we'll have a unique "pointer"
>> to the actual struct that's in the metadata area and can act on this.
>
> I would really like an explanation from Alexander, how his approach
> creates unique identifier across all kernel modules.  I don't get it
> from reading the code.  To me it looks like some extra BTF "type"
> information about the BTF_ID.
>
> E.g. how do BTF "local" BPF-ELF object's get a unique identifier, that
> doesn't overlap with e.g. kernel modules?

See above: the kernel generates a unique (until the next reboot) ID for
every BTF object when it's loaded into the kernel.

>>> RFC/TODO: Improve patch: Can verifier validate provided BTF on "update"
>>> and detect if compatible with common struct???
>> 
>> If we have the unique ID as mentioned above, I think the kernel probably
>> could resolve this automatically: whenever a module is loaded, the
>> kernel could walk the BTF information from that module an simply inspect
>> all the metadata structs and see if they contain the embedded
>> xdp_hints_common struct. The IDs of any metadata structs that do contain
>> the common struct can then be kept in a central lookup table and the
>> consumption code can then simply compare the BTF ID to this table when
>> building an SKB?
>
> I'm not against the idea for the kernel to keep track of these structs.
> I just don't like the idea of checking this runtime, especially as this
> approach for walking all other modules BTF struct's doesn't scale.

Yeah, we should optimise this. See below...

>> As for the validation on the BPF side:n
>> 
>>> +	if (flags & HINTS_BTF_UPDATE) {
>>> +		is_compat_common = !!(flags & HINTS_BTF_COMPAT_COMMON);
>>> +	/* TODO: Can kernel validate if hints are BTF compat with common? */
>>> +	/* TODO: Could BPF prog provide BTF as ARG_PTR_TO_BTF_ID to prove compat_common ? */
>> 
>> If we use the "global ID + lookup table" approach above, we don't really
>> need to validate anything here: if the program says it's writing
>> metadata with a format given by a specific ID, that implies
>> compatibility (or not) as given by the ID. We could sanity-check the
>> metadata area size, but the consumption code has to do that anyway, so
>> I'm not sure it's worth the runtime overhead to have an additional check
>> here?
>
> As you know I hate "runtime checks", and try hard to push checks to
> "setup time".  Maybe we could have verifier (or libbpf) do the check at
> setup/load time, by identifying the helper call and check if provided
> BTF do match COMPAT_COMMON claim.
>
> For this to work, the verifier need to be able to resolve origin
> "module", which happens at BPF load-time, so we would need to set the
> ifindex (curr used for XDP-hardware-offload) at BPF load-time.

If we make the UAPI on the BPF side just accept a full BTF object+type
ID, and also require that the value being passed to the helper is a
compile-time constant (so it is visible to the verifier at verification
time), it is straight-forward for the verifier to just lookup the BTF
type, check if it contains the "hints_common" struct and if it does,
rewrite the helper call to set the right value of the "compat_common"
flag without exposing the flag itself as UAPI.

The driver code would probably still have to set this flag "manually",
but that's internal kernel API, so that's probably fine...

>> As for safety of the metadata content itself, I don't really think we
>> can do anything to guarantee this: in any case the BPF program can pass
>> a valid BTF ID and still write garbage values into the actual fields, so
>> the consumption code has to do enough validation that this won't crash
>> the kernel anyway. But this is no different from the packet data itself:
>> XDP is basically in a position to be a MITM attacker of the network
>> stack itself, which is why loading XDP programs is a privileged
>> operation...
>
> I agree, that we cannot stop the end-user from screwing up their
> BPF-prog to provide garbage in the fields, as long as it doesn't crash
> the kernel.  I do think it would improve usability for end-users if we
> can detect and report that their BPF-prog have gotten out of sync with
> the running kernel and their claim that their BTF layout are
> COMPAT_COMMON isn't actually true.  But I guess it is shouldn't block
> the code, as it's only an extra usability help.

Yeah, I agree this could be error prone; which is why I think not
exposing the flag itself as UAPI is a better solution ;)

-Toke


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC bpf-next 7/9] i40e: add XDP-hints handling
  2022-06-28 16:31 ` [PATCH RFC bpf-next 7/9] i40e: add XDP-hints handling Jesper Dangaard Brouer
@ 2022-07-18  9:38   ` Maryam Tahhan
  2022-07-18 10:27     ` Maryam Tahhan
  0 siblings, 1 reply; 15+ messages in thread
From: Maryam Tahhan @ 2022-07-18  9:38 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, bpf; +Cc: xdp-hints

On 28/06/2022 17:31, Jesper Dangaard Brouer wrote:

<snip>

> +
> +static inline void i40e_process_xdp_hints(struct i40e_ring *rx_ring,
> +					  union i40e_rx_desc *rx_desc,
> +					  struct xdp_buff *xdp,
> +					  u64 qword)
> +{
> +	u32 rx_status = (qword & I40E_RXD_QW1_STATUS_MASK) >>
> +			I40E_RXD_QW1_STATUS_SHIFT;
> +	u32 tsynvalid = rx_status & I40E_RXD_QW1_STATUS_TSYNVALID_MASK;
> +	u32 tsyn = (rx_status & I40E_RXD_QW1_STATUS_TSYNINDX_MASK) >>
> +		   I40E_RXD_QW1_STATUS_TSYNINDX_SHIFT;
> +	u64 tsyn_ts;
> +
> +	struct i40e_rx_ptype_decoded ptype;
> +	struct xdp_hints_i40e *xdp_hints;
> +	struct xdp_hints_common *common;
> +	u32 btf_id = btf_id_xdp_hints_i40e;
> +	u32 btf_sz = sizeof(*xdp_hints);
> +	u32 f1 = 0, f2, f3, f4, f5 = 0;
> +	u8 rx_ptype;
> +
> +	if (!(rx_ring->netdev->features & NETIF_F_XDP_HINTS))
> +		return;
> +
> +	/* Driver have xdp headroom when using build_skb */
> +	if (unlikely(!ring_uses_build_skb(rx_ring)))
> +		return;
> +
> +	xdp_hints = xdp->data - btf_sz;
> +	common = &xdp_hints->common;
> +
> +	if (unlikely(tsynvalid)) {
> +		struct xdp_hints_i40e_timestamp *hints;
> +
> +		tsyn_ts = i40e_ptp_rx_hwtstamp_raw(rx_ring->vsi->back, tsyn);
> +		btf_id = btf_id_xdp_hints_i40e_timestamp;
> +		btf_sz = sizeof(*hints);
> +		hints = xdp->data - btf_sz;
> +		hints->rx_timestamp = ns_to_ktime(tsyn_ts);
> +		f1 = HINT_FLAG_RX_TIMESTAMP;
> +	}
> +
> +	/* ptype needed by both hash and checksum code */
> +	rx_ptype = (qword & I40E_RXD_QW1_PTYPE_MASK) >> I40E_RXD_QW1_PTYPE_SHIFT;
> +	ptype = decode_rx_desc_ptype(rx_ptype);
> +
> +	f2 = i40e_rx_hash_xdp(rx_ring, rx_desc, xdp, qword, xdp_hints, ptype);
> +	f3 = i40e_rx_checksum_xdp(rx_ring->vsi, qword, xdp_hints, ptype);
> +	f4 = xdp_hints_set_rxq(common, rx_ring->queue_index);
> +
> +	if (unlikely(qword & BIT(I40E_RX_DESC_STATUS_L2TAG1P_SHIFT))) {
> +		__le16 vlan_tag = rx_desc->wb.qword0.lo_dword.l2tag1;
> +
> +		f5 = xdp_hints_set_vlan(common, le16_to_cpu(vlan_tag),
> +				   htons(ETH_P_8021Q));
> +	}
> +
> +	xdp_hints_set_flags(common, (f1 | f2 | f3 | f4 | f5));
> +	common->btf_id = btf_id;
> +	xdp->data_meta = xdp->data - btf_sz;

I think it might be worth considering leaving a predefined size space in 
the headroom before the data (but after the metadata) for encapsulation 
headers that may be applied to the packet as it transitions to it's 
final destination through a host. In other words starting the metadata 
further up so that the BTF id resides at a known offset from data.

Say for example a bpf program that inserts vlan/vxlan tags/headers on 
received packets on a host should have enough space to apply that vlan 
tag without having to copy the metadata and shift it before it does that.

Or maybe there was something that was already accounting for this in the 
design and I missed it. Would really appreciate a pointer in that case.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC bpf-next 7/9] i40e: add XDP-hints handling
  2022-07-18  9:38   ` Maryam Tahhan
@ 2022-07-18 10:27     ` Maryam Tahhan
  0 siblings, 0 replies; 15+ messages in thread
From: Maryam Tahhan @ 2022-07-18 10:27 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, bpf; +Cc: xdp-hints

On 18/07/2022 10:38, Maryam Tahhan wrote:
> On 28/06/2022 17:31, Jesper Dangaard Brouer wrote:
> 
> <snip>
> 
>> +
>> +static inline void i40e_process_xdp_hints(struct i40e_ring *rx_ring,
>> +                      union i40e_rx_desc *rx_desc,
>> +                      struct xdp_buff *xdp,
>> +                      u64 qword)
>> +{
>> +    u32 rx_status = (qword & I40E_RXD_QW1_STATUS_MASK) >>
>> +            I40E_RXD_QW1_STATUS_SHIFT;
>> +    u32 tsynvalid = rx_status & I40E_RXD_QW1_STATUS_TSYNVALID_MASK;
>> +    u32 tsyn = (rx_status & I40E_RXD_QW1_STATUS_TSYNINDX_MASK) >>
>> +           I40E_RXD_QW1_STATUS_TSYNINDX_SHIFT;
>> +    u64 tsyn_ts;
>> +
>> +    struct i40e_rx_ptype_decoded ptype;
>> +    struct xdp_hints_i40e *xdp_hints;
>> +    struct xdp_hints_common *common;
>> +    u32 btf_id = btf_id_xdp_hints_i40e;
>> +    u32 btf_sz = sizeof(*xdp_hints);
>> +    u32 f1 = 0, f2, f3, f4, f5 = 0;
>> +    u8 rx_ptype;
>> +
>> +    if (!(rx_ring->netdev->features & NETIF_F_XDP_HINTS))
>> +        return;
>> +
>> +    /* Driver have xdp headroom when using build_skb */
>> +    if (unlikely(!ring_uses_build_skb(rx_ring)))
>> +        return;
>> +
>> +    xdp_hints = xdp->data - btf_sz;
>> +    common = &xdp_hints->common;
>> +
>> +    if (unlikely(tsynvalid)) {
>> +        struct xdp_hints_i40e_timestamp *hints;
>> +
>> +        tsyn_ts = i40e_ptp_rx_hwtstamp_raw(rx_ring->vsi->back, tsyn);
>> +        btf_id = btf_id_xdp_hints_i40e_timestamp;
>> +        btf_sz = sizeof(*hints);
>> +        hints = xdp->data - btf_sz;
>> +        hints->rx_timestamp = ns_to_ktime(tsyn_ts);
>> +        f1 = HINT_FLAG_RX_TIMESTAMP;
>> +    }
>> +
>> +    /* ptype needed by both hash and checksum code */
>> +    rx_ptype = (qword & I40E_RXD_QW1_PTYPE_MASK) >> 
>> I40E_RXD_QW1_PTYPE_SHIFT;
>> +    ptype = decode_rx_desc_ptype(rx_ptype);
>> +
>> +    f2 = i40e_rx_hash_xdp(rx_ring, rx_desc, xdp, qword, xdp_hints, 
>> ptype);
>> +    f3 = i40e_rx_checksum_xdp(rx_ring->vsi, qword, xdp_hints, ptype);
>> +    f4 = xdp_hints_set_rxq(common, rx_ring->queue_index);
>> +
>> +    if (unlikely(qword & BIT(I40E_RX_DESC_STATUS_L2TAG1P_SHIFT))) {
>> +        __le16 vlan_tag = rx_desc->wb.qword0.lo_dword.l2tag1;
>> +
>> +        f5 = xdp_hints_set_vlan(common, le16_to_cpu(vlan_tag),
>> +                   htons(ETH_P_8021Q));
>> +    }
>> +
>> +    xdp_hints_set_flags(common, (f1 | f2 | f3 | f4 | f5));
>> +    common->btf_id = btf_id;
>> +    xdp->data_meta = xdp->data - btf_sz;
> 
> I think it might be worth considering leaving a predefined size space in 
> the headroom before the data (but after the metadata) for encapsulation 
> headers that may be applied to the packet as it transitions to it's 
> final destination through a host. In other words starting the metadata 
I meant xdp_hints rather than metadata here...

> further up so that the BTF id resides at a known offset from data.
> 
> Say for example a bpf program that inserts vlan/vxlan tags/headers on 
> received packets on a host should have enough space to apply that vlan 
> tag without having to copy the metadata and shift it before it does that.

  xdp_hints here also

> 
> Or maybe there was something that was already accounting for this in the 
> design and I missed it. Would really appreciate a pointer in that case.
> 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-07-18 10:27 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-28 16:30 [PATCH RFC bpf-next 0/9] Introduce XDP-hints via BTF Jesper Dangaard Brouer
2022-06-28 16:30 ` [PATCH RFC bpf-next 1/9] i40e: Refactor i40e_ptp_rx_hwtstamp Jesper Dangaard Brouer
2022-06-28 16:30 ` [PATCH RFC bpf-next 2/9] bpf: export btf functions for modules Jesper Dangaard Brouer
2022-06-28 16:30 ` [PATCH RFC bpf-next 3/9] net: create xdp_hints_common and set functions Jesper Dangaard Brouer
2022-06-28 16:30 ` [PATCH RFC bpf-next 4/9] net: add net_device feature flag for XDP-hints Jesper Dangaard Brouer
2022-06-28 16:30 ` [PATCH RFC bpf-next 5/9] xdp: controlling XDP-hints from BPF-prog via helper Jesper Dangaard Brouer
2022-06-29 14:20   ` [xdp-hints] " Toke Høiland-Jørgensen
2022-07-01  9:10     ` [xdp-hints] " Jesper Dangaard Brouer
2022-07-01 12:09       ` Toke Høiland-Jørgensen
2022-06-28 16:31 ` [PATCH RFC bpf-next 6/9] i40e: refactor i40e_rx_checksum with helper Jesper Dangaard Brouer
2022-06-28 16:31 ` [PATCH RFC bpf-next 7/9] i40e: add XDP-hints handling Jesper Dangaard Brouer
2022-07-18  9:38   ` Maryam Tahhan
2022-07-18 10:27     ` Maryam Tahhan
2022-06-28 16:31 ` [PATCH RFC bpf-next 8/9] net: use XDP-hints in xdp_frame to SKB conversion Jesper Dangaard Brouer
2022-06-28 16:31 ` [PATCH RFC bpf-next 9/9] mvneta: add XDP-hints support Jesper Dangaard Brouer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.