bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support
@ 2021-08-03  1:03 Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 01/16] bpf: add btf register/unregister API Ederson de Souza
                   ` (16 more replies)
  0 siblings, 17 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf

While there's some work going on different aspects of the XDP hints, I'd like
to present and ask for comments on this patch series.

XDP hints/metadata is a way for the driver to transmit information regarding a
specific XDP frame along with the frame. Following current discussions and
based on top of Saeed's early patches, this series provides the XDP hints with
one (or two, depending on how you view it) use case: RX/TX timestamps for the
igc driver.

Keeping with Saeed's patches, to enable XDP hints usage, one has to first
enable it with bpftool like:

  bpftool net xdp set dev <iface> md_btf on

From the driver perspective, support for XDP hints is achieved by:

 - Adding support for XDP_SETUP_MD_BTF operation, where it can register the BTF.

 - Adding support for XDP_QUERY_MD_BTF so user space can retrieve the BTF id.

 - Adding the relevant data to the metadata area of the XDP frame.

    - One of this relevant data is the BTF id of the BTF in use.

In order to make use of the BPF CO-RE mechanism, this series makes the driver
name of the struct for the XDP hints be called `xdp_hints___<driver_name>` (in
this series, as I'm using igc driver, it becomes `xdp_hints___igc`). This
should help BPF programs, as they can simply refer to the struct as `xdp_hints`.

A common issue is how to standardize the names of the fields in the BTF. Here,
a series of macros is provided on the `include/net/xdp.h`, that goes by
`XDP_GENERIC_` prefixes. In there, the `btf_id` field was added, that needs
to be strategically positioned at the end of the struct. Also added are the
`rx_timestamp` and  `tx_timestamp` fields, as I believe they're generic as
well. The macros also provide `u32` and `u64` types. Besides, I also ended
up adding a `valid_map` field. It should help whoever is using the XDP hints
to be sure of what is valid in that hints. It also makes the driver life
simple, as it just uses a single struct and validates fields as it fills
them.

The BPF sample `xdp_sample_pkts` was modified to demonstrate the usage of XDP
hints on BPF programs. It's a very simple example, but it shows some nice
things about it. For instance, instead of getting the struct somehow before,
it uses CO-RE to simply name the XDP hint field it's interested in and
read it using `BPF_CORE_READ`. (I also tried to use `bpf_core_field_exists` to
make it even more dynamic, but couldn't get to build it. I mention why in the
example.)

Also, as much of my interest lies in the user space side, the one using
AF_XDP, to support it a few additional things were done.

Firstly, a new "driver info" is provided, to be obtained via
`ioctl(SIOCETHTOOL)`: "xdp_headroom". This is how much XDP headroom is
required by the driver. While not really important for the RX path (as the
driver already applies that headroom to the XDP frame), it's
important for the TX path, as here, it's the application responsibility to
factor in the XDP headroom area. (Note that the TX timestamp is obtained from
the XDP frame of the transmitted packet, when that frame goes back to the
completion queue.)

A series of helpers was also added to libbpf to help manage this headroom
area. They go by the prefix " xsk_umem__adjust_", to adjust consumer and
producer data and metadata.

In order to read the XDP hints from the memory, another series of helpers was
added. They read the BTF from the BTF id, and create a hashmap of the offsets
and sizes of the fields, that is then used to actually retrieve the data.

I modified the "xdpsock" example to show the use of XDP hints on the AF_XDP
world, along with the proposed API.

Finally, I know that Michal and Alexandr (and probably others that I don't
know) are working in this same front. This RFC is not to race any other work,
instead I hope it can help in the discussion of the best solution for the
XDP hints – and I truly think it brings value, specifically for the AF_XDP
usages.

Andre Guedes (1):
  igc: Fix race condition in PTP Tx code

Ederson de Souza (8):
  net/xdp: Support for generic XDP hints
  igc: XDP packet RX timestamp
  igc: XDP packet TX timestamp
  ethtool,igc: Add "xdp_headroom" driver info
  libbpf: Helpers to access XDP frame metadata
  libbpf: Helpers to access XDP hints based on BTF definitions
  samples/bpf: XDP hints AF_XDP example
  samples/bpf: Show XDP hints usage

Saeed Mahameed (4):
  bpf: add btf register/unregister API
  net/core: XDP metadata BTF netlink API
  tools/bpf: Query XDP metadata BTF ID
  tools/bpf: Add xdp set command for md btf

Vinicius Costa Gomes (3):
  igc: Retrieve the TX timestamp directly (instead of in a interrupt)
  igc: Add support for multiple in-flight TX timestamps
  igc: Use irq safe locks for timestamping

 drivers/net/ethernet/intel/igc/igc.h         |  49 ++-
 drivers/net/ethernet/intel/igc/igc_base.h    |   3 +
 drivers/net/ethernet/intel/igc/igc_defines.h |   7 +
 drivers/net/ethernet/intel/igc/igc_ethtool.c |   2 +
 drivers/net/ethernet/intel/igc/igc_main.c    | 207 +++++++++++--
 drivers/net/ethernet/intel/igc/igc_ptp.c     | 239 +++++++++-----
 drivers/net/ethernet/intel/igc/igc_regs.h    |  12 +
 drivers/net/ethernet/intel/igc/igc_xdp.c     |  93 ++++++
 drivers/net/ethernet/intel/igc/igc_xdp.h     |  11 +
 include/linux/btf.h                          |   9 +
 include/linux/netdevice.h                    |  15 +-
 include/net/xdp.h                            |  62 ++++
 include/uapi/linux/ethtool.h                 |   3 +
 include/uapi/linux/if_link.h                 |   2 +
 include/uapi/linux/if_xdp.h                  |   3 +
 kernel/bpf/btf.c                             | 155 ++++++++--
 net/core/dev.c                               |  54 ++++
 net/core/rtnetlink.c                         |  18 +-
 samples/bpf/xdp_sample_pkts_kern.c           |  21 ++
 samples/bpf/xdp_sample_pkts_user.c           |   4 +-
 samples/bpf/xdpsock_user.c                   | 146 ++++++++-
 tools/bpf/bpftool/main.h                     |   3 +-
 tools/bpf/bpftool/net.c                      |   7 +-
 tools/bpf/bpftool/netlink_dumper.c           |  21 +-
 tools/bpf/bpftool/xdp.c                      | 310 +++++++++++++++++++
 tools/include/uapi/linux/if_link.h           |   2 +
 tools/lib/bpf/libbpf.h                       |   2 +
 tools/lib/bpf/libbpf.map                     |  10 +
 tools/lib/bpf/netlink.c                      |  49 +++
 tools/lib/bpf/xsk.c                          | 203 ++++++++++++
 tools/lib/bpf/xsk.h                          |  22 ++
 31 files changed, 1579 insertions(+), 165 deletions(-)
 create mode 100644 tools/bpf/bpftool/xdp.c

-- 
2.32.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 01/16] bpf: add btf register/unregister API
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 02/16] net/core: XDP metadata BTF netlink API Ederson de Souza
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf, Saeed Mahameed

From: Saeed Mahameed <saeedm@mellanox.com>

A device driver can register own BTF format buffers into the kernel.
Will be used in downstream patches by mlx5 XDP driver to advertise and
populated XDP meta data.

Issue: 2114293
Change-Id: I37cc1b18dadd7cf22aa67d2f14d811deae7525b4
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 include/linux/btf.h |   9 +++
 kernel/bpf/btf.c    | 155 +++++++++++++++++++++++++++++++++++---------
 2 files changed, 135 insertions(+), 29 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 214fde93214b..d48e6fcf46dc 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -225,6 +225,8 @@ const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id);
 const char *btf_name_by_offset(const struct btf *btf, u32 offset);
 struct btf *btf_parse_vmlinux(void);
 struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog);
+struct btf *btf_register(void *data, u32 data_size);
+void btf_unregister(struct btf *btf);
 #else
 static inline const struct btf_type *btf_type_by_id(const struct btf *btf,
 						    u32 type_id)
@@ -236,6 +238,13 @@ static inline const char *btf_name_by_offset(const struct btf *btf,
 {
 	return NULL;
 }
+
+static inline struct btf *btf_register(void *data, u32 data_size)
+{
+	return NULL;
+}
+
+static inline void btf_unregister(struct btf *btf) { }
 #endif
 
 #endif
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 7780131f710e..88d8cd02d282 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -1509,11 +1509,31 @@ static void btf_free_id(struct btf *btf)
 	spin_unlock_irqrestore(&btf_idr_lock, flags);
 }
 
+static struct btf *btf_alloc(u32 data_size)
+{
+	struct btf *btf;
+
+	btf = kzalloc(sizeof(*btf), GFP_KERNEL | __GFP_NOWARN);
+	if (!btf)
+		return NULL;
+
+	btf->data = kvmalloc(data_size, GFP_KERNEL | __GFP_NOWARN);
+	if (!btf->data) {
+		kfree(btf);
+		return NULL;
+	}
+
+	btf->data_size = data_size;
+	return btf;
+}
+
 static void btf_free(struct btf *btf)
 {
 	kvfree(btf->types);
 	kvfree(btf->resolved_sizes);
 	kvfree(btf->resolved_ids);
+
+	/* stuff allocated via btf_alloc */
 	kvfree(btf->data);
 	kfree(btf);
 }
@@ -1574,6 +1594,13 @@ static int env_resolve_init(struct btf_verifier_env *env)
 	return -ENOMEM;
 }
 
+static struct btf_verifier_env *btf_verifier_env_alloc(void)
+{
+	gfp_t flags = GFP_KERNEL | __GFP_NOWARN;
+
+	return kzalloc(sizeof(struct btf_verifier_env), flags);
+}
+
 static void btf_verifier_env_free(struct btf_verifier_env *env)
 {
 	kvfree(env->visit_states);
@@ -4306,19 +4333,37 @@ static int btf_parse_hdr(struct btf_verifier_env *env)
 	return 0;
 }
 
-static struct btf *btf_parse(bpfptr_t btf_data, u32 btf_data_size,
-			     u32 log_level, char __user *log_ubuf, u32 log_size)
+static int btf_parse(struct btf_verifier_env *env)
+{
+	struct btf *btf = env->btf;
+	int err;
+
+	err = btf_parse_hdr(env);
+	if (err)
+		return err;
+
+	btf->nohdr_data = btf->data + btf->hdr.hdr_len;
+
+	err = btf_parse_str_sec(env);
+	if (err)
+		return err;
+
+	return btf_parse_type_sec(env);
+}
+
+static struct btf *
+btf_parse_user(bpfptr_t btf_data, u32 btf_data_size,
+	       u32 log_level, char __user *log_ubuf, u32 log_size)
 {
 	struct btf_verifier_env *env = NULL;
 	struct bpf_verifier_log *log;
 	struct btf *btf = NULL;
-	u8 *data;
 	int err;
 
 	if (btf_data_size > BTF_MAX_SIZE)
 		return ERR_PTR(-E2BIG);
 
-	env = kzalloc(sizeof(*env), GFP_KERNEL | __GFP_NOWARN);
+	env = btf_verifier_env_alloc();
 	if (!env)
 		return ERR_PTR(-ENOMEM);
 
@@ -4339,45 +4384,66 @@ static struct btf *btf_parse(bpfptr_t btf_data, u32 btf_data_size,
 		}
 	}
 
-	btf = kzalloc(sizeof(*btf), GFP_KERNEL | __GFP_NOWARN);
+	btf = btf_alloc(btf_data_size);
 	if (!btf) {
 		err = -ENOMEM;
 		goto errout;
 	}
-	env->btf = btf;
 
-	data = kvmalloc(btf_data_size, GFP_KERNEL | __GFP_NOWARN);
-	if (!data) {
-		err = -ENOMEM;
+	if (copy_from_bpfptr(btf->data, btf_data, btf_data_size)) {
+		err = -EFAULT;
 		goto errout;
 	}
 
-	btf->data = data;
-	btf->data_size = btf_data_size;
+	env->btf = btf;
+	err = btf_parse(env);
+	if (err)
+		goto errout;
 
-	if (copy_from_bpfptr(data, btf_data, btf_data_size)) {
-		err = -EFAULT;
+	if (log->level && bpf_verifier_log_full(log)) {
+		err = -ENOSPC;
 		goto errout;
 	}
 
-	err = btf_parse_hdr(env);
-	if (err)
-		goto errout;
+	btf_verifier_env_free(env);
+	refcount_set(&btf->refcnt, 1);
+	return btf;
 
-	btf->nohdr_data = btf->data + btf->hdr.hdr_len;
+errout:
+	btf_verifier_env_free(env);
+	if (btf)
+		btf_free(btf);
+	return ERR_PTR(err);
+}
 
-	err = btf_parse_str_sec(env);
-	if (err)
-		goto errout;
+static struct btf *btf_parse_raw(void *btf_data, u32 btf_data_size)
+{
+	struct btf_verifier_env *env = NULL;
+	struct btf *btf = NULL;
+	int err;
 
-	err = btf_parse_type_sec(env);
-	if (err)
-		goto errout;
+	if (btf_data_size > BTF_MAX_SIZE)
+		return ERR_PTR(-E2BIG);
 
-	if (log->level && bpf_verifier_log_full(log)) {
-		err = -ENOSPC;
+	env = btf_verifier_env_alloc();
+	if (!env)
+		return ERR_PTR(-ENOMEM);
+
+	/* force log to go to kernel trace buffer */
+	env->log.level = BPF_LOG_KERNEL;
+
+	btf = btf_alloc(btf_data_size);
+	if (!btf) {
+		err = -ENOMEM;
 		goto errout;
 	}
+	memcpy(btf->data, btf_data, btf_data_size);
+
+	env->btf = btf;
+
+	err = btf_parse(env);
+	if (err)
+		goto errout;
 
 	btf_verifier_env_free(env);
 	refcount_set(&btf->refcnt, 1);
@@ -4388,6 +4454,7 @@ static struct btf *btf_parse(bpfptr_t btf_data, u32 btf_data_size,
 	if (btf)
 		btf_free(btf);
 	return ERR_PTR(err);
+
 }
 
 extern char __weak __start_BTF[];
@@ -5846,10 +5913,10 @@ int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr)
 	struct btf *btf;
 	int ret;
 
-	btf = btf_parse(make_bpfptr(attr->btf, uattr.is_kernel),
-			attr->btf_size, attr->btf_log_level,
-			u64_to_user_ptr(attr->btf_log_buf),
-			attr->btf_log_size);
+	btf = btf_parse_user(make_bpfptr(attr->btf, uattr.is_kernel),
+			     attr->btf_size, attr->btf_log_level,
+			     u64_to_user_ptr(attr->btf_log_buf),
+			     attr->btf_log_size);
 	if (IS_ERR(btf))
 		return PTR_ERR(btf);
 
@@ -5872,6 +5939,35 @@ int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr)
 	return ret;
 }
 
+struct btf *btf_register(void *data, u32 data_size)
+{
+	struct btf *btf;
+	int ret;
+
+	btf = btf_parse_raw(data, data_size);
+	if (IS_ERR(btf))
+		return btf;
+
+	ret = btf_alloc_id(btf);
+	if (ret) {
+		btf_free(btf);
+		return ERR_PTR(ret);
+	}
+
+	return btf;
+}
+EXPORT_SYMBOL(btf_register);
+
+void btf_unregister(struct btf *btf)
+{
+	if (IS_ERR(btf))
+		return;
+
+	/* btf_put since btf might be held by user */
+	btf_put(btf);
+}
+EXPORT_SYMBOL(btf_unregister);
+
 struct btf *btf_get_by_fd(int fd)
 {
 	struct btf *btf;
@@ -5979,6 +6075,7 @@ u32 btf_obj_id(const struct btf *btf)
 {
 	return btf->id;
 }
+EXPORT_SYMBOL(btf_obj_id);
 
 bool btf_is_kernel(const struct btf *btf)
 {
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 02/16] net/core: XDP metadata BTF netlink API
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 01/16] bpf: add btf register/unregister API Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 03/16] tools/bpf: Query XDP metadata BTF ID Ederson de Souza
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf, Saeed Mahameed

From: Saeed Mahameed <saeedm@mellanox.com>

Add new devlink XDP attributes to be used to query or setup XDP metadata
BTF state.

IFLA_XDP_MD_BTF_ID: type NLA_U32.
IFLA_XDP_MD_BTF_STATE: type = NLA_U8.

On XDP query driver reports current loaded BTF ID, and its state if
active or not.

On XDP setup, driver will use these attributes to activate/deactivate
a specific BTF ID.

Issue: 2114293
Change-Id: I14d57cc104231f970c5ce709c0b21f7ff1711ff1
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 include/linux/netdevice.h    | 15 +++++++++-
 include/uapi/linux/if_link.h |  2 ++
 net/core/dev.c               | 54 ++++++++++++++++++++++++++++++++++++
 net/core/rtnetlink.c         | 18 +++++++++++-
 4 files changed, 87 insertions(+), 2 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index c871dc223dfa..79a794711cd6 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -938,6 +938,9 @@ enum bpf_netdev_command {
 	 */
 	XDP_SETUP_PROG,
 	XDP_SETUP_PROG_HW,
+	/* Setup/query XDP Meta Data BTF */
+	XDP_SETUP_MD_BTF,
+	XDP_QUERY_MD_BTF,
 	/* BPF program for offload callbacks, invoked at program load time. */
 	BPF_OFFLOAD_MAP_ALLOC,
 	BPF_OFFLOAD_MAP_FREE,
@@ -948,6 +951,7 @@ struct bpf_prog_offload_ops;
 struct netlink_ext_ack;
 struct xdp_umem;
 struct xdp_dev_bulk_queue;
+struct btf;
 struct bpf_xdp_link;
 
 enum bpf_xdp_mode {
@@ -969,7 +973,11 @@ struct netdev_bpf {
 		struct {
 			u32 flags;
 			struct bpf_prog *prog;
-			struct netlink_ext_ack *extack;
+		};
+		/* XDP_{SETUP/QUERY}_MD_BTF */
+		struct {
+			u8 btf_enable; /* only enable/disable for now */
+			u32 btf_id;
 		};
 		/* BPF_OFFLOAD_MAP_ALLOC, BPF_OFFLOAD_MAP_FREE */
 		struct {
@@ -981,6 +989,7 @@ struct netdev_bpf {
 			u16 queue_id;
 		} xsk;
 	};
+	struct netlink_ext_ack *extack;
 };
 
 /* Flags for ndo_xsk_wakeup. */
@@ -4067,6 +4076,10 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
 int bpf_xdp_link_attach(const union bpf_attr *attr, struct bpf_prog *prog);
 u32 dev_xdp_prog_id(struct net_device *dev, enum bpf_xdp_mode mode);
 
+int dev_xdp_setup_md_btf(struct net_device *dev, struct netlink_ext_ack *extack,
+			 u8 enable);
+u32 dev_xdp_query_md_btf(struct net_device *dev, u8 *enabled);
+
 int __dev_forward_skb(struct net_device *dev, struct sk_buff *skb);
 int dev_forward_skb(struct net_device *dev, struct sk_buff *skb);
 int dev_forward_skb_nomtu(struct net_device *dev, struct sk_buff *skb);
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 4882e81514b6..6879a33b63ed 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -1204,6 +1204,8 @@ enum {
 	IFLA_XDP_SKB_PROG_ID,
 	IFLA_XDP_HW_PROG_ID,
 	IFLA_XDP_EXPECTED_FD,
+	IFLA_XDP_MD_BTF_ID,
+	IFLA_XDP_MD_BTF_STATE,
 	__IFLA_XDP_MAX,
 };
 
diff --git a/net/core/dev.c b/net/core/dev.c
index fb5d12a3d52d..792bc356582b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -9462,6 +9462,60 @@ static void dev_xdp_uninstall(struct net_device *dev)
 	}
 }
 
+
+/**
+ *	dev_xdp_query_md_btf - Query meta data btf of a device
+ *	@dev: device
+ *	@enabled: 1 if enabled, 0 otherwise
+ *
+ *	Returns btf id > 0 if valid
+ */
+u32 dev_xdp_query_md_btf(struct net_device *dev, u8 *enabled)
+{
+	struct netdev_bpf xdp;
+	bpf_op_t ndo_bpf;
+
+	ndo_bpf = dev->netdev_ops->ndo_bpf;
+	if (!ndo_bpf)
+		return 0;
+
+	memset(&xdp, 0, sizeof(xdp));
+	xdp.command = XDP_QUERY_MD_BTF;
+
+	if (ndo_bpf(dev, &xdp))
+		return 0; /* 0 is an invalid btf id */
+
+	*enabled = xdp.btf_enable;
+	return xdp.btf_id;
+}
+
+/**
+ *	dev_xdp_setup_md_btf - enable or disable meta data btf for a device
+ *	@dev: device
+ *	@extack: netlink extended ack
+ *	@enable: 1 to enable, 0 to disable
+ *
+ *	Returns 0 on success
+ */
+int dev_xdp_setup_md_btf(struct net_device *dev, struct netlink_ext_ack *extack,
+			 u8 enable)
+{
+	struct netdev_bpf xdp;
+	bpf_op_t ndo_bpf;
+
+	ndo_bpf = dev->netdev_ops->ndo_bpf;
+	if (!ndo_bpf)
+		return -EOPNOTSUPP;
+
+	memset(&xdp, 0, sizeof(xdp));
+
+	xdp.command = XDP_SETUP_MD_BTF;
+	xdp.btf_enable = enable;
+	xdp.extack = extack;
+
+	return ndo_bpf(dev, &xdp);
+}
+
 static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack,
 			  struct bpf_xdp_link *link, struct bpf_prog *new_prog,
 			  struct bpf_prog *old_prog, u32 flags)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index e79aaf1f7139..961cd7e98054 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1453,8 +1453,9 @@ static int rtnl_xdp_report_one(struct sk_buff *skb, struct net_device *dev,
 
 static int rtnl_xdp_fill(struct sk_buff *skb, struct net_device *dev)
 {
+	u32 prog_id, md_btf_id;
+	u8 md_btf_enabled = 0;
 	struct nlattr *xdp;
-	u32 prog_id;
 	int err;
 	u8 mode;
 
@@ -1487,6 +1488,10 @@ static int rtnl_xdp_fill(struct sk_buff *skb, struct net_device *dev)
 			goto err_cancel;
 	}
 
+	md_btf_id = dev_xdp_query_md_btf(dev, &md_btf_enabled);
+	nla_put_u32(skb, IFLA_XDP_MD_BTF_ID, md_btf_id);
+	nla_put_u8(skb, IFLA_XDP_MD_BTF_STATE, md_btf_enabled);
+
 	nla_nest_end(skb, xdp);
 	return 0;
 
@@ -1931,6 +1936,8 @@ static const struct nla_policy ifla_xdp_policy[IFLA_XDP_MAX + 1] = {
 	[IFLA_XDP_ATTACHED]	= { .type = NLA_U8 },
 	[IFLA_XDP_FLAGS]	= { .type = NLA_U32 },
 	[IFLA_XDP_PROG_ID]	= { .type = NLA_U32 },
+	[IFLA_XDP_MD_BTF_ID]	= { .type = NLA_U32 },
+	[IFLA_XDP_MD_BTF_STATE] = { .type = NLA_U8 },
 };
 
 static const struct rtnl_link_ops *linkinfo_to_kind_ops(const struct nlattr *nla)
@@ -2927,6 +2934,15 @@ static int do_setlink(const struct sk_buff *skb,
 				goto errout;
 			status |= DO_SETLINK_NOTIFY;
 		}
+
+		if (xdp[IFLA_XDP_MD_BTF_STATE]) {
+			u8 enable = nla_get_u8(xdp[IFLA_XDP_MD_BTF_STATE]);
+
+			err = dev_xdp_setup_md_btf(dev, extack, enable);
+			if (err)
+				goto errout;
+			status |= DO_SETLINK_NOTIFY;
+		}
 	}
 
 errout:
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 03/16] tools/bpf: Query XDP metadata BTF ID
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 01/16] bpf: add btf register/unregister API Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 02/16] net/core: XDP metadata BTF netlink API Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 04/16] tools/bpf: Add xdp set command for md btf Ederson de Souza
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf, Saeed Mahameed

From: Saeed Mahameed <saeedm@mellanox.com>

When dumping bpf net information, also query XDP MD BTF attributes:

$ /usr/local/sbin/bpftool net
xdp:
mlx0(3) md_btf_id(1) md_btf_enabled(0)

tc:

flow_dissector:

Issue: 2114293
Change-Id: Ifef542ecf3defe4204947618c07cc3eac45b39f9
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 tools/bpf/bpftool/netlink_dumper.c | 21 +++++++++++++++++----
 tools/include/uapi/linux/if_link.h |  2 ++
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/tools/bpf/bpftool/netlink_dumper.c b/tools/bpf/bpftool/netlink_dumper.c
index 5f65140b003b..17807a3312ff 100644
--- a/tools/bpf/bpftool/netlink_dumper.c
+++ b/tools/bpf/bpftool/netlink_dumper.c
@@ -29,23 +29,36 @@ static void xdp_dump_prog_id(struct nlattr **tb, int attr,
 static int do_xdp_dump_one(struct nlattr *attr, unsigned int ifindex,
 			   const char *name)
 {
+	unsigned char mode = XDP_ATTACHED_NONE;
 	struct nlattr *tb[IFLA_XDP_MAX + 1];
-	unsigned char mode;
+	unsigned char md_btf_enabled = 0;
+	unsigned int md_btf_id = 0;
+	bool attached;
 
 	if (libbpf_nla_parse_nested(tb, IFLA_XDP_MAX, attr, NULL) < 0)
 		return -1;
 
-	if (!tb[IFLA_XDP_ATTACHED])
+	if (!tb[IFLA_XDP_ATTACHED] && !tb[IFLA_XDP_MD_BTF_ID])
 		return 0;
 
-	mode = libbpf_nla_getattr_u8(tb[IFLA_XDP_ATTACHED]);
-	if (mode == XDP_ATTACHED_NONE)
+	if (tb[IFLA_XDP_ATTACHED])
+		mode = libbpf_nla_getattr_u8(tb[IFLA_XDP_ATTACHED]);
+
+	if (tb[IFLA_XDP_MD_BTF_ID]) {
+		md_btf_id = libbpf_nla_getattr_u32(tb[IFLA_XDP_MD_BTF_ID]);
+		md_btf_enabled = libbpf_nla_getattr_u8(tb[IFLA_XDP_MD_BTF_STATE]);
+	}
+
+	attached = (mode != XDP_ATTACHED_NONE);
+	if (!attached && !md_btf_id)
 		return 0;
 
 	NET_START_OBJECT;
 	if (name)
 		NET_DUMP_STR("devname", "%s", name);
 	NET_DUMP_UINT("ifindex", "(%d)", ifindex);
+	NET_DUMP_UINT("md_btf_id", " md_btf_id(%d)", md_btf_id);
+	NET_DUMP_UINT("md_btf_enabled", " md_btf_enabled(%d)", md_btf_enabled);
 
 	if (mode == XDP_ATTACHED_MULTI) {
 		if (json_output) {
diff --git a/tools/include/uapi/linux/if_link.h b/tools/include/uapi/linux/if_link.h
index d208b2af697f..9b45bb3327c2 100644
--- a/tools/include/uapi/linux/if_link.h
+++ b/tools/include/uapi/linux/if_link.h
@@ -992,6 +992,8 @@ enum {
 	IFLA_XDP_SKB_PROG_ID,
 	IFLA_XDP_HW_PROG_ID,
 	IFLA_XDP_EXPECTED_FD,
+	IFLA_XDP_MD_BTF_ID,
+	IFLA_XDP_MD_BTF_STATE,
 	__IFLA_XDP_MAX,
 };
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 04/16] tools/bpf: Add xdp set command for md btf
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
                   ` (2 preceding siblings ...)
  2021-08-03  1:03 ` [[RFC xdp-hints] 03/16] tools/bpf: Query XDP metadata BTF ID Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 05/16] igc: Fix race condition in PTP Tx code Ederson de Souza
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf, Saeed Mahameed

From: Saeed Mahameed <saeedm@mellanox.com>

Introduce a new bpftool net subcommand and use it to report and set XDP
attributes:

$ /usr/local/sbin/bpftool net xdp help
Usage: /usr/local/sbin/bpftool xdp xdp { show | list | set | md_btf} [dev <devname>]
       /usr/local/sbin/bpftool xdp help

$ /usr/local/sbin/bpftool net xdp set dev mlx0 md_btf on

$ /usr/local/sbin/bpftool net xdp show
xdp:
mlx0(3) md_btf_id(1) md_btf_enabled(1)

Issue: 2114293
Change-Id: Id6abe633209852b4957001fcbee6e8b1ae248e4b
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 tools/bpf/bpftool/main.h |   3 +-
 tools/bpf/bpftool/net.c  |   7 +-
 tools/bpf/bpftool/xdp.c  | 310 +++++++++++++++++++++++++++++++++++++++
 tools/lib/bpf/libbpf.h   |   2 +
 tools/lib/bpf/libbpf.map |   1 +
 tools/lib/bpf/netlink.c  |  49 +++++++
 6 files changed, 368 insertions(+), 4 deletions(-)
 create mode 100644 tools/bpf/bpftool/xdp.c

diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index c1cf29798b99..cb2ff2083000 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -168,7 +168,6 @@ int mount_bpffs_for_pin(const char *name);
 int do_pin_any(int argc, char **argv, int (*get_fd_by_id)(int *, char ***));
 int do_pin_fd(int fd, const char *name);
 
-/* commands available in bootstrap mode */
 int do_gen(int argc, char **argv);
 int do_btf(int argc, char **argv);
 
@@ -180,6 +179,7 @@ int do_event_pipe(int argc, char **argv) __weak;
 int do_cgroup(int argc, char **arg) __weak;
 int do_perf(int argc, char **arg) __weak;
 int do_net(int argc, char **arg) __weak;
+int do_xdp(int argc, char **arg) __weak;
 int do_tracelog(int argc, char **arg) __weak;
 int do_feature(int argc, char **argv) __weak;
 int do_struct_ops(int argc, char **argv) __weak;
@@ -257,6 +257,7 @@ struct tcmsg;
 int do_xdp_dump(struct ifinfomsg *ifinfo, struct nlattr **tb);
 int do_filter_dump(struct tcmsg *ifinfo, struct nlattr **tb, const char *kind,
 		   const char *devname, int ifindex);
+int xdp_dump_link_nlmsg(void *cookie, void *msg, struct nlattr **tb);
 
 int print_all_levels(__maybe_unused enum libbpf_print_level level,
 		     const char *format, va_list args);
diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c
index f836d115d7d6..264350a3ca3b 100644
--- a/tools/bpf/bpftool/net.c
+++ b/tools/bpf/bpftool/net.c
@@ -349,7 +349,7 @@ static int netlink_get_link(int sock, unsigned int nl_pid,
 			    dump_link_nlmsg, cookie);
 }
 
-static int dump_link_nlmsg(void *cookie, void *msg, struct nlattr **tb)
+int xdp_dump_link_nlmsg(void *cookie, void *msg, struct nlattr **tb)
 {
 	struct bpf_netdev_t *netinfo = cookie;
 	struct ifinfomsg *ifinfo = msg;
@@ -680,7 +680,7 @@ static int do_show(int argc, char **argv)
 		jsonw_start_array(json_wtr);
 	NET_START_OBJECT;
 	NET_START_ARRAY("xdp", "%s:\n");
-	ret = netlink_get_link(sock, nl_pid, dump_link_nlmsg, &dev_array);
+	ret = netlink_get_link(sock, nl_pid, xdp_dump_link_nlmsg, &dev_array);
 	NET_END_ARRAY("\n");
 
 	if (!ret) {
@@ -722,7 +722,7 @@ static int do_help(int argc, char **argv)
 	}
 
 	fprintf(stderr,
-		"Usage: %1$s %2$s { show | list } [dev <devname>]\n"
+		"Usage: %1$s %2$s { show | list | xdp } [dev <devname>]\n"
 		"       %1$s %2$s attach ATTACH_TYPE PROG dev <devname> [ overwrite ]\n"
 		"       %1$s %2$s detach ATTACH_TYPE dev <devname>\n"
 		"       %1$s %2$s help\n"
@@ -746,6 +746,7 @@ static const struct cmd cmds[] = {
 	{ "list",	do_show },
 	{ "attach",	do_attach },
 	{ "detach",	do_detach },
+	{ "xdp",	do_xdp },
 	{ "help",	do_help },
 	{ 0 }
 };
diff --git a/tools/bpf/bpftool/xdp.c b/tools/bpf/bpftool/xdp.c
new file mode 100644
index 000000000000..f38d692d187c
--- /dev/null
+++ b/tools/bpf/bpftool/xdp.c
@@ -0,0 +1,310 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+// Copyright (C) 2019 Mellanox.
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <time.h>
+#include <net/if.h>
+#include <linux/if.h>
+#include <linux/rtnetlink.h>
+#include <sys/socket.h>
+
+#include "bpf/nlattr.h"
+#include "main.h"
+#include "netlink_dumper.h"
+
+
+/* TODO: reuse  form net.c */
+#ifndef SOL_NETLINK
+#define SOL_NETLINK 270
+#endif
+
+static int netlink_open(__u32 *nl_pid)
+{
+	struct sockaddr_nl sa;
+	socklen_t addrlen;
+	int one = 1, ret;
+	int sock;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.nl_family = AF_NETLINK;
+
+	sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
+	if (sock < 0)
+		return -errno;
+
+	if (setsockopt(sock, SOL_NETLINK, NETLINK_EXT_ACK,
+		       &one, sizeof(one)) < 0) {
+		p_err("Netlink error reporting not supported");
+	}
+
+	if (bind(sock, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
+		ret = -errno;
+		goto cleanup;
+	}
+
+	addrlen = sizeof(sa);
+	if (getsockname(sock, (struct sockaddr *)&sa, &addrlen) < 0) {
+		ret = -errno;
+		goto cleanup;
+	}
+
+	if (addrlen != sizeof(sa)) {
+		ret = -LIBBPF_ERRNO__INTERNAL;
+		goto cleanup;
+	}
+
+	*nl_pid = sa.nl_pid;
+	return sock;
+
+cleanup:
+	close(sock);
+	return ret;
+}
+
+typedef int (*dump_nlmsg_t)(void *cookie, void *msg, struct nlattr **tb);
+
+typedef int (*__dump_nlmsg_t)(struct nlmsghdr *nlmsg, dump_nlmsg_t, void *cookie);
+
+static int netlink_recv(int sock, __u32 nl_pid, __u32 seq,
+			__dump_nlmsg_t _fn, dump_nlmsg_t fn,
+			void *cookie)
+{
+	bool multipart = true;
+	struct nlmsgerr *err;
+	struct nlmsghdr *nh;
+	char buf[4096];
+	int len, ret;
+
+	while (multipart) {
+		multipart = false;
+		len = recv(sock, buf, sizeof(buf), 0);
+		if (len < 0) {
+			ret = -errno;
+			goto done;
+		}
+
+		if (len == 0)
+			break;
+
+		for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len);
+		     nh = NLMSG_NEXT(nh, len)) {
+			if (nh->nlmsg_pid != nl_pid) {
+				ret = -LIBBPF_ERRNO__WRNGPID;
+				goto done;
+			}
+			if (nh->nlmsg_seq != seq) {
+				ret = -LIBBPF_ERRNO__INVSEQ;
+				goto done;
+			}
+			if (nh->nlmsg_flags & NLM_F_MULTI)
+				multipart = true;
+			switch (nh->nlmsg_type) {
+			case NLMSG_ERROR:
+				err = (struct nlmsgerr *)NLMSG_DATA(nh);
+				if (!err->error)
+					continue;
+				ret = err->error;
+				libbpf_nla_dump_errormsg(nh);
+				goto done;
+			case NLMSG_DONE:
+				return 0;
+			default:
+				break;
+			}
+			if (_fn) {
+				ret = _fn(nh, fn, cookie);
+				if (ret)
+					return ret;
+			}
+		}
+	}
+	ret = 0;
+done:
+	return ret;
+}
+
+
+static int __dump_link_nlmsg(struct nlmsghdr *nlh,
+			     dump_nlmsg_t dump_link_nlmsg, void *cookie)
+{
+	struct nlattr *tb[IFLA_MAX + 1], *attr;
+	struct ifinfomsg *ifi = NLMSG_DATA(nlh);
+	int len;
+
+	len = nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*ifi));
+	attr = (struct nlattr *) ((void *) ifi + NLMSG_ALIGN(sizeof(*ifi)));
+	if (libbpf_nla_parse(tb, IFLA_MAX, attr, len, NULL) != 0)
+		return -LIBBPF_ERRNO__NLPARSE;
+
+	return dump_link_nlmsg(cookie, ifi, tb);
+}
+
+static int netlink_get_link(int sock, unsigned int nl_pid,
+			    dump_nlmsg_t dump_link_nlmsg, void *cookie)
+{
+	struct {
+		struct nlmsghdr nlh;
+		struct ifinfomsg ifm;
+	} req = {
+		.nlh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
+		.nlh.nlmsg_type = RTM_GETLINK,
+		.nlh.nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST,
+		.ifm.ifi_family = AF_PACKET,
+	};
+	int seq = time(NULL);
+
+	req.nlh.nlmsg_seq = seq;
+	if (send(sock, &req, req.nlh.nlmsg_len, 0) < 0)
+		return -errno;
+
+	return netlink_recv(sock, nl_pid, seq, __dump_link_nlmsg,
+			    dump_link_nlmsg, cookie);
+}
+
+struct ip_devname_ifindex {
+	char	devname[64];
+	int	ifindex;
+};
+
+struct bpf_netdev_t {
+	struct ip_devname_ifindex *devices;
+	int	used_len;
+	int	array_len;
+	int	filter_idx;
+};
+
+static int do_show(int argc, char **argv)
+{
+	int sock, ret, filter_idx = -1;
+	struct bpf_netdev_t dev_array;
+	unsigned int nl_pid = 0;
+	char err_buf[256];
+
+	if (argc == 2) {
+		if (strcmp(argv[0], "dev") != 0)
+			usage();
+		filter_idx = if_nametoindex(argv[1]);
+		if (filter_idx == 0) {
+			fprintf(stderr, "invalid dev name %s\n", argv[1]);
+			return -1;
+		}
+	} else if (argc != 0) {
+		usage();
+	}
+
+	sock = netlink_open(&nl_pid);
+	if (sock < 0) {
+		fprintf(stderr, "failed to open netlink sock\n");
+		return -1;
+	}
+
+	dev_array.devices = NULL;
+	dev_array.used_len = 0;
+	dev_array.array_len = 0;
+	dev_array.filter_idx = filter_idx;
+
+	if (json_output)
+		jsonw_start_array(json_wtr);
+	NET_START_OBJECT;
+	NET_START_ARRAY("xdp", "%s:\n");
+	ret = netlink_get_link(sock, nl_pid, xdp_dump_link_nlmsg, &dev_array);
+	NET_END_ARRAY("\n");
+
+	NET_END_OBJECT;
+	if (json_output)
+		jsonw_end_array(json_wtr);
+
+	if (ret) {
+		if (json_output)
+			jsonw_null(json_wtr);
+		libbpf_strerror(ret, err_buf, sizeof(err_buf));
+		fprintf(stderr, "Error: %s\n", err_buf);
+	}
+	free(dev_array.devices);
+	close(sock);
+	return ret;
+}
+
+static int set_usage(void)
+{
+	fprintf(stderr,
+		"Usage: %s net xdp set dev <devname> {md_btf {on|off}}\n"
+		"       %s net xdp set help\n"
+		"       md_btf {on|off}: enable/disable meta data btf\n",
+		bin_name, bin_name);
+
+	return -1;
+}
+
+static int xdp_set_md_btf(int ifindex, char *arg)
+{
+	__u8 enable = (strcmp(arg, "on") == 0) ? 1 : 0;
+	int ret;
+
+	ret = bpf_set_link_xdp_md_btf(ifindex, enable);
+	if (ret)
+		fprintf(stderr, "Failed to setup xdp md, err=%d\n", ret);
+
+	return -ret;
+}
+
+static int do_set(int argc, char **argv)
+{
+	char *set_cmd, *set_arg;
+	int dev_idx = -1;
+
+	if (argc < 4)
+		return set_usage();
+
+	if (strcmp(argv[0], "dev") != 0)
+		return set_usage();
+
+	dev_idx = if_nametoindex(argv[1]);
+	if (dev_idx == 0) {
+		fprintf(stderr, "invalid dev name %s\n", argv[1]);
+		return -1;
+	}
+
+	set_cmd = argv[2];
+	set_arg = argv[3];
+
+	if (strcmp(set_cmd, "md_btf") != 0)
+		return set_usage();
+
+	if (strcmp(set_arg, "on") != 0 && strcmp(set_arg, "off") != 0)
+		return set_usage();
+
+	return xdp_set_md_btf(dev_idx, set_arg);
+}
+
+static int do_help(int argc, char **argv)
+{
+	if (json_output) {
+		jsonw_null(json_wtr);
+		return 0;
+	}
+
+	fprintf(stderr,
+		"Usage: %s %s xdp { show | list | set } [dev <devname>]\n"
+		"       %s %s help\n",
+		bin_name, argv[-2], bin_name, argv[-2]);
+
+	return 0;
+}
+
+static const struct cmd cmds[] = {
+	{ "show",        do_show },
+	{ "list",        do_show },
+	{ "set",         do_set  },
+	{ "help",        do_help },
+	{ 0 }
+};
+
+int do_xdp(int argc, char **argv)
+{
+	return cmd_select(cmds, argc, argv, do_help);
+}
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 6e61342ba56c..5075cf9fd509 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -520,6 +520,7 @@ LIBBPF_API int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags);
 LIBBPF_API int bpf_set_link_xdp_fd_opts(int ifindex, int fd, __u32 flags,
 					const struct bpf_xdp_set_link_opts *opts);
 LIBBPF_API int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags);
+
 LIBBPF_API int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info,
 				     size_t info_size, __u32 flags);
 
@@ -607,6 +608,7 @@ struct perf_buffer_opts {
 LIBBPF_API struct perf_buffer *
 perf_buffer__new(int map_fd, size_t page_cnt,
 		 const struct perf_buffer_opts *opts);
+LIBBPF_API int bpf_set_link_xdp_md_btf(int ifindex, __u8 enable);
 
 enum bpf_perf_event_ret {
 	LIBBPF_PERF_EVENT_DONE	= 0,
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 944c99d1ded3..492db50a4cd7 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -135,6 +135,7 @@ LIBBPF_0.0.2 {
 		bpf_object__btf;
 		bpf_object__find_map_fd_by_name;
 		bpf_get_link_xdp_id;
+		bpf_set_link_xdp_md_btf;
 		btf__dedup;
 		btf__get_map_kv_tids;
 		btf__get_nr_types;
diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c
index 39f25e09b51e..4f79972943e4 100644
--- a/tools/lib/bpf/netlink.c
+++ b/tools/lib/bpf/netlink.c
@@ -242,6 +242,55 @@ int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
 	return libbpf_err(ret);
 }
 
+int bpf_set_link_xdp_md_btf(int ifindex, __u8  enable)
+{
+	struct nlattr *nla, *nla_xdp;
+	int sock, seq = 0, ret;
+	__u32 nl_pid;
+	struct {
+		struct nlmsghdr  nh;
+		struct ifinfomsg ifinfo;
+		char             attrbuf[64];
+	} req;
+
+	sock = libbpf_netlink_open(&nl_pid);
+	if (sock < 0)
+		return sock;
+
+	memset(&req, 0, sizeof(req));
+	req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
+	req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
+	req.nh.nlmsg_type = RTM_SETLINK;
+	req.nh.nlmsg_pid = 0;
+	req.nh.nlmsg_seq = ++seq;
+	req.ifinfo.ifi_family = AF_UNSPEC;
+	req.ifinfo.ifi_index = ifindex;
+
+	/* started nested attribute for XDP */
+	nla = (struct nlattr *)(((char *)&req)
+				+ NLMSG_ALIGN(req.nh.nlmsg_len));
+	nla->nla_type = NLA_F_NESTED | IFLA_XDP;
+	nla->nla_len = NLA_HDRLEN;
+	/* add XDP MD setup */
+	nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
+	nla_xdp->nla_type = IFLA_XDP_MD_BTF_STATE;
+	nla_xdp->nla_len = NLA_HDRLEN + sizeof(__u8);
+	memcpy((char *)nla_xdp + NLA_HDRLEN, &enable, sizeof(__u8));
+	nla->nla_len += nla_xdp->nla_len;
+
+	req.nh.nlmsg_len += NLA_ALIGN(nla->nla_len);
+
+	if (send(sock, &req, req.nh.nlmsg_len, 0) < 0) {
+		ret = -errno;
+		goto cleanup;
+	}
+	ret = libbpf_netlink_recv(sock, nl_pid, seq, NULL, NULL, NULL);
+
+cleanup:
+	close(sock);
+	return ret;
+}
+
 static int __dump_link_nlmsg(struct nlmsghdr *nlh,
 			     libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie)
 {
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 05/16] igc: Fix race condition in PTP Tx code
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
                   ` (3 preceding siblings ...)
  2021-08-03  1:03 ` [[RFC xdp-hints] 04/16] tools/bpf: Add xdp set command for md btf Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 06/16] igc: Retrieve the TX timestamp directly (instead of in a interrupt) Ederson de Souza
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf, Andre Guedes

From: Andre Guedes <andre.guedes@intel.com>

Currently, the igc driver supports timestamping only one Tx packet at a
time. During the transmission flow, the skb that requires hardware
timestamping is saved in adapter->ptp_tx_skb. Once hardware has the
timestamp, an interrupt is delivered, and adapter->ptp_tx_work is
scheduled. In igc_ptp_tx_work(), we read the timestamp register, update
adapter->ptp_tx_skb, and notify the network stack.

While the thread executing the transmission flow (the user process
running in kernel mode) and the thread executing ptp_tx_work don't
access adapter->ptp_tx_skb concurrently, there are two other places
where adapter->ptp_tx_skb is accessed: igc_ptp_tx_hang() and
igc_ptp_suspend().

igc_ptp_tx_hang() is executed by the adapter->watchdog_task worker
thread which runs periodically so it is possible we have two threads
accessing ptp_tx_skb at the same time. Consider the following scenario:
right after __IGC_PTP_TX_IN_PROGRESS is set in igc_xmit_frame_ring(),
igc_ptp_tx_hang() is executed. Since adapter->ptp_tx_start hasn't been
written yet, this is considered a timeout and adapter->ptp_tx_skb is
cleaned up.

This patch fixes the issue described above by adding the ptp_tx_lock to
protect access to ptp_tx_skb and ptp_tx_start fields from igc_adapter.
Since igc_xmit_frame_ring() called in atomic context by the networking
stack, ptp_tx_lock is defined as a spinlock.

With the introduction of the ptp_tx_lock, the __IGC_PTP_TX_IN_PROGRESS
flag doesn't provide much of a use anymore so this patch gets rid of it.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h      |  5 ++-
 drivers/net/ethernet/intel/igc/igc_main.c |  7 +++-
 drivers/net/ethernet/intel/igc/igc_ptp.c  | 49 ++++++++++++++---------
 3 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index a0ecfe5a4078..10635588263e 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -217,6 +217,10 @@ struct igc_adapter {
 	struct ptp_clock *ptp_clock;
 	struct ptp_clock_info ptp_caps;
 	struct work_struct ptp_tx_work;
+	/* Access to ptp_tx_skb and ptp_tx_start is protected by the
+	 * ptp_tx_lock.
+	 */
+	spinlock_t ptp_tx_lock;
 	struct sk_buff *ptp_tx_skb;
 	struct hwtstamp_config tstamp_config;
 	unsigned long ptp_tx_start;
@@ -387,7 +391,6 @@ enum igc_state_t {
 	__IGC_TESTING,
 	__IGC_RESETTING,
 	__IGC_DOWN,
-	__IGC_PTP_TX_IN_PROGRESS,
 };
 
 enum igc_tx_flags {
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 5c95bf82eaf7..ae6ceb0790d8 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -1439,13 +1439,14 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb,
 	if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) {
 		struct igc_adapter *adapter = netdev_priv(tx_ring->netdev);
 
+		spin_lock(&adapter->ptp_tx_lock);
+
 		/* FIXME: add support for retrieving timestamps from
 		 * the other timer registers before skipping the
 		 * timestamping request.
 		 */
 		if (adapter->tstamp_config.tx_type == HWTSTAMP_TX_ON &&
-		    !test_and_set_bit_lock(__IGC_PTP_TX_IN_PROGRESS,
-					   &adapter->state)) {
+		    !adapter->ptp_tx_skb) {
 			skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
 			tx_flags |= IGC_TX_FLAGS_TSTAMP;
 
@@ -1454,6 +1455,8 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb,
 		} else {
 			adapter->tx_hwtstamp_skipped++;
 		}
+
+		spin_unlock(&adapter->ptp_tx_lock);
 	}
 
 	if (skb_vlan_tag_present(skb)) {
diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
index 69617d2c1be2..92ed2760485b 100644
--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
+++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
@@ -598,35 +598,35 @@ static int igc_ptp_set_timestamp_mode(struct igc_adapter *adapter,
 	return 0;
 }
 
+/* Requires adapter->ptp_tx_lock held by caller. */
 static void igc_ptp_tx_timeout(struct igc_adapter *adapter)
 {
 	struct igc_hw *hw = &adapter->hw;
 
 	dev_kfree_skb_any(adapter->ptp_tx_skb);
 	adapter->ptp_tx_skb = NULL;
+	adapter->ptp_tx_start = 0;
 	adapter->tx_hwtstamp_timeouts++;
-	clear_bit_unlock(__IGC_PTP_TX_IN_PROGRESS, &adapter->state);
 	/* Clear the tx valid bit in TSYNCTXCTL register to enable interrupt. */
 	rd32(IGC_TXSTMPH);
+
 	netdev_warn(adapter->netdev, "Tx timestamp timeout\n");
 }
 
 void igc_ptp_tx_hang(struct igc_adapter *adapter)
 {
-	bool timeout = time_is_before_jiffies(adapter->ptp_tx_start +
-					      IGC_PTP_TX_TIMEOUT);
+	spin_lock(&adapter->ptp_tx_lock);
 
-	if (!test_bit(__IGC_PTP_TX_IN_PROGRESS, &adapter->state))
-		return;
+	if (!adapter->ptp_tx_skb)
+		goto unlock;
 
-	/* If we haven't received a timestamp within the timeout, it is
-	 * reasonable to assume that it will never occur, so we can unlock the
-	 * timestamp bit when this occurs.
-	 */
-	if (timeout) {
-		cancel_work_sync(&adapter->ptp_tx_work);
-		igc_ptp_tx_timeout(adapter);
-	}
+	if (time_is_after_jiffies(adapter->ptp_tx_start + IGC_PTP_TX_TIMEOUT))
+		goto unlock;
+
+	igc_ptp_tx_timeout(adapter);
+
+unlock:
+	spin_unlock(&adapter->ptp_tx_lock);
 }
 
 /**
@@ -636,6 +636,8 @@ void igc_ptp_tx_hang(struct igc_adapter *adapter)
  * If we were asked to do hardware stamping and such a time stamp is
  * available, then it must have been for this skb here because we only
  * allow only one such packet into the queue.
+ *
+ * Context: Expects adapter->ptp_tx_lock to be held by caller.
  */
 static void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter)
 {
@@ -676,7 +678,7 @@ static void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter)
 	 * while we're notifying the stack.
 	 */
 	adapter->ptp_tx_skb = NULL;
-	clear_bit_unlock(__IGC_PTP_TX_IN_PROGRESS, &adapter->state);
+	adapter->ptp_tx_start = 0;
 
 	/* Notify the stack and free the skb after we've unlocked */
 	skb_tstamp_tx(skb, &shhwtstamps);
@@ -697,14 +699,19 @@ static void igc_ptp_tx_work(struct work_struct *work)
 	struct igc_hw *hw = &adapter->hw;
 	u32 tsynctxctl;
 
-	if (!test_bit(__IGC_PTP_TX_IN_PROGRESS, &adapter->state))
-		return;
+	spin_lock(&adapter->ptp_tx_lock);
+
+	if (!adapter->ptp_tx_skb)
+		goto unlock;
 
 	tsynctxctl = rd32(IGC_TSYNCTXCTL);
 	if (WARN_ON_ONCE(!(tsynctxctl & IGC_TSYNCTXCTL_TXTT_0)))
-		return;
+		goto unlock;
 
 	igc_ptp_tx_hwtstamp(adapter);
+
+unlock:
+	spin_unlock(&adapter->ptp_tx_lock);
 }
 
 /**
@@ -795,6 +802,7 @@ void igc_ptp_init(struct igc_adapter *adapter)
 	}
 
 	spin_lock_init(&adapter->tmreg_lock);
+	spin_lock_init(&adapter->ptp_tx_lock);
 	INIT_WORK(&adapter->ptp_tx_work, igc_ptp_tx_work);
 
 	adapter->tstamp_config.rx_filter = HWTSTAMP_FILTER_NONE;
@@ -845,9 +853,14 @@ void igc_ptp_suspend(struct igc_adapter *adapter)
 		return;
 
 	cancel_work_sync(&adapter->ptp_tx_work);
+
+	spin_lock(&adapter->ptp_tx_lock);
+
 	dev_kfree_skb_any(adapter->ptp_tx_skb);
 	adapter->ptp_tx_skb = NULL;
-	clear_bit_unlock(__IGC_PTP_TX_IN_PROGRESS, &adapter->state);
+	adapter->ptp_tx_start = 0;
+
+	spin_unlock(&adapter->ptp_tx_lock);
 
 	igc_ptp_time_save(adapter);
 }
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 06/16] igc: Retrieve the TX timestamp directly (instead of in a interrupt)
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
                   ` (4 preceding siblings ...)
  2021-08-03  1:03 ` [[RFC xdp-hints] 05/16] igc: Fix race condition in PTP Tx code Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 07/16] igc: Add support for multiple in-flight TX timestamps Ederson de Souza
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf, Vinicius Costa Gomes

From: Vinicius Costa Gomes <vinicius.gomes@intel.com>

Handling of TX timestamp interrupt should be simple enough to not cause
issues during the interrupt context. This way, the processing is
simplified and potentially more performant.

This patch is inspired by the i40 driver approach.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h      |  2 +-
 drivers/net/ethernet/intel/igc/igc_main.c |  6 +++-
 drivers/net/ethernet/intel/igc/igc_ptp.c  | 41 +++++------------------
 3 files changed, 14 insertions(+), 35 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index 10635588263e..2b07a9dd29bb 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -216,7 +216,6 @@ struct igc_adapter {
 
 	struct ptp_clock *ptp_clock;
 	struct ptp_clock_info ptp_caps;
-	struct work_struct ptp_tx_work;
 	/* Access to ptp_tx_skb and ptp_tx_start is protected by the
 	 * ptp_tx_lock.
 	 */
@@ -619,6 +618,7 @@ void igc_ptp_reset(struct igc_adapter *adapter);
 void igc_ptp_suspend(struct igc_adapter *adapter);
 void igc_ptp_stop(struct igc_adapter *adapter);
 ktime_t igc_ptp_rx_pktstamp(struct igc_adapter *adapter, __le32 *buf);
+void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter);
 int igc_ptp_set_ts_config(struct net_device *netdev, struct ifreq *ifr);
 int igc_ptp_get_ts_config(struct net_device *netdev, struct ifreq *ifr);
 void igc_ptp_tx_hang(struct igc_adapter *adapter);
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index ae6ceb0790d8..400b9de51475 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -4999,8 +4999,12 @@ static void igc_tsync_interrupt(struct igc_adapter *adapter)
 	}
 
 	if (tsicr & IGC_TSICR_TXTS) {
+		u32 tsynctxctl = rd32(IGC_TSYNCTXCTL);;
+
 		/* retrieve hardware timestamp */
-		schedule_work(&adapter->ptp_tx_work);
+		if (tsynctxctl & IGC_TSYNCTXCTL_TXTT_0)
+			igc_ptp_tx_hwtstamp(adapter);
+
 		ack |= IGC_TSICR_TXTS;
 	}
 
diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
index 92ed2760485b..3ec0baa8451a 100644
--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
+++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
@@ -639,16 +639,19 @@ void igc_ptp_tx_hang(struct igc_adapter *adapter)
  *
  * Context: Expects adapter->ptp_tx_lock to be held by caller.
  */
-static void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter)
+void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter)
 {
-	struct sk_buff *skb = adapter->ptp_tx_skb;
 	struct skb_shared_hwtstamps shhwtstamps;
 	struct igc_hw *hw = &adapter->hw;
+	struct sk_buff *skb;
 	int adjust = 0;
 	u64 regval;
 
+	spin_lock(&adapter->ptp_tx_lock);
+	skb = adapter->ptp_tx_skb;
+
 	if (WARN_ON_ONCE(!skb))
-		return;
+		goto done;
 
 	regval = rd32(IGC_TXSTMPL);
 	regval |= (u64)rd32(IGC_TXSTMPH) << 32;
@@ -683,35 +686,10 @@ static void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter)
 	/* Notify the stack and free the skb after we've unlocked */
 	skb_tstamp_tx(skb, &shhwtstamps);
 	dev_kfree_skb_any(skb);
-}
 
-/**
- * igc_ptp_tx_work
- * @work: pointer to work struct
- *
- * This work function polls the TSYNCTXCTL valid bit to determine when a
- * timestamp has been taken for the current stored skb.
- */
-static void igc_ptp_tx_work(struct work_struct *work)
-{
-	struct igc_adapter *adapter = container_of(work, struct igc_adapter,
-						   ptp_tx_work);
-	struct igc_hw *hw = &adapter->hw;
-	u32 tsynctxctl;
-
-	spin_lock(&adapter->ptp_tx_lock);
-
-	if (!adapter->ptp_tx_skb)
-		goto unlock;
-
-	tsynctxctl = rd32(IGC_TSYNCTXCTL);
-	if (WARN_ON_ONCE(!(tsynctxctl & IGC_TSYNCTXCTL_TXTT_0)))
-		goto unlock;
-
-	igc_ptp_tx_hwtstamp(adapter);
-
-unlock:
+done:
 	spin_unlock(&adapter->ptp_tx_lock);
+
 }
 
 /**
@@ -803,7 +781,6 @@ void igc_ptp_init(struct igc_adapter *adapter)
 
 	spin_lock_init(&adapter->tmreg_lock);
 	spin_lock_init(&adapter->ptp_tx_lock);
-	INIT_WORK(&adapter->ptp_tx_work, igc_ptp_tx_work);
 
 	adapter->tstamp_config.rx_filter = HWTSTAMP_FILTER_NONE;
 	adapter->tstamp_config.tx_type = HWTSTAMP_TX_OFF;
@@ -852,8 +829,6 @@ void igc_ptp_suspend(struct igc_adapter *adapter)
 	if (!(adapter->ptp_flags & IGC_PTP_ENABLED))
 		return;
 
-	cancel_work_sync(&adapter->ptp_tx_work);
-
 	spin_lock(&adapter->ptp_tx_lock);
 
 	dev_kfree_skb_any(adapter->ptp_tx_skb);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 07/16] igc: Add support for multiple in-flight TX timestamps
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
                   ` (5 preceding siblings ...)
  2021-08-03  1:03 ` [[RFC xdp-hints] 06/16] igc: Retrieve the TX timestamp directly (instead of in a interrupt) Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 08/16] igc: Use irq safe locks for timestamping Ederson de Souza
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf, Vinicius Costa Gomes

From: Vinicius Costa Gomes <vinicius.gomes@intel.com>

Adds support for using the four sets of timestamping registers that
i225 has available for TX.

In some TSN workloads, where multiple applications request hardware
transmission timestamps, it was possible that some of those requests
were denied because the only in use register was already occupied.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h         |  20 ++-
 drivers/net/ethernet/intel/igc/igc_base.h    |   3 +
 drivers/net/ethernet/intel/igc/igc_defines.h |   7 +
 drivers/net/ethernet/intel/igc/igc_main.c    |  45 +++--
 drivers/net/ethernet/intel/igc/igc_ptp.c     | 172 +++++++++++++------
 drivers/net/ethernet/intel/igc/igc_regs.h    |  12 ++
 6 files changed, 192 insertions(+), 67 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index 2b07a9dd29bb..6fd5901f07c7 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -67,6 +67,17 @@ struct igc_rx_packet_stats {
 	u64 other_packets;
 };
 
+#define IGC_MAX_TX_TSTAMP_TIMERS	4
+
+struct igc_tx_timestamp_request {
+	struct sk_buff *skb;
+	unsigned long start;
+	u32 mask;
+	u32 regl;
+	u32 regh;
+	u32 flags;
+};
+
 struct igc_ring_container {
 	struct igc_ring *ring;          /* pointer to linked list of rings */
 	unsigned int total_bytes;       /* total bytes processed this int */
@@ -220,9 +231,8 @@ struct igc_adapter {
 	 * ptp_tx_lock.
 	 */
 	spinlock_t ptp_tx_lock;
-	struct sk_buff *ptp_tx_skb;
+	struct igc_tx_timestamp_request tx_tstamp[IGC_MAX_TX_TSTAMP_TIMERS];
 	struct hwtstamp_config tstamp_config;
-	unsigned long ptp_tx_start;
 	unsigned int ptp_flags;
 	/* System time value lock */
 	spinlock_t tmreg_lock;
@@ -401,6 +411,10 @@ enum igc_tx_flags {
 	/* olinfo flags */
 	IGC_TX_FLAGS_IPV4	= 0x10,
 	IGC_TX_FLAGS_CSUM	= 0x20,
+
+	IGC_TX_FLAGS_TSTAMP_1	= 0x100,
+	IGC_TX_FLAGS_TSTAMP_2	= 0x200,
+	IGC_TX_FLAGS_TSTAMP_3	= 0x400,
 };
 
 enum igc_boards {
@@ -618,7 +632,7 @@ void igc_ptp_reset(struct igc_adapter *adapter);
 void igc_ptp_suspend(struct igc_adapter *adapter);
 void igc_ptp_stop(struct igc_adapter *adapter);
 ktime_t igc_ptp_rx_pktstamp(struct igc_adapter *adapter, __le32 *buf);
-void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter);
+void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter, u32 mask);
 int igc_ptp_set_ts_config(struct net_device *netdev, struct ifreq *ifr);
 int igc_ptp_get_ts_config(struct net_device *netdev, struct ifreq *ifr);
 void igc_ptp_tx_hang(struct igc_adapter *adapter);
diff --git a/drivers/net/ethernet/intel/igc/igc_base.h b/drivers/net/ethernet/intel/igc/igc_base.h
index ce530f5fd7bd..0d2b4482cb2f 100644
--- a/drivers/net/ethernet/intel/igc/igc_base.h
+++ b/drivers/net/ethernet/intel/igc/igc_base.h
@@ -32,6 +32,9 @@ struct igc_adv_tx_context_desc {
 
 /* Adv Transmit Descriptor Config Masks */
 #define IGC_ADVTXD_MAC_TSTAMP	0x00080000 /* IEEE1588 Timestamp packet */
+#define IGC_ADVTXD_TSTAMP_REG_1	0x00010000 /* IEEE1588 Timestamp packet */
+#define IGC_ADVTXD_TSTAMP_REG_2	0x00020000 /* IEEE1588 Timestamp packet */
+#define IGC_ADVTXD_TSTAMP_REG_3	0x00030000 /* IEEE1588 Timestamp packet */
 #define IGC_ADVTXD_DTYP_CTXT	0x00200000 /* Advanced Context Descriptor */
 #define IGC_ADVTXD_DTYP_DATA	0x00300000 /* Advanced Data Descriptor */
 #define IGC_ADVTXD_DCMD_EOP	0x01000000 /* End of Packet */
diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index c6315690e20f..04759c2c81eb 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -446,6 +446,9 @@
 
 /* Time Sync Transmit Control bit definitions */
 #define IGC_TSYNCTXCTL_TXTT_0			0x00000001  /* Tx timestamp reg 0 valid */
+#define IGC_TSYNCTXCTL_TXTT_1			0x00000002  /* Tx timestamp reg 1 valid */
+#define IGC_TSYNCTXCTL_TXTT_2			0x00000004  /* Tx timestamp reg 2 valid */
+#define IGC_TSYNCTXCTL_TXTT_3			0x00000008  /* Tx timestamp reg 3 valid */
 #define IGC_TSYNCTXCTL_ENABLED			0x00000010  /* enable Tx timestamping */
 #define IGC_TSYNCTXCTL_MAX_ALLOWED_DLY_MASK	0x0000F000  /* max delay */
 #define IGC_TSYNCTXCTL_SYNC_COMP_ERR		0x20000000  /* sync err */
@@ -453,6 +456,10 @@
 #define IGC_TSYNCTXCTL_START_SYNC		0x80000000  /* initiate sync */
 #define IGC_TSYNCTXCTL_TXSYNSIG			0x00000020  /* Sample TX tstamp in PHY sop */
 
+#define IGC_TSYNCTXCTL_TXTT_ANY ( \
+		IGC_TSYNCTXCTL_TXTT_0 | IGC_TSYNCTXCTL_TXTT_1 | \
+		IGC_TSYNCTXCTL_TXTT_2 | IGC_TSYNCTXCTL_TXTT_3)
+
 /* Timer selection bits */
 #define IGC_AUX_IO_TIMER_SEL_SYSTIM0	(0u << 30) /* Select SYSTIM0 for auxiliary time stamp */
 #define IGC_AUX_IO_TIMER_SEL_SYSTIM1	(1u << 30) /* Select SYSTIM1 for auxiliary time stamp */
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 400b9de51475..a2e0b71d1f4e 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -1146,6 +1146,15 @@ static u32 igc_tx_cmd_type(struct sk_buff *skb, u32 tx_flags)
 	cmd_type |= IGC_SET_FLAG(tx_flags, IGC_TX_FLAGS_TSTAMP,
 				 (IGC_ADVTXD_MAC_TSTAMP));
 
+	cmd_type |= IGC_SET_FLAG(tx_flags, IGC_TX_FLAGS_TSTAMP_1,
+				 (IGC_ADVTXD_TSTAMP_REG_1));
+
+	cmd_type |= IGC_SET_FLAG(tx_flags, IGC_TX_FLAGS_TSTAMP_2,
+				 (IGC_ADVTXD_TSTAMP_REG_2));
+
+	cmd_type |= IGC_SET_FLAG(tx_flags, IGC_TX_FLAGS_TSTAMP_3,
+				 (IGC_ADVTXD_TSTAMP_REG_3));
+
 	/* insert frame checksum */
 	cmd_type ^= IGC_SET_FLAG(skb->no_fcs, 1, IGC_ADVTXD_DCMD_IFCS);
 
@@ -1403,6 +1412,26 @@ static int igc_tso(struct igc_ring *tx_ring,
 	return 1;
 }
 
+static bool igc_request_tx_tstamp(struct igc_adapter *adapter, struct sk_buff *skb, u32 *flags)
+{
+	int i;
+
+	for (i = 0; i < IGC_MAX_TX_TSTAMP_TIMERS; i++) {
+		struct igc_tx_timestamp_request *tstamp = &adapter->tx_tstamp[i];
+
+		if (tstamp->skb)
+			continue;
+
+		tstamp->skb = skb_get(skb);
+		tstamp->start = jiffies;
+		*flags = tstamp->flags;
+
+		return true;
+	}
+
+	return false;
+}
+
 static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb,
 				       struct igc_ring *tx_ring)
 {
@@ -1438,20 +1467,14 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb,
 
 	if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) {
 		struct igc_adapter *adapter = netdev_priv(tx_ring->netdev);
+		u32 tstamp_flags;
 
 		spin_lock(&adapter->ptp_tx_lock);
 
-		/* FIXME: add support for retrieving timestamps from
-		 * the other timer registers before skipping the
-		 * timestamping request.
-		 */
 		if (adapter->tstamp_config.tx_type == HWTSTAMP_TX_ON &&
-		    !adapter->ptp_tx_skb) {
+		    igc_request_tx_tstamp(adapter, skb, &tstamp_flags)) {
 			skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
-			tx_flags |= IGC_TX_FLAGS_TSTAMP;
-
-			adapter->ptp_tx_skb = skb_get(skb);
-			adapter->ptp_tx_start = jiffies;
+			tx_flags |= IGC_TX_FLAGS_TSTAMP | tstamp_flags;
 		} else {
 			adapter->tx_hwtstamp_skipped++;
 		}
@@ -5001,9 +5024,7 @@ static void igc_tsync_interrupt(struct igc_adapter *adapter)
 	if (tsicr & IGC_TSICR_TXTS) {
 		u32 tsynctxctl = rd32(IGC_TSYNCTXCTL);;
 
-		/* retrieve hardware timestamp */
-		if (tsynctxctl & IGC_TSYNCTXCTL_TXTT_0)
-			igc_ptp_tx_hwtstamp(adapter);
+		igc_ptp_tx_hwtstamp(adapter, tsynctxctl & IGC_TSYNCTXCTL_TXTT_ANY);
 
 		ack |= IGC_TSICR_TXTS;
 	}
diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
index 3ec0baa8451a..e286b0341575 100644
--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
+++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
@@ -541,8 +541,17 @@ static void igc_ptp_enable_tx_timestamp(struct igc_adapter *adapter)
 	wr32(IGC_TSYNCTXCTL, IGC_TSYNCTXCTL_ENABLED | IGC_TSYNCTXCTL_TXSYNSIG);
 
 	/* Read TXSTMP registers to discard any timestamp previously stored. */
-	rd32(IGC_TXSTMPL);
-	rd32(IGC_TXSTMPH);
+	rd32(IGC_TXSTMPL_0);
+	rd32(IGC_TXSTMPH_0);
+
+	rd32(IGC_TXSTMPL_1);
+	rd32(IGC_TXSTMPH_1);
+
+	rd32(IGC_TXSTMPL_2);
+	rd32(IGC_TXSTMPH_2);
+
+	rd32(IGC_TXSTMPL_3);
+	rd32(IGC_TXSTMPH_3);
 }
 
 /**
@@ -599,33 +608,40 @@ static int igc_ptp_set_timestamp_mode(struct igc_adapter *adapter,
 }
 
 /* Requires adapter->ptp_tx_lock held by caller. */
-static void igc_ptp_tx_timeout(struct igc_adapter *adapter)
+static void igc_ptp_tx_timeout(struct igc_adapter *adapter,
+			       struct igc_tx_timestamp_request *tstamp)
 {
 	struct igc_hw *hw = &adapter->hw;
 
-	dev_kfree_skb_any(adapter->ptp_tx_skb);
-	adapter->ptp_tx_skb = NULL;
-	adapter->ptp_tx_start = 0;
+	dev_kfree_skb_any(tstamp->skb);
+	tstamp->skb = NULL;
+	tstamp->start = 0;
 	adapter->tx_hwtstamp_timeouts++;
 	/* Clear the tx valid bit in TSYNCTXCTL register to enable interrupt. */
-	rd32(IGC_TXSTMPH);
+	rd32(tstamp->regh);
 
 	netdev_warn(adapter->netdev, "Tx timestamp timeout\n");
 }
 
 void igc_ptp_tx_hang(struct igc_adapter *adapter)
 {
+	struct igc_tx_timestamp_request *tstamp;
+	int i;
+
 	spin_lock(&adapter->ptp_tx_lock);
 
-	if (!adapter->ptp_tx_skb)
-		goto unlock;
+	for (i = 0; i < IGC_MAX_TX_TSTAMP_TIMERS; i++) {
+		tstamp = &adapter->tx_tstamp[i];
 
-	if (time_is_after_jiffies(adapter->ptp_tx_start + IGC_PTP_TX_TIMEOUT))
-		goto unlock;
+		if (!tstamp->skb)
+			continue;
 
-	igc_ptp_tx_timeout(adapter);
+		if (time_is_after_jiffies(tstamp->start + IGC_PTP_TX_TIMEOUT))
+			continue;
+
+		igc_ptp_tx_timeout(adapter, tstamp);
+	}
 
-unlock:
 	spin_unlock(&adapter->ptp_tx_lock);
 }
 
@@ -639,57 +655,73 @@ void igc_ptp_tx_hang(struct igc_adapter *adapter)
  *
  * Context: Expects adapter->ptp_tx_lock to be held by caller.
  */
-void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter)
+void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter, u32 mask)
 {
 	struct skb_shared_hwtstamps shhwtstamps;
 	struct igc_hw *hw = &adapter->hw;
 	struct sk_buff *skb;
 	int adjust = 0;
 	u64 regval;
+	int i;
 
+again:
 	spin_lock(&adapter->ptp_tx_lock);
-	skb = adapter->ptp_tx_skb;
-
-	if (WARN_ON_ONCE(!skb))
-		goto done;
 
-	regval = rd32(IGC_TXSTMPL);
-	regval |= (u64)rd32(IGC_TXSTMPH) << 32;
-	igc_ptp_systim_to_hwtstamp(adapter, &shhwtstamps, regval);
-
-	switch (adapter->link_speed) {
-	case SPEED_10:
-		adjust = IGC_I225_TX_LATENCY_10;
-		break;
-	case SPEED_100:
-		adjust = IGC_I225_TX_LATENCY_100;
-		break;
-	case SPEED_1000:
-		adjust = IGC_I225_TX_LATENCY_1000;
-		break;
-	case SPEED_2500:
-		adjust = IGC_I225_TX_LATENCY_2500;
-		break;
-	}
+	for (i = 0; i < IGC_MAX_TX_TSTAMP_TIMERS; i++) {
+		struct igc_tx_timestamp_request *tstamp = &adapter->tx_tstamp[i];
+
+		if (!(mask & tstamp->mask))
+			continue;
+
+		skb = tstamp->skb;
+		if (!skb)
+			continue;
+
+		regval = rd32(tstamp->regl);
+		regval |= (u64)rd32(tstamp->regh) << 32;
+		igc_ptp_systim_to_hwtstamp(adapter, &shhwtstamps, regval);
+
+		switch (adapter->link_speed) {
+		case SPEED_10:
+			adjust = IGC_I225_TX_LATENCY_10;
+			break;
+		case SPEED_100:
+			adjust = IGC_I225_TX_LATENCY_100;
+			break;
+		case SPEED_1000:
+			adjust = IGC_I225_TX_LATENCY_1000;
+			break;
+		case SPEED_2500:
+			adjust = IGC_I225_TX_LATENCY_2500;
+			break;
+		}
 
-	shhwtstamps.hwtstamp =
-		ktime_add_ns(shhwtstamps.hwtstamp, adjust);
+		shhwtstamps.hwtstamp =
+			ktime_add_ns(shhwtstamps.hwtstamp, adjust);
 
-	/* Clear the lock early before calling skb_tstamp_tx so that
-	 * applications are not woken up before the lock bit is clear. We use
-	 * a copy of the skb pointer to ensure other threads can't change it
-	 * while we're notifying the stack.
-	 */
-	adapter->ptp_tx_skb = NULL;
-	adapter->ptp_tx_start = 0;
+		/* Clear the lock early before calling skb_tstamp_tx so that
+		 * applications are not woken up before the lock bit is clear. We use
+		 * a copy of the skb pointer to ensure other threads can't change it
+		 * while we're notifying the stack.
+		 */
+		tstamp->skb = NULL;
+		tstamp->start = 0;
 
-	/* Notify the stack and free the skb after we've unlocked */
-	skb_tstamp_tx(skb, &shhwtstamps);
-	dev_kfree_skb_any(skb);
+		/* Notify the stack and free the skb after we've unlocked */
+		skb_tstamp_tx(skb, &shhwtstamps);
+		dev_kfree_skb_any(skb);
+	}
 
-done:
 	spin_unlock(&adapter->ptp_tx_lock);
 
+	mask = rd32(IGC_TSYNCTXCTL) & IGC_TSYNCTXCTL_TXTT_ANY;
+	if (mask) {
+		/* Some timestamps arrived while we were handling the
+		 * previous ones
+		 */
+		goto again;
+	}
+
 }
 
 /**
@@ -747,9 +779,34 @@ int igc_ptp_get_ts_config(struct net_device *netdev, struct ifreq *ifr)
 void igc_ptp_init(struct igc_adapter *adapter)
 {
 	struct net_device *netdev = adapter->netdev;
+	struct igc_tx_timestamp_request *tstamp;
 	struct igc_hw *hw = &adapter->hw;
 	int i;
 
+	tstamp = &adapter->tx_tstamp[0];
+	tstamp->mask = IGC_TSYNCTXCTL_TXTT_0;
+	tstamp->regl = IGC_TXSTMPL_0;
+	tstamp->regh = IGC_TXSTMPH_0;
+	tstamp->flags = 0;
+
+	tstamp = &adapter->tx_tstamp[1];
+	tstamp->mask = IGC_TSYNCTXCTL_TXTT_1;
+	tstamp->regl = IGC_TXSTMPL_1;
+	tstamp->regh = IGC_TXSTMPH_1;
+	tstamp->flags = IGC_TX_FLAGS_TSTAMP_1;
+
+	tstamp = &adapter->tx_tstamp[2];
+	tstamp->mask = IGC_TSYNCTXCTL_TXTT_2;
+	tstamp->regl = IGC_TXSTMPL_2;
+	tstamp->regh = IGC_TXSTMPH_2;
+	tstamp->flags = IGC_TX_FLAGS_TSTAMP_2;
+
+	tstamp = &adapter->tx_tstamp[3];
+	tstamp->mask = IGC_TSYNCTXCTL_TXTT_3;
+	tstamp->regl = IGC_TXSTMPL_3;
+	tstamp->regh = IGC_TXSTMPH_3;
+	tstamp->flags = IGC_TX_FLAGS_TSTAMP_3;
+
 	switch (hw->mac.type) {
 	case igc_i225:
 		for (i = 0; i < IGC_N_SDP; i++) {
@@ -817,6 +874,19 @@ static void igc_ptp_time_restore(struct igc_adapter *adapter)
 	igc_ptp_write_i225(adapter, &ts);
 }
 
+static void igc_tx_tstamp_clear(struct igc_adapter *adapter)
+{
+	int i;
+
+	for (i = 0; i < IGC_MAX_TX_TSTAMP_TIMERS; i++) {
+		struct igc_tx_timestamp_request *tstamp = &adapter->tx_tstamp[i];
+
+		dev_kfree_skb_any(tstamp->skb);
+		tstamp->skb = NULL;
+		tstamp->start = 0;
+	}
+}
+
 /**
  * igc_ptp_suspend - Disable PTP work items and prepare for suspend
  * @adapter: Board private structure
@@ -831,9 +901,7 @@ void igc_ptp_suspend(struct igc_adapter *adapter)
 
 	spin_lock(&adapter->ptp_tx_lock);
 
-	dev_kfree_skb_any(adapter->ptp_tx_skb);
-	adapter->ptp_tx_skb = NULL;
-	adapter->ptp_tx_start = 0;
+	igc_tx_tstamp_clear(adapter);
 
 	spin_unlock(&adapter->ptp_tx_lock);
 
diff --git a/drivers/net/ethernet/intel/igc/igc_regs.h b/drivers/net/ethernet/intel/igc/igc_regs.h
index 828c3501c448..313b28e33165 100644
--- a/drivers/net/ethernet/intel/igc/igc_regs.h
+++ b/drivers/net/ethernet/intel/igc/igc_regs.h
@@ -242,6 +242,18 @@
 #define IGC_SYSTIMR	0x0B6F8  /* System time register Residue */
 #define IGC_TIMINCA	0x0B608  /* Increment attributes register - RW */
 
+/* TX Timestamp Low */
+#define IGC_TXSTMPL_0		0x0B618
+#define IGC_TXSTMPL_1		0x0B698
+#define IGC_TXSTMPL_2		0x0B6B8
+#define IGC_TXSTMPL_3		0x0B6D8
+
+/* TX Timestamp High */
+#define IGC_TXSTMPH_0		0x0B61C
+#define IGC_TXSTMPH_1		0x0B69C
+#define IGC_TXSTMPH_2		0x0B6BC
+#define IGC_TXSTMPH_3		0x0B6DC
+
 #define IGC_TXSTMPL	0x0B618  /* Tx timestamp value Low - RO */
 #define IGC_TXSTMPH	0x0B61C  /* Tx timestamp value High - RO */
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 08/16] igc: Use irq safe locks for timestamping
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
                   ` (6 preceding siblings ...)
  2021-08-03  1:03 ` [[RFC xdp-hints] 07/16] igc: Add support for multiple in-flight TX timestamps Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 09/16] net/xdp: Support for generic XDP hints Ederson de Souza
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf, Vinicius Costa Gomes

From: Vinicius Costa Gomes <vinicius.gomes@intel.com>

Now that the timestamping is done in interrupt context we should
protect against concurrent access using irq safe locks.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c |  5 +++--
 drivers/net/ethernet/intel/igc/igc_ptp.c  | 16 ++++++++++------
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index a2e0b71d1f4e..fe3619c25c05 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -1467,9 +1467,10 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb,
 
 	if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) {
 		struct igc_adapter *adapter = netdev_priv(tx_ring->netdev);
+		unsigned long flags;
 		u32 tstamp_flags;
 
-		spin_lock(&adapter->ptp_tx_lock);
+		spin_lock_irqsave(&adapter->ptp_tx_lock, flags);
 
 		if (adapter->tstamp_config.tx_type == HWTSTAMP_TX_ON &&
 		    igc_request_tx_tstamp(adapter, skb, &tstamp_flags)) {
@@ -1479,7 +1480,7 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb,
 			adapter->tx_hwtstamp_skipped++;
 		}
 
-		spin_unlock(&adapter->ptp_tx_lock);
+		spin_unlock_irqrestore(&adapter->ptp_tx_lock, flags);
 	}
 
 	if (skb_vlan_tag_present(skb)) {
diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
index e286b0341575..911c36a909a4 100644
--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
+++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
@@ -626,9 +626,10 @@ static void igc_ptp_tx_timeout(struct igc_adapter *adapter,
 void igc_ptp_tx_hang(struct igc_adapter *adapter)
 {
 	struct igc_tx_timestamp_request *tstamp;
+	unsigned long flags;
 	int i;
 
-	spin_lock(&adapter->ptp_tx_lock);
+	spin_lock_irqsave(&adapter->ptp_tx_lock, flags);
 
 	for (i = 0; i < IGC_MAX_TX_TSTAMP_TIMERS; i++) {
 		tstamp = &adapter->tx_tstamp[i];
@@ -642,7 +643,7 @@ void igc_ptp_tx_hang(struct igc_adapter *adapter)
 		igc_ptp_tx_timeout(adapter, tstamp);
 	}
 
-	spin_unlock(&adapter->ptp_tx_lock);
+	spin_unlock_irqrestore(&adapter->ptp_tx_lock, flags);
 }
 
 /**
@@ -659,13 +660,14 @@ void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter, u32 mask)
 {
 	struct skb_shared_hwtstamps shhwtstamps;
 	struct igc_hw *hw = &adapter->hw;
+	unsigned long flags;
 	struct sk_buff *skb;
 	int adjust = 0;
 	u64 regval;
 	int i;
 
 again:
-	spin_lock(&adapter->ptp_tx_lock);
+	spin_lock_irqsave(&adapter->ptp_tx_lock, flags);
 
 	for (i = 0; i < IGC_MAX_TX_TSTAMP_TIMERS; i++) {
 		struct igc_tx_timestamp_request *tstamp = &adapter->tx_tstamp[i];
@@ -712,7 +714,7 @@ void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter, u32 mask)
 		dev_kfree_skb_any(skb);
 	}
 
-	spin_unlock(&adapter->ptp_tx_lock);
+	spin_unlock_irqrestore(&adapter->ptp_tx_lock, flags);
 
 	mask = rd32(IGC_TSYNCTXCTL) & IGC_TSYNCTXCTL_TXTT_ANY;
 	if (mask) {
@@ -896,14 +898,16 @@ static void igc_tx_tstamp_clear(struct igc_adapter *adapter)
  */
 void igc_ptp_suspend(struct igc_adapter *adapter)
 {
+	unsigned long flags;
+
 	if (!(adapter->ptp_flags & IGC_PTP_ENABLED))
 		return;
 
-	spin_lock(&adapter->ptp_tx_lock);
+	spin_lock_irqsave(&adapter->ptp_tx_lock, flags);
 
 	igc_tx_tstamp_clear(adapter);
 
-	spin_unlock(&adapter->ptp_tx_lock);
+	spin_unlock_irqrestore(&adapter->ptp_tx_lock, flags);
 
 	igc_ptp_time_save(adapter);
 }
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 09/16] net/xdp: Support for generic XDP hints
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
                   ` (7 preceding siblings ...)
  2021-08-03  1:03 ` [[RFC xdp-hints] 08/16] igc: Use irq safe locks for timestamping Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 10/16] igc: XDP packet RX timestamp Ederson de Souza
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf

XDP hints are meta information about an XDP packet. This patch provides
macros that define a base set of hints, that drivers can use to
implement XDP hints support - as well as expand on that.

A future patch will show these macros being used by the igc driver.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
---
 include/net/xdp.h           | 62 +++++++++++++++++++++++++++++++++++++
 include/uapi/linux/if_xdp.h |  3 ++
 2 files changed, 65 insertions(+)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index ad5b02dcb6f4..59a0b91e1975 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -76,6 +76,68 @@ struct xdp_buff {
 	u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/
 };
 
+/*
+ * This is what the generic xdp hints struct looks like:
+ *
+ * struct xdp_hints {
+ *	u64 rx_timestamp;
+ *	u64 tx_timestamp;
+ *	u64 valid_map;
+ *	u32 btf_id;
+ * };
+ */
+
+/* New fields need to be added at the beginning, not at the end,
+ * as the known position on the metadata is the end (just before
+ * the data starts) */
+
+#define XDP_GENERIC_HINTS_STRUCT_MEMBERS \
+	u64 rx_timestamp; \
+	u64 tx_timestamp; \
+	u64 valid_map; \
+	u32 btf_id;
+
+#define BTF_INFO_ENC(kind, kind_flag, vlen) \
+	((!!(kind_flag) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN))
+
+#define BTF_TYPE_ENC(name, info, size_or_type) \
+	(name), (info), (size_or_type)
+
+#define BTF_INT_ENC(encoding, bits_offset, nr_bits) \
+	((encoding) << 24 | (bits_offset) << 16 | (nr_bits))
+
+#define BTF_TYPE_INT_ENC(name, encoding, bits_offset, bits, sz) \
+	BTF_TYPE_ENC(name, BTF_INFO_ENC(BTF_KIND_INT, 0, 0), sz),       \
+	BTF_INT_ENC(encoding, bits_offset, bits)
+
+#define BTF_STRUCT_ENC(name, nr_elems, sz)      \
+	BTF_TYPE_ENC(name, BTF_INFO_ENC(BTF_KIND_STRUCT, 1, nr_elems), sz)
+
+#define BTF_MEMBER_ENC(name, type, bits_offset) \
+	(name), (type), (bits_offset)
+
+#define XDP_GENERIC_MD_SUPPORTED_HINTS_NUM_MMBRS 4
+#define XDP_GENERIC_HINTS_NAME_OFFSET 62
+
+#define XDP_GENERIC_HINTS_NAMES "\0xdp_hints\0u32\0u64\0rx_timestamp\0" \
+				"tx_timestamp\0valid_map\0btf_id\0"
+
+#define XDP_GENERIC_HINTS_TYPES \
+	BTF_TYPE_INT_ENC(15, 0, 0, 64, 8), \
+	BTF_TYPE_INT_ENC(11, 0, 0, 32, 4)
+
+#define XDP_GENERIC_HINTS_STRUCT(nr_elems, sz) \
+	BTF_STRUCT_ENC(1, XDP_GENERIC_MD_SUPPORTED_HINTS_NUM_MMBRS + nr_elems, \
+		8 + 8 + 8 + 4 + sz)
+
+#define XDP_GENERIC_HINTS_MEMBERS(offset) \
+	BTF_MEMBER_ENC(19, 1, offset + 0), \
+	BTF_MEMBER_ENC(32, 1, offset + 64), \
+	BTF_MEMBER_ENC(45, 1, offset + 128), \
+	BTF_MEMBER_ENC(55, 2, offset + 192)
+
+#define XDP_GENERIC_HINTS_BIT_MAX   XDP_GENERIC_HINTS_TX_TIMESTAMP
+
 static __always_inline void
 xdp_init_buff(struct xdp_buff *xdp, u32 frame_sz, struct xdp_rxq_info *rxq)
 {
diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h
index a78a8096f4ce..345c757b8c3e 100644
--- a/include/uapi/linux/if_xdp.h
+++ b/include/uapi/linux/if_xdp.h
@@ -12,6 +12,9 @@
 
 #include <linux/types.h>
 
+#define XDP_GENERIC_HINTS_RX_TIMESTAMP (1 << 0)
+#define XDP_GENERIC_HINTS_TX_TIMESTAMP (1 << 1)
+
 /* Options for the sxdp_flags field */
 #define XDP_SHARED_UMEM	(1 << 0)
 #define XDP_COPY	(1 << 1) /* Force copy-mode */
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 10/16] igc: XDP packet RX timestamp
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
                   ` (8 preceding siblings ...)
  2021-08-03  1:03 ` [[RFC xdp-hints] 09/16] net/xdp: Support for generic XDP hints Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 11/16] igc: XDP packet TX timestamp Ederson de Souza
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf

Using XDP hints, driver adds the PTP timestamp of when a packet was
received by the i225 NIC.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h      |  4 +
 drivers/net/ethernet/intel/igc/igc_main.c | 40 ++++++++--
 drivers/net/ethernet/intel/igc/igc_xdp.c  | 93 +++++++++++++++++++++++
 drivers/net/ethernet/intel/igc/igc_xdp.h  | 11 +++
 4 files changed, 143 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index 6fd5901f07c7..84e5f3c97351 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -13,6 +13,7 @@
 #include <linux/ptp_clock_kernel.h>
 #include <linux/timecounter.h>
 #include <linux/net_tstamp.h>
+#include <linux/if_xdp.h>
 
 #include "igc_hw.h"
 
@@ -252,6 +253,9 @@ struct igc_adapter {
 		struct timespec64 start;
 		struct timespec64 period;
 	} perout[IGC_N_PEROUT];
+
+	struct btf *btf;
+	u8 btf_enabled;
 };
 
 void igc_up(struct igc_adapter *adapter);
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index fe3619c25c05..82e9b493cad6 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -13,6 +13,7 @@
 #include <linux/bpf_trace.h>
 #include <net/xdp_sock_drv.h>
 #include <net/ipv6.h>
+#include <linux/btf.h>
 
 #include "igc.h"
 #include "igc_hw.h"
@@ -2374,8 +2375,20 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
 
 		if (!skb) {
 			xdp_init_buff(&xdp, truesize, &rx_ring->xdp_rxq);
+
 			xdp_prepare_buff(&xdp, pktbuf - igc_rx_offset(rx_ring),
-					 igc_rx_offset(rx_ring) + pkt_offset, size, false);
+					 igc_rx_offset(rx_ring) + pkt_offset, size,
+					 adapter->btf_enabled);
+
+			if (adapter->btf_enabled) {
+				struct xdp_hints___igc *hints;
+
+				hints = xdp.data - sizeof(*hints);
+				xdp.data_meta = hints;
+				hints->rx_timestamp = timestamp;
+				hints->valid_map = XDP_GENERIC_HINTS_RX_TIMESTAMP;
+				hints->btf_id = btf_obj_id(adapter->btf);
+			}
 
 			skb = igc_xdp_run_prog(adapter, &xdp);
 		}
@@ -2539,12 +2552,18 @@ static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget)
 							bi->xdp->data);
 
 			bi->xdp->data += IGC_TS_HDR_LEN;
-
-			/* HW timestamp has been copied into local variable. Metadata
-			 * length when XDP program is called should be 0.
-			 */
 			bi->xdp->data_meta += IGC_TS_HDR_LEN;
 			size -= IGC_TS_HDR_LEN;
+
+			if (adapter->btf_enabled) {
+				struct xdp_hints___igc *hints;
+
+				hints = bi->xdp->data - sizeof(*hints);
+				bi->xdp->data_meta = hints;
+				hints->rx_timestamp = timestamp;
+				hints->valid_map = XDP_GENERIC_HINTS_RX_TIMESTAMP;
+				hints->btf_id = btf_obj_id(adapter->btf);
+			}
 		}
 
 		bi->xdp->data_end = bi->xdp->data + size;
@@ -5949,6 +5968,12 @@ static int igc_bpf(struct net_device *dev, struct netdev_bpf *bpf)
 	case XDP_SETUP_XSK_POOL:
 		return igc_xdp_setup_pool(adapter, bpf->xsk.pool,
 					  bpf->xsk.queue_id);
+	case XDP_SETUP_MD_BTF:
+		return igc_xdp_set_btf_md(dev, bpf->btf_enable);
+	case XDP_QUERY_MD_BTF:
+		bpf->btf_id = igc_xdp_query_btf(dev, &bpf->btf_enable);
+		return 0;
+
 	default:
 		return -EOPNOTSUPP;
 	}
@@ -6436,6 +6461,11 @@ static void igc_remove(struct pci_dev *pdev)
 	cancel_work_sync(&adapter->reset_task);
 	cancel_work_sync(&adapter->watchdog_task);
 
+	if (adapter->btf) {
+		adapter->btf_enabled = 0;
+		btf_unregister(adapter->btf);
+	}
+
 	/* Release control of h/w to f/w.  If f/w is AMT enabled, this
 	 * would have already happened in close and is redundant.
 	 */
diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c b/drivers/net/ethernet/intel/igc/igc_xdp.c
index a8cf5374be47..00d223703424 100644
--- a/drivers/net/ethernet/intel/igc/igc_xdp.c
+++ b/drivers/net/ethernet/intel/igc/igc_xdp.c
@@ -2,10 +2,103 @@
 /* Copyright (c) 2020, Intel Corporation. */
 
 #include <net/xdp_sock_drv.h>
+#include <linux/btf.h>
 
 #include "igc.h"
 #include "igc_xdp.h"
 
+#define IGC_XDP_HINTS_NUM_MMBRS 0
+static const char names_str[] = XDP_GENERIC_HINTS_NAMES "yet_another_timestamp\0";
+
+/**
+ * struct igc_xdp_hints {
+ *      { generic_xdp_hints }
+ * }
+ */
+
+static const u32 igc_xdp_hints_raw[] = {
+	XDP_GENERIC_HINTS_TYPES,
+	XDP_GENERIC_HINTS_STRUCT(IGC_XDP_HINTS_NUM_MMBRS, 0),
+	XDP_GENERIC_HINTS_MEMBERS(0),
+};
+
+static int igc_xdp_register_btf(struct igc_adapter *priv)
+{
+	unsigned int type_sec_sz, str_sec_sz;
+	char *types_sec, *str_sec;
+	struct btf_header *hdr;
+	unsigned int btf_size;
+	void *raw_btf = NULL;
+	int err = 0;
+
+	type_sec_sz = sizeof(igc_xdp_hints_raw);
+	str_sec_sz  = sizeof(names_str);
+
+	btf_size = sizeof(*hdr) + type_sec_sz + str_sec_sz;
+	raw_btf = kzalloc(btf_size, GFP_KERNEL);
+	if (!raw_btf)
+		return -ENOMEM;
+
+	hdr = raw_btf;
+	hdr->magic	= BTF_MAGIC;
+	hdr->version  = BTF_VERSION;
+	hdr->hdr_len  = sizeof(*hdr);
+	hdr->type_off = 0;
+	hdr->type_len = type_sec_sz;
+	hdr->str_off  = type_sec_sz;
+	hdr->str_len  = str_sec_sz;
+
+	types_sec = raw_btf   + sizeof(*hdr);
+	str_sec   = types_sec + type_sec_sz;
+	memcpy(types_sec, igc_xdp_hints_raw, type_sec_sz);
+	memcpy(str_sec, names_str, str_sec_sz);
+
+	priv->btf = btf_register(raw_btf, btf_size);
+	if (IS_ERR(priv->btf)) {
+		err = PTR_ERR(priv->btf);
+		priv->btf = NULL;
+		netdev_err(priv->netdev, "failed to register BTF MD, err (%d)\n", err);
+	}
+
+	kfree(raw_btf);
+	return err;
+}
+
+int igc_xdp_query_btf(struct net_device *dev, u8 *enabled)
+{
+	struct igc_adapter *priv = netdev_priv(dev);
+	u32 md_btf_id = 0;
+
+	if (!IS_ENABLED(CONFIG_BPF_SYSCALL))
+		return md_btf_id;
+
+	if (!priv->btf)
+		igc_xdp_register_btf(priv);
+
+	*enabled = !!priv->btf_enabled;
+	md_btf_id = priv->btf ? btf_obj_id(priv->btf) : 0;
+
+	return md_btf_id;
+}
+
+int igc_xdp_set_btf_md(struct net_device *dev, u8 enable)
+{
+	struct igc_adapter *priv = netdev_priv(dev);
+	int err = 0;
+
+	if (enable && !priv->btf) {
+		igc_xdp_register_btf(priv);
+		if (!priv->btf) {
+			err = -EINVAL;
+			goto unlock;
+		}
+	}
+
+	priv->btf_enabled = enable;
+unlock:
+	return err;
+}
+
 int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
 		     struct netlink_ext_ack *extack)
 {
diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.h b/drivers/net/ethernet/intel/igc/igc_xdp.h
index a74e5487d199..2bf591f42cec 100644
--- a/drivers/net/ethernet/intel/igc/igc_xdp.h
+++ b/drivers/net/ethernet/intel/igc/igc_xdp.h
@@ -4,6 +4,8 @@
 #ifndef _IGC_XDP_H_
 #define _IGC_XDP_H_
 
+#include <linux/btf.h>
+
 int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
 		     struct netlink_ext_ack *extack);
 int igc_xdp_setup_pool(struct igc_adapter *adapter, struct xsk_buff_pool *pool,
@@ -14,4 +16,13 @@ static inline bool igc_xdp_is_enabled(struct igc_adapter *adapter)
 	return !!adapter->xdp_prog;
 }
 
+int igc_xdp_query_btf(struct net_device *dev, u8 *enabled);
+int igc_xdp_set_btf_md(struct net_device *dev, u8 enable);
+
+struct xdp_hints___igc {
+	XDP_GENERIC_HINTS_STRUCT_MEMBERS;
+} __packed;
+
+#define IGC_XDP_HINT_DMA_TIMESTAMP BIT(XDP_GENERIC_HINTS_BIT_MAX + 1)
+
 #endif /* _IGC_XDP_H_ */
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 11/16] igc: XDP packet TX timestamp
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
                   ` (9 preceding siblings ...)
  2021-08-03  1:03 ` [[RFC xdp-hints] 10/16] igc: XDP packet RX timestamp Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 12/16] ethtool,igc: Add "xdp_headroom" driver info Ederson de Souza
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf

ADD the PTP timestamp of when a packet was transmitted to the XDP hints.
An application using AF_XDP can get this timestamp by inspecting the XDP
frame metadata when it gets to the completion queue.

One notable difference from TX timestamp for SKB, is that the XDP frame
actually resides in the UMEM. As such, the timestamp is added to the
frame, and user space applications can access it when the frame is sent
to the completion queue.

When performing the clean-up of TX descriptors, driver will check if an
XDP socket frame is "expecting" a TX timestamp. If so, driver will stop
clean-up to give an opportunity for the TX timestamp interrupt arrive.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h      |  22 ++--
 drivers/net/ethernet/intel/igc/igc_main.c | 122 ++++++++++++++++++----
 drivers/net/ethernet/intel/igc/igc_ptp.c  |  47 ++++++---
 3 files changed, 152 insertions(+), 39 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index 84e5f3c97351..eb955b5fb58f 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -68,11 +68,23 @@ struct igc_rx_packet_stats {
 	u64 other_packets;
 };
 
+enum igc_tx_buffer_type {
+	IGC_TX_BUFFER_TYPE_SKB,
+	IGC_TX_BUFFER_TYPE_XDP,
+	IGC_TX_BUFFER_TYPE_XSK,
+};
+
 #define IGC_MAX_TX_TSTAMP_TIMERS	4
 
 struct igc_tx_timestamp_request {
-	struct sk_buff *skb;
+	union igc_pending_ts_pkt {
+		struct sk_buff *skb;
+		struct xdp_desc xsk_desc;
+		void *ptr;
+	} pending_ts_pkt;
+	struct xsk_buff_pool *xsk_pool;
 	unsigned long start;
+	enum igc_tx_buffer_type type;
 	u32 mask;
 	u32 regl;
 	u32 regh;
@@ -435,12 +447,6 @@ enum igc_boards {
 #define TXD_USE_COUNT(S)	DIV_ROUND_UP((S), IGC_MAX_DATA_PER_TXD)
 #define DESC_NEEDED	(MAX_SKB_FRAGS + 4)
 
-enum igc_tx_buffer_type {
-	IGC_TX_BUFFER_TYPE_SKB,
-	IGC_TX_BUFFER_TYPE_XDP,
-	IGC_TX_BUFFER_TYPE_XSK,
-};
-
 /* wrapper around a pointer to a socket buffer,
  * so a DMA handle can be stored along with the buffer
  */
@@ -451,6 +457,7 @@ struct igc_tx_buffer {
 	union {
 		struct sk_buff *skb;
 		struct xdp_frame *xdpf;
+		struct xdp_desc xsk_desc;
 	};
 	unsigned int bytecount;
 	u16 gso_segs;
@@ -641,6 +648,7 @@ int igc_ptp_set_ts_config(struct net_device *netdev, struct ifreq *ifr);
 int igc_ptp_get_ts_config(struct net_device *netdev, struct ifreq *ifr);
 void igc_ptp_tx_hang(struct igc_adapter *adapter);
 void igc_ptp_read(struct igc_adapter *adapter, struct timespec64 *ts);
+ktime_t igc_retrieve_ptp_tx_timestamp(struct igc_adapter *adapter);
 
 #define igc_rx_pg_size(_ring) (PAGE_SIZE << igc_rx_pg_order(_ring))
 
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 82e9b493cad6..46c8c393d03e 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -1157,7 +1157,8 @@ static u32 igc_tx_cmd_type(struct sk_buff *skb, u32 tx_flags)
 				 (IGC_ADVTXD_TSTAMP_REG_3));
 
 	/* insert frame checksum */
-	cmd_type ^= IGC_SET_FLAG(skb->no_fcs, 1, IGC_ADVTXD_DCMD_IFCS);
+	if (skb)
+		cmd_type ^= IGC_SET_FLAG(skb->no_fcs, 1, IGC_ADVTXD_DCMD_IFCS);
 
 	return cmd_type;
 }
@@ -1413,17 +1414,25 @@ static int igc_tso(struct igc_ring *tx_ring,
 	return 1;
 }
 
-static bool igc_request_tx_tstamp(struct igc_adapter *adapter, struct sk_buff *skb, u32 *flags)
+static bool igc_request_tx_tstamp(struct igc_adapter *adapter,
+				  union igc_pending_ts_pkt ts_pkt, u32 *flags,
+				  struct xsk_buff_pool *xsk_pool)
 {
 	int i;
 
 	for (i = 0; i < IGC_MAX_TX_TSTAMP_TIMERS; i++) {
 		struct igc_tx_timestamp_request *tstamp = &adapter->tx_tstamp[i];
 
-		if (tstamp->skb)
+		if (tstamp->pending_ts_pkt.ptr)
 			continue;
 
-		tstamp->skb = skb_get(skb);
+		tstamp->pending_ts_pkt = ts_pkt;
+		if (xsk_pool) {
+			tstamp->xsk_pool = xsk_pool;
+			tstamp->type = IGC_TX_BUFFER_TYPE_XSK;
+		} else {
+			tstamp->type = IGC_TX_BUFFER_TYPE_SKB;
+		}
 		tstamp->start = jiffies;
 		*flags = tstamp->flags;
 
@@ -1468,17 +1477,20 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb,
 
 	if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) {
 		struct igc_adapter *adapter = netdev_priv(tx_ring->netdev);
+		union igc_pending_ts_pkt ts_pkt;
 		unsigned long flags;
 		u32 tstamp_flags;
 
 		spin_lock_irqsave(&adapter->ptp_tx_lock, flags);
 
+		ts_pkt.skb = skb_get(skb);
 		if (adapter->tstamp_config.tx_type == HWTSTAMP_TX_ON &&
-		    igc_request_tx_tstamp(adapter, skb, &tstamp_flags)) {
+		    igc_request_tx_tstamp(adapter, ts_pkt, &tstamp_flags, NULL)) {
 			skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
 			tx_flags |= IGC_TX_FLAGS_TSTAMP | tstamp_flags;
 		} else {
 			adapter->tx_hwtstamp_skipped++;
+			skb_unref(skb);
 		}
 
 		spin_unlock_irqrestore(&adapter->ptp_tx_lock, flags);
@@ -2163,7 +2175,8 @@ static int igc_xdp_init_tx_buffer(struct igc_tx_buffer *buffer,
 
 /* This function requires __netif_tx_lock is held by the caller. */
 static int igc_xdp_init_tx_descriptor(struct igc_ring *ring,
-				      struct xdp_frame *xdpf)
+				      struct xdp_frame *xdpf,
+				      u32 tx_flags)
 {
 	struct igc_tx_buffer *buffer;
 	union igc_adv_tx_desc *desc;
@@ -2191,6 +2204,7 @@ static int igc_xdp_init_tx_descriptor(struct igc_ring *ring,
 	netdev_tx_sent_queue(txring_txq(ring), buffer->bytecount);
 
 	buffer->next_to_watch = desc;
+	buffer->tx_flags = tx_flags;
 
 	ring->next_to_use++;
 	if (ring->next_to_use == ring->count)
@@ -2228,7 +2242,7 @@ static int igc_xdp_xmit_back(struct igc_adapter *adapter, struct xdp_buff *xdp)
 	nq = txring_txq(ring);
 
 	__netif_tx_lock(nq, cpu);
-	res = igc_xdp_init_tx_descriptor(ring, xdpf);
+	res = igc_xdp_init_tx_descriptor(ring, xdpf, 0);
 	__netif_tx_unlock(nq);
 	return res;
 }
@@ -2630,6 +2644,7 @@ static void igc_update_tx_stats(struct igc_q_vector *q_vector,
 
 static void igc_xdp_xmit_zc(struct igc_ring *ring)
 {
+	struct igc_adapter *adapter = netdev_priv(ring->netdev);
 	struct xsk_buff_pool *pool = ring->xsk_pool;
 	struct netdev_queue *nq = txring_txq(ring);
 	union igc_adv_tx_desc *tx_desc = NULL;
@@ -2646,13 +2661,36 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring)
 	budget = igc_desc_unused(ring);
 
 	while (xsk_tx_peek_desc(pool, &xdp_desc) && budget--) {
-		u32 cmd_type, olinfo_status;
+		u32 cmd_type, olinfo_status, tx_flags = 0;
 		struct igc_tx_buffer *bi;
+		unsigned long flags;
 		dma_addr_t dma;
 
-		cmd_type = IGC_ADVTXD_DTYP_DATA | IGC_ADVTXD_DCMD_DEXT |
-			   IGC_ADVTXD_DCMD_IFCS | IGC_TXD_DCMD |
-			   xdp_desc.len;
+		if (adapter->tstamp_config.tx_type == HWTSTAMP_TX_ON &&
+		    adapter->btf_enabled) {
+			union igc_pending_ts_pkt ts_pkt;
+			struct xdp_hints___igc *hints;
+			u32 tstamp_flags;
+
+			/* Ensure there's no garbage on metadata */
+			hints = (struct xdp_hints___igc *)
+				((char *)xsk_buff_raw_get_data(pool, xdp_desc.addr)
+				 - sizeof(*hints));
+			hints->valid_map = 0;
+			spin_lock_irqsave(&adapter->ptp_tx_lock, flags);
+
+			ts_pkt.xsk_desc = xdp_desc;
+			if (igc_request_tx_tstamp(
+			    adapter, ts_pkt, &tstamp_flags, pool)) {
+				tx_flags |= IGC_TX_FLAGS_TSTAMP | tstamp_flags;
+				hints->tx_timestamp = 0;;
+			} else
+				adapter->tx_hwtstamp_skipped++;
+
+			spin_unlock_irqrestore(&adapter->ptp_tx_lock, flags);
+		}
+
+		cmd_type = igc_tx_cmd_type(NULL, tx_flags) | IGC_TXD_DCMD | xdp_desc.len;
 		olinfo_status = xdp_desc.len << IGC_ADVTXD_PAYLEN_SHIFT;
 
 		dma = xsk_buff_raw_get_dma(pool, xdp_desc.addr);
@@ -2670,6 +2708,7 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring)
 		bi->gso_segs = 1;
 		bi->time_stamp = jiffies;
 		bi->next_to_watch = tx_desc;
+		bi->xsk_desc = xdp_desc;
 
 		netdev_tx_sent_queue(txring_txq(ring), xdp_desc.len);
 
@@ -2687,6 +2726,47 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring)
 	__netif_tx_unlock(nq);
 }
 
+static bool igc_xsk_complete_tx_tstamp(struct igc_adapter *adapter,
+				       struct igc_tx_buffer *tx_buffer)
+{
+	unsigned long flags;
+	bool ret = true;
+	int i;
+
+	if (!adapter->btf_enabled)
+		return ret;
+
+	spin_lock_irqsave(&adapter->ptp_tx_lock, flags);
+	for (i = 0; i < IGC_MAX_TX_TSTAMP_TIMERS; i++) {
+		struct igc_tx_timestamp_request *tstamp = &adapter->tx_tstamp[i];
+
+		if (!tstamp->pending_ts_pkt.ptr)
+			continue;
+
+		if (tstamp->type == IGC_TX_BUFFER_TYPE_XSK) {
+			struct xdp_desc xdp_desc = tstamp->pending_ts_pkt.xsk_desc;
+
+			if (xdp_desc.addr == tx_buffer->xsk_desc.addr) {
+				struct xdp_hints___igc *hints;
+				struct xsk_buff_pool *pool;
+
+				pool = tstamp->xsk_pool;
+				hints = (struct xdp_hints___igc *)
+					((char *)xsk_buff_raw_get_data(pool, xdp_desc.addr)
+					 - sizeof(*hints));
+				if (hints->tx_timestamp) {
+					ret = false;
+					break;
+				}
+				tstamp->pending_ts_pkt.ptr = NULL;
+			}
+		}
+	}
+	spin_unlock_irqrestore(&adapter->ptp_tx_lock, flags);
+
+	return ret;
+}
+
 /**
  * igc_clean_tx_irq - Reclaim resources after transmit completes
  * @q_vector: pointer to q_vector containing needed info
@@ -2726,15 +2806,10 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
 		if (!(eop_desc->wb.status & cpu_to_le32(IGC_TXD_STAT_DD)))
 			break;
 
-		/* clear next_to_watch to prevent false hangs */
-		tx_buffer->next_to_watch = NULL;
-
-		/* update the statistics for this packet */
-		total_bytes += tx_buffer->bytecount;
-		total_packets += tx_buffer->gso_segs;
-
 		switch (tx_buffer->type) {
 		case IGC_TX_BUFFER_TYPE_XSK:
+			if (!igc_xsk_complete_tx_tstamp(adapter, tx_buffer))
+				goto budget_out;
 			xsk_frames++;
 			break;
 		case IGC_TX_BUFFER_TYPE_XDP:
@@ -2750,6 +2825,13 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
 			break;
 		}
 
+		/* clear next_to_watch to prevent false hangs */
+		tx_buffer->next_to_watch = NULL;
+
+		/* update the statistics for this packet */
+		total_bytes += tx_buffer->bytecount;
+		total_packets += tx_buffer->gso_segs;
+
 		/* clear last DMA location and unmap remaining buffers */
 		while (tx_desc != eop_desc) {
 			tx_buffer++;
@@ -2783,6 +2865,7 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
 		budget--;
 	} while (likely(budget));
 
+budget_out:
 	netdev_tx_completed_queue(txring_txq(tx_ring),
 				  total_packets, total_bytes);
 
@@ -5986,6 +6069,7 @@ static int igc_xdp_xmit(struct net_device *dev, int num_frames,
 	int cpu = smp_processor_id();
 	struct netdev_queue *nq;
 	struct igc_ring *ring;
+	u32 tx_flags = 0;
 	int i, drops;
 
 	if (unlikely(test_bit(__IGC_DOWN, &adapter->state)))
@@ -6004,7 +6088,7 @@ static int igc_xdp_xmit(struct net_device *dev, int num_frames,
 		int err;
 		struct xdp_frame *xdpf = frames[i];
 
-		err = igc_xdp_init_tx_descriptor(ring, xdpf);
+		err = igc_xdp_init_tx_descriptor(ring, xdpf, tx_flags);
 		if (err) {
 			xdp_return_frame_rx_napi(xdpf);
 			drops++;
diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
index 911c36a909a4..0f6b91a421e9 100644
--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
+++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
@@ -9,6 +9,9 @@
 #include <linux/ptp_classify.h>
 #include <linux/clocksource.h>
 #include <linux/ktime.h>
+#include <net/xdp_sock_drv.h>
+
+#include "igc_xdp.h"
 
 #define INCVALUE_MASK		0x7fffffff
 #define ISGN			0x80000000
@@ -613,9 +616,10 @@ static void igc_ptp_tx_timeout(struct igc_adapter *adapter,
 {
 	struct igc_hw *hw = &adapter->hw;
 
-	dev_kfree_skb_any(tstamp->skb);
-	tstamp->skb = NULL;
+	if (tstamp->type == IGC_TX_BUFFER_TYPE_SKB)
+		dev_kfree_skb_any(tstamp->pending_ts_pkt.skb);
 	tstamp->start = 0;
+	tstamp->pending_ts_pkt.ptr = NULL;
 	adapter->tx_hwtstamp_timeouts++;
 	/* Clear the tx valid bit in TSYNCTXCTL register to enable interrupt. */
 	rd32(tstamp->regh);
@@ -634,7 +638,7 @@ void igc_ptp_tx_hang(struct igc_adapter *adapter)
 	for (i = 0; i < IGC_MAX_TX_TSTAMP_TIMERS; i++) {
 		tstamp = &adapter->tx_tstamp[i];
 
-		if (!tstamp->skb)
+		if (!tstamp->pending_ts_pkt.ptr)
 			continue;
 
 		if (time_is_after_jiffies(tstamp->start + IGC_PTP_TX_TIMEOUT))
@@ -661,7 +665,6 @@ void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter, u32 mask)
 	struct skb_shared_hwtstamps shhwtstamps;
 	struct igc_hw *hw = &adapter->hw;
 	unsigned long flags;
-	struct sk_buff *skb;
 	int adjust = 0;
 	u64 regval;
 	int i;
@@ -675,12 +678,13 @@ void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter, u32 mask)
 		if (!(mask & tstamp->mask))
 			continue;
 
-		skb = tstamp->skb;
-		if (!skb)
-			continue;
-
+		/* Always need to read register, to clean interrupt cause */
 		regval = rd32(tstamp->regl);
 		regval |= (u64)rd32(tstamp->regh) << 32;
+
+		if (!tstamp->pending_ts_pkt.ptr)
+			continue;
+
 		igc_ptp_systim_to_hwtstamp(adapter, &shhwtstamps, regval);
 
 		switch (adapter->link_speed) {
@@ -706,12 +710,27 @@ void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter, u32 mask)
 		 * a copy of the skb pointer to ensure other threads can't change it
 		 * while we're notifying the stack.
 		 */
-		tstamp->skb = NULL;
 		tstamp->start = 0;
 
 		/* Notify the stack and free the skb after we've unlocked */
-		skb_tstamp_tx(skb, &shhwtstamps);
-		dev_kfree_skb_any(skb);
+		if (tstamp->type == IGC_TX_BUFFER_TYPE_SKB) {
+			skb_tstamp_tx(tstamp->pending_ts_pkt.skb, &shhwtstamps);
+			dev_kfree_skb_any(tstamp->pending_ts_pkt.skb);
+			tstamp->pending_ts_pkt.ptr = NULL;
+		} else if (tstamp->type == IGC_TX_BUFFER_TYPE_XSK) {
+			struct xdp_hints___igc *hints;
+			struct xsk_buff_pool *pool;
+			struct xdp_desc xdp_desc;
+
+			pool = tstamp->xsk_pool;
+			xdp_desc = tstamp->pending_ts_pkt.xsk_desc;
+			hints = (struct xdp_hints___igc *)
+				((char *)xsk_buff_raw_get_data(pool, xdp_desc.addr)
+				 - sizeof(*hints));
+			hints->tx_timestamp = shhwtstamps.hwtstamp;
+			hints->valid_map = XDP_GENERIC_HINTS_TX_TIMESTAMP;
+			hints->btf_id = btf_obj_id(adapter->btf);
+		}
 	}
 
 	spin_unlock_irqrestore(&adapter->ptp_tx_lock, flags);
@@ -883,8 +902,10 @@ static void igc_tx_tstamp_clear(struct igc_adapter *adapter)
 	for (i = 0; i < IGC_MAX_TX_TSTAMP_TIMERS; i++) {
 		struct igc_tx_timestamp_request *tstamp = &adapter->tx_tstamp[i];
 
-		dev_kfree_skb_any(tstamp->skb);
-		tstamp->skb = NULL;
+		if (tstamp->pending_ts_pkt.ptr && tstamp->type == IGC_TX_BUFFER_TYPE_SKB)
+			dev_kfree_skb_any(tstamp->pending_ts_pkt.skb);
+
+		tstamp->pending_ts_pkt.ptr = NULL;
 		tstamp->start = 0;
 	}
 }
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 12/16] ethtool,igc: Add "xdp_headroom" driver info
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
                   ` (10 preceding siblings ...)
  2021-08-03  1:03 ` [[RFC xdp-hints] 11/16] igc: XDP packet TX timestamp Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 13/16] libbpf: Helpers to access XDP frame metadata Ederson de Souza
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf

This information can be used by user space applications to determine how
much headroom is needed for the XDP frame.

igc driver is also changed to add this new information.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_ethtool.c | 2 ++
 include/uapi/linux/ethtool.h                 | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
index d3e84416248e..7cfd4eb59234 100644
--- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
+++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
@@ -8,6 +8,7 @@
 
 #include "igc.h"
 #include "igc_diag.h"
+#include "igc_xdp.h"
 
 /* forward declaration */
 struct igc_stats {
@@ -156,6 +157,7 @@ static void igc_ethtool_get_drvinfo(struct net_device *netdev,
 		sizeof(drvinfo->bus_info));
 
 	drvinfo->n_priv_flags = IGC_PRIV_FLAGS_STR_LEN;
+	drvinfo->xdp_headroom = XDP_PACKET_HEADROOM;
 }
 
 static int igc_ethtool_get_regs_len(struct net_device *netdev)
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index 67aa7134b301..dcf14ad4dccd 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -176,6 +176,8 @@ static inline __u32 ethtool_cmd_speed(const struct ethtool_cmd *ep)
  *	and %ETHTOOL_SEEPROM commands, in bytes
  * @regdump_len: Size of register dump returned by the %ETHTOOL_GREGS
  *	command, in bytes
+ * @xdp_headroom: Size of minimum XDP headroom needed by the driver
+ *	to fill with metadata information.
  *
  * Users can use the %ETHTOOL_GSSET_INFO command to get the number of
  * strings in any string set (from Linux 2.6.34).
@@ -197,6 +199,7 @@ struct ethtool_drvinfo {
 	__u32	testinfo_len;
 	__u32	eedump_len;
 	__u32	regdump_len;
+	__u32	xdp_headroom;
 };
 
 #define SOPASS_MAX	6
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 13/16] libbpf: Helpers to access XDP frame metadata
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
                   ` (11 preceding siblings ...)
  2021-08-03  1:03 ` [[RFC xdp-hints] 12/16] ethtool,igc: Add "xdp_headroom" driver info Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-06 22:59   ` Andrii Nakryiko
  2021-08-03  1:03 ` [[RFC xdp-hints] 14/16] libbpf: Helpers to access XDP hints based on BTF definitions Ederson de Souza
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf

Two new pairs of helpers: `xsk_umem__adjust_prod_data` and
`xsk_umem__adjust_prod_data_meta` for data that is being produced by the
application - such as data that will be sent; and
`xsk_umem__adjust_cons_data` and `xsk_umem__adjust_cons_data_meta`,
for data being consumed - such as data obtained from the completion
queue.

Those function should usually be used on data obtained via
`xsk_umem__get_data`. Didn't change this function to avoid API breaks.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
---
 tools/lib/bpf/libbpf.map |  4 ++++
 tools/lib/bpf/xsk.c      | 26 ++++++++++++++++++++++++++
 tools/lib/bpf/xsk.h      |  7 +++++++
 3 files changed, 37 insertions(+)

diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 492db50a4cd7..663585f7f186 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -375,4 +375,8 @@ LIBBPF_0.5.0 {
 		bpf_map_lookup_and_delete_elem_flags;
 		bpf_object__gen_loader;
 		libbpf_set_strict_mode;
+		xsk_umem__adjust_cons_data;
+		xsk_umem__adjust_cons_data_meta;
+		xsk_umem__adjust_prod_data;
+		xsk_umem__adjust_prod_data_meta;
 } LIBBPF_0.4.0;
diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index e9b619aa0cdf..17e8045eac0e 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -119,6 +119,30 @@ int xsk_socket__fd(const struct xsk_socket *xsk)
 	return xsk ? xsk->fd : -EINVAL;
 }
 
+void *xsk_umem__adjust_prod_data(void *umem_data, const struct xsk_umem *umem)
+{
+	return umem_data + umem->config.frame_headroom + umem->config.xdp_headroom;
+}
+
+void *xsk_umem__adjust_prod_data_meta(void *umem_data, const struct xsk_umem *umem)
+{
+	if (!umem->config.xdp_headroom)
+		return NULL;
+	return umem_data;
+}
+
+void *xsk_umem__adjust_cons_data(void *umem_data, const struct xsk_umem *umem)
+{
+	return umem_data;
+}
+
+void *xsk_umem__adjust_cons_data_meta(void *umem_data, const struct xsk_umem *umem)
+{
+	if (!umem->config.xdp_headroom)
+		return NULL;
+	return umem_data;
+}
+
 static bool xsk_page_aligned(void *buffer)
 {
 	unsigned long addr = (unsigned long)buffer;
@@ -135,6 +159,7 @@ static void xsk_set_umem_config(struct xsk_umem_config *cfg,
 		cfg->frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE;
 		cfg->frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM;
 		cfg->flags = XSK_UMEM__DEFAULT_FLAGS;
+		cfg->xdp_headroom = XSK_UMEM__DEFAULT_XDP_HEADROOM;
 		return;
 	}
 
@@ -143,6 +168,7 @@ static void xsk_set_umem_config(struct xsk_umem_config *cfg,
 	cfg->frame_size = usr_cfg->frame_size;
 	cfg->frame_headroom = usr_cfg->frame_headroom;
 	cfg->flags = usr_cfg->flags;
+	cfg->xdp_headroom = usr_cfg->xdp_headroom;
 }
 
 static int xsk_set_xdp_socket_config(struct xsk_socket_config *cfg,
diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h
index 01c12dca9c10..7f4143150746 100644
--- a/tools/lib/bpf/xsk.h
+++ b/tools/lib/bpf/xsk.h
@@ -248,12 +248,18 @@ static inline __u64 xsk_umem__add_offset_to_addr(__u64 addr)
 LIBBPF_API int xsk_umem__fd(const struct xsk_umem *umem);
 LIBBPF_API int xsk_socket__fd(const struct xsk_socket *xsk);
 
+LIBBPF_API void *xsk_umem__adjust_prod_data(void *umem_data, const struct xsk_umem *umem);
+LIBBPF_API void *xsk_umem__adjust_prod_data_meta(void *umem_data, const struct xsk_umem *umem);
+LIBBPF_API void *xsk_umem__adjust_cons_data(void *umem_data, const struct xsk_umem *umem);
+LIBBPF_API void *xsk_umem__adjust_cons_data_meta(void *umem_data, const struct xsk_umem *umem);
+
 #define XSK_RING_CONS__DEFAULT_NUM_DESCS      2048
 #define XSK_RING_PROD__DEFAULT_NUM_DESCS      2048
 #define XSK_UMEM__DEFAULT_FRAME_SHIFT    12 /* 4096 bytes */
 #define XSK_UMEM__DEFAULT_FRAME_SIZE     (1 << XSK_UMEM__DEFAULT_FRAME_SHIFT)
 #define XSK_UMEM__DEFAULT_FRAME_HEADROOM 0
 #define XSK_UMEM__DEFAULT_FLAGS 0
+#define XSK_UMEM__DEFAULT_XDP_HEADROOM 0
 
 struct xsk_umem_config {
 	__u32 fill_size;
@@ -261,6 +267,7 @@ struct xsk_umem_config {
 	__u32 frame_size;
 	__u32 frame_headroom;
 	__u32 flags;
+	__u32 xdp_headroom;
 };
 
 LIBBPF_API int xsk_setup_xdp_prog(int ifindex,
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 14/16] libbpf: Helpers to access XDP hints based on BTF definitions
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
                   ` (12 preceding siblings ...)
  2021-08-03  1:03 ` [[RFC xdp-hints] 13/16] libbpf: Helpers to access XDP frame metadata Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 15/16] samples/bpf: XDP hints AF_XDP example Ederson de Souza
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf

A new set of functions to help get the BTF definition of XDP hints
structure and get the information based on it.

`xsk_umem__btf_id` helps retrieve the BTF id of XDP metadata.
`xsk_btf__init` sets up a context based on the BTF, including a hashmap,
so that subsequent queries are faster.
`xsk_btf__read` returns a pointer to the position in the XDP metadata
containing a given field.
`xsk_btf__has_field` checks the presence of a field in the BTF.
`xsk_btf__free` frees up the context.

Besides those, a macro `XSK_BTF_READ_INTO` acts as a convenient helper
to read the field contents into a given variable.

Note that currently, the hashmap used to speed-up offset location into
the BTF doesn't use the field name as a string as key to the hashmap. It
directly uses the pointer value instead, as it is expected that most of
time, field names will be addressed by a shared constant string residing
on read-only memory, thus saving some time. If this assumption is not
entirely true, this optimisation needs to be rethought (or discarded
altogether).

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
---
 tools/lib/bpf/libbpf.map |   5 ++
 tools/lib/bpf/xsk.c      | 177 +++++++++++++++++++++++++++++++++++++++
 tools/lib/bpf/xsk.h      |  15 ++++
 3 files changed, 197 insertions(+)

diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 663585f7f186..04ffee0dc005 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -375,8 +375,13 @@ LIBBPF_0.5.0 {
 		bpf_map_lookup_and_delete_elem_flags;
 		bpf_object__gen_loader;
 		libbpf_set_strict_mode;
+		xsk_btf__init;
+		xsk_btf__read;
+		xsk_btf__has_field;
+		xsk_btf__free;
 		xsk_umem__adjust_cons_data;
 		xsk_umem__adjust_cons_data_meta;
 		xsk_umem__adjust_prod_data;
 		xsk_umem__adjust_prod_data_meta;
+		xsk_umem__btf_id;
 } LIBBPF_0.4.0;
diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index 17e8045eac0e..0455ddaa1734 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -31,6 +31,7 @@
 #include <linux/if_link.h>
 
 #include "bpf.h"
+#include "hashmap.h"
 #include "libbpf.h"
 #include "libbpf_internal.h"
 #include "xsk.h"
@@ -143,6 +144,14 @@ void *xsk_umem__adjust_cons_data_meta(void *umem_data, const struct xsk_umem *um
 	return umem_data;
 }
 
+int xsk_umem__btf_id(void *umem_data, const struct xsk_umem *umem)
+{
+	if (umem->config.xdp_headroom < sizeof(int))
+		return -EINVAL;
+
+	return *(int *)(umem_data - sizeof(int));
+}
+
 static bool xsk_page_aligned(void *buffer)
 {
 	unsigned long addr = (unsigned long)buffer;
@@ -1290,3 +1299,171 @@ void xsk_socket__delete(struct xsk_socket *xsk)
 		close(xsk->fd);
 	free(xsk);
 }
+
+struct xsk_btf_info {
+	struct hashmap map;
+	struct btf *btf;
+	const struct btf_type *type;
+};
+
+struct xsk_btf_entry {
+	__u32 offset;
+	__u32 size;
+};
+
+static void __xsk_btf_free_hash(struct xsk_btf_info *xbi)
+{
+	struct hashmap_entry *entry;
+	int i;
+
+	hashmap__for_each_entry((&(xbi->map)), entry, i) {
+		free(entry->value);
+	}
+	hashmap__clear(&(xbi->map));
+}
+
+static size_t __xsk_hash_fn(const void *key, void *ctx)
+{
+	return (size_t)key;
+}
+
+static bool __xsk_equal_fn(const void *k1, const void *k2, void *ctx)
+{
+	return k1 == k2;
+}
+
+int xsk_btf__init(__u32 btf_id, struct xsk_btf_info **xbi)
+{
+	const struct btf_member *m;
+	const struct btf_type *t;
+	unsigned short vlen;
+	struct btf *btf;
+	int i, id, ret = 0;
+
+	if (!xbi)
+		return -EINVAL;
+
+	ret = btf__get_from_id(btf_id, &btf);
+	if (ret < 0)
+		return ret;
+
+	id = btf__find_by_name(btf, "xdp_hints");
+	if (id < 0) {
+		ret = id;
+		goto error_btf;
+	}
+
+	t = btf__type_by_id(btf, id);
+
+	if (!BTF_INFO_KFLAG(t->info)) {
+		ret = -EINVAL;
+		goto error_btf;
+	}
+
+	*xbi = malloc(sizeof(**xbi));
+	if (!*xbi) {
+		ret = -ENOMEM;
+		goto error_btf;
+	}
+
+	hashmap__init(&(*xbi)->map, __xsk_hash_fn, __xsk_equal_fn, NULL);
+
+	/* Validate no BTF field is a bitfield */
+	m = btf_members(t);
+	vlen = BTF_INFO_VLEN(t->info);
+	for (i = 0; i < vlen; i++, m++) {
+		if (BTF_MEMBER_BITFIELD_SIZE(m->offset)) {
+			ret = -ENOTSUP;
+			goto error_entry;
+		}
+	}
+
+	(*xbi)->btf = btf;
+	(*xbi)->type = t;
+
+	return ret;
+
+error_entry:
+	__xsk_btf_free_hash(*xbi);
+	free(*xbi);
+
+error_btf:
+	btf__free(btf);
+	return ret;
+}
+
+static int __xsk_btf_field_entry(struct xsk_btf_info *xbi, const char *field,
+			  struct xsk_btf_entry **entry)
+{
+	const struct btf_member *m;
+	unsigned short vlen;
+	int i;
+
+	m = btf_members(xbi->type);
+	vlen = BTF_INFO_VLEN(xbi->type->info);
+	for (i = 0; i < vlen; i++, m++) {
+		const struct btf_type *member_type;
+		const char *name = btf__name_by_offset(xbi->btf, m->name_off);
+
+		if (strcmp(name, field))
+			continue;
+
+		if (entry) {
+			member_type = btf__type_by_id(xbi->btf, m->type);
+			*entry = malloc(sizeof(*entry));
+			if (!entry) {
+				return -ENOMEM;
+			}
+
+			/* As we bail out at init for bit fields, there should
+			 * be no entries whose offset is not a multiple of byte */
+			(*entry)->offset = BTF_MEMBER_BIT_OFFSET(m->offset) / 8;
+			(*entry)->size = member_type->size;
+		}
+		return 0;
+	}
+
+	return -ENOENT;
+}
+
+bool xsk_btf__has_field(const char *field, struct xsk_btf_info *xbi)
+{
+	if (!xbi)
+		return false;
+
+	return __xsk_btf_field_entry(xbi, field, NULL);
+}
+
+void xsk_btf__free(struct xsk_btf_info *xbi)
+{
+	if (!xbi)
+		return;
+
+	__xsk_btf_free_hash(xbi);
+	btf__free(xbi->btf);
+	free(xbi);
+}
+
+int xsk_btf__read(void **dest, size_t size, const char *field, struct xsk_btf_info *xbi,
+		  const void *addr)
+{
+	struct xsk_btf_entry *entry;
+	int err;
+
+	if (!field || !xbi || !dest || !addr)
+		return -EINVAL;
+
+	if (!hashmap__find(&(xbi->map), field, (void **)&entry)) {
+		err = __xsk_btf_field_entry(xbi, field, &entry);
+		if (err)
+			return err;
+
+		hashmap__add(&(xbi->map), field, entry);
+	}
+
+	if (entry->size != size)
+		return -EINVAL;
+
+	*dest = (void *)((char *)addr - xbi->type->size + entry->offset);
+	return 0;
+}
diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h
index 7f4143150746..b0bddc70c5a6 100644
--- a/tools/lib/bpf/xsk.h
+++ b/tools/lib/bpf/xsk.h
@@ -253,6 +253,8 @@ LIBBPF_API void *xsk_umem__adjust_prod_data_meta(void *umem_data, const struct x
 LIBBPF_API void *xsk_umem__adjust_cons_data(void *umem_data, const struct xsk_umem *umem);
 LIBBPF_API void *xsk_umem__adjust_cons_data_meta(void *umem_data, const struct xsk_umem *umem);
 
+LIBBPF_API int xsk_umem__btf_id(void *umem_data, const struct xsk_umem *umem);
+
 #define XSK_RING_CONS__DEFAULT_NUM_DESCS      2048
 #define XSK_RING_PROD__DEFAULT_NUM_DESCS      2048
 #define XSK_UMEM__DEFAULT_FRAME_SHIFT    12 /* 4096 bytes */
@@ -322,6 +324,19 @@ xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
 LIBBPF_API int xsk_umem__delete(struct xsk_umem *umem);
 LIBBPF_API void xsk_socket__delete(struct xsk_socket *xsk);
 
+struct xsk_btf_info;
+
+LIBBPF_API int xsk_btf__init(__u32 btf_id, struct xsk_btf_info **xbi);
+LIBBPF_API int xsk_btf__read(void **dest, size_t size, const char *field, struct xsk_btf_info *xbi,
+			     const void *addr);
+LIBBPF_API bool xsk_btf__has_field(const char *field, struct xsk_btf_info *xbi);
+LIBBPF_API void xsk_btf__free(struct xsk_btf_info *xbi);
+
+#define XSK_BTF_READ_INTO(dest, field, xbi, addr) ({ \
+	typeof(dest) *_d; \
+	xsk_btf__read((void **)&_d, sizeof(dest), #field, xbi, addr); \
+	dest = *_d; })
+
 #ifdef __cplusplus
 } /* extern "C" */
 #endif
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 15/16] samples/bpf: XDP hints AF_XDP example
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
                   ` (13 preceding siblings ...)
  2021-08-03  1:03 ` [[RFC xdp-hints] 14/16] libbpf: Helpers to access XDP hints based on BTF definitions Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-03  1:03 ` [[RFC xdp-hints] 16/16] samples/bpf: Show XDP hints usage Ederson de Souza
  2021-08-03  9:12 ` [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Alexander Lobakin
  16 siblings, 0 replies; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf

Using -D option, xdpsock now shows the RX or TX timestamp of last
sent/received packets (for rx only or tx only modes).

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
---
 samples/bpf/xdpsock_user.c | 146 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 142 insertions(+), 4 deletions(-)

diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
index 33d0bdebbed8..9485bb6fe356 100644
--- a/samples/bpf/xdpsock_user.c
+++ b/samples/bpf/xdpsock_user.c
@@ -7,11 +7,13 @@
 #include <libgen.h>
 #include <linux/bpf.h>
 #include <linux/compiler.h>
+#include <linux/ethtool.h>
 #include <linux/if_link.h>
 #include <linux/if_xdp.h>
 #include <linux/if_ether.h>
 #include <linux/ip.h>
 #include <linux/limits.h>
+#include <linux/sockios.h>
 #include <linux/udp.h>
 #include <arpa/inet.h>
 #include <locale.h>
@@ -25,6 +27,7 @@
 #include <stdlib.h>
 #include <string.h>
 #include <sys/capability.h>
+#include <sys/ioctl.h>
 #include <sys/mman.h>
 #include <sys/resource.h>
 #include <sys/socket.h>
@@ -99,6 +102,7 @@ static u32 opt_num_xsks = 1;
 static u32 prog_id;
 static bool opt_busy_poll;
 static bool opt_reduced_cap;
+static bool opt_metadata;
 
 struct xsk_ring_stats {
 	unsigned long rx_npkts;
@@ -142,6 +146,14 @@ struct xsk_umem_info {
 	struct xsk_ring_cons cq;
 	struct xsk_umem *umem;
 	void *buffer;
+	u32 frame_headroom;
+};
+
+struct xsk_metadata {
+	struct xsk_btf_info *xbi;
+	unsigned long rx_timestamp;
+	unsigned long tx_timestamp;
+	u32 btf_id;
 };
 
 struct xsk_socket_info {
@@ -152,13 +164,48 @@ struct xsk_socket_info {
 	struct xsk_ring_stats ring_stats;
 	struct xsk_app_stats app_stats;
 	struct xsk_driver_stats drv_stats;
+	struct xsk_metadata metadata;
 	u32 outstanding_tx;
 };
 
+struct xdp_hints {
+	u64 rx_timestamp;
+	u64 tx_timestamp;
+	u32 hash32;
+	u32 extension_id;
+	u64 field_map;
+} __attribute__((packed));
+
 static int num_socks;
 struct xsk_socket_info *xsks[MAX_SOCKS];
 int sock;
 
+static u32 get_xdp_headroom(void)
+{
+	struct ethtool_drvinfo drvinfo = { .cmd = ETHTOOL_GDRVINFO };
+	struct ifreq ifr = {};
+	int fd, err, ret;
+
+	fd = socket(AF_LOCAL, SOCK_DGRAM, 0);
+	if (fd < 0)
+		return 0;
+
+	ifr.ifr_data = (void *)&drvinfo;
+	memcpy(ifr.ifr_name, opt_if, strlen(opt_if) + 1);
+	err = ioctl(fd, SIOCETHTOOL, &ifr);
+
+	if (err) {
+		ret = 0;
+		goto out;
+	}
+
+	ret = drvinfo.xdp_headroom;
+
+out:
+	close(fd);
+	return ret;
+}
+
 static unsigned long get_nsecs(void)
 {
 	struct timespec ts;
@@ -258,6 +305,44 @@ static void dump_app_stats(long dt)
 	}
 }
 
+static struct xsk_btf_info *init_xsk_metadata_info(u32 btf_id)
+{
+	struct xsk_btf_info *xbi;
+
+	if (xsk_btf__init(btf_id, &xbi) < 0)
+		return NULL;
+
+	return xbi;
+}
+
+static void save_metadata_tx(void *meta, struct xsk_socket_info *xsk)
+{
+	u64 valid_map;
+
+	if (!meta)
+		return;
+
+	XSK_BTF_READ_INTO(valid_map, valid_map, xsk->metadata.xbi, meta);
+	if (valid_map & XDP_GENERIC_HINTS_TX_TIMESTAMP) {
+		XSK_BTF_READ_INTO(xsk->metadata.tx_timestamp,
+				  tx_timestamp, xsk->metadata.xbi, meta);
+	}
+}
+
+static void save_metadata_rx(void *meta, struct xsk_socket_info *xsk)
+{
+	u64 valid_map;
+
+	if (!meta)
+		return;
+
+	XSK_BTF_READ_INTO(valid_map, valid_map, xsk->metadata.xbi, meta);
+	if (valid_map & XDP_GENERIC_HINTS_RX_TIMESTAMP) {
+		XSK_BTF_READ_INTO(xsk->metadata.rx_timestamp,
+				  rx_timestamp, xsk->metadata.xbi, meta);
+	}
+}
+
 static bool get_interrupt_number(void)
 {
 	FILE *f_int_proc;
@@ -432,6 +517,12 @@ static void dump_stats(void)
 				printf("%-15s\n", "Error retrieving extra stats");
 			}
 		}
+
+		if (opt_metadata) {
+			printf("Last TX time: %lu\n", xsks[i]->metadata.tx_timestamp);
+			printf("Last RX time: %lu\n", xsks[i]->metadata.rx_timestamp);
+		}
+
 	}
 
 	if (opt_app_stats)
@@ -798,8 +889,10 @@ static void gen_eth_hdr_data(void)
 
 static void gen_eth_frame(struct xsk_umem_info *umem, u64 addr)
 {
-	memcpy(xsk_umem__get_data(umem->buffer, addr), pkt_data,
-	       PKT_SIZE);
+	void *data = xsk_umem__get_data(umem->buffer, addr);
+
+	data = xsk_umem__adjust_prod_data(data, umem->umem);
+	memcpy(data, pkt_data, PKT_SIZE);
 }
 
 static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
@@ -819,6 +912,7 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
 		.comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS,
 		.frame_size = opt_xsk_frame_size,
 		.frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM,
+		.xdp_headroom = get_xdp_headroom(),
 		.flags = opt_umem_flags
 	};
 	int ret;
@@ -833,6 +927,7 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
 		exit_with_error(-ret);
 
 	umem->buffer = buffer;
+	umem->frame_headroom = cfg.frame_headroom;
 	return umem;
 }
 
@@ -927,6 +1022,7 @@ static struct option long_options[] = {
 	{"irq-string", no_argument, 0, 'I'},
 	{"busy-poll", no_argument, 0, 'B'},
 	{"reduce-cap", no_argument, 0, 'R'},
+	{"metadata", no_argument, 0, 'D'},
 	{0, 0, 0, 0}
 };
 
@@ -967,6 +1063,7 @@ static void usage(const char *prog)
 		"  -I, --irq-string	Display driver interrupt statistics for interface associated with irq-string.\n"
 		"  -B, --busy-poll      Busy poll.\n"
 		"  -R, --reduce-cap	Use reduced capabilities (cannot be used with -M)\n"
+		"  -D, --metadata	Display latest packet metadata\n"
 		"\n";
 	fprintf(stderr, str, prog, XSK_UMEM__DEFAULT_FRAME_SIZE,
 		opt_batch_size, MIN_PKT_SIZE, MIN_PKT_SIZE,
@@ -982,7 +1079,7 @@ static void parse_command_line(int argc, char **argv)
 	opterr = 0;
 
 	for (;;) {
-		c = getopt_long(argc, argv, "Frtli:q:pSNn:czf:muMd:b:C:s:P:xQaI:BR",
+		c = getopt_long(argc, argv, "Frtli:q:pSNn:czf:muMd:b:C:s:P:xQaI:BRD",
 				long_options, &option_index);
 		if (c == -1)
 			break;
@@ -1087,6 +1184,9 @@ static void parse_command_line(int argc, char **argv)
 		case 'R':
 			opt_reduced_cap = true;
 			break;
+		case 'D':
+			opt_metadata = true;
+			break;
 		default:
 			usage(basename(argv[0]));
 		}
@@ -1193,6 +1293,25 @@ static inline void complete_tx_only(struct xsk_socket_info *xsk,
 
 	rcvd = xsk_ring_cons__peek(&xsk->umem->cq, batch_size, &idx);
 	if (rcvd > 0) {
+		if (opt_metadata) {
+			const struct xdp_desc *cq_desc = xsk_ring_cons__rx_desc(&xsk->umem->cq,
+					idx);
+			char *pkt = xsk_umem__get_data(xsk->umem->buffer, cq_desc->addr);
+			__u32 btf_id = xsk_umem__btf_id(pkt, xsk->umem->umem);
+			if (btf_id > 0) {
+				if (!xsk->metadata.xbi) {
+					xsk->metadata.xbi = init_xsk_metadata_info(btf_id);
+					if (xsk->metadata.xbi)
+						xsk->metadata.btf_id = btf_id;
+				}
+				if (xsk->metadata.btf_id == btf_id) {
+					void *m;
+
+					m = xsk_umem__adjust_cons_data_meta(pkt, xsk->umem->umem);
+					save_metadata_tx(m, xsk);
+				}
+			}
+		}
 		xsk_ring_cons__release(&xsk->umem->cq, rcvd);
 		xsk->outstanding_tx -= rcvd;
 	}
@@ -1232,6 +1351,23 @@ static void rx_drop(struct xsk_socket_info *xsk)
 		addr = xsk_umem__add_offset_to_addr(addr);
 		char *pkt = xsk_umem__get_data(xsk->umem->buffer, addr);
 
+		if (opt_metadata) {
+			__u32 btf_id = xsk_umem__btf_id(pkt, xsk->umem->umem);
+			if (btf_id > 0) {
+				if (!xsk->metadata.xbi) {
+					xsk->metadata.xbi = init_xsk_metadata_info(btf_id);
+					if (xsk->metadata.xbi)
+						xsk->metadata.btf_id = btf_id;
+				}
+				if (xsk->metadata.btf_id == btf_id) {
+					void *m;
+
+					m = xsk_umem__adjust_cons_data_meta(pkt, xsk->umem->umem);
+					save_metadata_rx(m, xsk);
+				}
+			}
+		}
+
 		hex_dump(pkt, len, addr);
 		*xsk_ring_prod__fill_addr(&xsk->umem->fq, idx_fq++) = orig;
 	}
@@ -1283,7 +1419,9 @@ static void tx_only(struct xsk_socket_info *xsk, u32 *frame_nb, int batch_size)
 	for (i = 0; i < batch_size; i++) {
 		struct xdp_desc *tx_desc = xsk_ring_prod__tx_desc(&xsk->tx,
 								  idx + i);
-		tx_desc->addr = (*frame_nb + i) * opt_xsk_frame_size;
+		tx_desc->addr = (__u64)xsk_umem__adjust_prod_data(
+				(void *)(__u64)((*frame_nb + i) * opt_xsk_frame_size),
+				xsk->umem->umem);
 		tx_desc->len = PKT_SIZE;
 	}
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [[RFC xdp-hints] 16/16] samples/bpf: Show XDP hints usage
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
                   ` (14 preceding siblings ...)
  2021-08-03  1:03 ` [[RFC xdp-hints] 15/16] samples/bpf: XDP hints AF_XDP example Ederson de Souza
@ 2021-08-03  1:03 ` Ederson de Souza
  2021-08-06 23:14   ` Andrii Nakryiko
  2021-08-03  9:12 ` [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Alexander Lobakin
  16 siblings, 1 reply; 25+ messages in thread
From: Ederson de Souza @ 2021-08-03  1:03 UTC (permalink / raw)
  To: xdp-hints; +Cc: bpf

An example of how to retrieve XDP hints/metadata from an XDP frame. To
get the xdp_hints struct, one can use:

$ bpftool net xdp show
  xdp:
  enp6s0(2) md_btf_id(44) md_btf_enabled(0)

To get the BTF id, and then:

$ bpftool btf dump id 44 format c > btf.h

But, in this example, to demonstrate BTF and CORE features, a simpler
struct was defined, containing the only field used by the sample.

A lowpoint is that it's not currently possible to use some CORE features
from "samples/bpf" directory, as those samples are currently built
without using "clang -target bpf". This way, it was not possible to use
"bpf_core_field_exists" macro to check, in runtime, the presence of a
given XDP hints field.
---
 samples/bpf/xdp_sample_pkts_kern.c | 21 +++++++++++++++++++++
 samples/bpf/xdp_sample_pkts_user.c |  4 +++-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/samples/bpf/xdp_sample_pkts_kern.c b/samples/bpf/xdp_sample_pkts_kern.c
index 9cf76b340dd7..9f0b0c5a6237 100644
--- a/samples/bpf/xdp_sample_pkts_kern.c
+++ b/samples/bpf/xdp_sample_pkts_kern.c
@@ -3,6 +3,7 @@
 #include <linux/version.h>
 #include <uapi/linux/bpf.h>
 #include <bpf/bpf_helpers.h>
+#include <bpf/bpf_core_read.h>
 
 #define SAMPLE_SIZE 64ul
 
@@ -12,16 +13,23 @@ struct {
 	__uint(value_size, sizeof(u32));
 } my_map SEC(".maps");
 
+struct xdp_hints {
+	u64 rx_timestamp;
+};
+
 SEC("xdp_sample")
 int xdp_sample_prog(struct xdp_md *ctx)
 {
+	void *meta_data = (void *)(long)ctx->data_meta;
 	void *data_end = (void *)(long)ctx->data_end;
 	void *data = (void *)(long)ctx->data;
+	struct xdp_hints *hints;
 
 	/* Metadata will be in the perf event before the packet data. */
 	struct S {
 		u16 cookie;
 		u16 pkt_len;
+		u64 rx_timestamp;
 	} __packed metadata;
 
 	if (data < data_end) {
@@ -41,9 +49,22 @@ int xdp_sample_prog(struct xdp_md *ctx)
 
 		metadata.cookie = 0xdead;
 		metadata.pkt_len = (u16)(data_end - data);
+		metadata.rx_timestamp = 0;
 		sample_size = min(metadata.pkt_len, SAMPLE_SIZE);
 		flags |= (u64)sample_size << 32;
 
+		if (meta_data < data) {
+			hints = meta_data;
+			/* bpf_core_field_exists doesn't work from samples/bpf,
+			 * as it is only available for "clang -target bpf", which
+			 * is not used on samples/bpf. A program that can use
+			 * the "vmlinux.h" and "clang -target btf" could use this
+			 * call to check the existence of a given field in runtime
+			 */
+			/*if (bpf_core_field_exists(hints->rx_timestamp))*/
+				metadata.rx_timestamp = BPF_CORE_READ(hints, rx_timestamp);
+		}
+
 		ret = bpf_perf_event_output(ctx, &my_map, flags,
 					    &metadata, sizeof(metadata));
 		if (ret)
diff --git a/samples/bpf/xdp_sample_pkts_user.c b/samples/bpf/xdp_sample_pkts_user.c
index 495e09897bd3..b87e0ae8eb3d 100644
--- a/samples/bpf/xdp_sample_pkts_user.c
+++ b/samples/bpf/xdp_sample_pkts_user.c
@@ -76,6 +76,7 @@ static void print_bpf_output(void *ctx, int cpu, void *data, __u32 size)
 	struct {
 		__u16 cookie;
 		__u16 pkt_len;
+		__u64 rx_timestamp;
 		__u8  pkt_data[SAMPLE_SIZE];
 	} __packed *e = data;
 	int i;
@@ -85,7 +86,8 @@ static void print_bpf_output(void *ctx, int cpu, void *data, __u32 size)
 		return;
 	}
 
-	printf("Pkt len: %-5d bytes. Ethernet hdr: ", e->pkt_len);
+	printf("Pkt len: %-5d bytes. RX timestamp: %llu Ethernet hdr: ", e->pkt_len,
+	       e->rx_timestamp);
 	for (i = 0; i < 14 && i < e->pkt_len; i++)
 		printf("%02x ", e->pkt_data[i]);
 	printf("\n");
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support
  2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
                   ` (15 preceding siblings ...)
  2021-08-03  1:03 ` [[RFC xdp-hints] 16/16] samples/bpf: Show XDP hints usage Ederson de Souza
@ 2021-08-03  9:12 ` Alexander Lobakin
  2021-08-03 15:23   ` John Fastabend
  16 siblings, 1 reply; 25+ messages in thread
From: Alexander Lobakin @ 2021-08-03  9:12 UTC (permalink / raw)
  To: Ederson de Souza; +Cc: Alexander Lobakin, xdp-hints, bpf

From: Ederson de Souza <ederson.desouza@intel.com>
Date: Mon,  2 Aug 2021 18:03:15 -0700

> While there's some work going on different aspects of the XDP hints, I'd like
> to present and ask for comments on this patch series.
> 
> XDP hints/metadata is a way for the driver to transmit information regarding a
> specific XDP frame along with the frame. Following current discussions and
> based on top of Saeed's early patches, this series provides the XDP hints with
> one (or two, depending on how you view it) use case: RX/TX timestamps for the
> igc driver.
> 
> Keeping with Saeed's patches, to enable XDP hints usage, one has to first
> enable it with bpftool like:
> 
>   bpftool net xdp set dev <iface> md_btf on
> 
> >From the driver perspective, support for XDP hints is achieved by:
> 
>  - Adding support for XDP_SETUP_MD_BTF operation, where it can register the BTF.
> 
>  - Adding support for XDP_QUERY_MD_BTF so user space can retrieve the BTF id.
> 
>  - Adding the relevant data to the metadata area of the XDP frame.
> 
>     - One of this relevant data is the BTF id of the BTF in use.
> 
> In order to make use of the BPF CO-RE mechanism, this series makes the driver
> name of the struct for the XDP hints be called `xdp_hints___<driver_name>` (in
> this series, as I'm using igc driver, it becomes `xdp_hints___igc`). This
> should help BPF programs, as they can simply refer to the struct as `xdp_hints`.
> 
> A common issue is how to standardize the names of the fields in the BTF. Here,
> a series of macros is provided on the `include/net/xdp.h`, that goes by
> `XDP_GENERIC_` prefixes. In there, the `btf_id` field was added, that needs
> to be strategically positioned at the end of the struct. Also added are the
> `rx_timestamp` and  `tx_timestamp` fields, as I believe they're generic as
> well. The macros also provide `u32` and `u64` types. Besides, I also ended
> up adding a `valid_map` field. It should help whoever is using the XDP hints
> to be sure of what is valid in that hints. It also makes the driver life
> simple, as it just uses a single struct and validates fields as it fills
> them.
> 
> The BPF sample `xdp_sample_pkts` was modified to demonstrate the usage of XDP
> hints on BPF programs. It's a very simple example, but it shows some nice
> things about it. For instance, instead of getting the struct somehow before,
> it uses CO-RE to simply name the XDP hint field it's interested in and
> read it using `BPF_CORE_READ`. (I also tried to use `bpf_core_field_exists` to
> make it even more dynamic, but couldn't get to build it. I mention why in the
> example.)
> 
> Also, as much of my interest lies in the user space side, the one using
> AF_XDP, to support it a few additional things were done.
> 
> Firstly, a new "driver info" is provided, to be obtained via
> `ioctl(SIOCETHTOOL)`: "xdp_headroom". This is how much XDP headroom is
> required by the driver. While not really important for the RX path (as the
> driver already applies that headroom to the XDP frame), it's
> important for the TX path, as here, it's the application responsibility to
> factor in the XDP headroom area. (Note that the TX timestamp is obtained from
> the XDP frame of the transmitted packet, when that frame goes back to the
> completion queue.)
> 
> A series of helpers was also added to libbpf to help manage this headroom
> area. They go by the prefix " xsk_umem__adjust_", to adjust consumer and
> producer data and metadata.
> 
> In order to read the XDP hints from the memory, another series of helpers was
> added. They read the BTF from the BTF id, and create a hashmap of the offsets
> and sizes of the fields, that is then used to actually retrieve the data.
> 
> I modified the "xdpsock" example to show the use of XDP hints on the AF_XDP
> world, along with the proposed API.
> 
> Finally, I know that Michal and Alexandr (and probably others that I don't
> know) are working in this same front. This RFC is not to race any other work,
> instead I hope it can help in the discussion of the best solution for the
> XDP hints – and I truly think it brings value, specifically for the AF_XDP
> usages.

XDP Hints have been discussed on Netdev 0x15, and we kinda
established the optimal way for doing it. This RFC's approach
is not actual anymore.
You could just write to me and request write perms on our open
GitHub repo (which was mentioned here several times) for Hints
to do things if not together, then in one place at least.
I'll be off for two weeks since next Monday, Michal could get
you into things if you decide to join after than point
(if at all).

Thanks,
Al

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support
  2021-08-03  9:12 ` [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Alexander Lobakin
@ 2021-08-03 15:23   ` John Fastabend
  2021-08-04 15:15     ` Alexander Lobakin
  0 siblings, 1 reply; 25+ messages in thread
From: John Fastabend @ 2021-08-03 15:23 UTC (permalink / raw)
  To: Alexander Lobakin, Ederson de Souza; +Cc: Alexander Lobakin, xdp-hints, bpf

Alexander Lobakin wrote:
> From: Ederson de Souza <ederson.desouza@intel.com>
> Date: Mon,  2 Aug 2021 18:03:15 -0700
> 
> > While there's some work going on different aspects of the XDP hints, I'd like
> > to present and ask for comments on this patch series.
> > 
> > XDP hints/metadata is a way for the driver to transmit information regarding a
> > specific XDP frame along with the frame. Following current discussions and
> > based on top of Saeed's early patches, this series provides the XDP hints with
> > one (or two, depending on how you view it) use case: RX/TX timestamps for the
> > igc driver.
> > 
> > Keeping with Saeed's patches, to enable XDP hints usage, one has to first
> > enable it with bpftool like:
> > 
> >   bpftool net xdp set dev <iface> md_btf on
> > 
> > >From the driver perspective, support for XDP hints is achieved by:
> > 
> >  - Adding support for XDP_SETUP_MD_BTF operation, where it can register the BTF.
> > 
> >  - Adding support for XDP_QUERY_MD_BTF so user space can retrieve the BTF id.
> > 
> >  - Adding the relevant data to the metadata area of the XDP frame.
> > 
> >     - One of this relevant data is the BTF id of the BTF in use.
> > 
> > In order to make use of the BPF CO-RE mechanism, this series makes the driver
> > name of the struct for the XDP hints be called `xdp_hints___<driver_name>` (in
> > this series, as I'm using igc driver, it becomes `xdp_hints___igc`). This
> > should help BPF programs, as they can simply refer to the struct as `xdp_hints`.
> > 
> > A common issue is how to standardize the names of the fields in the BTF. Here,
> > a series of macros is provided on the `include/net/xdp.h`, that goes by
> > `XDP_GENERIC_` prefixes. In there, the `btf_id` field was added, that needs
> > to be strategically positioned at the end of the struct. Also added are the
> > `rx_timestamp` and  `tx_timestamp` fields, as I believe they're generic as
> > well. The macros also provide `u32` and `u64` types. Besides, I also ended
> > up adding a `valid_map` field. It should help whoever is using the XDP hints
> > to be sure of what is valid in that hints. It also makes the driver life
> > simple, as it just uses a single struct and validates fields as it fills
> > them.
> > 
> > The BPF sample `xdp_sample_pkts` was modified to demonstrate the usage of XDP
> > hints on BPF programs. It's a very simple example, but it shows some nice
> > things about it. For instance, instead of getting the struct somehow before,
> > it uses CO-RE to simply name the XDP hint field it's interested in and
> > read it using `BPF_CORE_READ`. (I also tried to use `bpf_core_field_exists` to
> > make it even more dynamic, but couldn't get to build it. I mention why in the
> > example.)
> > 
> > Also, as much of my interest lies in the user space side, the one using
> > AF_XDP, to support it a few additional things were done.
> > 
> > Firstly, a new "driver info" is provided, to be obtained via
> > `ioctl(SIOCETHTOOL)`: "xdp_headroom". This is how much XDP headroom is
> > required by the driver. While not really important for the RX path (as the
> > driver already applies that headroom to the XDP frame), it's
> > important for the TX path, as here, it's the application responsibility to
> > factor in the XDP headroom area. (Note that the TX timestamp is obtained from
> > the XDP frame of the transmitted packet, when that frame goes back to the
> > completion queue.)
> > 
> > A series of helpers was also added to libbpf to help manage this headroom
> > area. They go by the prefix " xsk_umem__adjust_", to adjust consumer and
> > producer data and metadata.
> > 
> > In order to read the XDP hints from the memory, another series of helpers was
> > added. They read the BTF from the BTF id, and create a hashmap of the offsets
> > and sizes of the fields, that is then used to actually retrieve the data.
> > 
> > I modified the "xdpsock" example to show the use of XDP hints on the AF_XDP
> > world, along with the proposed API.
> > 
> > Finally, I know that Michal and Alexandr (and probably others that I don't
> > know) are working in this same front. This RFC is not to race any other work,
> > instead I hope it can help in the discussion of the best solution for the
> > XDP hints – and I truly think it brings value, specifically for the AF_XDP
> > usages.
> 
> XDP Hints have been discussed on Netdev 0x15, and we kinda
> established the optimal way for doing it. This RFC's approach
> is not actual anymore.

Its great it was discussed, but you need to also summarize it
on the list. Give us the conclusion, who came to it, and why
its better then this proposal or whats wrong with this proposal.
Not everyone was in the discussion and here we
have a concrete proposal _with_ code. You can't just out of hand
throw it out based on a conference discussion.

> You could just write to me and request write perms on our open
> GitHub repo (which was mentioned here several times) for Hints
> to do things if not together, then in one place at least.
> I'll be off for two weeks since next Monday, Michal could get
> you into things if you decide to join after than point
> (if at all).

I'll review code thats posted on the list. Please
do the same or give us a _reason_ to skip it. It has a nice commit
message that on the face looks like a reasonable starting point
even if I have a few issues with the aproach in a couple spots.

Time to get coffee on my side.

Thanks,
John

> 
> Thanks,
> Al

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support
  2021-08-03 15:23   ` John Fastabend
@ 2021-08-04 15:15     ` Alexander Lobakin
  2021-08-04 23:45       ` John Fastabend
  0 siblings, 1 reply; 25+ messages in thread
From: Alexander Lobakin @ 2021-08-04 15:15 UTC (permalink / raw)
  To: John Fastabend
  Cc: Alexander Lobakin, Marcin Kubiak, Michal Swiatkowski,
	Ederson de Souza, xdp-hints, bpf

From: John Fastabend <john.fastabend@gmail.com>
Date: Tue, 03 Aug 2021 08:23:36 -0700

> Alexander Lobakin wrote:
> > From: Ederson de Souza <ederson.desouza@intel.com>
> > Date: Mon,  2 Aug 2021 18:03:15 -0700
> >
> > > While there's some work going on different aspects of the XDP hints, I'd like
> > > to present and ask for comments on this patch series.
> > >
> > > XDP hints/metadata is a way for the driver to transmit information regarding a
> > > specific XDP frame along with the frame. Following current discussions and
> > > based on top of Saeed's early patches, this series provides the XDP hints with
> > > one (or two, depending on how you view it) use case: RX/TX timestamps for the
> > > igc driver.
> > >
> > > Keeping with Saeed's patches, to enable XDP hints usage, one has to first
> > > enable it with bpftool like:
> > >
> > >   bpftool net xdp set dev <iface> md_btf on
> > >
> > > >From the driver perspective, support for XDP hints is achieved by:
> > >
> > >  - Adding support for XDP_SETUP_MD_BTF operation, where it can register the BTF.
> > >
> > >  - Adding support for XDP_QUERY_MD_BTF so user space can retrieve the BTF id.
> > >
> > >  - Adding the relevant data to the metadata area of the XDP frame.
> > >
> > >     - One of this relevant data is the BTF id of the BTF in use.
> > >
> > > In order to make use of the BPF CO-RE mechanism, this series makes the driver
> > > name of the struct for the XDP hints be called `xdp_hints___<driver_name>` (in
> > > this series, as I'm using igc driver, it becomes `xdp_hints___igc`). This
> > > should help BPF programs, as they can simply refer to the struct as `xdp_hints`.
> > >
> > > A common issue is how to standardize the names of the fields in the BTF. Here,
> > > a series of macros is provided on the `include/net/xdp.h`, that goes by
> > > `XDP_GENERIC_` prefixes. In there, the `btf_id` field was added, that needs
> > > to be strategically positioned at the end of the struct. Also added are the
> > > `rx_timestamp` and  `tx_timestamp` fields, as I believe they're generic as
> > > well. The macros also provide `u32` and `u64` types. Besides, I also ended
> > > up adding a `valid_map` field. It should help whoever is using the XDP hints
> > > to be sure of what is valid in that hints. It also makes the driver life
> > > simple, as it just uses a single struct and validates fields as it fills
> > > them.
> > >
> > > The BPF sample `xdp_sample_pkts` was modified to demonstrate the usage of XDP
> > > hints on BPF programs. It's a very simple example, but it shows some nice
> > > things about it. For instance, instead of getting the struct somehow before,
> > > it uses CO-RE to simply name the XDP hint field it's interested in and
> > > read it using `BPF_CORE_READ`. (I also tried to use `bpf_core_field_exists` to
> > > make it even more dynamic, but couldn't get to build it. I mention why in the
> > > example.)
> > >
> > > Also, as much of my interest lies in the user space side, the one using
> > > AF_XDP, to support it a few additional things were done.
> > >
> > > Firstly, a new "driver info" is provided, to be obtained via
> > > `ioctl(SIOCETHTOOL)`: "xdp_headroom". This is how much XDP headroom is
> > > required by the driver. While not really important for the RX path (as the
> > > driver already applies that headroom to the XDP frame), it's
> > > important for the TX path, as here, it's the application responsibility to
> > > factor in the XDP headroom area. (Note that the TX timestamp is obtained from
> > > the XDP frame of the transmitted packet, when that frame goes back to the
> > > completion queue.)
> > >
> > > A series of helpers was also added to libbpf to help manage this headroom
> > > area. They go by the prefix " xsk_umem__adjust_", to adjust consumer and
> > > producer data and metadata.
> > >
> > > In order to read the XDP hints from the memory, another series of helpers was
> > > added. They read the BTF from the BTF id, and create a hashmap of the offsets
> > > and sizes of the fields, that is then used to actually retrieve the data.
> > >
> > > I modified the "xdpsock" example to show the use of XDP hints on the AF_XDP
> > > world, along with the proposed API.
> > >
> > > Finally, I know that Michal and Alexandr (and probably others that I don't
> > > know) are working in this same front. This RFC is not to race any other work,
> > > instead I hope it can help in the discussion of the best solution for the
> > > XDP hints and I truly think it brings value, specifically for the AF_XDP
> > > usages.
> >
> > XDP Hints have been discussed on Netdev 0x15, and we kinda
> > established the optimal way for doing it. This RFC's approach
> > is not actual anymore.
> 
> Its great it was discussed, but you need to also summarize it
> on the list. Give us the conclusion, who came to it, and why
> its better then this proposal or whats wrong with this proposal.
> Not everyone was in the discussion and here we
> have a concrete proposal _with_ code. You can't just out of hand
> throw it out based on a conference discussion.

The conclusion was:
 * no need to register BTF from drivers themselves, they can be
   obtained through sysfs interface;
 * the verifier has nothing to do with Hints and stuff, BTF ID
   just gets passed along with the other XDP Prog setup flags;
 * no if-else ladders or loops when preparing a metadata, only
   a couple of paths to satisfy generic needs (first of all)
   and some sort of extended structure if needed;
 * the first two users of the feature will be veth and cpumap.

This greatly simplifies the code and the hotpath.

It was decided by Jesper, Toke, Lorenzo, Michal, me, and several
more attendees (sorree if I forgot someone).

> > You could just write to me and request write perms on our open
> > GitHub repo (which was mentioned here several times) for Hints
> > to do things if not together, then in one place at least.
> > I'll be off for two weeks since next Monday, Michal could get
> > you into things if you decide to join after than point
> > (if at all).
> 
> I'll review code thats posted on the list. Please
> do the same or give us a _reason_ to skip it. It has a nice commit
> message that on the face looks like a reasonable starting point
> even if I have a few issues with the aproach in a couple spots.

There's almost nothing to review for me because I saw most of this
code because we were working on it, and it's available in my open
GitHub repo.
I mean, AF_XDP parts are new, that's for sure, but firstly we need
to stop {doing,rushing} things somewhere in the closets. For now,
we have THREE almost identical implementations of XDP Hints even
inside Intel, and they were born just because everyone was doing
something without any discussion or whatever, and I see no good
in such a fragmentation.
At least, that was the reason why XDP Hints mailing list and
XDP Hints workshop topic were created.

For sure, anyone is free to do whatever he wants, but I believe
letting us firstly finish the things discussed and established
and then starting to expand them to support AF_XDP and whatnot
can prevent wasting a lot of everyone's time and resources, and
keeping on reinventing the wheel again and again doesn't help
there. At all.

> Time to get coffee on my side.

I'm a tea man[iac], meh.

> Thanks,
> John

Thanks,
Al

> >
> > Thanks,
> > Al

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support
  2021-08-04 15:15     ` Alexander Lobakin
@ 2021-08-04 23:45       ` John Fastabend
  2021-08-13 22:04         ` Desouza, Ederson
  0 siblings, 1 reply; 25+ messages in thread
From: John Fastabend @ 2021-08-04 23:45 UTC (permalink / raw)
  To: Alexander Lobakin, John Fastabend
  Cc: Alexander Lobakin, Marcin Kubiak, Michal Swiatkowski,
	Ederson de Souza, xdp-hints, bpf

Alexander Lobakin wrote:
> From: John Fastabend <john.fastabend@gmail.com>
> Date: Tue, 03 Aug 2021 08:23:36 -0700
> 
> > Alexander Lobakin wrote:
> > > From: Ederson de Souza <ederson.desouza@intel.com>
> > > Date: Mon,  2 Aug 2021 18:03:15 -0700
> > >
> > > > While there's some work going on different aspects of the XDP hints, I'd like
> > > > to present and ask for comments on this patch series.
> > > >
> > > > XDP hints/metadata is a way for the driver to transmit information regarding a
> > > > specific XDP frame along with the frame. Following current discussions and
> > > > based on top of Saeed's early patches, this series provides the XDP hints with
> > > > one (or two, depending on how you view it) use case: RX/TX timestamps for the
> > > > igc driver.
> > > >
> > > > Keeping with Saeed's patches, to enable XDP hints usage, one has to first
> > > > enable it with bpftool like:
> > > >
> > > >   bpftool net xdp set dev <iface> md_btf on
> > > >
> > > > >From the driver perspective, support for XDP hints is achieved by:
> > > >
> > > >  - Adding support for XDP_SETUP_MD_BTF operation, where it can register the BTF.
> > > >
> > > >  - Adding support for XDP_QUERY_MD_BTF so user space can retrieve the BTF id.

Its still unclear to me why BTF id is being passed around.

> > > >
> > > >  - Adding the relevant data to the metadata area of the XDP frame.
> > > >
> > > >     - One of this relevant data is the BTF id of the BTF in use.
> > > >
> > > > In order to make use of the BPF CO-RE mechanism, this series makes the driver
> > > > name of the struct for the XDP hints be called `xdp_hints___<driver_name>` (in
> > > > this series, as I'm using igc driver, it becomes `xdp_hints___igc`). This
> > > > should help BPF programs, as they can simply refer to the struct as `xdp_hints`.
> > > >
> > > > A common issue is how to standardize the names of the fields in the BTF. Here,
> > > > a series of macros is provided on the `include/net/xdp.h`, that goes by
> > > > `XDP_GENERIC_` prefixes. In there, the `btf_id` field was added, that needs
> > > > to be strategically positioned at the end of the struct. Also added are the
> > > > `rx_timestamp` and  `tx_timestamp` fields, as I believe they're generic as
> > > > well. The macros also provide `u32` and `u64` types. Besides, I also ended
> > > > up adding a `valid_map` field. It should help whoever is using the XDP hints
> > > > to be sure of what is valid in that hints. It also makes the driver life
> > > > simple, as it just uses a single struct and validates fields as it fills
> > > > them.
> > > >
> > > > The BPF sample `xdp_sample_pkts` was modified to demonstrate the usage of XDP
> > > > hints on BPF programs. It's a very simple example, but it shows some nice
> > > > things about it. For instance, instead of getting the struct somehow before,
> > > > it uses CO-RE to simply name the XDP hint field it's interested in and
> > > > read it using `BPF_CORE_READ`. (I also tried to use `bpf_core_field_exists` to
> > > > make it even more dynamic, but couldn't get to build it. I mention why in the
> > > > example.)
> > > >
> > > > Also, as much of my interest lies in the user space side, the one using
> > > > AF_XDP, to support it a few additional things were done.
> > > >
> > > > Firstly, a new "driver info" is provided, to be obtained via
> > > > `ioctl(SIOCETHTOOL)`: "xdp_headroom". This is how much XDP headroom is
> > > > required by the driver. While not really important for the RX path (as the
> > > > driver already applies that headroom to the XDP frame), it's
> > > > important for the TX path, as here, it's the application responsibility to
> > > > factor in the XDP headroom area. (Note that the TX timestamp is obtained from
> > > > the XDP frame of the transmitted packet, when that frame goes back to the
> > > > completion queue.)
> > > >
> > > > A series of helpers was also added to libbpf to help manage this headroom
> > > > area. They go by the prefix " xsk_umem__adjust_", to adjust consumer and
> > > > producer data and metadata.
> > > >
> > > > In order to read the XDP hints from the memory, another series of helpers was
> > > > added. They read the BTF from the BTF id, and create a hashmap of the offsets
> > > > and sizes of the fields, that is then used to actually retrieve the data.
> > > >
> > > > I modified the "xdpsock" example to show the use of XDP hints on the AF_XDP
> > > > world, along with the proposed API.
> > > >
> > > > Finally, I know that Michal and Alexandr (and probably others that I don't
> > > > know) are working in this same front. This RFC is not to race any other work,
> > > > instead I hope it can help in the discussion of the best solution for the
> > > > XDP hints and I truly think it brings value, specifically for the AF_XDP
> > > > usages.
> > >
> > > XDP Hints have been discussed on Netdev 0x15, and we kinda
> > > established the optimal way for doing it. This RFC's approach
> > > is not actual anymore.
> > 
> > Its great it was discussed, but you need to also summarize it
> > on the list. Give us the conclusion, who came to it, and why
> > its better then this proposal or whats wrong with this proposal.
> > Not everyone was in the discussion and here we
> > have a concrete proposal _with_ code. You can't just out of hand
> > throw it out based on a conference discussion.
> 
> The conclusion was:
>  * no need to register BTF from drivers themselves, they can be
>    obtained through sysfs interface;

This would put the BTF info in '/sys/kernel/btf/driver_name' correct?
Likely good enough for static configs, but this will only work
for static hardware? Dynamic changes would need another mechanism
to learn BTF? My view on this is if you reprogram the hardware that
tooling should also give you a BTF file so its not a kernel
problem. Think P4 compiler spits out a firmware image and BTF file.

>  * the verifier has nothing to do with Hints and stuff, BTF ID
>    just gets passed along with the other XDP Prog setup flags;

Its not clear why the BTF ID needs to be passed at all. Can
someone elaborate? What would XDP do with the ID?

I get on redirect you don't know the xdp hints layout from
where the program came from? Do you plan to write programs
like this,

  if (btf_id == 0xfoo) { do something }
  else if (btf_id == 0xe00) { do other thing}

Other way is to sanitize meta data on input program. But,
OK assume you want to pivot on btf_id.  What does passing
the BTF ID down with the XDP prog setup have to do with the
above? A driver can probably just get its own BTF_ID and
no need to pass it down via XDP prog setup.


>  * no if-else ladders or loops when preparing a metadata, only
>    a couple of paths to satisfy generic needs (first of all)
>    and some sort of extended structure if needed;

Not sure I follow. I assume here the "hints" are put into the
metadata as a first iteration. If this is correct I expect
with BTF info and normal CO-RE logic nothing new should be needed.

If you want to put the metadata in a page somewhere else then
you need some way to patch code to point at that. A map would
work here. But, I think best approach is to first use the
metadata space.

>  * the first two users of the feature will be veth and cpumap.

Sure, I think its sufficient to have a BPF program use the
hints. Getting the connection into veth/cpumap seems like
extra, step 2 after a base driver supports putting hints in
the metadata block.

> 
> This greatly simplifies the code and the hotpath.

Sure I suspect for a first iteration "all" thats needed is a
driver/hardware to populate the metadata with hints and
expose a BTF file for the BPF program can use.

> 
> It was decided by Jesper, Toke, Lorenzo, Michal, me, and several
> more attendees (sorree if I forgot someone).

Great.

> 
> > > You could just write to me and request write perms on our open
> > > GitHub repo (which was mentioned here several times) for Hints
> > > to do things if not together, then in one place at least.
> > > I'll be off for two weeks since next Monday, Michal could get
> > > you into things if you decide to join after than point
> > > (if at all).
> > 
> > I'll review code thats posted on the list. Please
> > do the same or give us a _reason_ to skip it. It has a nice commit
> > message that on the face looks like a reasonable starting point
> > even if I have a few issues with the aproach in a couple spots.
> 
> There's almost nothing to review for me because I saw most of this
> code because we were working on it, and it's available in my open
> GitHub repo.

OK. It looks to me that a feature flag to enable/disable
hints is going to be useful? If this has overhead in the
driver to do the copy or overhead on the bus to push the
data around then we want to turn it on/off. An ethtool feature
flag would be sufficient.

> I mean, AF_XDP parts are new, that's for sure, but firstly we need
> to stop {doing,rushing} things somewhere in the closets. For now,
> we have THREE almost identical implementations of XDP Hints even
> inside Intel, and they were born just because everyone was doing
> something without any discussion or whatever, and I see no good
> in such a fragmentation.
> At least, that was the reason why XDP Hints mailing list and
> XDP Hints workshop topic were created.

My only objection is, off-list discussions need to land eventually
on the list as well. If someone submits a series of patches we
can't/shouldn't tell them its already been decided in some other
private discussion without providing the conclusions of that
discussion giving the sender a chance to debate it. I think above
summary in the initial reply would have been sufficient and I
wouldn't have even commented on it. And by giving the summary
now its clear to me as well whats being worked on.

Thanks,
John

> 
> For sure, anyone is free to do whatever he wants, but I believe
> letting us firstly finish the things discussed and established
> and then starting to expand them to support AF_XDP and whatnot
> can prevent wasting a lot of everyone's time and resources, and
> keeping on reinventing the wheel again and again doesn't help
> there. At all.
> 
> > Time to get coffee on my side.
> 
> I'm a tea man[iac], meh.
> 
> > Thanks,
> > John
> 
> Thanks,
> Al
> 
> > >
> > > Thanks,
> > > Al



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [[RFC xdp-hints] 13/16] libbpf: Helpers to access XDP frame metadata
  2021-08-03  1:03 ` [[RFC xdp-hints] 13/16] libbpf: Helpers to access XDP frame metadata Ederson de Souza
@ 2021-08-06 22:59   ` Andrii Nakryiko
  2021-08-19 11:47     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 25+ messages in thread
From: Andrii Nakryiko @ 2021-08-06 22:59 UTC (permalink / raw)
  To: Ederson de Souza
  Cc: xdp-hints, bpf, Toke Høiland-Jørgensen, Magnus Karlsson

On Mon, Aug 2, 2021 at 6:04 PM Ederson de Souza
<ederson.desouza@intel.com> wrote:
>
> Two new pairs of helpers: `xsk_umem__adjust_prod_data` and
> `xsk_umem__adjust_prod_data_meta` for data that is being produced by the
> application - such as data that will be sent; and
> `xsk_umem__adjust_cons_data` and `xsk_umem__adjust_cons_data_meta`,
> for data being consumed - such as data obtained from the completion
> queue.
>
> Those function should usually be used on data obtained via
> `xsk_umem__get_data`. Didn't change this function to avoid API breaks.
>
> Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
> ---

AF_XDP parts of libbpf are being moved into libxdp ([0]). We shouldn't
keep adding new APIs if we are actively working on deprecating and
removing existing functionality already. CC'ing Toke and Magnus for
the state of libxsk to libxdp migration.

  [0] https://github.com/xdp-project/xdp-tools/tree/master/lib/libxdp#using-af_xdp-sockets


>  tools/lib/bpf/libbpf.map |  4 ++++
>  tools/lib/bpf/xsk.c      | 26 ++++++++++++++++++++++++++
>  tools/lib/bpf/xsk.h      |  7 +++++++
>  3 files changed, 37 insertions(+)
>

[...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [[RFC xdp-hints] 16/16] samples/bpf: Show XDP hints usage
  2021-08-03  1:03 ` [[RFC xdp-hints] 16/16] samples/bpf: Show XDP hints usage Ederson de Souza
@ 2021-08-06 23:14   ` Andrii Nakryiko
  0 siblings, 0 replies; 25+ messages in thread
From: Andrii Nakryiko @ 2021-08-06 23:14 UTC (permalink / raw)
  To: Ederson de Souza; +Cc: xdp-hints, bpf

On Mon, Aug 2, 2021 at 6:05 PM Ederson de Souza
<ederson.desouza@intel.com> wrote:
>
> An example of how to retrieve XDP hints/metadata from an XDP frame. To
> get the xdp_hints struct, one can use:
>
> $ bpftool net xdp show
>   xdp:
>   enp6s0(2) md_btf_id(44) md_btf_enabled(0)
>
> To get the BTF id, and then:
>
> $ bpftool btf dump id 44 format c > btf.h
>
> But, in this example, to demonstrate BTF and CORE features, a simpler
> struct was defined, containing the only field used by the sample.
>
> A lowpoint is that it's not currently possible to use some CORE features
> from "samples/bpf" directory, as those samples are currently built
> without using "clang -target bpf". This way, it was not possible to use
> "bpf_core_field_exists" macro to check, in runtime, the presence of a
> given XDP hints field.
> ---

FYI, Kumar Kartikeya Dwivedi is adding vmlinux.h and CO-RE support to
samples/bpf in [0].

  [0] https://lore.kernel.org/bpf/20210728165552.435050-1-memxor@gmail.com/


>  samples/bpf/xdp_sample_pkts_kern.c | 21 +++++++++++++++++++++
>  samples/bpf/xdp_sample_pkts_user.c |  4 +++-
>  2 files changed, 24 insertions(+), 1 deletion(-)
>

[...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support
  2021-08-04 23:45       ` John Fastabend
@ 2021-08-13 22:04         ` Desouza, Ederson
  0 siblings, 0 replies; 25+ messages in thread
From: Desouza, Ederson @ 2021-08-13 22:04 UTC (permalink / raw)
  To: Lobakin, Alexandr, john.fastabend
  Cc: Kubiak, Marcin, xdp-hints, Swiatkowski, Michal, bpf

On Wed, 2021-08-04 at 16:45 -0700, John Fastabend wrote:
> Alexander Lobakin wrote:
> > From: John Fastabend <john.fastabend@gmail.com>
> > Date: Tue, 03 Aug 2021 08:23:36 -0700
> > 
> > > Alexander Lobakin wrote:
> > > > From: Ederson de Souza <ederson.desouza@intel.com>
> > > > Date: Mon,  2 Aug 2021 18:03:15 -0700
> > > > 
> > > > > While there's some work going on different aspects of the XDP hints, I'd like
> > > > > to present and ask for comments on this patch series.
> > > > > 
> > > > > XDP hints/metadata is a way for the driver to transmit information regarding a
> > > > > specific XDP frame along with the frame. Following current discussions and
> > > > > based on top of Saeed's early patches, this series provides the XDP hints with
> > > > > one (or two, depending on how you view it) use case: RX/TX timestamps for the
> > > > > igc driver.
> > > > > 
> > > > > Keeping with Saeed's patches, to enable XDP hints usage, one has to first
> > > > > enable it with bpftool like:
> > > > > 
> > > > >   bpftool net xdp set dev <iface> md_btf on
> > > > > 
> > > > > > From the driver perspective, support for XDP hints is achieved by:
> > > > > 
> > > > >  - Adding support for XDP_SETUP_MD_BTF operation, where it can register the BTF.
> > > > > 
> > > > >  - Adding support for XDP_QUERY_MD_BTF so user space can retrieve the BTF id.
> 
> Its still unclear to me why BTF id is being passed around.

So, from the user space, the application can retrieve the BTF layout
and correctly get the offset of the fields inside the metadata.

> 
> > > > > 
> > > > >  - Adding the relevant data to the metadata area of the XDP frame.
> > > > > 
> > > > >     - One of this relevant data is the BTF id of the BTF in use.
> > > > > 
> > > > > In order to make use of the BPF CO-RE mechanism, this series makes the driver
> > > > > name of the struct for the XDP hints be called `xdp_hints___<driver_name>` (in
> > > > > this series, as I'm using igc driver, it becomes `xdp_hints___igc`). This
> > > > > should help BPF programs, as they can simply refer to the struct as `xdp_hints`.
> > > > > 
> > > > > A common issue is how to standardize the names of the fields in the BTF. Here,
> > > > > a series of macros is provided on the `include/net/xdp.h`, that goes by
> > > > > `XDP_GENERIC_` prefixes. In there, the `btf_id` field was added, that needs
> > > > > to be strategically positioned at the end of the struct. Also added are the
> > > > > `rx_timestamp` and  `tx_timestamp` fields, as I believe they're generic as
> > > > > well. The macros also provide `u32` and `u64` types. Besides, I also ended
> > > > > up adding a `valid_map` field. It should help whoever is using the XDP hints
> > > > > to be sure of what is valid in that hints. It also makes the driver life
> > > > > simple, as it just uses a single struct and validates fields as it fills
> > > > > them.
> > > > > 
> > > > > The BPF sample `xdp_sample_pkts` was modified to demonstrate the usage of XDP
> > > > > hints on BPF programs. It's a very simple example, but it shows some nice
> > > > > things about it. For instance, instead of getting the struct somehow before,
> > > > > it uses CO-RE to simply name the XDP hint field it's interested in and
> > > > > read it using `BPF_CORE_READ`. (I also tried to use `bpf_core_field_exists` to
> > > > > make it even more dynamic, but couldn't get to build it. I mention why in the
> > > > > example.)
> > > > > 
> > > > > Also, as much of my interest lies in the user space side, the one using
> > > > > AF_XDP, to support it a few additional things were done.
> > > > > 
> > > > > Firstly, a new "driver info" is provided, to be obtained via
> > > > > `ioctl(SIOCETHTOOL)`: "xdp_headroom". This is how much XDP headroom is
> > > > > required by the driver. While not really important for the RX path (as the
> > > > > driver already applies that headroom to the XDP frame), it's
> > > > > important for the TX path, as here, it's the application responsibility to
> > > > > factor in the XDP headroom area. (Note that the TX timestamp is obtained from
> > > > > the XDP frame of the transmitted packet, when that frame goes back to the
> > > > > completion queue.)
> > > > > 
> > > > > A series of helpers was also added to libbpf to help manage this headroom
> > > > > area. They go by the prefix " xsk_umem__adjust_", to adjust consumer and
> > > > > producer data and metadata.
> > > > > 
> > > > > In order to read the XDP hints from the memory, another series of helpers was
> > > > > added. They read the BTF from the BTF id, and create a hashmap of the offsets
> > > > > and sizes of the fields, that is then used to actually retrieve the data.
> > > > > 
> > > > > I modified the "xdpsock" example to show the use of XDP hints on the AF_XDP
> > > > > world, along with the proposed API.
> > > > > 
> > > > > Finally, I know that Michal and Alexandr (and probably others that I don't
> > > > > know) are working in this same front. This RFC is not to race any other work,
> > > > > instead I hope it can help in the discussion of the best solution for the
> > > > > XDP hints and I truly think it brings value, specifically for the AF_XDP
> > > > > usages.
> > > > 
> > > > XDP Hints have been discussed on Netdev 0x15, and we kinda
> > > > established the optimal way for doing it. This RFC's approach
> > > > is not actual anymore.
> > > 
> > > Its great it was discussed, but you need to also summarize it
> > > on the list. Give us the conclusion, who came to it, and why
> > > its better then this proposal or whats wrong with this proposal.
> > > Not everyone was in the discussion and here we
> > > have a concrete proposal _with_ code. You can't just out of hand
> > > throw it out based on a conference discussion.
> > 
> > The conclusion was:
> >  * no need to register BTF from drivers themselves, they can be
> >    obtained through sysfs interface;
> 
> This would put the BTF info in '/sys/kernel/btf/driver_name' correct?
> Likely good enough for static configs, but this will only work
> for static hardware? Dynamic changes would need another mechanism
> to learn BTF? My view on this is if you reprogram the hardware that
> tooling should also give you a BTF file so its not a kernel
> problem. Think P4 compiler spits out a firmware image and BTF file.
> 
> >  * the verifier has nothing to do with Hints and stuff, BTF ID
> >    just gets passed along with the other XDP Prog setup flags;
> 
> Its not clear why the BTF ID needs to be passed at all. Can
> someone elaborate? What would XDP do with the ID?
> 
> I get on redirect you don't know the xdp hints layout from
> where the program came from? Do you plan to write programs
> like this,
> 
>   if (btf_id == 0xfoo) { do something }
>   else if (btf_id == 0xe00) { do other thing}
> 
> Other way is to sanitize meta data on input program. But,
> OK assume you want to pivot on btf_id.  What does passing
> the BTF ID down with the XDP prog setup have to do with the
> above? A driver can probably just get its own BTF_ID and
> no need to pass it down via XDP prog setup.
> 
> 
> >  * no if-else ladders or loops when preparing a metadata, only
> >    a couple of paths to satisfy generic needs (first of all)
> >    and some sort of extended structure if needed;
> 
> Not sure I follow. I assume here the "hints" are put into the
> metadata as a first iteration. If this is correct I expect
> with BTF info and normal CO-RE logic nothing new should be needed.
> 
> If you want to put the metadata in a page somewhere else then
> you need some way to patch code to point at that. A map would
> work here. But, I think best approach is to first use the
> metadata space.
> 
> >  * the first two users of the feature will be veth and cpumap.
> 
> Sure, I think its sufficient to have a BPF program use the
> hints. Getting the connection into veth/cpumap seems like
> extra, step 2 after a base driver supports putting hints in
> the metadata block.
> 
> > 
> > This greatly simplifies the code and the hotpath.
> 
> Sure I suspect for a first iteration "all" thats needed is a
> driver/hardware to populate the metadata with hints and
> expose a BTF file for the BPF program can use.
> 
> > 
> > It was decided by Jesper, Toke, Lorenzo, Michal, me, and several
> > more attendees (sorree if I forgot someone).
> 
> Great.
> 
> > 
> > > > You could just write to me and request write perms on our open
> > > > GitHub repo (which was mentioned here several times) for Hints
> > > > to do things if not together, then in one place at least.
> > > > I'll be off for two weeks since next Monday, Michal could get
> > > > you into things if you decide to join after than point
> > > > (if at all).
> > > 
> > > I'll review code thats posted on the list. Please
> > > do the same or give us a _reason_ to skip it. It has a nice commit
> > > message that on the face looks like a reasonable starting point
> > > even if I have a few issues with the aproach in a couple spots.
> > 
> > There's almost nothing to review for me because I saw most of this
> > code because we were working on it, and it's available in my open
> > GitHub repo.
> 
> OK. It looks to me that a feature flag to enable/disable
> hints is going to be useful? If this has overhead in the
> driver to do the copy or overhead on the bus to push the
> data around then we want to turn it on/off. An ethtool feature
> flag would be sufficient.
> 
> > I mean, AF_XDP parts are new, that's for sure, but firstly we need
> > to stop {doing,rushing} things somewhere in the closets. For now,
> > we have THREE almost identical implementations of XDP Hints even
> > inside Intel, and they were born just because everyone was doing
> > something without any discussion or whatever, and I see no good
> > in such a fragmentation.
> > At least, that was the reason why XDP Hints mailing list and
> > XDP Hints workshop topic were created.
> 
> My only objection is, off-list discussions need to land eventually
> on the list as well. If someone submits a series of patches we
> can't/shouldn't tell them its already been decided in some other
> private discussion without providing the conclusions of that
> discussion giving the sender a chance to debate it. I think above
> summary in the initial reply would have been sufficient and I
> wouldn't have even commented on it. And by giving the summary
> now its clear to me as well whats being worked on.
> 
> Thanks,
> John
> 
> > 
> > For sure, anyone is free to do whatever he wants, but I believe
> > letting us firstly finish the things discussed and established
> > and then starting to expand them to support AF_XDP and whatnot
> > can prevent wasting a lot of everyone's time and resources, and
> > keeping on reinventing the wheel again and again doesn't help
> > there. At all.
> > 
> > > Time to get coffee on my side.
> > 
> > I'm a tea man[iac], meh.
> > 
> > > Thanks,
> > > John
> > 
> > Thanks,
> > Al
> > 
> > > > 
> > > > Thanks,
> > > > Al
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [[RFC xdp-hints] 13/16] libbpf: Helpers to access XDP frame metadata
  2021-08-06 22:59   ` Andrii Nakryiko
@ 2021-08-19 11:47     ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 25+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-08-19 11:47 UTC (permalink / raw)
  To: Andrii Nakryiko, Ederson de Souza; +Cc: xdp-hints, bpf, Magnus Karlsson

Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:

> On Mon, Aug 2, 2021 at 6:04 PM Ederson de Souza
> <ederson.desouza@intel.com> wrote:
>>
>> Two new pairs of helpers: `xsk_umem__adjust_prod_data` and
>> `xsk_umem__adjust_prod_data_meta` for data that is being produced by the
>> application - such as data that will be sent; and
>> `xsk_umem__adjust_cons_data` and `xsk_umem__adjust_cons_data_meta`,
>> for data being consumed - such as data obtained from the completion
>> queue.
>>
>> Those function should usually be used on data obtained via
>> `xsk_umem__get_data`. Didn't change this function to avoid API breaks.
>>
>> Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
>> ---
>
> AF_XDP parts of libbpf are being moved into libxdp ([0]). We shouldn't
> keep adding new APIs if we are actively working on deprecating and
> removing existing functionality already. CC'ing Toke and Magnus for
> the state of libxsk to libxdp migration.
>
>   [0]
>   https://github.com/xdp-project/xdp-tools/tree/master/lib/libxdp#using-af_xdp-sockets

The AF_XDP code is merged into libxdp and is fully functional with the
exception of Maciej's XDP program auto-detach feature which we need to
replicate in a different way in libxdp.

So as far as I'm concerned, we can just go ahead and accept patches for
AF_XDP in libxdp. Does anyone have any thoughts on a preferred workflow?
Having the libxdp patches completely separate in a Github PR seems like
it will be an annoying workflow, so what to do instead?

-Toke


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-08-19 11:47 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-03  1:03 [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 01/16] bpf: add btf register/unregister API Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 02/16] net/core: XDP metadata BTF netlink API Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 03/16] tools/bpf: Query XDP metadata BTF ID Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 04/16] tools/bpf: Add xdp set command for md btf Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 05/16] igc: Fix race condition in PTP Tx code Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 06/16] igc: Retrieve the TX timestamp directly (instead of in a interrupt) Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 07/16] igc: Add support for multiple in-flight TX timestamps Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 08/16] igc: Use irq safe locks for timestamping Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 09/16] net/xdp: Support for generic XDP hints Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 10/16] igc: XDP packet RX timestamp Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 11/16] igc: XDP packet TX timestamp Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 12/16] ethtool,igc: Add "xdp_headroom" driver info Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 13/16] libbpf: Helpers to access XDP frame metadata Ederson de Souza
2021-08-06 22:59   ` Andrii Nakryiko
2021-08-19 11:47     ` Toke Høiland-Jørgensen
2021-08-03  1:03 ` [[RFC xdp-hints] 14/16] libbpf: Helpers to access XDP hints based on BTF definitions Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 15/16] samples/bpf: XDP hints AF_XDP example Ederson de Souza
2021-08-03  1:03 ` [[RFC xdp-hints] 16/16] samples/bpf: Show XDP hints usage Ederson de Souza
2021-08-06 23:14   ` Andrii Nakryiko
2021-08-03  9:12 ` [[RFC xdp-hints] 00/16] XDP hints and AF_XDP support Alexander Lobakin
2021-08-03 15:23   ` John Fastabend
2021-08-04 15:15     ` Alexander Lobakin
2021-08-04 23:45       ` John Fastabend
2021-08-13 22:04         ` Desouza, Ederson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).