BPF Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk
@ 2021-02-15 15:46 Maciej Fijalkowski
  2021-02-15 15:46 ` [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link Maciej Fijalkowski
                   ` (3 more replies)
  0 siblings, 4 replies; 37+ messages in thread
From: Maciej Fijalkowski @ 2021-02-15 15:46 UTC (permalink / raw)
  To: daniel, ast, bpf, netdev
  Cc: andrii, toke, bjorn.topel, magnus.karlsson, ciara.loftus,
	Maciej Fijalkowski

Hi,

This set is another approach towards addressing the below issue:

// load xdp prog and xskmap and add entry to xskmap at idx 10
$ sudo ./xdpsock -i ens801f0 -t -q 10

// add entry to xskmap at idx 11
$ sudo ./xdpsock -i ens801f0 -t -q 11

terminate one of the processes and another one is unable to work due to
the fact that the XDP prog was unloaded from interface.

Previous attempt was, to put it mildly, a bit broken, as there was no
synchronization between updates to additional map, as Bjorn pointed out.
See https://lore.kernel.org/netdev/20190603131907.13395-5-maciej.fijalkowski@intel.com/

In the meantime bpf_link was introduced and it seems that it can address
the issue of refcounting the XDP prog on interface. More info on commit
messages.

Thanks.

Maciej Fijalkowski (3):
  libbpf: xsk: use bpf_link
  libbpf: clear map_info before each bpf_obj_get_info_by_fd
  samples: bpf: do not unload prog within xdpsock

 samples/bpf/xdpsock_user.c |  55 ++++----------
 tools/lib/bpf/xsk.c        | 147 +++++++++++++++++++++++++++++++------
 2 files changed, 139 insertions(+), 63 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-15 15:46 [PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk Maciej Fijalkowski
@ 2021-02-15 15:46 ` Maciej Fijalkowski
  2021-02-15 17:07   ` Toke Høiland-Jørgensen
  2021-02-15 20:49   ` John Fastabend
  2021-02-15 15:46 ` [PATCH bpf-next 2/3] libbpf: clear map_info before each bpf_obj_get_info_by_fd Maciej Fijalkowski
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 37+ messages in thread
From: Maciej Fijalkowski @ 2021-02-15 15:46 UTC (permalink / raw)
  To: daniel, ast, bpf, netdev
  Cc: andrii, toke, bjorn.topel, magnus.karlsson, ciara.loftus,
	Maciej Fijalkowski

Currently, if there are multiple xdpsock instances running on a single
interface and in case one of the instances is terminated, the rest of
them are left in an inoperable state due to the fact of unloaded XDP
prog from interface.

To address that, step away from setting bpf prog in favour of bpf_link.
This means that refcounting of BPF resources will be done automatically
by bpf_link itself.

When setting up BPF resources during xsk socket creation, check whether
bpf_link for a given ifindex already exists via set of calls to
bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
and comparing the ifindexes from bpf_link and xsk socket.

If there's no bpf_link yet, create one for a given XDP prog and unload
explicitly existing prog if XDP_FLAGS_UPDATE_IF_NOEXIST is not set.

If bpf_link is already at a given ifindex and underlying program is not
AF-XDP one, bail out or update the bpf_link's prog given the presence of
XDP_FLAGS_UPDATE_IF_NOEXIST.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 tools/lib/bpf/xsk.c | 143 +++++++++++++++++++++++++++++++++++++-------
 1 file changed, 122 insertions(+), 21 deletions(-)

diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index 20500fb1f17e..5911868efa43 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -28,6 +28,7 @@
 #include <sys/mman.h>
 #include <sys/socket.h>
 #include <sys/types.h>
+#include <linux/if_link.h>
 
 #include "bpf.h"
 #include "libbpf.h"
@@ -70,6 +71,7 @@ struct xsk_ctx {
 	int ifindex;
 	struct list_head list;
 	int prog_fd;
+	int link_fd;
 	int xsks_map_fd;
 	char ifname[IFNAMSIZ];
 };
@@ -409,7 +411,7 @@ static int xsk_load_xdp_prog(struct xsk_socket *xsk)
 	static const int log_buf_size = 16 * 1024;
 	struct xsk_ctx *ctx = xsk->ctx;
 	char log_buf[log_buf_size];
-	int err, prog_fd;
+	int prog_fd;
 
 	/* This is the fallback C-program:
 	 * SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
@@ -499,14 +501,47 @@ static int xsk_load_xdp_prog(struct xsk_socket *xsk)
 		return prog_fd;
 	}
 
-	err = bpf_set_link_xdp_fd(xsk->ctx->ifindex, prog_fd,
-				  xsk->config.xdp_flags);
-	if (err) {
-		close(prog_fd);
-		return err;
+	ctx->prog_fd = prog_fd;
+	return 0;
+}
+
+static int xsk_create_bpf_link(struct xsk_socket *xsk)
+{
+	/* bpf_link only accepts XDP_FLAGS_MODES, but xsk->config.xdp_flags
+	 * might have set XDP_FLAGS_UPDATE_IF_NOEXIST
+	 */
+	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts,
+			    .flags = (xsk->config.xdp_flags & XDP_FLAGS_MODES));
+	struct xsk_ctx *ctx = xsk->ctx;
+	__u32 prog_id;
+	int link_fd;
+	int err;
+
+	/* for !XDP_FLAGS_UPDATE_IF_NOEXIST, unload the program first, if any,
+	 * so that bpf_link can be attached
+	 */
+	if (!(xsk->config.xdp_flags & XDP_FLAGS_UPDATE_IF_NOEXIST)) {
+		err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags);
+		if (err) {
+			pr_warn("getting XDP prog id failed\n");
+			return err;
+		}
+		if (prog_id) {
+			err = bpf_set_link_xdp_fd(ctx->ifindex, -1, 0);
+			if (err < 0) {
+				pr_warn("detaching XDP prog failed\n");
+				return err;
+			}
+		}
 	}
 
-	ctx->prog_fd = prog_fd;
+	link_fd = bpf_link_create(ctx->prog_fd, xsk->ctx->ifindex, BPF_XDP, &opts);
+	if (link_fd < 0) {
+		pr_warn("bpf_link_create failed: %s\n", strerror(errno));
+		return link_fd;
+	}
+
+	ctx->link_fd = link_fd;
 	return 0;
 }
 
@@ -625,8 +660,32 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
 	}
 
 	err = 0;
-	if (ctx->xsks_map_fd == -1)
+	if (ctx->xsks_map_fd != -1)
+		goto out_map_ids;
+
+	/* if prog load is forced and we found bpf_link on a given ifindex BUT
+	 * the underlying XDP prog is not an AF-XDP one, close old prog's fd,
+	 * load AF-XDP and update existing bpf_link
+	 */
+	if (!(xsk->config.xdp_flags & XDP_FLAGS_UPDATE_IF_NOEXIST)) {
+		close(ctx->prog_fd);
+		err = xsk_create_bpf_maps(xsk);
+		if (err)
+			goto out_map_ids;
+
+		err = xsk_load_xdp_prog(xsk);
+		if (err) {
+			xsk_delete_bpf_maps(xsk);
+			goto out_map_ids;
+		}
+
+		/* if update failed, caller will close fds */
+		err = bpf_link_update(ctx->link_fd, ctx->prog_fd, 0);
+		if (err)
+			pr_warn("bpf_link_update failed: %s\n", strerror(errno));
+	} else {
 		err = -ENOENT;
+	}
 
 out_map_ids:
 	free(map_ids);
@@ -666,6 +725,49 @@ static int xsk_create_xsk_struct(int ifindex, struct xsk_socket *xsk)
 	return 0;
 }
 
+static int xsk_link_lookup(struct xsk_ctx *ctx, __u32 *prog_id)
+{
+	__u32 link_len = sizeof(struct bpf_link_info);
+	struct bpf_link_info link_info;
+	__u32 id = 0;
+	int err;
+	int fd;
+
+	while (true) {
+		err = bpf_link_get_next_id(id, &id);
+		if (err) {
+			if (errno == ENOENT)
+				break;
+			pr_warn("can't get next link: %s\n", strerror(errno));
+			break;
+		}
+
+		fd = bpf_link_get_fd_by_id(id);
+		if (fd < 0) {
+			if (errno == ENOENT)
+				continue;
+			pr_warn("can't get link by id (%u): %s\n", id, strerror(errno));
+			break;
+		}
+
+		memset(&link_info, 0, link_len);
+		err = bpf_obj_get_info_by_fd(fd, &link_info, &link_len);
+		if (err) {
+			pr_warn("can't get link info: %s\n", strerror(errno));
+			close(fd);
+			break;
+		}
+		if (link_info.xdp.ifindex == ctx->ifindex) {
+			ctx->link_fd = fd;
+			*prog_id = link_info.prog_id;
+			break;
+		}
+		close(fd);
+	}
+
+	return errno == ENOENT ? 0 : err;
+}
+
 static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp,
 				int *xsks_map_fd)
 {
@@ -674,8 +776,7 @@ static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp,
 	__u32 prog_id = 0;
 	int err;
 
-	err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id,
-				  xsk->config.xdp_flags);
+	err = xsk_link_lookup(ctx, &prog_id);
 	if (err)
 		return err;
 
@@ -685,9 +786,12 @@ static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp,
 			return err;
 
 		err = xsk_load_xdp_prog(xsk);
-		if (err) {
+		if (err)
 			goto err_load_xdp_prog;
-		}
+
+		err = xsk_create_bpf_link(xsk);
+		if (err)
+			goto err_create_bpf_link;
 	} else {
 		ctx->prog_fd = bpf_prog_get_fd_by_id(prog_id);
 		if (ctx->prog_fd < 0)
@@ -695,20 +799,15 @@ static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp,
 		err = xsk_lookup_bpf_maps(xsk);
 		if (err) {
 			close(ctx->prog_fd);
+			close(ctx->link_fd);
 			return err;
 		}
 	}
 
 	if (xsk->rx) {
 		err = xsk_set_bpf_maps(xsk);
-		if (err) {
-			if (!prog_id) {
-				goto err_set_bpf_maps;
-			} else {
-				close(ctx->prog_fd);
-				return err;
-			}
-		}
+		if (err)
+			goto err_set_bpf_maps;
 	}
 	if (xsks_map_fd)
 		*xsks_map_fd = ctx->xsks_map_fd;
@@ -716,8 +815,9 @@ static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp,
 	return 0;
 
 err_set_bpf_maps:
+	close(ctx->link_fd);
+err_create_bpf_link:
 	close(ctx->prog_fd);
-	bpf_set_link_xdp_fd(ctx->ifindex, -1, 0);
 err_load_xdp_prog:
 	xsk_delete_bpf_maps(xsk);
 
@@ -1053,6 +1153,7 @@ void xsk_socket__delete(struct xsk_socket *xsk)
 	if (ctx->prog_fd != -1) {
 		xsk_delete_bpf_maps(xsk);
 		close(ctx->prog_fd);
+		close(ctx->link_fd);
 	}
 
 	err = xsk_get_mmap_offsets(xsk->fd, &off);
-- 
2.20.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH bpf-next 2/3] libbpf: clear map_info before each bpf_obj_get_info_by_fd
  2021-02-15 15:46 [PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk Maciej Fijalkowski
  2021-02-15 15:46 ` [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link Maciej Fijalkowski
@ 2021-02-15 15:46 ` Maciej Fijalkowski
  2021-02-15 20:33   ` John Fastabend
  2021-02-15 15:46 ` [PATCH bpf-next 3/3] samples: bpf: do not unload prog within xdpsock Maciej Fijalkowski
  2021-02-15 16:07 ` [PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk Björn Töpel
  3 siblings, 1 reply; 37+ messages in thread
From: Maciej Fijalkowski @ 2021-02-15 15:46 UTC (permalink / raw)
  To: daniel, ast, bpf, netdev
  Cc: andrii, toke, bjorn.topel, magnus.karlsson, ciara.loftus,
	Maciej Fijalkowski

xsk_lookup_bpf_maps, based on prog_fd, looks whether current prog has a
reference to XSKMAP. BPF prog can include insns that work on various BPF
maps and this is covered by iterating through map_ids.

The bpf_map_info that is passed to bpf_obj_get_info_by_fd for filling
needs to be cleared at each iteration, so that it doesn't any outdated
fields and that is currently missing in the function of interest.

To fix that, zero-init map_info via memset before each
bpf_obj_get_info_by_fd call.

Also, since the area of this code is touched, in general strcmp is
considered harmful, so let's convert it to strncmp and provide the
length of the array name that we're looking for.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 tools/lib/bpf/xsk.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index 5911868efa43..fb259c0bba93 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -616,6 +616,7 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
 	__u32 i, *map_ids, num_maps, prog_len = sizeof(struct bpf_prog_info);
 	__u32 map_len = sizeof(struct bpf_map_info);
 	struct bpf_prog_info prog_info = {};
+	const char *map_name = "xsks_map";
 	struct xsk_ctx *ctx = xsk->ctx;
 	struct bpf_map_info map_info;
 	int fd, err;
@@ -645,13 +646,14 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
 		if (fd < 0)
 			continue;
 
+		memset(&map_info, 0, map_len);
 		err = bpf_obj_get_info_by_fd(fd, &map_info, &map_len);
 		if (err) {
 			close(fd);
 			continue;
 		}
 
-		if (!strcmp(map_info.name, "xsks_map")) {
+		if (!strncmp(map_info.name, map_name, strlen(map_name))) {
 			ctx->xsks_map_fd = fd;
 			continue;
 		}
-- 
2.20.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH bpf-next 3/3] samples: bpf: do not unload prog within xdpsock
  2021-02-15 15:46 [PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk Maciej Fijalkowski
  2021-02-15 15:46 ` [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link Maciej Fijalkowski
  2021-02-15 15:46 ` [PATCH bpf-next 2/3] libbpf: clear map_info before each bpf_obj_get_info_by_fd Maciej Fijalkowski
@ 2021-02-15 15:46 ` Maciej Fijalkowski
  2021-02-15 20:24   ` John Fastabend
  2021-02-15 16:07 ` [PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk Björn Töpel
  3 siblings, 1 reply; 37+ messages in thread
From: Maciej Fijalkowski @ 2021-02-15 15:46 UTC (permalink / raw)
  To: daniel, ast, bpf, netdev
  Cc: andrii, toke, bjorn.topel, magnus.karlsson, ciara.loftus,
	Maciej Fijalkowski

With the introduction of bpf_link in xsk's libbpf part, there's no
further need for explicit unload of prog on xdpsock's termination. When
process dies, the bpf_link's refcount will be decremented and resources
will be unloaded/freed under the hood in case when there are no more
active users.

While at it, don't dump stats on error path.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 samples/bpf/xdpsock_user.c | 55 ++++++++++----------------------------
 1 file changed, 14 insertions(+), 41 deletions(-)

diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
index db0cb73513a5..96246313e342 100644
--- a/samples/bpf/xdpsock_user.c
+++ b/samples/bpf/xdpsock_user.c
@@ -96,7 +96,6 @@ static int opt_xsk_frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE;
 static int opt_timeout = 1000;
 static bool opt_need_wakeup = true;
 static u32 opt_num_xsks = 1;
-static u32 prog_id;
 static bool opt_busy_poll;
 static bool opt_reduced_cap;
 
@@ -462,59 +461,37 @@ static void *poller(void *arg)
 	return NULL;
 }
 
-static void remove_xdp_program(void)
+static void int_exit(int sig)
 {
-	u32 curr_prog_id = 0;
-	int cmd = CLOSE_CONN;
-
-	if (bpf_get_link_xdp_id(opt_ifindex, &curr_prog_id, opt_xdp_flags)) {
-		printf("bpf_get_link_xdp_id failed\n");
-		exit(EXIT_FAILURE);
-	}
-	if (prog_id == curr_prog_id)
-		bpf_set_link_xdp_fd(opt_ifindex, -1, opt_xdp_flags);
-	else if (!curr_prog_id)
-		printf("couldn't find a prog id on a given interface\n");
-	else
-		printf("program on interface changed, not removing\n");
-
-	if (opt_reduced_cap) {
-		if (write(sock, &cmd, sizeof(int)) < 0) {
-			fprintf(stderr, "Error writing into stream socket: %s", strerror(errno));
-			exit(EXIT_FAILURE);
-		}
-	}
+	benchmark_done = true;
 }
 
-static void int_exit(int sig)
+static void __exit_with_error(int error, const char *file, const char *func,
+			      int line)
 {
-	benchmark_done = true;
+	fprintf(stderr, "%s:%s:%i: errno: %d/\"%s\"\n", file, func,
+		line, error, strerror(error));
+	exit(EXIT_FAILURE);
 }
 
+#define exit_with_error(error) __exit_with_error(error, __FILE__, __func__, __LINE__)
+
 static void xdpsock_cleanup(void)
 {
 	struct xsk_umem *umem = xsks[0]->umem->umem;
-	int i;
+	int i, cmd = CLOSE_CONN;
 
 	dump_stats();
 	for (i = 0; i < num_socks; i++)
 		xsk_socket__delete(xsks[i]->xsk);
 	(void)xsk_umem__delete(umem);
-	remove_xdp_program();
-}
 
-static void __exit_with_error(int error, const char *file, const char *func,
-			      int line)
-{
-	fprintf(stderr, "%s:%s:%i: errno: %d/\"%s\"\n", file, func,
-		line, error, strerror(error));
-	dump_stats();
-	remove_xdp_program();
-	exit(EXIT_FAILURE);
+	if (opt_reduced_cap) {
+		if (write(sock, &cmd, sizeof(int)) < 0)
+			exit_with_error(errno);
+	}
 }
 
-#define exit_with_error(error) __exit_with_error(error, __FILE__, __func__, \
-						 __LINE__)
 static void swap_mac_addresses(void *data)
 {
 	struct ether_header *eth = (struct ether_header *)data;
@@ -880,10 +857,6 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem,
 	if (ret)
 		exit_with_error(-ret);
 
-	ret = bpf_get_link_xdp_id(opt_ifindex, &prog_id, opt_xdp_flags);
-	if (ret)
-		exit_with_error(-ret);
-
 	xsk->app_stats.rx_empty_polls = 0;
 	xsk->app_stats.fill_fail_polls = 0;
 	xsk->app_stats.copy_tx_sendtos = 0;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk
  2021-02-15 15:46 [PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk Maciej Fijalkowski
                   ` (2 preceding siblings ...)
  2021-02-15 15:46 ` [PATCH bpf-next 3/3] samples: bpf: do not unload prog within xdpsock Maciej Fijalkowski
@ 2021-02-15 16:07 ` Björn Töpel
  3 siblings, 0 replies; 37+ messages in thread
From: Björn Töpel @ 2021-02-15 16:07 UTC (permalink / raw)
  To: Maciej Fijalkowski, daniel, ast, bpf, netdev
  Cc: andrii, toke, magnus.karlsson, ciara.loftus

On 2021-02-15 16:46, Maciej Fijalkowski wrote:
> Hi,
> 
> This set is another approach towards addressing the below issue:
> 
> // load xdp prog and xskmap and add entry to xskmap at idx 10
> $ sudo ./xdpsock -i ens801f0 -t -q 10
> 
> // add entry to xskmap at idx 11
> $ sudo ./xdpsock -i ens801f0 -t -q 11
> 
> terminate one of the processes and another one is unable to work due to
> the fact that the XDP prog was unloaded from interface.
> 
> Previous attempt was, to put it mildly, a bit broken, as there was no
> synchronization between updates to additional map, as Bjorn pointed out.
> See https://lore.kernel.org/netdev/20190603131907.13395-5-maciej.fijalkowski@intel.com/
> 
> In the meantime bpf_link was introduced and it seems that it can address
> the issue of refcounting the XDP prog on interface. More info on commit
> messages.
>

For the series:

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>

Finally, bpf_link/scoped XDP support! Thanks a lot!


Björn



> Thanks.
> 
> Maciej Fijalkowski (3):
>    libbpf: xsk: use bpf_link
>    libbpf: clear map_info before each bpf_obj_get_info_by_fd
>    samples: bpf: do not unload prog within xdpsock
> 
>   samples/bpf/xdpsock_user.c |  55 ++++----------
>   tools/lib/bpf/xsk.c        | 147 +++++++++++++++++++++++++++++++------
>   2 files changed, 139 insertions(+), 63 deletions(-)
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-15 15:46 ` [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link Maciej Fijalkowski
@ 2021-02-15 17:07   ` Toke Høiland-Jørgensen
  2021-02-15 17:38     ` Björn Töpel
  2021-02-15 20:49   ` John Fastabend
  1 sibling, 1 reply; 37+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-02-15 17:07 UTC (permalink / raw)
  To: Maciej Fijalkowski, daniel, ast, bpf, netdev
  Cc: andrii, bjorn.topel, magnus.karlsson, ciara.loftus, Maciej Fijalkowski

Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:

> Currently, if there are multiple xdpsock instances running on a single
> interface and in case one of the instances is terminated, the rest of
> them are left in an inoperable state due to the fact of unloaded XDP
> prog from interface.
>
> To address that, step away from setting bpf prog in favour of bpf_link.
> This means that refcounting of BPF resources will be done automatically
> by bpf_link itself.
>
> When setting up BPF resources during xsk socket creation, check whether
> bpf_link for a given ifindex already exists via set of calls to
> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
> and comparing the ifindexes from bpf_link and xsk socket.

One consideration here is that bpf_link_get_fd_by_id() is a privileged
operation (privileged as in CAP_SYS_ADMIN), so this has the side effect
of making AF_XDP privileged as well. Is that the intention?

Another is that the AF_XDP code is in the process of moving to libxdp
(see in-progress PR [0]), and this approach won't carry over as-is to
that model, because libxdp has to pin the bpf_link fds.

However, in libxdp we can solve the original problem in a different way,
and in fact I already suggested to Magnus that we should do this (see
[1]); so one way forward could be to address it during the merge in
libxdp? It should be possible to address the original issue (two
instances of xdpsock breaking each other when they exit), but
applications will still need to do an explicit unload operation before
exiting (i.e., the automatic detach on bpf_link fd closure will take
more work, and likely require extending the bpf_link kernel support)...

-Toke

[0] https://github.com/xdp-project/xdp-tools/pull/92
[1] https://github.com/xdp-project/xdp-tools/pull/92#discussion_r576204719


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-15 17:07   ` Toke Høiland-Jørgensen
@ 2021-02-15 17:38     ` Björn Töpel
  2021-02-15 19:35       ` Toke Høiland-Jørgensen
  2021-02-15 20:22       ` John Fastabend
  0 siblings, 2 replies; 37+ messages in thread
From: Björn Töpel @ 2021-02-15 17:38 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Maciej Fijalkowski, daniel,
	ast, bpf, netdev
  Cc: andrii, magnus.karlsson, ciara.loftus

On 2021-02-15 18:07, Toke Høiland-Jørgensen wrote:
> Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:
> 
>> Currently, if there are multiple xdpsock instances running on a single
>> interface and in case one of the instances is terminated, the rest of
>> them are left in an inoperable state due to the fact of unloaded XDP
>> prog from interface.
>>
>> To address that, step away from setting bpf prog in favour of bpf_link.
>> This means that refcounting of BPF resources will be done automatically
>> by bpf_link itself.
>>
>> When setting up BPF resources during xsk socket creation, check whether
>> bpf_link for a given ifindex already exists via set of calls to
>> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
>> and comparing the ifindexes from bpf_link and xsk socket.
> 
> One consideration here is that bpf_link_get_fd_by_id() is a privileged
> operation (privileged as in CAP_SYS_ADMIN), so this has the side effect
> of making AF_XDP privileged as well. Is that the intention?
>

We're already using, e.g., bpf_map_get_fd_by_id() which has that
as well. So we're assuming that for XDP setup already!

> Another is that the AF_XDP code is in the process of moving to libxdp
> (see in-progress PR [0]), and this approach won't carry over as-is to
> that model, because libxdp has to pin the bpf_link fds.
>

I was assuming there were two modes of operations for AF_XDP in libxdp.
One which is with the multi-program support (which AFAIK is why the
pinning is required), and one "like the current libbpf" one. For the
latter Maciej's series would be a good fit, no?

> However, in libxdp we can solve the original problem in a different way,
> and in fact I already suggested to Magnus that we should do this (see
> [1]); so one way forward could be to address it during the merge in
> libxdp? It should be possible to address the original issue (two
> instances of xdpsock breaking each other when they exit), but
> applications will still need to do an explicit unload operation before
> exiting (i.e., the automatic detach on bpf_link fd closure will take
> more work, and likely require extending the bpf_link kernel support)...
>

I'd say it's depending on the libbpf 1.0/libxdp merge timeframe. If
we're months ahead, then I'd really like to see this in libbpf until the
merge. However, I'll leave that for Magnus/you to decide!

Bottom line; I'd *really* like bpf_link behavior (process scoped) for
AF_XDP sooner than later! ;-)


Thanks for the input!
Björn


> -Toke
> 
> [0] https://github.com/xdp-project/xdp-tools/pull/92
> [1] https://github.com/xdp-project/xdp-tools/pull/92#discussion_r576204719
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-15 17:38     ` Björn Töpel
@ 2021-02-15 19:35       ` Toke Høiland-Jørgensen
  2021-02-16  2:01         ` Maciej Fijalkowski
  2021-02-15 20:22       ` John Fastabend
  1 sibling, 1 reply; 37+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-02-15 19:35 UTC (permalink / raw)
  To: Björn Töpel, Maciej Fijalkowski, daniel, ast, bpf, netdev
  Cc: andrii, magnus.karlsson, ciara.loftus

Björn Töpel <bjorn.topel@intel.com> writes:

> On 2021-02-15 18:07, Toke Høiland-Jørgensen wrote:
>> Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:
>> 
>>> Currently, if there are multiple xdpsock instances running on a single
>>> interface and in case one of the instances is terminated, the rest of
>>> them are left in an inoperable state due to the fact of unloaded XDP
>>> prog from interface.
>>>
>>> To address that, step away from setting bpf prog in favour of bpf_link.
>>> This means that refcounting of BPF resources will be done automatically
>>> by bpf_link itself.
>>>
>>> When setting up BPF resources during xsk socket creation, check whether
>>> bpf_link for a given ifindex already exists via set of calls to
>>> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
>>> and comparing the ifindexes from bpf_link and xsk socket.
>> 
>> One consideration here is that bpf_link_get_fd_by_id() is a privileged
>> operation (privileged as in CAP_SYS_ADMIN), so this has the side effect
>> of making AF_XDP privileged as well. Is that the intention?
>>
>
> We're already using, e.g., bpf_map_get_fd_by_id() which has that
> as well. So we're assuming that for XDP setup already!

Ah, right, didn't realise that one is CAP_SYS_ADMIN as well; I
remembered this as being specific to the bpf_link operation.

>> Another is that the AF_XDP code is in the process of moving to libxdp
>> (see in-progress PR [0]), and this approach won't carry over as-is to
>> that model, because libxdp has to pin the bpf_link fds.
>>
>
> I was assuming there were two modes of operations for AF_XDP in libxdp.
> One which is with the multi-program support (which AFAIK is why the
> pinning is required), and one "like the current libbpf" one. For the
> latter Maciej's series would be a good fit, no?

We haven't added an explicit mode switch for now; libxdp will fall back
to regular interface attach if the kernel doesn't support the needed
features for multi-attach, but if it's possible to just have libxdp
transparently do the right thing I'd much prefer that. So we're still
exploring that (part of which is that Magnus has promised to run some
performance tests to see if there's a difference).

However, even if there's an explicit mode switch I'd like to avoid
different *semantics* between the two modes if possible, to keep the two
as compatible as possible. And since we can't currently do "auto-detach
on bpf_link fd close" when using multi-prog, introducing this now would
lead to just such a semantic difference. So my preference would be to do
it differently... :)

>> However, in libxdp we can solve the original problem in a different way,
>> and in fact I already suggested to Magnus that we should do this (see
>> [1]); so one way forward could be to address it during the merge in
>> libxdp? It should be possible to address the original issue (two
>> instances of xdpsock breaking each other when they exit), but
>> applications will still need to do an explicit unload operation before
>> exiting (i.e., the automatic detach on bpf_link fd closure will take
>> more work, and likely require extending the bpf_link kernel support)...
>>
>
> I'd say it's depending on the libbpf 1.0/libxdp merge timeframe. If
> we're months ahead, then I'd really like to see this in libbpf until the
> merge. However, I'll leave that for Magnus/you to decide!

Well, as far as libxdp support goes, the PR I linked is pretty close to
being mergeable. One of the few outstanding issues is whether we should
solve just this issue before merging, actually :)

Not sure exactly which timeframe Andrii is envisioning for libbpf 1.0,
but last I heard he'll announce something next week.

> Bottom line; I'd *really* like bpf_link behavior (process scoped) for
> AF_XDP sooner than later! ;-)

Totally agree that we should solve the multi-process coexistence
problem! And as I said, I think we can do so in libxdp by using the same
synchronisation mechanism we use for setting up the multi-prog
dispatcher. So it doesn't *have* to hold things up :)

-Toke


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-15 17:38     ` Björn Töpel
  2021-02-15 19:35       ` Toke Høiland-Jørgensen
@ 2021-02-15 20:22       ` John Fastabend
  2021-02-15 21:38         ` Toke Høiland-Jørgensen
  2021-02-16  2:10         ` Maciej Fijalkowski
  1 sibling, 2 replies; 37+ messages in thread
From: John Fastabend @ 2021-02-15 20:22 UTC (permalink / raw)
  To: Björn Töpel, Toke Høiland-Jørgensen,
	Maciej Fijalkowski, daniel, ast, bpf, netdev
  Cc: andrii, magnus.karlsson, ciara.loftus

Björn Töpel wrote:
> On 2021-02-15 18:07, Toke Høiland-Jørgensen wrote:
> > Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:
> > 
> >> Currently, if there are multiple xdpsock instances running on a single
> >> interface and in case one of the instances is terminated, the rest of
> >> them are left in an inoperable state due to the fact of unloaded XDP
> >> prog from interface.

I'm a bit confused by the above. This is only the case if the instance
that terminated is the one that loaded the XDP program and it also didn't
pin the program correct? If so lets make the commit message a bit more
clear about the exact case we are solving.

> >>
> >> To address that, step away from setting bpf prog in favour of bpf_link.
> >> This means that refcounting of BPF resources will be done automatically
> >> by bpf_link itself.

+1

> >>
> >> When setting up BPF resources during xsk socket creation, check whether
> >> bpf_link for a given ifindex already exists via set of calls to
> >> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
> >> and comparing the ifindexes from bpf_link and xsk socket.
> > 
> > One consideration here is that bpf_link_get_fd_by_id() is a privileged
> > operation (privileged as in CAP_SYS_ADMIN), so this has the side effect
> > of making AF_XDP privileged as well. Is that the intention?
> >
> 
> We're already using, e.g., bpf_map_get_fd_by_id() which has that
> as well. So we're assuming that for XDP setup already!
> 
> > Another is that the AF_XDP code is in the process of moving to libxdp
> > (see in-progress PR [0]), and this approach won't carry over as-is to
> > that model, because libxdp has to pin the bpf_link fds.
> >
> 
> I was assuming there were two modes of operations for AF_XDP in libxdp.
> One which is with the multi-program support (which AFAIK is why the
> pinning is required), and one "like the current libbpf" one. For the
> latter Maciej's series would be a good fit, no?
> 
> > However, in libxdp we can solve the original problem in a different way,
> > and in fact I already suggested to Magnus that we should do this (see
> > [1]); so one way forward could be to address it during the merge in
> > libxdp? It should be possible to address the original issue (two
> > instances of xdpsock breaking each other when they exit), but
> > applications will still need to do an explicit unload operation before
> > exiting (i.e., the automatic detach on bpf_link fd closure will take
> > more work, and likely require extending the bpf_link kernel support)...
> >
> 
> I'd say it's depending on the libbpf 1.0/libxdp merge timeframe. If
> we're months ahead, then I'd really like to see this in libbpf until the
> merge. However, I'll leave that for Magnus/you to decide!

Did I miss some thread? What does this mean libbpf 1.0/libxdp merge? From
my side libbpf should support the basic operations: load, attach, pin,
and link for all my BPF objects. I view libxdp as providing 'extra'
goodness on top of that. Everyone agree?

> 
> Bottom line; I'd *really* like bpf_link behavior (process scoped) for
> AF_XDP sooner than later! ;-)

Because I use libbpf as my base management for BPF objects I want it
to support the basic ops for all objects so link ops should land there.

> 
> 
> Thanks for the input!
> Björn
> 
> 
> > -Toke
> > 
> > [0] https://github.com/xdp-project/xdp-tools/pull/92
> > [1] https://github.com/xdp-project/xdp-tools/pull/92#discussion_r576204719
> > 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [PATCH bpf-next 3/3] samples: bpf: do not unload prog within xdpsock
  2021-02-15 15:46 ` [PATCH bpf-next 3/3] samples: bpf: do not unload prog within xdpsock Maciej Fijalkowski
@ 2021-02-15 20:24   ` John Fastabend
  2021-02-16  9:22     ` Björn Töpel
  0 siblings, 1 reply; 37+ messages in thread
From: John Fastabend @ 2021-02-15 20:24 UTC (permalink / raw)
  To: Maciej Fijalkowski, daniel, ast, bpf, netdev
  Cc: andrii, toke, bjorn.topel, magnus.karlsson, ciara.loftus,
	Maciej Fijalkowski

Maciej Fijalkowski wrote:
> With the introduction of bpf_link in xsk's libbpf part, there's no
> further need for explicit unload of prog on xdpsock's termination. When
> process dies, the bpf_link's refcount will be decremented and resources
> will be unloaded/freed under the hood in case when there are no more
> active users.
> 
> While at it, don't dump stats on error path.
> 
> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---

Can we also use it from selftests prog so we have a test that is run there
using this? Looks like xdpxceiver.c could do the link step as well?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [PATCH bpf-next 2/3] libbpf: clear map_info before each bpf_obj_get_info_by_fd
  2021-02-15 15:46 ` [PATCH bpf-next 2/3] libbpf: clear map_info before each bpf_obj_get_info_by_fd Maciej Fijalkowski
@ 2021-02-15 20:33   ` John Fastabend
  2021-02-16  2:42     ` Maciej Fijalkowski
  0 siblings, 1 reply; 37+ messages in thread
From: John Fastabend @ 2021-02-15 20:33 UTC (permalink / raw)
  To: Maciej Fijalkowski, daniel, ast, bpf, netdev
  Cc: andrii, toke, bjorn.topel, magnus.karlsson, ciara.loftus,
	Maciej Fijalkowski

Maciej Fijalkowski wrote:
> xsk_lookup_bpf_maps, based on prog_fd, looks whether current prog has a
> reference to XSKMAP. BPF prog can include insns that work on various BPF
> maps and this is covered by iterating through map_ids.
> 
> The bpf_map_info that is passed to bpf_obj_get_info_by_fd for filling
> needs to be cleared at each iteration, so that it doesn't any outdated
> fields and that is currently missing in the function of interest.
> 
> To fix that, zero-init map_info via memset before each
> bpf_obj_get_info_by_fd call.
> 
> Also, since the area of this code is touched, in general strcmp is
> considered harmful, so let's convert it to strncmp and provide the
> length of the array name that we're looking for.
> 
> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---

This is a bugfix independent of the link bits correct? Would be best
to send to bpf then. 

>  tools/lib/bpf/xsk.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> index 5911868efa43..fb259c0bba93 100644
> --- a/tools/lib/bpf/xsk.c
> +++ b/tools/lib/bpf/xsk.c
> @@ -616,6 +616,7 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
>  	__u32 i, *map_ids, num_maps, prog_len = sizeof(struct bpf_prog_info);
>  	__u32 map_len = sizeof(struct bpf_map_info);
>  	struct bpf_prog_info prog_info = {};
> +	const char *map_name = "xsks_map";
>  	struct xsk_ctx *ctx = xsk->ctx;
>  	struct bpf_map_info map_info;
>  	int fd, err;
> @@ -645,13 +646,14 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
>  		if (fd < 0)
>  			continue;
>  
> +		memset(&map_info, 0, map_len);
>  		err = bpf_obj_get_info_by_fd(fd, &map_info, &map_len);
>  		if (err) {
>  			close(fd);
>  			continue;
>  		}
>  
> -		if (!strcmp(map_info.name, "xsks_map")) {
> +		if (!strncmp(map_info.name, map_name, strlen(map_name))) {
>  			ctx->xsks_map_fd = fd;
>  			continue;

Also just looking at this how is above not buggy? Should be a break instead
of continue? If we match another "xsks_map" here won't we stomp on xsks_map_fd
and leak a file descriptor?

>  		}
> -- 
> 2.20.1
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-15 15:46 ` [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link Maciej Fijalkowski
  2021-02-15 17:07   ` Toke Høiland-Jørgensen
@ 2021-02-15 20:49   ` John Fastabend
  2021-02-16  2:38     ` Maciej Fijalkowski
  2021-02-16  9:20     ` Björn Töpel
  1 sibling, 2 replies; 37+ messages in thread
From: John Fastabend @ 2021-02-15 20:49 UTC (permalink / raw)
  To: Maciej Fijalkowski, daniel, ast, bpf, netdev
  Cc: andrii, toke, bjorn.topel, magnus.karlsson, ciara.loftus,
	Maciej Fijalkowski

Maciej Fijalkowski wrote:
> Currently, if there are multiple xdpsock instances running on a single
> interface and in case one of the instances is terminated, the rest of
> them are left in an inoperable state due to the fact of unloaded XDP
> prog from interface.
> 
> To address that, step away from setting bpf prog in favour of bpf_link.
> This means that refcounting of BPF resources will be done automatically
> by bpf_link itself.
> 
> When setting up BPF resources during xsk socket creation, check whether
> bpf_link for a given ifindex already exists via set of calls to
> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
> and comparing the ifindexes from bpf_link and xsk socket.
> 
> If there's no bpf_link yet, create one for a given XDP prog and unload
> explicitly existing prog if XDP_FLAGS_UPDATE_IF_NOEXIST is not set.
> 
> If bpf_link is already at a given ifindex and underlying program is not
> AF-XDP one, bail out or update the bpf_link's prog given the presence of
> XDP_FLAGS_UPDATE_IF_NOEXIST.
> 
> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---
>  tools/lib/bpf/xsk.c | 143 +++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 122 insertions(+), 21 deletions(-)

[...]

> +static int xsk_create_bpf_link(struct xsk_socket *xsk)
> +{
> +	/* bpf_link only accepts XDP_FLAGS_MODES, but xsk->config.xdp_flags
> +	 * might have set XDP_FLAGS_UPDATE_IF_NOEXIST
> +	 */
> +	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts,
> +			    .flags = (xsk->config.xdp_flags & XDP_FLAGS_MODES));
> +	struct xsk_ctx *ctx = xsk->ctx;
> +	__u32 prog_id;
> +	int link_fd;
> +	int err;
> +
> +	/* for !XDP_FLAGS_UPDATE_IF_NOEXIST, unload the program first, if any,
> +	 * so that bpf_link can be attached
> +	 */
> +	if (!(xsk->config.xdp_flags & XDP_FLAGS_UPDATE_IF_NOEXIST)) {
> +		err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags);
> +		if (err) {
> +			pr_warn("getting XDP prog id failed\n");
> +			return err;
> +		}
> +		if (prog_id) {
> +			err = bpf_set_link_xdp_fd(ctx->ifindex, -1, 0);
> +			if (err < 0) {
> +				pr_warn("detaching XDP prog failed\n");
> +				return err;
> +			}
> +		}
>  	}
>  
> -	ctx->prog_fd = prog_fd;
> +	link_fd = bpf_link_create(ctx->prog_fd, xsk->ctx->ifindex, BPF_XDP, &opts);
> +	if (link_fd < 0) {
> +		pr_warn("bpf_link_create failed: %s\n", strerror(errno));
> +		return link_fd;
> +	}
> +

This can leave the system in a bad state where it unloaded the XDP program
above, but then failed to create the link. So we should somehow fix that
if possible or at minimum put a note somewhere so users can't claim they
shouldn't know this.

Also related, its not good for real systems to let XDP program go missing
for some period of time. I didn't check but we should make
XDP_FLAGS_UPDATE_IF_NOEXIST the default if its not already.

> +	ctx->link_fd = link_fd;
>  	return 0;
>  }
>  

[...]

> +static int xsk_link_lookup(struct xsk_ctx *ctx, __u32 *prog_id)
> +{
> +	__u32 link_len = sizeof(struct bpf_link_info);
> +	struct bpf_link_info link_info;
> +	__u32 id = 0;
> +	int err;
> +	int fd;
> +
> +	while (true) {
> +		err = bpf_link_get_next_id(id, &id);
> +		if (err) {
> +			if (errno == ENOENT)
> +				break;
> +			pr_warn("can't get next link: %s\n", strerror(errno));
> +			break;
> +		}
> +
> +		fd = bpf_link_get_fd_by_id(id);
> +		if (fd < 0) {
> +			if (errno == ENOENT)
> +				continue;
> +			pr_warn("can't get link by id (%u): %s\n", id, strerror(errno));
> +			break;
> +		}
> +
> +		memset(&link_info, 0, link_len);
> +		err = bpf_obj_get_info_by_fd(fd, &link_info, &link_len);
> +		if (err) {
> +			pr_warn("can't get link info: %s\n", strerror(errno));
> +			close(fd);
> +			break;
> +		}
> +		if (link_info.xdp.ifindex == ctx->ifindex) {
> +			ctx->link_fd = fd;
> +			*prog_id = link_info.prog_id;
> +			break;
> +		}
> +		close(fd);
> +	}
> +
> +	return errno == ENOENT ? 0 : err;

But, err wont be set in fd < 0 case? I guess we don't want to return 0 if
bpf_link_get_fd_by_id fails. Although I really don't like the construct
here that much. I think just `return err` and ensuring err is set correctly
would be more clear. At least the fd error case needs to be handled
though.

> +}
> +

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-15 20:22       ` John Fastabend
@ 2021-02-15 21:38         ` Toke Høiland-Jørgensen
  2021-02-16  0:18           ` John Fastabend
  2021-02-17  2:23           ` Dan Siemon
  2021-02-16  2:10         ` Maciej Fijalkowski
  1 sibling, 2 replies; 37+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-02-15 21:38 UTC (permalink / raw)
  To: John Fastabend, Björn Töpel, Maciej Fijalkowski,
	daniel, ast, bpf, netdev
  Cc: andrii, magnus.karlsson, ciara.loftus

John Fastabend <john.fastabend@gmail.com> writes:

>> > However, in libxdp we can solve the original problem in a different way,
>> > and in fact I already suggested to Magnus that we should do this (see
>> > [1]); so one way forward could be to address it during the merge in
>> > libxdp? It should be possible to address the original issue (two
>> > instances of xdpsock breaking each other when they exit), but
>> > applications will still need to do an explicit unload operation before
>> > exiting (i.e., the automatic detach on bpf_link fd closure will take
>> > more work, and likely require extending the bpf_link kernel support)...
>> >
>> 
>> I'd say it's depending on the libbpf 1.0/libxdp merge timeframe. If
>> we're months ahead, then I'd really like to see this in libbpf until the
>> merge. However, I'll leave that for Magnus/you to decide!
>
> Did I miss some thread? What does this mean libbpf 1.0/libxdp merge?

The idea is to keep libbpf focused on bpf, and move the AF_XDP stuff to
libxdp (so the socket stuff in xsk.h). We're adding the existing code
wholesale, and keeping API compatibility during the move, so all that's
needed is adding -lxdp when compiling. And obviously the existing libbpf
code isn't going anywhere until such a time as there's a general
backwards compatibility-breaking deprecation in libbpf (which I believe
Andrii is planning to do in an upcoming and as-of-yet unannounced v1.0
release).

While integrating the XSK code into libxdp we're trying to make it
compatible with the rest of the library (i.e., multi-prog). Hence my
preference to avoid introducing something that makes this harder :)

-Toke


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-15 21:38         ` Toke Høiland-Jørgensen
@ 2021-02-16  0:18           ` John Fastabend
  2021-02-16  2:23             ` Maciej Fijalkowski
  2021-02-16 10:36             ` Toke Høiland-Jørgensen
  2021-02-17  2:23           ` Dan Siemon
  1 sibling, 2 replies; 37+ messages in thread
From: John Fastabend @ 2021-02-16  0:18 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, daniel, ast, bpf,
	netdev
  Cc: andrii, magnus.karlsson, ciara.loftus

Toke Høiland-Jørgensen wrote:
> John Fastabend <john.fastabend@gmail.com> writes:
> 
> >> > However, in libxdp we can solve the original problem in a different way,
> >> > and in fact I already suggested to Magnus that we should do this (see
> >> > [1]); so one way forward could be to address it during the merge in
> >> > libxdp? It should be possible to address the original issue (two
> >> > instances of xdpsock breaking each other when they exit), but
> >> > applications will still need to do an explicit unload operation before
> >> > exiting (i.e., the automatic detach on bpf_link fd closure will take
> >> > more work, and likely require extending the bpf_link kernel support)...
> >> >
> >> 
> >> I'd say it's depending on the libbpf 1.0/libxdp merge timeframe. If
> >> we're months ahead, then I'd really like to see this in libbpf until the
> >> merge. However, I'll leave that for Magnus/you to decide!
> >
> > Did I miss some thread? What does this mean libbpf 1.0/libxdp merge?
> 
> The idea is to keep libbpf focused on bpf, and move the AF_XDP stuff to
> libxdp (so the socket stuff in xsk.h). We're adding the existing code
> wholesale, and keeping API compatibility during the move, so all that's
> needed is adding -lxdp when compiling. And obviously the existing libbpf
> code isn't going anywhere until such a time as there's a general
> backwards compatibility-breaking deprecation in libbpf (which I believe
> Andrii is planning to do in an upcoming and as-of-yet unannounced v1.0
> release).

OK, I would like to keep the basic XDP pieces in libbpf though. For example
bpf_program__attach_xdp(). This way we don't have one lib to attach
everything, but XDP.

> 
> While integrating the XSK code into libxdp we're trying to make it
> compatible with the rest of the library (i.e., multi-prog). Hence my
> preference to avoid introducing something that makes this harder :)
> 
> -Toke
> 

OK that makes sense to me thanks. But, I'm missing something (maybe its
obvious to everyone else?).

When you load an XDP program you should get a reference to it. And then
XDP program should never be unloaded until that id is removed right? It
may or may not have an xsk map. Why does adding/removing programs from
an associated map have any impact on the XDP program? That seems like
the buggy part to me. No other map behaves this way as far as I can
tell. Now if the program with the XDP reference closes without pinning
the map or otherwise doing something with it, sure the map gets destroyed
and any xsk sockets are lost.

Thanks!
John

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-15 19:35       ` Toke Høiland-Jørgensen
@ 2021-02-16  2:01         ` Maciej Fijalkowski
  2021-02-16  9:15           ` Björn Töpel
  2021-02-16 10:27           ` Toke Høiland-Jørgensen
  0 siblings, 2 replies; 37+ messages in thread
From: Maciej Fijalkowski @ 2021-02-16  2:01 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Björn Töpel, daniel, ast, bpf, netdev, andrii,
	magnus.karlsson, ciara.loftus

On Mon, Feb 15, 2021 at 08:35:29PM +0100, Toke Høiland-Jørgensen wrote:
> Björn Töpel <bjorn.topel@intel.com> writes:
> 
> > On 2021-02-15 18:07, Toke Høiland-Jørgensen wrote:
> >> Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:
> >> 
> >>> Currently, if there are multiple xdpsock instances running on a single
> >>> interface and in case one of the instances is terminated, the rest of
> >>> them are left in an inoperable state due to the fact of unloaded XDP
> >>> prog from interface.
> >>>
> >>> To address that, step away from setting bpf prog in favour of bpf_link.
> >>> This means that refcounting of BPF resources will be done automatically
> >>> by bpf_link itself.
> >>>
> >>> When setting up BPF resources during xsk socket creation, check whether
> >>> bpf_link for a given ifindex already exists via set of calls to
> >>> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
> >>> and comparing the ifindexes from bpf_link and xsk socket.
> >> 
> >> One consideration here is that bpf_link_get_fd_by_id() is a privileged
> >> operation (privileged as in CAP_SYS_ADMIN), so this has the side effect
> >> of making AF_XDP privileged as well. Is that the intention?
> >>
> >
> > We're already using, e.g., bpf_map_get_fd_by_id() which has that
> > as well. So we're assuming that for XDP setup already!
> 
> Ah, right, didn't realise that one is CAP_SYS_ADMIN as well; I
> remembered this as being specific to the bpf_link operation.
> 
> >> Another is that the AF_XDP code is in the process of moving to libxdp
> >> (see in-progress PR [0]), and this approach won't carry over as-is to
> >> that model, because libxdp has to pin the bpf_link fds.
> >>
> >
> > I was assuming there were two modes of operations for AF_XDP in libxdp.
> > One which is with the multi-program support (which AFAIK is why the
> > pinning is required), and one "like the current libbpf" one. For the
> > latter Maciej's series would be a good fit, no?
> 
> We haven't added an explicit mode switch for now; libxdp will fall back
> to regular interface attach if the kernel doesn't support the needed
> features for multi-attach, but if it's possible to just have libxdp
> transparently do the right thing I'd much prefer that. So we're still
> exploring that (part of which is that Magnus has promised to run some
> performance tests to see if there's a difference).
> 
> However, even if there's an explicit mode switch I'd like to avoid
> different *semantics* between the two modes if possible, to keep the two
> as compatible as possible. And since we can't currently do "auto-detach
> on bpf_link fd close" when using multi-prog, introducing this now would
> lead to just such a semantic difference. So my preference would be to do
> it differently... :)
> 
> >> However, in libxdp we can solve the original problem in a different way,
> >> and in fact I already suggested to Magnus that we should do this (see
> >> [1]); so one way forward could be to address it during the merge in
> >> libxdp? It should be possible to address the original issue (two
> >> instances of xdpsock breaking each other when they exit), but
> >> applications will still need to do an explicit unload operation before
> >> exiting (i.e., the automatic detach on bpf_link fd closure will take
> >> more work, and likely require extending the bpf_link kernel support)...
> >>
> >
> > I'd say it's depending on the libbpf 1.0/libxdp merge timeframe. If
> > we're months ahead, then I'd really like to see this in libbpf until the
> > merge. However, I'll leave that for Magnus/you to decide!

WDYM by libbpf 1.0/libxdp merge? I glanced through thread and I saw that
John was also not aware of that. Not sure where it was discussed?

If you're saying 'merge', then is libxdp going to be a part of kernel or
as an AF-XDP related guy I would be forced to include yet another
repository in the BPF developer toolchain? :<

> 
> Well, as far as libxdp support goes, the PR I linked is pretty close to
> being mergeable. One of the few outstanding issues is whether we should
> solve just this issue before merging, actually :)
> 
> Not sure exactly which timeframe Andrii is envisioning for libbpf 1.0,
> but last I heard he'll announce something next week.
> 
> > Bottom line; I'd *really* like bpf_link behavior (process scoped) for
> > AF_XDP sooner than later! ;-)
> 
> Totally agree that we should solve the multi-process coexistence
> problem! And as I said, I think we can do so in libxdp by using the same
> synchronisation mechanism we use for setting up the multi-prog
> dispatcher. So it doesn't *have* to hold things up :)

Am I reading this right or you're trying to reject the fix of the long
standing issue due to a PR that is not ready yet on a standalone
project/library? :P

Once libxdp is the standard way of playing with AF-XDP and there are
actual users of that, then I'm happy to work/help on that issue.

Spawning a few xdpsock instances on the same interface has been a
standard/easiest way of measuring the scalability of AF-XDP ZC
implementations. This has been a real PITA for quite a long time. So, I
second Bjorn's statement - the sooner we have this fixed, the better.

Thanks! I hope I haven't sounded harsh, not my intent at all,
Maciej

> 
> -Toke
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-15 20:22       ` John Fastabend
  2021-02-15 21:38         ` Toke Høiland-Jørgensen
@ 2021-02-16  2:10         ` Maciej Fijalkowski
  1 sibling, 0 replies; 37+ messages in thread
From: Maciej Fijalkowski @ 2021-02-16  2:10 UTC (permalink / raw)
  To: John Fastabend
  Cc: Björn Töpel, Toke Høiland-Jørgensen, daniel,
	ast, bpf, netdev, andrii, magnus.karlsson, ciara.loftus

On Mon, Feb 15, 2021 at 12:22:36PM -0800, John Fastabend wrote:
> Björn Töpel wrote:
> > On 2021-02-15 18:07, Toke Høiland-Jørgensen wrote:
> > > Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:
> > > 
> > >> Currently, if there are multiple xdpsock instances running on a single
> > >> interface and in case one of the instances is terminated, the rest of
> > >> them are left in an inoperable state due to the fact of unloaded XDP
> > >> prog from interface.
> 
> I'm a bit confused by the above. This is only the case if the instance
> that terminated is the one that loaded the XDP program and it also didn't
> pin the program correct? If so lets make the commit message a bit more
> clear about the exact case we are solving.

I can include the following from the cover letter:

<quote>
$ sudo ./xdpsock -i ens801f0 -t -q 10 // load xdp prog and xskmap and
                                      // add entry to xskmap at idx 10

$ sudo ./xdpsock -i ens801f0 -t -q 11 // add entry to xskmap at idx 11

terminate one of the processes and another one is unable to work due to
the fact that the XDP prog was unloaded from interface.
</quote>

We don't do the prog pinning and it doesn't really matter which of the
instances you terminate.

> 
> > >>
> > >> To address that, step away from setting bpf prog in favour of bpf_link.
> > >> This means that refcounting of BPF resources will be done automatically
> > >> by bpf_link itself.
> 
> +1
> 
> > >>
> > >> When setting up BPF resources during xsk socket creation, check whether
> > >> bpf_link for a given ifindex already exists via set of calls to
> > >> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
> > >> and comparing the ifindexes from bpf_link and xsk socket.
> > > 
> > > One consideration here is that bpf_link_get_fd_by_id() is a privileged
> > > operation (privileged as in CAP_SYS_ADMIN), so this has the side effect
> > > of making AF_XDP privileged as well. Is that the intention?
> > >
> > 
> > We're already using, e.g., bpf_map_get_fd_by_id() which has that
> > as well. So we're assuming that for XDP setup already!
> > 
> > > Another is that the AF_XDP code is in the process of moving to libxdp
> > > (see in-progress PR [0]), and this approach won't carry over as-is to
> > > that model, because libxdp has to pin the bpf_link fds.
> > >
> > 
> > I was assuming there were two modes of operations for AF_XDP in libxdp.
> > One which is with the multi-program support (which AFAIK is why the
> > pinning is required), and one "like the current libbpf" one. For the
> > latter Maciej's series would be a good fit, no?
> > 
> > > However, in libxdp we can solve the original problem in a different way,
> > > and in fact I already suggested to Magnus that we should do this (see
> > > [1]); so one way forward could be to address it during the merge in
> > > libxdp? It should be possible to address the original issue (two
> > > instances of xdpsock breaking each other when they exit), but
> > > applications will still need to do an explicit unload operation before
> > > exiting (i.e., the automatic detach on bpf_link fd closure will take
> > > more work, and likely require extending the bpf_link kernel support)...
> > >
> > 
> > I'd say it's depending on the libbpf 1.0/libxdp merge timeframe. If
> > we're months ahead, then I'd really like to see this in libbpf until the
> > merge. However, I'll leave that for Magnus/you to decide!
> 
> Did I miss some thread? What does this mean libbpf 1.0/libxdp merge? From
> my side libbpf should support the basic operations: load, attach, pin,
> and link for all my BPF objects. I view libxdp as providing 'extra'
> goodness on top of that. Everyone agree?
> 
> > 
> > Bottom line; I'd *really* like bpf_link behavior (process scoped) for
> > AF_XDP sooner than later! ;-)
> 
> Because I use libbpf as my base management for BPF objects I want it
> to support the basic ops for all objects so link ops should land there.
> 
> > 
> > 
> > Thanks for the input!
> > Björn
> > 
> > 
> > > -Toke
> > > 
> > > [0] https://github.com/xdp-project/xdp-tools/pull/92
> > > [1] https://github.com/xdp-project/xdp-tools/pull/92#discussion_r576204719
> > > 
> 
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-16  0:18           ` John Fastabend
@ 2021-02-16  2:23             ` Maciej Fijalkowski
  2021-02-16  9:23               ` Björn Töpel
  2021-02-16 10:36             ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 37+ messages in thread
From: Maciej Fijalkowski @ 2021-02-16  2:23 UTC (permalink / raw)
  To: John Fastabend
  Cc: Toke Høiland-Jørgensen, Björn Töpel, daniel,
	ast, bpf, netdev, andrii, magnus.karlsson, ciara.loftus

On Mon, Feb 15, 2021 at 04:18:28PM -0800, John Fastabend wrote:
> Toke Høiland-Jørgensen wrote:
> > John Fastabend <john.fastabend@gmail.com> writes:
> > 
> > >> > However, in libxdp we can solve the original problem in a different way,
> > >> > and in fact I already suggested to Magnus that we should do this (see
> > >> > [1]); so one way forward could be to address it during the merge in
> > >> > libxdp? It should be possible to address the original issue (two
> > >> > instances of xdpsock breaking each other when they exit), but
> > >> > applications will still need to do an explicit unload operation before
> > >> > exiting (i.e., the automatic detach on bpf_link fd closure will take
> > >> > more work, and likely require extending the bpf_link kernel support)...
> > >> >
> > >> 
> > >> I'd say it's depending on the libbpf 1.0/libxdp merge timeframe. If
> > >> we're months ahead, then I'd really like to see this in libbpf until the
> > >> merge. However, I'll leave that for Magnus/you to decide!
> > >
> > > Did I miss some thread? What does this mean libbpf 1.0/libxdp merge?
> > 
> > The idea is to keep libbpf focused on bpf, and move the AF_XDP stuff to
> > libxdp (so the socket stuff in xsk.h). We're adding the existing code
> > wholesale, and keeping API compatibility during the move, so all that's
> > needed is adding -lxdp when compiling. And obviously the existing libbpf
> > code isn't going anywhere until such a time as there's a general
> > backwards compatibility-breaking deprecation in libbpf (which I believe
> > Andrii is planning to do in an upcoming and as-of-yet unannounced v1.0
> > release).

Once again, is libxdp going to land in th kernel? Still not clear to me.

> 
> OK, I would like to keep the basic XDP pieces in libbpf though. For example
> bpf_program__attach_xdp(). This way we don't have one lib to attach
> everything, but XDP.
> 
> > 
> > While integrating the XSK code into libxdp we're trying to make it
> > compatible with the rest of the library (i.e., multi-prog). Hence my
> > preference to avoid introducing something that makes this harder :)

Do you see the issue with solving it one way in libbpf currently given
that we can't really tell when the merge of libs is going to happen and
the other way within the libxdp itself?

> > 
> > -Toke
> > 
> 
> OK that makes sense to me thanks. But, I'm missing something (maybe its
> obvious to everyone else?).
> 
> When you load an XDP program you should get a reference to it. And then
> XDP program should never be unloaded until that id is removed right? It

WDYM by 'that id is removed' ?

> may or may not have an xsk map. Why does adding/removing programs from

you meant adding/removing 'sockets'?

> an associated map have any impact on the XDP program? That seems like
> the buggy part to me. No other map behaves this way as far as I can
> tell. Now if the program with the XDP reference closes without pinning
> the map or otherwise doing something with it, sure the map gets destroyed
> and any xsk sockets are lost.

It's the XDP program that is getting lost, not XSKMAP. XDP prog presence
determines whether your driver works in ZC or not. If XDP prog is gone
then xdpsock bails out (or any other AF-XDP app won't be able to work).

> 
> Thanks!
> John

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-15 20:49   ` John Fastabend
@ 2021-02-16  2:38     ` Maciej Fijalkowski
  2021-02-16 18:19       ` John Fastabend
  2021-02-16  9:20     ` Björn Töpel
  1 sibling, 1 reply; 37+ messages in thread
From: Maciej Fijalkowski @ 2021-02-16  2:38 UTC (permalink / raw)
  To: John Fastabend
  Cc: daniel, ast, bpf, netdev, andrii, toke, bjorn.topel,
	magnus.karlsson, ciara.loftus

On Mon, Feb 15, 2021 at 12:49:27PM -0800, John Fastabend wrote:
> Maciej Fijalkowski wrote:
> > Currently, if there are multiple xdpsock instances running on a single
> > interface and in case one of the instances is terminated, the rest of
> > them are left in an inoperable state due to the fact of unloaded XDP
> > prog from interface.
> > 
> > To address that, step away from setting bpf prog in favour of bpf_link.
> > This means that refcounting of BPF resources will be done automatically
> > by bpf_link itself.
> > 
> > When setting up BPF resources during xsk socket creation, check whether
> > bpf_link for a given ifindex already exists via set of calls to
> > bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
> > and comparing the ifindexes from bpf_link and xsk socket.
> > 
> > If there's no bpf_link yet, create one for a given XDP prog and unload
> > explicitly existing prog if XDP_FLAGS_UPDATE_IF_NOEXIST is not set.
> > 
> > If bpf_link is already at a given ifindex and underlying program is not
> > AF-XDP one, bail out or update the bpf_link's prog given the presence of
> > XDP_FLAGS_UPDATE_IF_NOEXIST.
> > 
> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > ---
> >  tools/lib/bpf/xsk.c | 143 +++++++++++++++++++++++++++++++++++++-------
> >  1 file changed, 122 insertions(+), 21 deletions(-)
> 
> [...]
> 
> > +static int xsk_create_bpf_link(struct xsk_socket *xsk)
> > +{
> > +	/* bpf_link only accepts XDP_FLAGS_MODES, but xsk->config.xdp_flags
> > +	 * might have set XDP_FLAGS_UPDATE_IF_NOEXIST
> > +	 */
> > +	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts,
> > +			    .flags = (xsk->config.xdp_flags & XDP_FLAGS_MODES));
> > +	struct xsk_ctx *ctx = xsk->ctx;
> > +	__u32 prog_id;
> > +	int link_fd;
> > +	int err;
> > +
> > +	/* for !XDP_FLAGS_UPDATE_IF_NOEXIST, unload the program first, if any,
> > +	 * so that bpf_link can be attached
> > +	 */
> > +	if (!(xsk->config.xdp_flags & XDP_FLAGS_UPDATE_IF_NOEXIST)) {
> > +		err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags);
> > +		if (err) {
> > +			pr_warn("getting XDP prog id failed\n");
> > +			return err;
> > +		}
> > +		if (prog_id) {
> > +			err = bpf_set_link_xdp_fd(ctx->ifindex, -1, 0);
> > +			if (err < 0) {
> > +				pr_warn("detaching XDP prog failed\n");
> > +				return err;
> > +			}
> > +		}
> >  	}
> >  
> > -	ctx->prog_fd = prog_fd;
> > +	link_fd = bpf_link_create(ctx->prog_fd, xsk->ctx->ifindex, BPF_XDP, &opts);
> > +	if (link_fd < 0) {
> > +		pr_warn("bpf_link_create failed: %s\n", strerror(errno));
> > +		return link_fd;
> > +	}
> > +
> 
> This can leave the system in a bad state where it unloaded the XDP program
> above, but then failed to create the link. So we should somehow fix that
> if possible or at minimum put a note somewhere so users can't claim they
> shouldn't know this.
> 
> Also related, its not good for real systems to let XDP program go missing
> for some period of time. I didn't check but we should make
> XDP_FLAGS_UPDATE_IF_NOEXIST the default if its not already.

Old way of attaching prog is mutual exclusive with bpf_link, right?
What I'm saying is in order to use one of the two, you need to wipe out
the current one in favour of the second that you would like to load.

> 
> > +	ctx->link_fd = link_fd;
> >  	return 0;
> >  }
> >  
> 
> [...]
> 
> > +static int xsk_link_lookup(struct xsk_ctx *ctx, __u32 *prog_id)
> > +{
> > +	__u32 link_len = sizeof(struct bpf_link_info);
> > +	struct bpf_link_info link_info;
> > +	__u32 id = 0;
> > +	int err;
> > +	int fd;
> > +
> > +	while (true) {
> > +		err = bpf_link_get_next_id(id, &id);
> > +		if (err) {
> > +			if (errno == ENOENT)
> > +				break;
> > +			pr_warn("can't get next link: %s\n", strerror(errno));
> > +			break;
> > +		}
> > +
> > +		fd = bpf_link_get_fd_by_id(id);
> > +		if (fd < 0) {
> > +			if (errno == ENOENT)
> > +				continue;
> > +			pr_warn("can't get link by id (%u): %s\n", id, strerror(errno));
> > +			break;
> > +		}
> > +
> > +		memset(&link_info, 0, link_len);
> > +		err = bpf_obj_get_info_by_fd(fd, &link_info, &link_len);
> > +		if (err) {
> > +			pr_warn("can't get link info: %s\n", strerror(errno));
> > +			close(fd);
> > +			break;
> > +		}
> > +		if (link_info.xdp.ifindex == ctx->ifindex) {
> > +			ctx->link_fd = fd;
> > +			*prog_id = link_info.prog_id;
> > +			break;
> > +		}
> > +		close(fd);
> > +	}
> > +
> > +	return errno == ENOENT ? 0 : err;
> 
> But, err wont be set in fd < 0 case? I guess we don't want to return 0 if
> bpf_link_get_fd_by_id fails.

Good catch!

> Although I really don't like the construct
> here that much. I think just `return err` and ensuring err is set correctly
> would be more clear. At least the fd error case needs to be handled
> though.

FWIW, this was inspired by tools/bpf/bpftool/link.c:do_show()

> 
> > +}
> > +

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 2/3] libbpf: clear map_info before each bpf_obj_get_info_by_fd
  2021-02-15 20:33   ` John Fastabend
@ 2021-02-16  2:42     ` Maciej Fijalkowski
  0 siblings, 0 replies; 37+ messages in thread
From: Maciej Fijalkowski @ 2021-02-16  2:42 UTC (permalink / raw)
  To: John Fastabend
  Cc: daniel, ast, bpf, netdev, andrii, toke, bjorn.topel,
	magnus.karlsson, ciara.loftus

On Mon, Feb 15, 2021 at 12:33:37PM -0800, John Fastabend wrote:
> Maciej Fijalkowski wrote:
> > xsk_lookup_bpf_maps, based on prog_fd, looks whether current prog has a
> > reference to XSKMAP. BPF prog can include insns that work on various BPF
> > maps and this is covered by iterating through map_ids.
> > 
> > The bpf_map_info that is passed to bpf_obj_get_info_by_fd for filling
> > needs to be cleared at each iteration, so that it doesn't any outdated
> > fields and that is currently missing in the function of interest.
> > 
> > To fix that, zero-init map_info via memset before each
> > bpf_obj_get_info_by_fd call.
> > 
> > Also, since the area of this code is touched, in general strcmp is
> > considered harmful, so let's convert it to strncmp and provide the
> > length of the array name that we're looking for.
> > 
> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > ---
> 
> This is a bugfix independent of the link bits correct? Would be best
> to send to bpf then. 

Right, I can pull this out of the series.

> 
> >  tools/lib/bpf/xsk.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> > index 5911868efa43..fb259c0bba93 100644
> > --- a/tools/lib/bpf/xsk.c
> > +++ b/tools/lib/bpf/xsk.c
> > @@ -616,6 +616,7 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
> >  	__u32 i, *map_ids, num_maps, prog_len = sizeof(struct bpf_prog_info);
> >  	__u32 map_len = sizeof(struct bpf_map_info);
> >  	struct bpf_prog_info prog_info = {};
> > +	const char *map_name = "xsks_map";
> >  	struct xsk_ctx *ctx = xsk->ctx;
> >  	struct bpf_map_info map_info;
> >  	int fd, err;
> > @@ -645,13 +646,14 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
> >  		if (fd < 0)
> >  			continue;
> >  
> > +		memset(&map_info, 0, map_len);
> >  		err = bpf_obj_get_info_by_fd(fd, &map_info, &map_len);
> >  		if (err) {
> >  			close(fd);
> >  			continue;
> >  		}
> >  
> > -		if (!strcmp(map_info.name, "xsks_map")) {
> > +		if (!strncmp(map_info.name, map_name, strlen(map_name))) {
> >  			ctx->xsks_map_fd = fd;
> >  			continue;
> 
> Also just looking at this how is above not buggy? Should be a break instead
> of continue? If we match another "xsks_map" here won't we stomp on xsks_map_fd
> and leak a file descriptor?

Will fix!

> 
> >  		}
> > -- 
> > 2.20.1
> > 
> 
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-16  2:01         ` Maciej Fijalkowski
@ 2021-02-16  9:15           ` Björn Töpel
  2021-02-16 10:27           ` Toke Høiland-Jørgensen
  1 sibling, 0 replies; 37+ messages in thread
From: Björn Töpel @ 2021-02-16  9:15 UTC (permalink / raw)
  To: Maciej Fijalkowski, Toke Høiland-Jørgensen
  Cc: daniel, ast, bpf, netdev, andrii, magnus.karlsson, ciara.loftus

On 2021-02-16 03:01, Maciej Fijalkowski wrote:
> On Mon, Feb 15, 2021 at 08:35:29PM +0100, Toke Høiland-Jørgensen wrote:
>> Björn Töpel <bjorn.topel@intel.com> writes:

[...]

>>>
>>> I'd say it's depending on the libbpf 1.0/libxdp merge timeframe. If
>>> we're months ahead, then I'd really like to see this in libbpf until the
>>> merge. However, I'll leave that for Magnus/you to decide!
> 
> WDYM by libbpf 1.0/libxdp merge? I glanced through thread and I saw that
> John was also not aware of that. Not sure where it was discussed?
>

Oh, right. Yeah, we've had some offlist discussions about moving the
AF_XDP functionality from libbpf to libxdp in the libbpf 1.0 timeframe.

> If you're saying 'merge', then is libxdp going to be a part of kernel or
> as an AF-XDP related guy I would be forced to include yet another
> repository in the BPF developer toolchain? :<
>

The AF_XDP functionality of libbpf will be part of libxdp, which is not
in the kernel tree. libxdp depend on libbpf, which includes the core BPF
functionality. For AF_XDP this is a good thing IMO. libxdp includes more
higher lever abstractions than libbpf, which is more aligned to AF_XDP.

Yes, that would mean that you would get another dependency for AF_XDP,
and one that is not in the kernel tree. For most *users* this is not a
problem, in fact it might be easier to consume and to contribute for
most users. We can't optimize just for the kernel hackers. ;-)


Björn

[...]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-15 20:49   ` John Fastabend
  2021-02-16  2:38     ` Maciej Fijalkowski
@ 2021-02-16  9:20     ` Björn Töpel
  2021-02-16 10:39       ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 37+ messages in thread
From: Björn Töpel @ 2021-02-16  9:20 UTC (permalink / raw)
  To: John Fastabend, Maciej Fijalkowski, daniel, ast, bpf, netdev
  Cc: andrii, toke, magnus.karlsson, ciara.loftus

On 2021-02-15 21:49, John Fastabend wrote:
> Maciej Fijalkowski wrote:
>> Currently, if there are multiple xdpsock instances running on a single
>> interface and in case one of the instances is terminated, the rest of
>> them are left in an inoperable state due to the fact of unloaded XDP
>> prog from interface.
>>
>> To address that, step away from setting bpf prog in favour of bpf_link.
>> This means that refcounting of BPF resources will be done automatically
>> by bpf_link itself.
>>
>> When setting up BPF resources during xsk socket creation, check whether
>> bpf_link for a given ifindex already exists via set of calls to
>> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
>> and comparing the ifindexes from bpf_link and xsk socket.
>>
>> If there's no bpf_link yet, create one for a given XDP prog and unload
>> explicitly existing prog if XDP_FLAGS_UPDATE_IF_NOEXIST is not set.
>>
>> If bpf_link is already at a given ifindex and underlying program is not
>> AF-XDP one, bail out or update the bpf_link's prog given the presence of
>> XDP_FLAGS_UPDATE_IF_NOEXIST.
>>
>> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>> ---
>>   tools/lib/bpf/xsk.c | 143 +++++++++++++++++++++++++++++++++++++-------
>>   1 file changed, 122 insertions(+), 21 deletions(-)
> 
> [...]
> 
>> +static int xsk_create_bpf_link(struct xsk_socket *xsk)
>> +{
>> +	/* bpf_link only accepts XDP_FLAGS_MODES, but xsk->config.xdp_flags
>> +	 * might have set XDP_FLAGS_UPDATE_IF_NOEXIST
>> +	 */
>> +	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts,
>> +			    .flags = (xsk->config.xdp_flags & XDP_FLAGS_MODES));
>> +	struct xsk_ctx *ctx = xsk->ctx;
>> +	__u32 prog_id;
>> +	int link_fd;
>> +	int err;
>> +
>> +	/* for !XDP_FLAGS_UPDATE_IF_NOEXIST, unload the program first, if any,
>> +	 * so that bpf_link can be attached
>> +	 */
>> +	if (!(xsk->config.xdp_flags & XDP_FLAGS_UPDATE_IF_NOEXIST)) {
>> +		err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags);
>> +		if (err) {
>> +			pr_warn("getting XDP prog id failed\n");
>> +			return err;
>> +		}
>> +		if (prog_id) {
>> +			err = bpf_set_link_xdp_fd(ctx->ifindex, -1, 0);
>> +			if (err < 0) {
>> +				pr_warn("detaching XDP prog failed\n");
>> +				return err;
>> +			}
>> +		}
>>   	}
>>   
>> -	ctx->prog_fd = prog_fd;
>> +	link_fd = bpf_link_create(ctx->prog_fd, xsk->ctx->ifindex, BPF_XDP, &opts);
>> +	if (link_fd < 0) {
>> +		pr_warn("bpf_link_create failed: %s\n", strerror(errno));
>> +		return link_fd;
>> +	}
>> +
> 
> This can leave the system in a bad state where it unloaded the XDP program
> above, but then failed to create the link. So we should somehow fix that
> if possible or at minimum put a note somewhere so users can't claim they
> shouldn't know this.
> 
> Also related, its not good for real systems to let XDP program go missing
> for some period of time. I didn't check but we should make
> XDP_FLAGS_UPDATE_IF_NOEXIST the default if its not already.
>

This is the default for XDP sockets library. The
"bpf_set_link_xdp_fd(...-1)" way is only when a user sets it explicitly.
One could maybe argue that the "force remove" would be out of scope for
AF_XDP; Meaning that if an XDP program is running, attached via netlink,
the AF_XDP library simply cannot remove it. The user would need to rely
on some other mechanism.


Björn


[...]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 3/3] samples: bpf: do not unload prog within xdpsock
  2021-02-15 20:24   ` John Fastabend
@ 2021-02-16  9:22     ` Björn Töpel
  2021-02-16 14:15       ` Maciej Fijalkowski
  0 siblings, 1 reply; 37+ messages in thread
From: Björn Töpel @ 2021-02-16  9:22 UTC (permalink / raw)
  To: John Fastabend, Maciej Fijalkowski, daniel, ast, bpf, netdev
  Cc: andrii, toke, magnus.karlsson, ciara.loftus

On 2021-02-15 21:24, John Fastabend wrote:
> Maciej Fijalkowski wrote:
>> With the introduction of bpf_link in xsk's libbpf part, there's no
>> further need for explicit unload of prog on xdpsock's termination. When
>> process dies, the bpf_link's refcount will be decremented and resources
>> will be unloaded/freed under the hood in case when there are no more
>> active users.
>>
>> While at it, don't dump stats on error path.
>>
>> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>> ---
> 
> Can we also use it from selftests prog so we have a test that is run there
> using this? Looks like xdpxceiver.c could do the link step as well?
> 

Yes! Good point!


Björn

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-16  2:23             ` Maciej Fijalkowski
@ 2021-02-16  9:23               ` Björn Töpel
  0 siblings, 0 replies; 37+ messages in thread
From: Björn Töpel @ 2021-02-16  9:23 UTC (permalink / raw)
  To: Maciej Fijalkowski, John Fastabend
  Cc: Toke Høiland-Jørgensen, daniel, ast, bpf, netdev,
	andrii, magnus.karlsson, ciara.loftus

On 2021-02-16 03:23, Maciej Fijalkowski wrote:
> On Mon, Feb 15, 2021 at 04:18:28PM -0800, John Fastabend wrote:

[...]

> 
> Once again, is libxdp going to land in th kernel? Still not clear to me.
>

No, libxdp does not live in the kernel tree.


Björn

[...]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-16  2:01         ` Maciej Fijalkowski
  2021-02-16  9:15           ` Björn Töpel
@ 2021-02-16 10:27           ` Toke Høiland-Jørgensen
  2021-02-16 20:15             ` Maciej Fijalkowski
  1 sibling, 1 reply; 37+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-02-16 10:27 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: Björn Töpel, daniel, ast, bpf, netdev, andrii,
	magnus.karlsson, ciara.loftus

Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:

> On Mon, Feb 15, 2021 at 08:35:29PM +0100, Toke Høiland-Jørgensen wrote:
>> Björn Töpel <bjorn.topel@intel.com> writes:
>> 
>> > On 2021-02-15 18:07, Toke Høiland-Jørgensen wrote:
>> >> Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:
>> >> 
>> >>> Currently, if there are multiple xdpsock instances running on a single
>> >>> interface and in case one of the instances is terminated, the rest of
>> >>> them are left in an inoperable state due to the fact of unloaded XDP
>> >>> prog from interface.
>> >>>
>> >>> To address that, step away from setting bpf prog in favour of bpf_link.
>> >>> This means that refcounting of BPF resources will be done automatically
>> >>> by bpf_link itself.
>> >>>
>> >>> When setting up BPF resources during xsk socket creation, check whether
>> >>> bpf_link for a given ifindex already exists via set of calls to
>> >>> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
>> >>> and comparing the ifindexes from bpf_link and xsk socket.
>> >> 
>> >> One consideration here is that bpf_link_get_fd_by_id() is a privileged
>> >> operation (privileged as in CAP_SYS_ADMIN), so this has the side effect
>> >> of making AF_XDP privileged as well. Is that the intention?
>> >>
>> >
>> > We're already using, e.g., bpf_map_get_fd_by_id() which has that
>> > as well. So we're assuming that for XDP setup already!
>> 
>> Ah, right, didn't realise that one is CAP_SYS_ADMIN as well; I
>> remembered this as being specific to the bpf_link operation.
>> 
>> >> Another is that the AF_XDP code is in the process of moving to libxdp
>> >> (see in-progress PR [0]), and this approach won't carry over as-is to
>> >> that model, because libxdp has to pin the bpf_link fds.
>> >>
>> >
>> > I was assuming there were two modes of operations for AF_XDP in libxdp.
>> > One which is with the multi-program support (which AFAIK is why the
>> > pinning is required), and one "like the current libbpf" one. For the
>> > latter Maciej's series would be a good fit, no?
>> 
>> We haven't added an explicit mode switch for now; libxdp will fall back
>> to regular interface attach if the kernel doesn't support the needed
>> features for multi-attach, but if it's possible to just have libxdp
>> transparently do the right thing I'd much prefer that. So we're still
>> exploring that (part of which is that Magnus has promised to run some
>> performance tests to see if there's a difference).
>> 
>> However, even if there's an explicit mode switch I'd like to avoid
>> different *semantics* between the two modes if possible, to keep the two
>> as compatible as possible. And since we can't currently do "auto-detach
>> on bpf_link fd close" when using multi-prog, introducing this now would
>> lead to just such a semantic difference. So my preference would be to do
>> it differently... :)
>> 
>> >> However, in libxdp we can solve the original problem in a different way,
>> >> and in fact I already suggested to Magnus that we should do this (see
>> >> [1]); so one way forward could be to address it during the merge in
>> >> libxdp? It should be possible to address the original issue (two
>> >> instances of xdpsock breaking each other when they exit), but
>> >> applications will still need to do an explicit unload operation before
>> >> exiting (i.e., the automatic detach on bpf_link fd closure will take
>> >> more work, and likely require extending the bpf_link kernel support)...
>> >>
>> >
>> > I'd say it's depending on the libbpf 1.0/libxdp merge timeframe. If
>> > we're months ahead, then I'd really like to see this in libbpf until the
>> > merge. However, I'll leave that for Magnus/you to decide!
>
> WDYM by libbpf 1.0/libxdp merge? I glanced through thread and I saw that
> John was also not aware of that. Not sure where it was discussed?
>
> If you're saying 'merge', then is libxdp going to be a part of kernel or
> as an AF-XDP related guy I would be forced to include yet another
> repository in the BPF developer toolchain? :<

As I replied to John, we're trying to do this in as compatible and
painless a way as possible. In particular, we'll maintain (source) API
compatibility with the code currently in libbpf. And yeah, currently
libxdp lives outside the kernel tree. Not sure whether we'll end up
moving it into the kernel tree, as Björn noted up-thread there are
arguments in both directions :)

>> Well, as far as libxdp support goes, the PR I linked is pretty close to
>> being mergeable. One of the few outstanding issues is whether we should
>> solve just this issue before merging, actually :)
>> 
>> Not sure exactly which timeframe Andrii is envisioning for libbpf 1.0,
>> but last I heard he'll announce something next week.
>> 
>> > Bottom line; I'd *really* like bpf_link behavior (process scoped) for
>> > AF_XDP sooner than later! ;-)
>> 
>> Totally agree that we should solve the multi-process coexistence
>> problem! And as I said, I think we can do so in libxdp by using the same
>> synchronisation mechanism we use for setting up the multi-prog
>> dispatcher. So it doesn't *have* to hold things up :)
>
> Am I reading this right or you're trying to reject the fix of the long
> standing issue due to a PR that is not ready yet on a standalone
> project/library? :P

Haha, no, that is not what I'm saying. As I said up-thread I agree that
this is something we should fix, obviously. I'm just suggesting we fix
it in a way that will also be compatible with libxdp and multiprog so we
won't have to introduce yet-another-flag that users will have to decide
on.

However, now that I'm looking at your patch again I think my fears may
have been overblown. I got hung up on the bit in the commit message
where you said "refcounting of BPF resources will be done automatically
by bpf_link itself", and wrongly assumed that you were exposing the
bpf_link fd to the application. But I see now that it's kept in the
private xsk_ctx structure, and applications still just call
xsk_socket__delete(). So in libxdp we can just implement the same API
with a different synchronisation mechanism; that's fine. Apologies for
jumping to conclusions :/

> Once libxdp is the standard way of playing with AF-XDP and there are
> actual users of that, then I'm happy to work/help on that issue.

That is good to know, thanks! I opened an issue in the xdp-tools
repository to track this for the libxdp side (Magnus and I agreed that
we'll merge the existing code first, then fix this on top):
https://github.com/xdp-project/xdp-tools/issues/93

As noted above, the mechanism may have to change, but I think it's
possible to achieve the same thing in a libxdp context :)

-Toke


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-16  0:18           ` John Fastabend
  2021-02-16  2:23             ` Maciej Fijalkowski
@ 2021-02-16 10:36             ` Toke Høiland-Jørgensen
  2021-02-23  1:15               ` Andrii Nakryiko
  1 sibling, 1 reply; 37+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-02-16 10:36 UTC (permalink / raw)
  To: John Fastabend, John Fastabend, Björn Töpel,
	Maciej Fijalkowski, daniel, ast, bpf, netdev
  Cc: andrii, magnus.karlsson, ciara.loftus

John Fastabend <john.fastabend@gmail.com> writes:

> Toke Høiland-Jørgensen wrote:
>> John Fastabend <john.fastabend@gmail.com> writes:
>> 
>> >> > However, in libxdp we can solve the original problem in a different way,
>> >> > and in fact I already suggested to Magnus that we should do this (see
>> >> > [1]); so one way forward could be to address it during the merge in
>> >> > libxdp? It should be possible to address the original issue (two
>> >> > instances of xdpsock breaking each other when they exit), but
>> >> > applications will still need to do an explicit unload operation before
>> >> > exiting (i.e., the automatic detach on bpf_link fd closure will take
>> >> > more work, and likely require extending the bpf_link kernel support)...
>> >> >
>> >> 
>> >> I'd say it's depending on the libbpf 1.0/libxdp merge timeframe. If
>> >> we're months ahead, then I'd really like to see this in libbpf until the
>> >> merge. However, I'll leave that for Magnus/you to decide!
>> >
>> > Did I miss some thread? What does this mean libbpf 1.0/libxdp merge?
>> 
>> The idea is to keep libbpf focused on bpf, and move the AF_XDP stuff to
>> libxdp (so the socket stuff in xsk.h). We're adding the existing code
>> wholesale, and keeping API compatibility during the move, so all that's
>> needed is adding -lxdp when compiling. And obviously the existing libbpf
>> code isn't going anywhere until such a time as there's a general
>> backwards compatibility-breaking deprecation in libbpf (which I believe
>> Andrii is planning to do in an upcoming and as-of-yet unannounced v1.0
>> release).
>
> OK, I would like to keep the basic XDP pieces in libbpf though. For example
> bpf_program__attach_xdp(). This way we don't have one lib to attach
> everything, but XDP.

The details are still TDB; for now, we're just merging in the XSK code
to the libxdp repo. I expect Andrii to announce his plans for the rest
soonish. I wouldn't expect basic things like that to go away, though :)

>> 
>> While integrating the XSK code into libxdp we're trying to make it
>> compatible with the rest of the library (i.e., multi-prog). Hence my
>> preference to avoid introducing something that makes this harder :)
>> 
>> -Toke
>> 
>
> OK that makes sense to me thanks. But, I'm missing something (maybe its
> obvious to everyone else?).
>
> When you load an XDP program you should get a reference to it. And then
> XDP program should never be unloaded until that id is removed right? It
> may or may not have an xsk map. Why does adding/removing programs from
> an associated map have any impact on the XDP program? That seems like
> the buggy part to me. No other map behaves this way as far as I can
> tell. Now if the program with the XDP reference closes without pinning
> the map or otherwise doing something with it, sure the map gets destroyed
> and any xsk sockets are lost.

The original bug comes from the XSK code abstracting away the fact that
an AF_XDP socket needs an XDP program on the interface to work; so if
none exists, the library will just load a program that redirects into
the socket. Which breaks since the xdpsock example application is trying
to be nice and clean up after itself, by removing the XDP program when
it's done with the socket, thus breaking any other programs using that
XDP program. So this patch introduces proper synchronisation on both add
and remove of the XDP program...

-Toke


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-16  9:20     ` Björn Töpel
@ 2021-02-16 10:39       ` Toke Høiland-Jørgensen
  2021-02-16 19:15         ` John Fastabend
  0 siblings, 1 reply; 37+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-02-16 10:39 UTC (permalink / raw)
  To: Björn Töpel, John Fastabend, Maciej Fijalkowski,
	daniel, ast, bpf, netdev
  Cc: andrii, magnus.karlsson, ciara.loftus

Björn Töpel <bjorn.topel@intel.com> writes:

> On 2021-02-15 21:49, John Fastabend wrote:
>> Maciej Fijalkowski wrote:
>>> Currently, if there are multiple xdpsock instances running on a single
>>> interface and in case one of the instances is terminated, the rest of
>>> them are left in an inoperable state due to the fact of unloaded XDP
>>> prog from interface.
>>>
>>> To address that, step away from setting bpf prog in favour of bpf_link.
>>> This means that refcounting of BPF resources will be done automatically
>>> by bpf_link itself.
>>>
>>> When setting up BPF resources during xsk socket creation, check whether
>>> bpf_link for a given ifindex already exists via set of calls to
>>> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
>>> and comparing the ifindexes from bpf_link and xsk socket.
>>>
>>> If there's no bpf_link yet, create one for a given XDP prog and unload
>>> explicitly existing prog if XDP_FLAGS_UPDATE_IF_NOEXIST is not set.
>>>
>>> If bpf_link is already at a given ifindex and underlying program is not
>>> AF-XDP one, bail out or update the bpf_link's prog given the presence of
>>> XDP_FLAGS_UPDATE_IF_NOEXIST.
>>>
>>> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>>> ---
>>>   tools/lib/bpf/xsk.c | 143 +++++++++++++++++++++++++++++++++++++-------
>>>   1 file changed, 122 insertions(+), 21 deletions(-)
>> 
>> [...]
>> 
>>> +static int xsk_create_bpf_link(struct xsk_socket *xsk)
>>> +{
>>> +	/* bpf_link only accepts XDP_FLAGS_MODES, but xsk->config.xdp_flags
>>> +	 * might have set XDP_FLAGS_UPDATE_IF_NOEXIST
>>> +	 */
>>> +	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts,
>>> +			    .flags = (xsk->config.xdp_flags & XDP_FLAGS_MODES));
>>> +	struct xsk_ctx *ctx = xsk->ctx;
>>> +	__u32 prog_id;
>>> +	int link_fd;
>>> +	int err;
>>> +
>>> +	/* for !XDP_FLAGS_UPDATE_IF_NOEXIST, unload the program first, if any,
>>> +	 * so that bpf_link can be attached
>>> +	 */
>>> +	if (!(xsk->config.xdp_flags & XDP_FLAGS_UPDATE_IF_NOEXIST)) {
>>> +		err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags);
>>> +		if (err) {
>>> +			pr_warn("getting XDP prog id failed\n");
>>> +			return err;
>>> +		}
>>> +		if (prog_id) {
>>> +			err = bpf_set_link_xdp_fd(ctx->ifindex, -1, 0);
>>> +			if (err < 0) {
>>> +				pr_warn("detaching XDP prog failed\n");
>>> +				return err;
>>> +			}
>>> +		}
>>>   	}
>>>   
>>> -	ctx->prog_fd = prog_fd;
>>> +	link_fd = bpf_link_create(ctx->prog_fd, xsk->ctx->ifindex, BPF_XDP, &opts);
>>> +	if (link_fd < 0) {
>>> +		pr_warn("bpf_link_create failed: %s\n", strerror(errno));
>>> +		return link_fd;
>>> +	}
>>> +
>> 
>> This can leave the system in a bad state where it unloaded the XDP program
>> above, but then failed to create the link. So we should somehow fix that
>> if possible or at minimum put a note somewhere so users can't claim they
>> shouldn't know this.
>> 
>> Also related, its not good for real systems to let XDP program go missing
>> for some period of time. I didn't check but we should make
>> XDP_FLAGS_UPDATE_IF_NOEXIST the default if its not already.
>>
>
> This is the default for XDP sockets library. The
> "bpf_set_link_xdp_fd(...-1)" way is only when a user sets it explicitly.
> One could maybe argue that the "force remove" would be out of scope for
> AF_XDP; Meaning that if an XDP program is running, attached via netlink,
> the AF_XDP library simply cannot remove it. The user would need to rely
> on some other mechanism.

Yeah, I'd tend to agree with that. In general, I think the proliferation
of "just force-remove (or override) the running program" in code and
instructions has been a mistake; and application should only really be
adding and removing its own program...

-Toke


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 3/3] samples: bpf: do not unload prog within xdpsock
  2021-02-16  9:22     ` Björn Töpel
@ 2021-02-16 14:15       ` Maciej Fijalkowski
  0 siblings, 0 replies; 37+ messages in thread
From: Maciej Fijalkowski @ 2021-02-16 14:15 UTC (permalink / raw)
  To: Björn Töpel
  Cc: John Fastabend, daniel, ast, bpf, netdev, andrii, toke,
	magnus.karlsson, ciara.loftus

On Tue, Feb 16, 2021 at 10:22:15AM +0100, Björn Töpel wrote:
> On 2021-02-15 21:24, John Fastabend wrote:
> > Maciej Fijalkowski wrote:
> > > With the introduction of bpf_link in xsk's libbpf part, there's no
> > > further need for explicit unload of prog on xdpsock's termination. When
> > > process dies, the bpf_link's refcount will be decremented and resources
> > > will be unloaded/freed under the hood in case when there are no more
> > > active users.
> > > 
> > > While at it, don't dump stats on error path.
> > > 
> > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > > ---
> > 
> > Can we also use it from selftests prog so we have a test that is run there
> > using this? Looks like xdpxceiver.c could do the link step as well?
> > 
> 
> Yes! Good point!

Somehow John's mail didn't end up in my inbox.
xdpxceiver is using libbpf API for socket handling (create/delete), so
with that set included it is working on bpf_link.

> 
> 
> Björn

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-16  2:38     ` Maciej Fijalkowski
@ 2021-02-16 18:19       ` John Fastabend
  2021-02-16 20:10         ` Maciej Fijalkowski
  0 siblings, 1 reply; 37+ messages in thread
From: John Fastabend @ 2021-02-16 18:19 UTC (permalink / raw)
  To: Maciej Fijalkowski, John Fastabend
  Cc: daniel, ast, bpf, netdev, andrii, toke, bjorn.topel,
	magnus.karlsson, ciara.loftus

Maciej Fijalkowski wrote:
> On Mon, Feb 15, 2021 at 12:49:27PM -0800, John Fastabend wrote:
> > Maciej Fijalkowski wrote:
> > > Currently, if there are multiple xdpsock instances running on a single
> > > interface and in case one of the instances is terminated, the rest of
> > > them are left in an inoperable state due to the fact of unloaded XDP
> > > prog from interface.
> > > 
> > > To address that, step away from setting bpf prog in favour of bpf_link.
> > > This means that refcounting of BPF resources will be done automatically
> > > by bpf_link itself.
> > > 
> > > When setting up BPF resources during xsk socket creation, check whether
> > > bpf_link for a given ifindex already exists via set of calls to
> > > bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
> > > and comparing the ifindexes from bpf_link and xsk socket.
> > > 
> > > If there's no bpf_link yet, create one for a given XDP prog and unload
> > > explicitly existing prog if XDP_FLAGS_UPDATE_IF_NOEXIST is not set.
> > > 
> > > If bpf_link is already at a given ifindex and underlying program is not
> > > AF-XDP one, bail out or update the bpf_link's prog given the presence of
> > > XDP_FLAGS_UPDATE_IF_NOEXIST.
> > > 
> > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > > ---
> > >  tools/lib/bpf/xsk.c | 143 +++++++++++++++++++++++++++++++++++++-------
> > >  1 file changed, 122 insertions(+), 21 deletions(-)
> > 
> > [...]
> > 
> > > +static int xsk_create_bpf_link(struct xsk_socket *xsk)
> > > +{
> > > +	/* bpf_link only accepts XDP_FLAGS_MODES, but xsk->config.xdp_flags
> > > +	 * might have set XDP_FLAGS_UPDATE_IF_NOEXIST
> > > +	 */
> > > +	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts,
> > > +			    .flags = (xsk->config.xdp_flags & XDP_FLAGS_MODES));
> > > +	struct xsk_ctx *ctx = xsk->ctx;
> > > +	__u32 prog_id;
> > > +	int link_fd;
> > > +	int err;
> > > +
> > > +	/* for !XDP_FLAGS_UPDATE_IF_NOEXIST, unload the program first, if any,
> > > +	 * so that bpf_link can be attached
> > > +	 */
> > > +	if (!(xsk->config.xdp_flags & XDP_FLAGS_UPDATE_IF_NOEXIST)) {
> > > +		err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags);
> > > +		if (err) {
> > > +			pr_warn("getting XDP prog id failed\n");
> > > +			return err;
> > > +		}
> > > +		if (prog_id) {
> > > +			err = bpf_set_link_xdp_fd(ctx->ifindex, -1, 0);
> > > +			if (err < 0) {
> > > +				pr_warn("detaching XDP prog failed\n");
> > > +				return err;
> > > +			}
> > > +		}
> > >  	}
> > >  
> > > -	ctx->prog_fd = prog_fd;
> > > +	link_fd = bpf_link_create(ctx->prog_fd, xsk->ctx->ifindex, BPF_XDP, &opts);
> > > +	if (link_fd < 0) {
> > > +		pr_warn("bpf_link_create failed: %s\n", strerror(errno));
> > > +		return link_fd;
> > > +	}
> > > +
> > 
> > This can leave the system in a bad state where it unloaded the XDP program
> > above, but then failed to create the link. So we should somehow fix that
> > if possible or at minimum put a note somewhere so users can't claim they
> > shouldn't know this.
> > 
> > Also related, its not good for real systems to let XDP program go missing
> > for some period of time. I didn't check but we should make
> > XDP_FLAGS_UPDATE_IF_NOEXIST the default if its not already.
> 
> Old way of attaching prog is mutual exclusive with bpf_link, right?
> What I'm saying is in order to use one of the two, you need to wipe out
> the current one in favour of the second that you would like to load.

Personally, if I were using above I want the operation to error
if a XDP program is already attached. Then user is forced to remove the
XDP program directly if thats even safe to do.

Reusing UPDATE_IF_NOEXIST flag above seems like an abuse of that flag.
The kernel side does an atomic program swap (or at least it should imo) 
of the programs when it is set. Atomic here is not exactly right though
because driver might reset or do other things, but the point is no
packets are missed without policy. In above some N packets will pass
through the device without policy being applied. This is going to be
subtle and buggy if used in real production systems.

The API needs to do a replace operation not a delete/create and if it
can't do that it needs to error out so the user can figure out what
to do about it.

Do you really need this automatic behavior for something? It clutters
up the API with more flags and I can't see how its useful. If it
errors out just delete the prog using the existing interfaces from the
API user side.

> 
> > 
> > > +	ctx->link_fd = link_fd;
> > >  	return 0;
> > >  }
> > >  
> > 
> > [...]
> > 
> > > +static int xsk_link_lookup(struct xsk_ctx *ctx, __u32 *prog_id)
> > > +{
> > > +	__u32 link_len = sizeof(struct bpf_link_info);
> > > +	struct bpf_link_info link_info;
> > > +	__u32 id = 0;
> > > +	int err;
> > > +	int fd;
> > > +
> > > +	while (true) {
> > > +		err = bpf_link_get_next_id(id, &id);
> > > +		if (err) {
> > > +			if (errno == ENOENT)
> > > +				break;
> > > +			pr_warn("can't get next link: %s\n", strerror(errno));
> > > +			break;
> > > +		}
> > > +
> > > +		fd = bpf_link_get_fd_by_id(id);
> > > +		if (fd < 0) {
> > > +			if (errno == ENOENT)
> > > +				continue;
> > > +			pr_warn("can't get link by id (%u): %s\n", id, strerror(errno));
> > > +			break;
> > > +		}
> > > +
> > > +		memset(&link_info, 0, link_len);
> > > +		err = bpf_obj_get_info_by_fd(fd, &link_info, &link_len);
> > > +		if (err) {
> > > +			pr_warn("can't get link info: %s\n", strerror(errno));
> > > +			close(fd);
> > > +			break;
> > > +		}
> > > +		if (link_info.xdp.ifindex == ctx->ifindex) {
> > > +			ctx->link_fd = fd;
> > > +			*prog_id = link_info.prog_id;
> > > +			break;
> > > +		}
> > > +		close(fd);
> > > +	}
> > > +
> > > +	return errno == ENOENT ? 0 : err;
> > 
> > But, err wont be set in fd < 0 case? I guess we don't want to return 0 if
> > bpf_link_get_fd_by_id fails.
> 
> Good catch!
> 
> > Although I really don't like the construct
> > here that much. I think just `return err` and ensuring err is set correctly
> > would be more clear. At least the fd error case needs to be handled
> > though.
> 
> FWIW, this was inspired by tools/bpf/bpftool/link.c:do_show()

Sure its not my preference, but as long as the bug is resolved I
wont complain. If I hadn't seen the bug I wouldn't have said
anything.

> 
> > 
> > > +}
> > > +



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-16 10:39       ` Toke Høiland-Jørgensen
@ 2021-02-16 19:15         ` John Fastabend
  2021-02-16 20:50           ` Maciej Fijalkowski
  0 siblings, 1 reply; 37+ messages in thread
From: John Fastabend @ 2021-02-16 19:15 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Björn Töpel,
	John Fastabend, Maciej Fijalkowski, daniel, ast, bpf, netdev
  Cc: andrii, magnus.karlsson, ciara.loftus

Toke Høiland-Jørgensen wrote:
> Björn Töpel <bjorn.topel@intel.com> writes:
> 
> > On 2021-02-15 21:49, John Fastabend wrote:
> >> Maciej Fijalkowski wrote:
> >>> Currently, if there are multiple xdpsock instances running on a single
> >>> interface and in case one of the instances is terminated, the rest of
> >>> them are left in an inoperable state due to the fact of unloaded XDP
> >>> prog from interface.
> >>>
> >>> To address that, step away from setting bpf prog in favour of bpf_link.
> >>> This means that refcounting of BPF resources will be done automatically
> >>> by bpf_link itself.
> >>>
> >>> When setting up BPF resources during xsk socket creation, check whether
> >>> bpf_link for a given ifindex already exists via set of calls to
> >>> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
> >>> and comparing the ifindexes from bpf_link and xsk socket.
> >>>
> >>> If there's no bpf_link yet, create one for a given XDP prog and unload
> >>> explicitly existing prog if XDP_FLAGS_UPDATE_IF_NOEXIST is not set.
> >>>
> >>> If bpf_link is already at a given ifindex and underlying program is not
> >>> AF-XDP one, bail out or update the bpf_link's prog given the presence of
> >>> XDP_FLAGS_UPDATE_IF_NOEXIST.
> >>>
> >>> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> >>> ---

[...]

> >>> -	ctx->prog_fd = prog_fd;
> >>> +	link_fd = bpf_link_create(ctx->prog_fd, xsk->ctx->ifindex, BPF_XDP, &opts);
> >>> +	if (link_fd < 0) {
> >>> +		pr_warn("bpf_link_create failed: %s\n", strerror(errno));
> >>> +		return link_fd;
> >>> +	}
> >>> +
> >> 
> >> This can leave the system in a bad state where it unloaded the XDP program
> >> above, but then failed to create the link. So we should somehow fix that
> >> if possible or at minimum put a note somewhere so users can't claim they
> >> shouldn't know this.
> >> 
> >> Also related, its not good for real systems to let XDP program go missing
> >> for some period of time. I didn't check but we should make
> >> XDP_FLAGS_UPDATE_IF_NOEXIST the default if its not already.
> >>
> >
> > This is the default for XDP sockets library. The
> > "bpf_set_link_xdp_fd(...-1)" way is only when a user sets it explicitly.
> > One could maybe argue that the "force remove" would be out of scope for
> > AF_XDP; Meaning that if an XDP program is running, attached via netlink,
> > the AF_XDP library simply cannot remove it. The user would need to rely
> > on some other mechanism.
> 
> Yeah, I'd tend to agree with that. In general, I think the proliferation
> of "just force-remove (or override) the running program" in code and
> instructions has been a mistake; and application should only really be
> adding and removing its own program...
> 
> -Toke
> 

I'll try to consolidate some of my opinions from a couple threads here. It
looks to me many of these issues are self-inflicted by the API. We built
the API with out the right abstractions or at least different abstractions
from the rest of the BPF APIs. Not too surprising seeing the kernel side
and user side were all being built up at once.

For example this specific issue where the xsk API also deletes the XDP
program seems to be due to merging the xsk with the creation of the XDP
programs.

I'm not a real user of AF_XDP (yet.), but here is how I would expect it
to work based on how the sockmap pieces work, which are somewhat similar
given they also deal with sockets.

Program 
(1) load and pin an XDP BPF program
    - obj = bpf_object__open(prog);
    - bpf_object__load_xattr(&attr);
    - bpf_program__pin()
(2) pin the map, find map_xsk using any of the map APIs
    - bpf_map__pin(map_xsk, path_to_pin)
(3) attach to XDP
    - link = bpf_program__attach_xdp()
    - bpf_link__pin()

At this point you have a BPF program loaded, a xsk map, and a link all
pinned and ready. And we can add socks using the process found in
`enter_xsks_into_map` in the sample. This can be the same program that
loaded/pinned the XDP program or some other program it doesn't really
matter.

 - create xsk fd
      . xsk_umem__create()
      . xsk_socket__create
 - open map @ pinned path
 - bpf_map_update_elem(xsks_map, &key, &fd, 0);

Then it looks like we don't have any conflicts? The XDP program is pinned
and exists in its normal scope. The xsk elements can be added/deleted
as normal. If the XDP program is removed and the map referencing (using
normal ref rules) reaches zero its also deleted. Above is more or less
the same flow we use for any BPF program so looks good to me.

The trouble seems to pop up when using the higher abstraction APIs
xsk_setup_xdp_prog and friends I guess? I just see above as already
fairly easy to use and we have good helpers to create the sockets it looks
like. Maybe I missed some design considerations. IMO higher level
abstractions should go in new libxdp and above should stay in libbpf.

/rant off ;)

Thanks,
John

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-16 18:19       ` John Fastabend
@ 2021-02-16 20:10         ` Maciej Fijalkowski
  0 siblings, 0 replies; 37+ messages in thread
From: Maciej Fijalkowski @ 2021-02-16 20:10 UTC (permalink / raw)
  To: John Fastabend
  Cc: daniel, ast, bpf, netdev, andrii, toke, bjorn.topel,
	magnus.karlsson, ciara.loftus

On Tue, Feb 16, 2021 at 10:19:17AM -0800, John Fastabend wrote:
> Maciej Fijalkowski wrote:
> > On Mon, Feb 15, 2021 at 12:49:27PM -0800, John Fastabend wrote:
> > > Maciej Fijalkowski wrote:
> > > > Currently, if there are multiple xdpsock instances running on a single
> > > > interface and in case one of the instances is terminated, the rest of
> > > > them are left in an inoperable state due to the fact of unloaded XDP
> > > > prog from interface.
> > > > 
> > > > To address that, step away from setting bpf prog in favour of bpf_link.
> > > > This means that refcounting of BPF resources will be done automatically
> > > > by bpf_link itself.
> > > > 
> > > > When setting up BPF resources during xsk socket creation, check whether
> > > > bpf_link for a given ifindex already exists via set of calls to
> > > > bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
> > > > and comparing the ifindexes from bpf_link and xsk socket.
> > > > 
> > > > If there's no bpf_link yet, create one for a given XDP prog and unload
> > > > explicitly existing prog if XDP_FLAGS_UPDATE_IF_NOEXIST is not set.
> > > > 
> > > > If bpf_link is already at a given ifindex and underlying program is not
> > > > AF-XDP one, bail out or update the bpf_link's prog given the presence of
> > > > XDP_FLAGS_UPDATE_IF_NOEXIST.
> > > > 
> > > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > > > ---
> > > >  tools/lib/bpf/xsk.c | 143 +++++++++++++++++++++++++++++++++++++-------
> > > >  1 file changed, 122 insertions(+), 21 deletions(-)
> > > 
> > > [...]
> > > 
> > > > +static int xsk_create_bpf_link(struct xsk_socket *xsk)
> > > > +{
> > > > +	/* bpf_link only accepts XDP_FLAGS_MODES, but xsk->config.xdp_flags
> > > > +	 * might have set XDP_FLAGS_UPDATE_IF_NOEXIST
> > > > +	 */
> > > > +	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts,
> > > > +			    .flags = (xsk->config.xdp_flags & XDP_FLAGS_MODES));
> > > > +	struct xsk_ctx *ctx = xsk->ctx;
> > > > +	__u32 prog_id;
> > > > +	int link_fd;
> > > > +	int err;
> > > > +
> > > > +	/* for !XDP_FLAGS_UPDATE_IF_NOEXIST, unload the program first, if any,
> > > > +	 * so that bpf_link can be attached
> > > > +	 */
> > > > +	if (!(xsk->config.xdp_flags & XDP_FLAGS_UPDATE_IF_NOEXIST)) {
> > > > +		err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags);
> > > > +		if (err) {
> > > > +			pr_warn("getting XDP prog id failed\n");
> > > > +			return err;
> > > > +		}
> > > > +		if (prog_id) {
> > > > +			err = bpf_set_link_xdp_fd(ctx->ifindex, -1, 0);
> > > > +			if (err < 0) {
> > > > +				pr_warn("detaching XDP prog failed\n");
> > > > +				return err;
> > > > +			}
> > > > +		}
> > > >  	}
> > > >  
> > > > -	ctx->prog_fd = prog_fd;
> > > > +	link_fd = bpf_link_create(ctx->prog_fd, xsk->ctx->ifindex, BPF_XDP, &opts);
> > > > +	if (link_fd < 0) {
> > > > +		pr_warn("bpf_link_create failed: %s\n", strerror(errno));
> > > > +		return link_fd;
> > > > +	}
> > > > +
> > > 
> > > This can leave the system in a bad state where it unloaded the XDP program
> > > above, but then failed to create the link. So we should somehow fix that
> > > if possible or at minimum put a note somewhere so users can't claim they
> > > shouldn't know this.
> > > 
> > > Also related, its not good for real systems to let XDP program go missing
> > > for some period of time. I didn't check but we should make
> > > XDP_FLAGS_UPDATE_IF_NOEXIST the default if its not already.
> > 
> > Old way of attaching prog is mutual exclusive with bpf_link, right?
> > What I'm saying is in order to use one of the two, you need to wipe out
> > the current one in favour of the second that you would like to load.
> 
> Personally, if I were using above I want the operation to error
> if a XDP program is already attached. Then user is forced to remove the
> XDP program directly if thats even safe to do.
> 
> Reusing UPDATE_IF_NOEXIST flag above seems like an abuse of that flag.
> The kernel side does an atomic program swap (or at least it should imo) 
> of the programs when it is set. Atomic here is not exactly right though
> because driver might reset or do other things, but the point is no
> packets are missed without policy. In above some N packets will pass
> through the device without policy being applied. This is going to be
> subtle and buggy if used in real production systems.
> 
> The API needs to do a replace operation not a delete/create and if it
> can't do that it needs to error out so the user can figure out what
> to do about it.
> 
> Do you really need this automatic behavior for something? It clutters
> up the API with more flags and I can't see how its useful. If it
> errors out just delete the prog using the existing interfaces from the
> API user side.

Fair argument, I went too far with --force flag. Given what you said, I'll
drop that wipe out of netlink-based XDP prog, but I think we can keep the
logic around updating the bpf_link if it already exists (in case --force
flag was set). This provides the atomic xchg and we won't expose systems
to a time frame without XDP policy as you point out.

> 
> > 
> > > 
> > > > +	ctx->link_fd = link_fd;
> > > >  	return 0;
> > > >  }
> > > >  
> > > 
> > > [...]
> > > 
> > > > +static int xsk_link_lookup(struct xsk_ctx *ctx, __u32 *prog_id)
> > > > +{
> > > > +	__u32 link_len = sizeof(struct bpf_link_info);
> > > > +	struct bpf_link_info link_info;
> > > > +	__u32 id = 0;
> > > > +	int err;
> > > > +	int fd;
> > > > +
> > > > +	while (true) {
> > > > +		err = bpf_link_get_next_id(id, &id);
> > > > +		if (err) {
> > > > +			if (errno == ENOENT)
> > > > +				break;
> > > > +			pr_warn("can't get next link: %s\n", strerror(errno));
> > > > +			break;
> > > > +		}
> > > > +
> > > > +		fd = bpf_link_get_fd_by_id(id);
> > > > +		if (fd < 0) {
> > > > +			if (errno == ENOENT)
> > > > +				continue;
> > > > +			pr_warn("can't get link by id (%u): %s\n", id, strerror(errno));
> > > > +			break;
> > > > +		}
> > > > +
> > > > +		memset(&link_info, 0, link_len);
> > > > +		err = bpf_obj_get_info_by_fd(fd, &link_info, &link_len);
> > > > +		if (err) {
> > > > +			pr_warn("can't get link info: %s\n", strerror(errno));
> > > > +			close(fd);
> > > > +			break;
> > > > +		}
> > > > +		if (link_info.xdp.ifindex == ctx->ifindex) {
> > > > +			ctx->link_fd = fd;
> > > > +			*prog_id = link_info.prog_id;
> > > > +			break;
> > > > +		}
> > > > +		close(fd);
> > > > +	}
> > > > +
> > > > +	return errno == ENOENT ? 0 : err;
> > > 
> > > But, err wont be set in fd < 0 case? I guess we don't want to return 0 if
> > > bpf_link_get_fd_by_id fails.
> > 
> > Good catch!
> > 
> > > Although I really don't like the construct
> > > here that much. I think just `return err` and ensuring err is set correctly
> > > would be more clear. At least the fd error case needs to be handled
> > > though.
> > 
> > FWIW, this was inspired by tools/bpf/bpftool/link.c:do_show()
> 
> Sure its not my preference, but as long as the bug is resolved I
> wont complain. If I hadn't seen the bug I wouldn't have said
> anything.
> 
> > 
> > > 
> > > > +}
> > > > +
> 
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-16 10:27           ` Toke Høiland-Jørgensen
@ 2021-02-16 20:15             ` Maciej Fijalkowski
  0 siblings, 0 replies; 37+ messages in thread
From: Maciej Fijalkowski @ 2021-02-16 20:15 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Björn Töpel, daniel, ast, bpf, netdev, andrii,
	magnus.karlsson, ciara.loftus

On Tue, Feb 16, 2021 at 11:27:55AM +0100, Toke Høiland-Jørgensen wrote:
> Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:
> >
> > Am I reading this right or you're trying to reject the fix of the long
> > standing issue due to a PR that is not ready yet on a standalone
> > project/library? :P
> 
> Haha, no, that is not what I'm saying. As I said up-thread I agree that
> this is something we should fix, obviously. I'm just suggesting we fix
> it in a way that will also be compatible with libxdp and multiprog so we
> won't have to introduce yet-another-flag that users will have to decide
> on.
> 
> However, now that I'm looking at your patch again I think my fears may
> have been overblown. I got hung up on the bit in the commit message
> where you said "refcounting of BPF resources will be done automatically
> by bpf_link itself", and wrongly assumed that you were exposing the
> bpf_link fd to the application. But I see now that it's kept in the
> private xsk_ctx structure, and applications still just call
> xsk_socket__delete(). So in libxdp we can just implement the same API
> with a different synchronisation mechanism; that's fine. Apologies for
> jumping to conclusions :/

No worries, this shows how important a thorough commit message is, seems
that I failed on that part.

> 
> > Once libxdp is the standard way of playing with AF-XDP and there are
> > actual users of that, then I'm happy to work/help on that issue.
> 
> That is good to know, thanks! I opened an issue in the xdp-tools
> repository to track this for the libxdp side (Magnus and I agreed that
> we'll merge the existing code first, then fix this on top):
> https://github.com/xdp-project/xdp-tools/issues/93

Thanks! And good to hear that there's some sort of agreement.

> 
> As noted above, the mechanism may have to change, but I think it's
> possible to achieve the same thing in a libxdp context :)
> 
> -Toke
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-16 19:15         ` John Fastabend
@ 2021-02-16 20:50           ` Maciej Fijalkowski
  2021-02-16 21:17             ` John Fastabend
  0 siblings, 1 reply; 37+ messages in thread
From: Maciej Fijalkowski @ 2021-02-16 20:50 UTC (permalink / raw)
  To: John Fastabend
  Cc: Toke Høiland-Jørgensen, Björn Töpel, daniel,
	ast, bpf, netdev, andrii, magnus.karlsson, ciara.loftus

On Tue, Feb 16, 2021 at 11:15:41AM -0800, John Fastabend wrote:
> Toke Høiland-Jørgensen wrote:
> > Björn Töpel <bjorn.topel@intel.com> writes:
> > 
> > > On 2021-02-15 21:49, John Fastabend wrote:
> > >> Maciej Fijalkowski wrote:
> > >>> Currently, if there are multiple xdpsock instances running on a single
> > >>> interface and in case one of the instances is terminated, the rest of
> > >>> them are left in an inoperable state due to the fact of unloaded XDP
> > >>> prog from interface.
> > >>>
> > >>> To address that, step away from setting bpf prog in favour of bpf_link.
> > >>> This means that refcounting of BPF resources will be done automatically
> > >>> by bpf_link itself.
> > >>>
> > >>> When setting up BPF resources during xsk socket creation, check whether
> > >>> bpf_link for a given ifindex already exists via set of calls to
> > >>> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
> > >>> and comparing the ifindexes from bpf_link and xsk socket.
> > >>>
> > >>> If there's no bpf_link yet, create one for a given XDP prog and unload
> > >>> explicitly existing prog if XDP_FLAGS_UPDATE_IF_NOEXIST is not set.
> > >>>
> > >>> If bpf_link is already at a given ifindex and underlying program is not
> > >>> AF-XDP one, bail out or update the bpf_link's prog given the presence of
> > >>> XDP_FLAGS_UPDATE_IF_NOEXIST.
> > >>>
> > >>> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > >>> ---
> 
> [...]
> 
> > >>> -	ctx->prog_fd = prog_fd;
> > >>> +	link_fd = bpf_link_create(ctx->prog_fd, xsk->ctx->ifindex, BPF_XDP, &opts);
> > >>> +	if (link_fd < 0) {
> > >>> +		pr_warn("bpf_link_create failed: %s\n", strerror(errno));
> > >>> +		return link_fd;
> > >>> +	}
> > >>> +
> > >> 
> > >> This can leave the system in a bad state where it unloaded the XDP program
> > >> above, but then failed to create the link. So we should somehow fix that
> > >> if possible or at minimum put a note somewhere so users can't claim they
> > >> shouldn't know this.
> > >> 
> > >> Also related, its not good for real systems to let XDP program go missing
> > >> for some period of time. I didn't check but we should make
> > >> XDP_FLAGS_UPDATE_IF_NOEXIST the default if its not already.
> > >>
> > >
> > > This is the default for XDP sockets library. The
> > > "bpf_set_link_xdp_fd(...-1)" way is only when a user sets it explicitly.
> > > One could maybe argue that the "force remove" would be out of scope for
> > > AF_XDP; Meaning that if an XDP program is running, attached via netlink,
> > > the AF_XDP library simply cannot remove it. The user would need to rely
> > > on some other mechanism.
> > 
> > Yeah, I'd tend to agree with that. In general, I think the proliferation
> > of "just force-remove (or override) the running program" in code and
> > instructions has been a mistake; and application should only really be
> > adding and removing its own program...
> > 
> > -Toke
> > 
> 
> I'll try to consolidate some of my opinions from a couple threads here. It
> looks to me many of these issues are self-inflicted by the API. We built
> the API with out the right abstractions or at least different abstractions
> from the rest of the BPF APIs. Not too surprising seeing the kernel side
> and user side were all being built up at once.
> 
> For example this specific issue where the xsk API also deletes the XDP
> program seems to be due to merging the xsk with the creation of the XDP
> programs.

IMHO the issue I fixed and that is the topic of those discussions is the
fact that handling of XDP program for xsk case was not balanced, IOW
libbpf was creating the prog if it wasn't present whereas app side was
blindly removing the XDP prog, even if it wasn't aware whether it has
loaded the prog among with XSKMAP or only attached the xsk socket to
XSKMAP that already existed.

> 
> I'm not a real user of AF_XDP (yet.), but here is how I would expect it
> to work based on how the sockmap pieces work, which are somewhat similar
> given they also deal with sockets.

Don't want to be picky, but I suppose sockmap won't play with ndo_bpf()
and that's what was bothering us.

> 
> Program 
> (1) load and pin an XDP BPF program
>     - obj = bpf_object__open(prog);
>     - bpf_object__load_xattr(&attr);
>     - bpf_program__pin()
> (2) pin the map, find map_xsk using any of the map APIs
>     - bpf_map__pin(map_xsk, path_to_pin)
> (3) attach to XDP
>     - link = bpf_program__attach_xdp()
>     - bpf_link__pin()
> 
> At this point you have a BPF program loaded, a xsk map, and a link all
> pinned and ready. And we can add socks using the process found in
> `enter_xsks_into_map` in the sample. This can be the same program that
> loaded/pinned the XDP program or some other program it doesn't really
> matter.
> 
>  - create xsk fd
>       . xsk_umem__create()
>       . xsk_socket__create
>  - open map @ pinned path
>  - bpf_map_update_elem(xsks_map, &key, &fd, 0);
> 
> Then it looks like we don't have any conflicts? The XDP program is pinned
> and exists in its normal scope. The xsk elements can be added/deleted
> as normal.

The only difference from what you wrote up is the resource pinning, when
compared to what we currently have + the set I am proposing.

So, if you're saying it looks like we don't have any conflicts and I am
saying that the flow is what we have, then...? :)

You would have to ask Magnus what was behind the decision to avoid API
from tools/lib/bpf/libbpf.c but rather directly call underlying functions
from tools/lib/bpf/bpf.c. Nevertheless, it doesn't really make a big
difference to me.

> If the XDP program is removed and the map referencing (using
> normal ref rules) reaches zero its also deleted. Above is more or less
> the same flow we use for any BPF program so looks good to me.

We have to be a bit more specific around the "XDP program is removed".
Is it removed from the network interface? That's the thing that we want to
avoid when there are other xsk sockets active on a given interface.

With bpf_link, XDP prog is removed only when bpf_link's refcount reaches
zero, via link->ops->release() callback that is invoked from
bpf_link_put().

And the refcount is updated with each process that attaches/detaches from
the bpf_link on interface.

> 
> The trouble seems to pop up when using the higher abstraction APIs
> xsk_setup_xdp_prog and friends I guess? I just see above as already
> fairly easy to use and we have good helpers to create the sockets it looks
> like. Maybe I missed some design considerations. IMO higher level
> abstractions should go in new libxdp and above should stay in libbpf.

xsk_setup_xdp_prog doesn't really feel like higher level abstraction, as I
mentioned, to me it has one layer of abstraction peeled off.

> 
> /rant off ;)
> 
> Thanks,
> John

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-16 20:50           ` Maciej Fijalkowski
@ 2021-02-16 21:17             ` John Fastabend
  0 siblings, 0 replies; 37+ messages in thread
From: John Fastabend @ 2021-02-16 21:17 UTC (permalink / raw)
  To: Maciej Fijalkowski, John Fastabend
  Cc: Toke Høiland-Jørgensen, Björn Töpel, daniel,
	ast, bpf, netdev, andrii, magnus.karlsson, ciara.loftus

Maciej Fijalkowski wrote:
> On Tue, Feb 16, 2021 at 11:15:41AM -0800, John Fastabend wrote:
> > Toke Høiland-Jørgensen wrote:
> > > Björn Töpel <bjorn.topel@intel.com> writes:
> > > 
> > > > On 2021-02-15 21:49, John Fastabend wrote:
> > > >> Maciej Fijalkowski wrote:
> > > >>> Currently, if there are multiple xdpsock instances running on a single
> > > >>> interface and in case one of the instances is terminated, the rest of
> > > >>> them are left in an inoperable state due to the fact of unloaded XDP
> > > >>> prog from interface.
> > > >>>
> > > >>> To address that, step away from setting bpf prog in favour of bpf_link.
> > > >>> This means that refcounting of BPF resources will be done automatically
> > > >>> by bpf_link itself.
> > > >>>
> > > >>> When setting up BPF resources during xsk socket creation, check whether
> > > >>> bpf_link for a given ifindex already exists via set of calls to
> > > >>> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
> > > >>> and comparing the ifindexes from bpf_link and xsk socket.
> > > >>>
> > > >>> If there's no bpf_link yet, create one for a given XDP prog and unload
> > > >>> explicitly existing prog if XDP_FLAGS_UPDATE_IF_NOEXIST is not set.
> > > >>>
> > > >>> If bpf_link is already at a given ifindex and underlying program is not
> > > >>> AF-XDP one, bail out or update the bpf_link's prog given the presence of
> > > >>> XDP_FLAGS_UPDATE_IF_NOEXIST.
> > > >>>
> > > >>> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

[...]

> > 
> > I'm not a real user of AF_XDP (yet.), but here is how I would expect it
> > to work based on how the sockmap pieces work, which are somewhat similar
> > given they also deal with sockets.
> 
> Don't want to be picky, but I suppose sockmap won't play with ndo_bpf()
> and that's what was bothering us.
> 
> > 
> > Program 
> > (1) load and pin an XDP BPF program
> >     - obj = bpf_object__open(prog);
> >     - bpf_object__load_xattr(&attr);
> >     - bpf_program__pin()
> > (2) pin the map, find map_xsk using any of the map APIs
> >     - bpf_map__pin(map_xsk, path_to_pin)
> > (3) attach to XDP
> >     - link = bpf_program__attach_xdp()
> >     - bpf_link__pin()
> > 
> > At this point you have a BPF program loaded, a xsk map, and a link all
> > pinned and ready. And we can add socks using the process found in
> > `enter_xsks_into_map` in the sample. This can be the same program that
> > loaded/pinned the XDP program or some other program it doesn't really
> > matter.
> > 
> >  - create xsk fd
> >       . xsk_umem__create()
> >       . xsk_socket__create
> >  - open map @ pinned path
> >  - bpf_map_update_elem(xsks_map, &key, &fd, 0);
> > 
> > Then it looks like we don't have any conflicts? The XDP program is pinned
> > and exists in its normal scope. The xsk elements can be added/deleted
> > as normal.
> 
> The only difference from what you wrote up is the resource pinning, when
> compared to what we currently have + the set I am proposing.
> 
> So, if you're saying it looks like we don't have any conflicts and I am
> saying that the flow is what we have, then...? :)
> 
> You would have to ask Magnus what was behind the decision to avoid API
> from tools/lib/bpf/libbpf.c but rather directly call underlying functions
> from tools/lib/bpf/bpf.c. Nevertheless, it doesn't really make a big
> difference to me.
> 
> > If the XDP program is removed and the map referencing (using
> > normal ref rules) reaches zero its also deleted. Above is more or less
> > the same flow we use for any BPF program so looks good to me.
> 
> We have to be a bit more specific around the "XDP program is removed".
> Is it removed from the network interface? That's the thing that we want to
> avoid when there are other xsk sockets active on a given interface.

What I'm suggesting here is to use the normal rules for when an
XDP program is removed from the network interface. I don't think we
should do anything extra to keep the XDP program attached/loaded
just because it has a xsk map which may or may not have some
entries in it. If the user wants this behavior they should pin
the bpf_link pointer associated with the xdp program. This is the
same as any other program/map I have in my BPF system.

> 
> With bpf_link, XDP prog is removed only when bpf_link's refcount reaches
> zero, via link->ops->release() callback that is invoked from
> bpf_link_put().
> 
> And the refcount is updated with each process that attaches/detaches from
> the bpf_link on interface.
> 
> > 
> > The trouble seems to pop up when using the higher abstraction APIs
> > xsk_setup_xdp_prog and friends I guess? I just see above as already
> > fairly easy to use and we have good helpers to create the sockets it looks
> > like. Maybe I missed some design considerations. IMO higher level
> > abstractions should go in new libxdp and above should stay in libbpf.
> 
> xsk_setup_xdp_prog doesn't really feel like higher level abstraction, as I
> mentioned, to me it has one layer of abstraction peeled off.

Except it seems to have caused some issues. I don't think I gain
much from the API personally vs just doing above steps. But, I
appreciate you are just trying to fix it here so patches are
a good idea with v2 improvements. And I expect whenever
libbpf/libxdp split the above "high level" APIs will land in
libxdp.

Thanks,
John

> 
> > 
> > /rant off ;)
> > 
> > Thanks,
> > John



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-15 21:38         ` Toke Høiland-Jørgensen
  2021-02-16  0:18           ` John Fastabend
@ 2021-02-17  2:23           ` Dan Siemon
  2021-02-17  7:16             ` Magnus Karlsson
  1 sibling, 1 reply; 37+ messages in thread
From: Dan Siemon @ 2021-02-17  2:23 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, daniel, ast, bpf,
	netdev, magnus.karlsson
  Cc: andrii, ciara.loftus

On Mon, 2021-02-15 at 22:38 +0100, Toke Høiland-Jørgensen wrote:
> The idea is to keep libbpf focused on bpf, and move the AF_XDP stuff
> to
> libxdp (so the socket stuff in xsk.h). We're adding the existing code
> wholesale, and keeping API compatibility during the move, so all
> that's
> needed is adding -lxdp when compiling. And obviously the existing
> libbpf
> code isn't going anywhere until such a time as there's a general
> backwards compatibility-breaking deprecation in libbpf (which I
> believe
> Andrii is planning to do in an upcoming and as-of-yet unannounced
> v1.0
> release).

I maintain a Rust binding to the AF_XDP parts of libbpf [1][2]. On the
chance that more significant changes can be entertained in the switch
to libxdp... The fact that many required functions like the ring access
functions exist only in xsk.h makes building a binding more difficult
because we need to wrap it with an extra C function [3]. From that
perspective, it would be great if those could move to xsk.c.

[1] - https://github.com/aterlo/afxdp-rs
[2] - https://github.com/alexforster/libbpf-sys
[3] - https://github.com/alexforster/libbpf-sys/blob/master/bindings.c


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-17  2:23           ` Dan Siemon
@ 2021-02-17  7:16             ` Magnus Karlsson
  2021-02-17  7:36               ` Magnus Karlsson
  0 siblings, 1 reply; 37+ messages in thread
From: Magnus Karlsson @ 2021-02-17  7:16 UTC (permalink / raw)
  To: Dan Siemon
  Cc: Toke Høiland-Jørgensen, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Daniel Borkmann,
	Alexei Starovoitov, bpf, Network Development, Karlsson, Magnus,
	Andrii Nakryiko, Ciara Loftus

On Wed, Feb 17, 2021 at 3:26 AM Dan Siemon <dan@coverfire.com> wrote:
>
> On Mon, 2021-02-15 at 22:38 +0100, Toke Høiland-Jørgensen wrote:
> > The idea is to keep libbpf focused on bpf, and move the AF_XDP stuff
> > to
> > libxdp (so the socket stuff in xsk.h). We're adding the existing code
> > wholesale, and keeping API compatibility during the move, so all
> > that's
> > needed is adding -lxdp when compiling. And obviously the existing
> > libbpf
> > code isn't going anywhere until such a time as there's a general
> > backwards compatibility-breaking deprecation in libbpf (which I
> > believe
> > Andrii is planning to do in an upcoming and as-of-yet unannounced
> > v1.0
> > release).
>
> I maintain a Rust binding to the AF_XDP parts of libbpf [1][2]. On the
> chance that more significant changes can be entertained in the switch
> to libxdp... The fact that many required functions like the ring access
> functions exist only in xsk.h makes building a binding more difficult
> because we need to wrap it with an extra C function [3]. From that
> perspective, it would be great if those could move to xsk.c.

The only reason they were put in xsk.h is performance. But with LTO
(link-time optimizations) being present in most C-compilers these
days, it might not be a valid argument anymore. I will perform some
experiments and let you know. As you say, it would be much nicer to
hide away these functions in the library proper and make your life
easier.

> [1] - https://github.com/aterlo/afxdp-rs
> [2] - https://github.com/alexforster/libbpf-sys
> [3] - https://github.com/alexforster/libbpf-sys/blob/master/bindings.c
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-17  7:16             ` Magnus Karlsson
@ 2021-02-17  7:36               ` Magnus Karlsson
  0 siblings, 0 replies; 37+ messages in thread
From: Magnus Karlsson @ 2021-02-17  7:36 UTC (permalink / raw)
  To: Dan Siemon
  Cc: Toke Høiland-Jørgensen, John Fastabend,
	Björn Töpel, Maciej Fijalkowski, Daniel Borkmann,
	Alexei Starovoitov, bpf, Network Development, Karlsson, Magnus,
	Andrii Nakryiko, Ciara Loftus

On Wed, Feb 17, 2021 at 8:16 AM Magnus Karlsson
<magnus.karlsson@gmail.com> wrote:
>
> On Wed, Feb 17, 2021 at 3:26 AM Dan Siemon <dan@coverfire.com> wrote:
> >
> > On Mon, 2021-02-15 at 22:38 +0100, Toke Høiland-Jørgensen wrote:
> > > The idea is to keep libbpf focused on bpf, and move the AF_XDP stuff
> > > to
> > > libxdp (so the socket stuff in xsk.h). We're adding the existing code
> > > wholesale, and keeping API compatibility during the move, so all
> > > that's
> > > needed is adding -lxdp when compiling. And obviously the existing
> > > libbpf
> > > code isn't going anywhere until such a time as there's a general
> > > backwards compatibility-breaking deprecation in libbpf (which I
> > > believe
> > > Andrii is planning to do in an upcoming and as-of-yet unannounced
> > > v1.0
> > > release).
> >
> > I maintain a Rust binding to the AF_XDP parts of libbpf [1][2]. On the
> > chance that more significant changes can be entertained in the switch
> > to libxdp... The fact that many required functions like the ring access
> > functions exist only in xsk.h makes building a binding more difficult
> > because we need to wrap it with an extra C function [3]. From that
> > perspective, it would be great if those could move to xsk.c.
>
> The only reason they were put in xsk.h is performance. But with LTO
> (link-time optimizations) being present in most C-compilers these
> days, it might not be a valid argument anymore. I will perform some
> experiments and let you know. As you say, it would be much nicer to
> hide away these functions in the library proper and make your life
> easier.

I would be very grateful for any more suggested changes that users out
there would like to see. Now, when we move to libxdp is the perfect
chance to fix those things. We might even decide to partially break
compatibility or change some behavior if the gain is large enough.

> > [1] - https://github.com/aterlo/afxdp-rs
> > [2] - https://github.com/alexforster/libbpf-sys
> > [3] - https://github.com/alexforster/libbpf-sys/blob/master/bindings.c
> >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
  2021-02-16 10:36             ` Toke Høiland-Jørgensen
@ 2021-02-23  1:15               ` Andrii Nakryiko
  0 siblings, 0 replies; 37+ messages in thread
From: Andrii Nakryiko @ 2021-02-23  1:15 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: John Fastabend, Björn Töpel, Maciej Fijalkowski,
	Daniel Borkmann, Alexei Starovoitov, bpf, Networking,
	Andrii Nakryiko, Magnus Karlsson, ciara.loftus

On Tue, Feb 16, 2021 at 2:37 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> John Fastabend <john.fastabend@gmail.com> writes:
>
> > Toke Høiland-Jørgensen wrote:
> >> John Fastabend <john.fastabend@gmail.com> writes:
> >>
> >> >> > However, in libxdp we can solve the original problem in a different way,
> >> >> > and in fact I already suggested to Magnus that we should do this (see
> >> >> > [1]); so one way forward could be to address it during the merge in
> >> >> > libxdp? It should be possible to address the original issue (two
> >> >> > instances of xdpsock breaking each other when they exit), but
> >> >> > applications will still need to do an explicit unload operation before
> >> >> > exiting (i.e., the automatic detach on bpf_link fd closure will take
> >> >> > more work, and likely require extending the bpf_link kernel support)...
> >> >> >
> >> >>
> >> >> I'd say it's depending on the libbpf 1.0/libxdp merge timeframe. If
> >> >> we're months ahead, then I'd really like to see this in libbpf until the
> >> >> merge. However, I'll leave that for Magnus/you to decide!
> >> >
> >> > Did I miss some thread? What does this mean libbpf 1.0/libxdp merge?
> >>
> >> The idea is to keep libbpf focused on bpf, and move the AF_XDP stuff to
> >> libxdp (so the socket stuff in xsk.h). We're adding the existing code
> >> wholesale, and keeping API compatibility during the move, so all that's
> >> needed is adding -lxdp when compiling. And obviously the existing libbpf
> >> code isn't going anywhere until such a time as there's a general
> >> backwards compatibility-breaking deprecation in libbpf (which I believe
> >> Andrii is planning to do in an upcoming and as-of-yet unannounced v1.0
> >> release).
> >
> > OK, I would like to keep the basic XDP pieces in libbpf though. For example
> > bpf_program__attach_xdp(). This way we don't have one lib to attach
> > everything, but XDP.
>
> The details are still TDB; for now, we're just merging in the XSK code
> to the libxdp repo. I expect Andrii to announce his plans for the rest
> soonish. I wouldn't expect basic things like that to go away, though :)

Yeah, I'll probably post more details this week. Just catching up on
stuff after vacation.

As mentioned already, all the basic APIs (i.e., APIs like
bpf_program__attach_xdp and bpf_set_link_xdp_fd, though I hope we can
give the latter a better name) will stay intact. Stay tuned!

>
> >>
> >> While integrating the XSK code into libxdp we're trying to make it
> >> compatible with the rest of the library (i.e., multi-prog). Hence my
> >> preference to avoid introducing something that makes this harder :)
> >>
> >> -Toke
> >>
> >
> > OK that makes sense to me thanks. But, I'm missing something (maybe its
> > obvious to everyone else?).
> >
> > When you load an XDP program you should get a reference to it. And then
> > XDP program should never be unloaded until that id is removed right? It
> > may or may not have an xsk map. Why does adding/removing programs from
> > an associated map have any impact on the XDP program? That seems like
> > the buggy part to me. No other map behaves this way as far as I can
> > tell. Now if the program with the XDP reference closes without pinning
> > the map or otherwise doing something with it, sure the map gets destroyed
> > and any xsk sockets are lost.
>
> The original bug comes from the XSK code abstracting away the fact that
> an AF_XDP socket needs an XDP program on the interface to work; so if
> none exists, the library will just load a program that redirects into
> the socket. Which breaks since the xdpsock example application is trying
> to be nice and clean up after itself, by removing the XDP program when
> it's done with the socket, thus breaking any other programs using that
> XDP program. So this patch introduces proper synchronisation on both add
> and remove of the XDP program...
>
> -Toke
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, back to index

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-15 15:46 [PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk Maciej Fijalkowski
2021-02-15 15:46 ` [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link Maciej Fijalkowski
2021-02-15 17:07   ` Toke Høiland-Jørgensen
2021-02-15 17:38     ` Björn Töpel
2021-02-15 19:35       ` Toke Høiland-Jørgensen
2021-02-16  2:01         ` Maciej Fijalkowski
2021-02-16  9:15           ` Björn Töpel
2021-02-16 10:27           ` Toke Høiland-Jørgensen
2021-02-16 20:15             ` Maciej Fijalkowski
2021-02-15 20:22       ` John Fastabend
2021-02-15 21:38         ` Toke Høiland-Jørgensen
2021-02-16  0:18           ` John Fastabend
2021-02-16  2:23             ` Maciej Fijalkowski
2021-02-16  9:23               ` Björn Töpel
2021-02-16 10:36             ` Toke Høiland-Jørgensen
2021-02-23  1:15               ` Andrii Nakryiko
2021-02-17  2:23           ` Dan Siemon
2021-02-17  7:16             ` Magnus Karlsson
2021-02-17  7:36               ` Magnus Karlsson
2021-02-16  2:10         ` Maciej Fijalkowski
2021-02-15 20:49   ` John Fastabend
2021-02-16  2:38     ` Maciej Fijalkowski
2021-02-16 18:19       ` John Fastabend
2021-02-16 20:10         ` Maciej Fijalkowski
2021-02-16  9:20     ` Björn Töpel
2021-02-16 10:39       ` Toke Høiland-Jørgensen
2021-02-16 19:15         ` John Fastabend
2021-02-16 20:50           ` Maciej Fijalkowski
2021-02-16 21:17             ` John Fastabend
2021-02-15 15:46 ` [PATCH bpf-next 2/3] libbpf: clear map_info before each bpf_obj_get_info_by_fd Maciej Fijalkowski
2021-02-15 20:33   ` John Fastabend
2021-02-16  2:42     ` Maciej Fijalkowski
2021-02-15 15:46 ` [PATCH bpf-next 3/3] samples: bpf: do not unload prog within xdpsock Maciej Fijalkowski
2021-02-15 20:24   ` John Fastabend
2021-02-16  9:22     ` Björn Töpel
2021-02-16 14:15       ` Maciej Fijalkowski
2021-02-15 16:07 ` [PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk Björn Töpel

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git