bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets
@ 2019-11-07 17:47 Magnus Karlsson
  2019-11-07 17:47 ` [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program Magnus Karlsson
                   ` (6 more replies)
  0 siblings, 7 replies; 27+ messages in thread
From: Magnus Karlsson @ 2019-11-07 17:47 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev,
	jonathan.lemon, u9012063
  Cc: bpf

This patch set extends libbpf and the xdpsock sample program to
demonstrate the shared umem mode (XDP_SHARED_UMEM) as well as Rx-only
and Tx-only sockets. This in order for users to have an example to use
as a blue print and also so that these modes will be exercised more
frequently.

Note that the user needs to supply an XDP program with the
XDP_SHARED_UMEM mode that distributes the packets over the sockets
according to some policy. There is an example supplied with the
xdpsock program, but there is no default one in libbpf similarly to
when XDP_SHARED_UMEM is not used. The reason for this is that I felt
that supplying one that would work for all users in this mode is
futile. There are just tons of ways to distribute packets, so whatever
I come up with and build into libbpf would be wrong in most cases.

This patch has been applied against commit 30ee348c1267 ("Merge branch 'bpf-libbpf-fixes'")

Structure of the patch set:

Patch 1: Adds shared umem support to libbpf
Patch 2: Shared umem support and example XPD program added to xdpsock sample
Patch 3: Adds Rx-only and Tx-only support to libbpf
Patch 4: Uses Rx-only sockets for rxdrop and Tx-only sockets for txpush in
         the xdpsock sample
Patch 5: Add documentation entries for these two features

Thanks: Magnus

Magnus Karlsson (5):
  libbpf: support XDP_SHARED_UMEM with external XDP program
  samples/bpf: add XDP_SHARED_UMEM support to xdpsock
  libbpf: allow for creating Rx or Tx only AF_XDP sockets
  samples/bpf: use Rx-only and Tx-only sockets in xdpsock
  xsk: extend documentation for Rx|Tx-only sockets and shared umems

 Documentation/networking/af_xdp.rst |  28 +++++--
 samples/bpf/Makefile                |   1 +
 samples/bpf/xdpsock.h               |  11 +++
 samples/bpf/xdpsock_kern.c          |  24 ++++++
 samples/bpf/xdpsock_user.c          | 158 ++++++++++++++++++++++++++----------
 tools/lib/bpf/xsk.c                 |  32 +++++---
 6 files changed, 195 insertions(+), 59 deletions(-)
 create mode 100644 samples/bpf/xdpsock.h
 create mode 100644 samples/bpf/xdpsock_kern.c

--
2.7.4

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program
  2019-11-07 17:47 [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets Magnus Karlsson
@ 2019-11-07 17:47 ` Magnus Karlsson
  2019-11-08 18:03   ` William Tu
  2019-11-08 22:56   ` Jonathan Lemon
  2019-11-07 17:47 ` [PATCH bpf-next 2/5] samples/bpf: add XDP_SHARED_UMEM support to xdpsock Magnus Karlsson
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 27+ messages in thread
From: Magnus Karlsson @ 2019-11-07 17:47 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev,
	jonathan.lemon, u9012063
  Cc: bpf

Add support in libbpf to create multiple sockets that share a single
umem. Note that an external XDP program need to be supplied that
routes the incoming traffic to the desired sockets. So you need to
supply the libbpf_flag XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD and load
your own XDP program.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 tools/lib/bpf/xsk.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index 86c1b61..8ebd810 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -586,15 +586,21 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
 	if (!umem || !xsk_ptr || !rx || !tx)
 		return -EFAULT;
 
-	if (umem->refcount) {
-		pr_warn("Error: shared umems not supported by libbpf.\n");
-		return -EBUSY;
-	}
-
 	xsk = calloc(1, sizeof(*xsk));
 	if (!xsk)
 		return -ENOMEM;
 
+	err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
+	if (err)
+		goto out_xsk_alloc;
+
+	if (umem->refcount &&
+	    !(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
+		pr_warn("Error: shared umems not supported by libbpf supplied XDP program.\n");
+		err = -EBUSY;
+		goto out_xsk_alloc;
+	}
+
 	if (umem->refcount++ > 0) {
 		xsk->fd = socket(AF_XDP, SOCK_RAW, 0);
 		if (xsk->fd < 0) {
@@ -616,10 +622,6 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
 	memcpy(xsk->ifname, ifname, IFNAMSIZ - 1);
 	xsk->ifname[IFNAMSIZ - 1] = '\0';
 
-	err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
-	if (err)
-		goto out_socket;
-
 	if (rx) {
 		err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING,
 				 &xsk->config.rx_size,
@@ -687,7 +689,12 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
 	sxdp.sxdp_family = PF_XDP;
 	sxdp.sxdp_ifindex = xsk->ifindex;
 	sxdp.sxdp_queue_id = xsk->queue_id;
-	sxdp.sxdp_flags = xsk->config.bind_flags;
+	if (umem->refcount > 1) {
+		sxdp.sxdp_flags = XDP_SHARED_UMEM;
+		sxdp.sxdp_shared_umem_fd = umem->fd;
+	} else {
+		sxdp.sxdp_flags = xsk->config.bind_flags;
+	}
 
 	err = bind(xsk->fd, (struct sockaddr *)&sxdp, sizeof(sxdp));
 	if (err) {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH bpf-next 2/5] samples/bpf: add XDP_SHARED_UMEM support to xdpsock
  2019-11-07 17:47 [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets Magnus Karlsson
  2019-11-07 17:47 ` [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program Magnus Karlsson
@ 2019-11-07 17:47 ` Magnus Karlsson
  2019-11-08 18:13   ` William Tu
  2019-11-08 22:59   ` Jonathan Lemon
  2019-11-07 17:47 ` [PATCH bpf-next 3/5] libbpf: allow for creating Rx or Tx only AF_XDP sockets Magnus Karlsson
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 27+ messages in thread
From: Magnus Karlsson @ 2019-11-07 17:47 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev,
	jonathan.lemon, u9012063
  Cc: bpf

Add support for the XDP_SHARED_UMEM mode to the xdpsock sample
application. As libbpf does not have a built in XDP program for this
mode, we use an explicitly loaded XDP program. This also serves as an
example on how to write your own XDP program that can route to an
AF_XDP socket.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 samples/bpf/Makefile       |   1 +
 samples/bpf/xdpsock.h      |  11 ++++
 samples/bpf/xdpsock_kern.c |  24 ++++++++
 samples/bpf/xdpsock_user.c | 141 +++++++++++++++++++++++++++++++--------------
 4 files changed, 135 insertions(+), 42 deletions(-)
 create mode 100644 samples/bpf/xdpsock.h
 create mode 100644 samples/bpf/xdpsock_kern.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 4df11dd..8a9af3a 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -167,6 +167,7 @@ always += xdp_sample_pkts_kern.o
 always += ibumad_kern.o
 always += hbm_out_kern.o
 always += hbm_edt_kern.o
+always += xdpsock_kern.o
 
 ifeq ($(ARCH), arm)
 # Strip all except -D__LINUX_ARM_ARCH__ option needed to handle linux
diff --git a/samples/bpf/xdpsock.h b/samples/bpf/xdpsock.h
new file mode 100644
index 0000000..b7eca15
--- /dev/null
+++ b/samples/bpf/xdpsock.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright(c) 2019 Intel Corporation.
+ */
+
+#ifndef XDPSOCK_H_
+#define XDPSOCK_H_
+
+#define MAX_SOCKS 4
+
+#endif /* XDPSOCK_H */
diff --git a/samples/bpf/xdpsock_kern.c b/samples/bpf/xdpsock_kern.c
new file mode 100644
index 0000000..a06177c
--- /dev/null
+++ b/samples/bpf/xdpsock_kern.c
@@ -0,0 +1,24 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include "bpf_helpers.h"
+#include "xdpsock.h"
+
+/* This XDP program is only needed for the XDP_SHARED_UMEM mode.
+ * If you do not use this mode, libbpf can supply an XDP program for you.
+ */
+
+struct {
+	__uint(type, BPF_MAP_TYPE_XSKMAP);
+	__uint(max_entries, MAX_SOCKS);
+	__uint(key_size, sizeof(int));
+	__uint(value_size, sizeof(int));
+} xsks_map SEC(".maps");
+
+static unsigned int rr;
+
+SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
+{
+	rr = (rr + 1) & (MAX_SOCKS - 1);
+
+	return bpf_redirect_map(&xsks_map, rr, XDP_DROP);
+}
diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
index 405c4e0..d3dba93 100644
--- a/samples/bpf/xdpsock_user.c
+++ b/samples/bpf/xdpsock_user.c
@@ -29,6 +29,7 @@
 
 #include "libbpf.h"
 #include "xsk.h"
+#include "xdpsock.h"
 #include <bpf/bpf.h>
 
 #ifndef SOL_XDP
@@ -47,7 +48,6 @@
 #define BATCH_SIZE 64
 
 #define DEBUG_HEXDUMP 0
-#define MAX_SOCKS 8
 
 typedef __u64 u64;
 typedef __u32 u32;
@@ -75,7 +75,8 @@ static u32 opt_xdp_bind_flags;
 static int opt_xsk_frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE;
 static int opt_timeout = 1000;
 static bool opt_need_wakeup = true;
-static __u32 prog_id;
+static u32 opt_num_xsks = 1;
+static u32 prog_id;
 
 struct xsk_umem_info {
 	struct xsk_ring_prod fq;
@@ -179,7 +180,7 @@ static void *poller(void *arg)
 
 static void remove_xdp_program(void)
 {
-	__u32 curr_prog_id = 0;
+	u32 curr_prog_id = 0;
 
 	if (bpf_get_link_xdp_id(opt_ifindex, &curr_prog_id, opt_xdp_flags)) {
 		printf("bpf_get_link_xdp_id failed\n");
@@ -196,11 +197,11 @@ static void remove_xdp_program(void)
 static void int_exit(int sig)
 {
 	struct xsk_umem *umem = xsks[0]->umem->umem;
-
-	(void)sig;
+	int i;
 
 	dump_stats();
-	xsk_socket__delete(xsks[0]->xsk);
+	for (i = 0; i < num_socks; i++)
+		xsk_socket__delete(xsks[i]->xsk);
 	(void)xsk_umem__delete(umem);
 	remove_xdp_program();
 
@@ -290,8 +291,8 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
 		.frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM,
 		.flags = opt_umem_flags
 	};
-
-	int ret;
+	int ret, i;
+	u32 idx;
 
 	umem = calloc(1, sizeof(*umem));
 	if (!umem)
@@ -303,6 +304,15 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
 	if (ret)
 		exit_with_error(-ret);
 
+	ret = xsk_ring_prod__reserve(&umem->fq,
+				     XSK_RING_PROD__DEFAULT_NUM_DESCS, &idx);
+	if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS)
+		exit_with_error(-ret);
+	for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS; i++)
+		*xsk_ring_prod__fill_addr(&umem->fq, idx++) =
+			i * opt_xsk_frame_size;
+	xsk_ring_prod__submit(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS);
+
 	umem->buffer = buffer;
 	return umem;
 }
@@ -312,8 +322,6 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
 	struct xsk_socket_config cfg;
 	struct xsk_socket_info *xsk;
 	int ret;
-	u32 idx;
-	int i;
 
 	xsk = calloc(1, sizeof(*xsk));
 	if (!xsk)
@@ -322,11 +330,15 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
 	xsk->umem = umem;
 	cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
 	cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
-	cfg.libbpf_flags = 0;
+	if (opt_num_xsks > 1)
+		cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;
+	else
+		cfg.libbpf_flags = 0;
 	cfg.xdp_flags = opt_xdp_flags;
 	cfg.bind_flags = opt_xdp_bind_flags;
-	ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue, umem->umem,
-				 &xsk->rx, &xsk->tx, &cfg);
+
+	ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue,
+				 umem->umem, &xsk->rx, &xsk->tx, &cfg);
 	if (ret)
 		exit_with_error(-ret);
 
@@ -334,17 +346,6 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
 	if (ret)
 		exit_with_error(-ret);
 
-	ret = xsk_ring_prod__reserve(&xsk->umem->fq,
-				     XSK_RING_PROD__DEFAULT_NUM_DESCS,
-				     &idx);
-	if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS)
-		exit_with_error(-ret);
-	for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS; i++)
-		*xsk_ring_prod__fill_addr(&xsk->umem->fq, idx++) =
-			i * opt_xsk_frame_size;
-	xsk_ring_prod__submit(&xsk->umem->fq,
-			      XSK_RING_PROD__DEFAULT_NUM_DESCS);
-
 	return xsk;
 }
 
@@ -363,6 +364,7 @@ static struct option long_options[] = {
 	{"frame-size", required_argument, 0, 'f'},
 	{"no-need-wakeup", no_argument, 0, 'm'},
 	{"unaligned", no_argument, 0, 'u'},
+	{"shared-umem", no_argument, 0, 'M'},
 	{0, 0, 0, 0}
 };
 
@@ -386,6 +388,7 @@ static void usage(const char *prog)
 		"  -m, --no-need-wakeup Turn off use of driver need wakeup flag.\n"
 		"  -f, --frame-size=n   Set the frame size (must be a power of two in aligned mode, default is %d).\n"
 		"  -u, --unaligned	Enable unaligned chunk placement\n"
+		"  -M, --shared-umem	Enable XDP_SHARED_UMEM\n"
 		"\n";
 	fprintf(stderr, str, prog, XSK_UMEM__DEFAULT_FRAME_SIZE);
 	exit(EXIT_FAILURE);
@@ -398,7 +401,7 @@ static void parse_command_line(int argc, char **argv)
 	opterr = 0;
 
 	for (;;) {
-		c = getopt_long(argc, argv, "Frtli:q:psSNn:czf:mu",
+		c = getopt_long(argc, argv, "Frtli:q:psSNn:czf:muM",
 				long_options, &option_index);
 		if (c == -1)
 			break;
@@ -448,11 +451,14 @@ static void parse_command_line(int argc, char **argv)
 			break;
 		case 'f':
 			opt_xsk_frame_size = atoi(optarg);
+			break;
 		case 'm':
 			opt_need_wakeup = false;
 			opt_xdp_bind_flags &= ~XDP_USE_NEED_WAKEUP;
 			break;
-
+		case 'M':
+			opt_num_xsks = MAX_SOCKS;
+			break;
 		default:
 			usage(basename(argv[0]));
 		}
@@ -586,11 +592,9 @@ static void rx_drop(struct xsk_socket_info *xsk, struct pollfd *fds)
 
 static void rx_drop_all(void)
 {
-	struct pollfd fds[MAX_SOCKS + 1];
+	struct pollfd fds[MAX_SOCKS] = {};
 	int i, ret;
 
-	memset(fds, 0, sizeof(fds));
-
 	for (i = 0; i < num_socks; i++) {
 		fds[i].fd = xsk_socket__fd(xsks[i]->xsk);
 		fds[i].events = POLLIN;
@@ -633,11 +637,10 @@ static void tx_only(struct xsk_socket_info *xsk, u32 frame_nb)
 
 static void tx_only_all(void)
 {
-	struct pollfd fds[MAX_SOCKS];
+	struct pollfd fds[MAX_SOCKS] = {};
 	u32 frame_nb[MAX_SOCKS] = {};
 	int i, ret;
 
-	memset(fds, 0, sizeof(fds));
 	for (i = 0; i < num_socks; i++) {
 		fds[0].fd = xsk_socket__fd(xsks[i]->xsk);
 		fds[0].events = POLLOUT;
@@ -706,11 +709,9 @@ static void l2fwd(struct xsk_socket_info *xsk, struct pollfd *fds)
 
 static void l2fwd_all(void)
 {
-	struct pollfd fds[MAX_SOCKS];
+	struct pollfd fds[MAX_SOCKS] = {};
 	int i, ret;
 
-	memset(fds, 0, sizeof(fds));
-
 	for (i = 0; i < num_socks; i++) {
 		fds[i].fd = xsk_socket__fd(xsks[i]->xsk);
 		fds[i].events = POLLOUT | POLLIN;
@@ -728,13 +729,65 @@ static void l2fwd_all(void)
 	}
 }
 
+static void load_xdp_program(char **argv, struct bpf_object **obj)
+{
+	struct bpf_prog_load_attr prog_load_attr = {
+		.prog_type      = BPF_PROG_TYPE_XDP,
+	};
+	char xdp_filename[256];
+	int prog_fd;
+
+	snprintf(xdp_filename, sizeof(xdp_filename), "%s_kern.o", argv[0]);
+	prog_load_attr.file = xdp_filename;
+
+	if (bpf_prog_load_xattr(&prog_load_attr, obj, &prog_fd))
+		exit(EXIT_FAILURE);
+	if (prog_fd < 0) {
+		fprintf(stderr, "ERROR: no program found: %s\n",
+			strerror(prog_fd));
+		exit(EXIT_FAILURE);
+	}
+
+	if (bpf_set_link_xdp_fd(opt_ifindex, prog_fd, opt_xdp_flags) < 0) {
+		fprintf(stderr, "ERROR: link set xdp fd failed\n");
+		exit(EXIT_FAILURE);
+	}
+}
+
+static void enter_xsks_into_map(struct bpf_object *obj)
+{
+	struct bpf_map *map;
+	int i, xsks_map;
+
+	map = bpf_object__find_map_by_name(obj, "xsks_map");
+	xsks_map = bpf_map__fd(map);
+	if (xsks_map < 0) {
+		fprintf(stderr, "ERROR: no xsks map found: %s\n",
+			strerror(xsks_map));
+			exit(EXIT_FAILURE);
+	}
+
+	for (i = 0; i < num_socks; i++) {
+		int fd = xsk_socket__fd(xsks[i]->xsk);
+		int key, ret;
+
+		key = i;
+		ret = bpf_map_update_elem(xsks_map, &key, &fd, 0);
+		if (ret) {
+			fprintf(stderr, "ERROR: bpf_map_update_elem %d\n", i);
+			exit(EXIT_FAILURE);
+		}
+	}
+}
+
 int main(int argc, char **argv)
 {
 	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct xsk_umem_info *umem;
+	struct bpf_object *obj;
 	pthread_t pt;
+	int i, ret;
 	void *bufs;
-	int ret;
 
 	parse_command_line(argc, argv);
 
@@ -744,6 +797,9 @@ int main(int argc, char **argv)
 		exit(EXIT_FAILURE);
 	}
 
+	if (opt_num_xsks > 1)
+		load_xdp_program(argv, &obj);
+
 	/* Reserve memory for the umem. Use hugepages if unaligned chunk mode */
 	bufs = mmap(NULL, NUM_FRAMES * opt_xsk_frame_size,
 		    PROT_READ | PROT_WRITE,
@@ -752,16 +808,17 @@ int main(int argc, char **argv)
 		printf("ERROR: mmap failed\n");
 		exit(EXIT_FAILURE);
 	}
-       /* Create sockets... */
+
+	/* Create sockets... */
 	umem = xsk_configure_umem(bufs, NUM_FRAMES * opt_xsk_frame_size);
-	xsks[num_socks++] = xsk_configure_socket(umem);
+	for (i = 0; i < opt_num_xsks; i++)
+		xsks[num_socks++] = xsk_configure_socket(umem);
 
-	if (opt_bench == BENCH_TXONLY) {
-		int i;
+	for (i = 0; i < NUM_FRAMES; i++)
+		gen_eth_frame(umem, i * opt_xsk_frame_size);
 
-		for (i = 0; i < NUM_FRAMES; i++)
-			(void)gen_eth_frame(umem, i * opt_xsk_frame_size);
-	}
+	if (opt_num_xsks > 1 && opt_bench != BENCH_TXONLY)
+		enter_xsks_into_map(obj);
 
 	signal(SIGINT, int_exit);
 	signal(SIGTERM, int_exit);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH bpf-next 3/5] libbpf: allow for creating Rx or Tx only AF_XDP sockets
  2019-11-07 17:47 [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets Magnus Karlsson
  2019-11-07 17:47 ` [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program Magnus Karlsson
  2019-11-07 17:47 ` [PATCH bpf-next 2/5] samples/bpf: add XDP_SHARED_UMEM support to xdpsock Magnus Karlsson
@ 2019-11-07 17:47 ` Magnus Karlsson
  2019-11-08 23:00   ` Jonathan Lemon
  2019-11-07 17:47 ` [PATCH bpf-next 4/5] samples/bpf: use Rx-only and Tx-only sockets in xdpsock Magnus Karlsson
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 27+ messages in thread
From: Magnus Karlsson @ 2019-11-07 17:47 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev,
	jonathan.lemon, u9012063
  Cc: bpf

The libbpf AF_XDP code is extended to allow for the creation of Rx
only or Tx only sockets. Previously it returned an error if the socket
was not initialized for both Rx and Tx.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 tools/lib/bpf/xsk.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index 8ebd810..303ed63 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -562,7 +562,8 @@ static int xsk_setup_xdp_prog(struct xsk_socket *xsk)
 		}
 	}
 
-	err = xsk_set_bpf_maps(xsk);
+	if (xsk->rx)
+		err = xsk_set_bpf_maps(xsk);
 	if (err) {
 		xsk_delete_bpf_maps(xsk);
 		close(xsk->prog_fd);
@@ -583,7 +584,7 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
 	struct xsk_socket *xsk;
 	int err;
 
-	if (!umem || !xsk_ptr || !rx || !tx)
+	if (!umem || !xsk_ptr || !(rx || tx))
 		return -EFAULT;
 
 	xsk = calloc(1, sizeof(*xsk));
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH bpf-next 4/5] samples/bpf: use Rx-only and Tx-only sockets in xdpsock
  2019-11-07 17:47 [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets Magnus Karlsson
                   ` (2 preceding siblings ...)
  2019-11-07 17:47 ` [PATCH bpf-next 3/5] libbpf: allow for creating Rx or Tx only AF_XDP sockets Magnus Karlsson
@ 2019-11-07 17:47 ` Magnus Karlsson
  2019-11-08 23:02   ` Jonathan Lemon
  2019-11-07 17:47 ` [PATCH bpf-next 5/5] xsk: extend documentation for Rx|Tx-only sockets and shared umems Magnus Karlsson
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 27+ messages in thread
From: Magnus Karlsson @ 2019-11-07 17:47 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev,
	jonathan.lemon, u9012063
  Cc: bpf

Use Rx-only sockets for the rxdrop sample and Tx-only sockets for the
txpush sample in the xdpsock application. This so that we exercise and
show case these socket types too.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 samples/bpf/xdpsock_user.c | 41 +++++++++++++++++++++++++++++------------
 1 file changed, 29 insertions(+), 12 deletions(-)

diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
index d3dba93..a1f96e5 100644
--- a/samples/bpf/xdpsock_user.c
+++ b/samples/bpf/xdpsock_user.c
@@ -291,8 +291,7 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
 		.frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM,
 		.flags = opt_umem_flags
 	};
-	int ret, i;
-	u32 idx;
+	int ret;
 
 	umem = calloc(1, sizeof(*umem));
 	if (!umem)
@@ -300,10 +299,18 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
 
 	ret = xsk_umem__create(&umem->umem, buffer, size, &umem->fq, &umem->cq,
 			       &cfg);
-
 	if (ret)
 		exit_with_error(-ret);
 
+	umem->buffer = buffer;
+	return umem;
+}
+
+static void xsk_populate_fill_ring(struct xsk_umem_info *umem)
+{
+	int ret, i;
+	u32 idx;
+
 	ret = xsk_ring_prod__reserve(&umem->fq,
 				     XSK_RING_PROD__DEFAULT_NUM_DESCS, &idx);
 	if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS)
@@ -312,15 +319,15 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
 		*xsk_ring_prod__fill_addr(&umem->fq, idx++) =
 			i * opt_xsk_frame_size;
 	xsk_ring_prod__submit(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS);
-
-	umem->buffer = buffer;
-	return umem;
 }
 
-static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
+static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem,
+						    bool rx, bool tx)
 {
 	struct xsk_socket_config cfg;
 	struct xsk_socket_info *xsk;
+	struct xsk_ring_cons *rxr;
+	struct xsk_ring_prod *txr;
 	int ret;
 
 	xsk = calloc(1, sizeof(*xsk));
@@ -337,8 +344,10 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
 	cfg.xdp_flags = opt_xdp_flags;
 	cfg.bind_flags = opt_xdp_bind_flags;
 
-	ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue,
-				 umem->umem, &xsk->rx, &xsk->tx, &cfg);
+	rxr = rx ? &xsk->rx : NULL;
+	txr = tx ? &xsk->tx : NULL;
+	ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue, umem->umem,
+				 rxr, txr, &cfg);
 	if (ret)
 		exit_with_error(-ret);
 
@@ -783,6 +792,7 @@ static void enter_xsks_into_map(struct bpf_object *obj)
 int main(int argc, char **argv)
 {
 	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
+	bool rx = false, tx = false;
 	struct xsk_umem_info *umem;
 	struct bpf_object *obj;
 	pthread_t pt;
@@ -811,11 +821,18 @@ int main(int argc, char **argv)
 
 	/* Create sockets... */
 	umem = xsk_configure_umem(bufs, NUM_FRAMES * opt_xsk_frame_size);
+	if (opt_bench == BENCH_RXDROP || opt_bench == BENCH_L2FWD) {
+		rx = true;
+		xsk_populate_fill_ring(umem);
+	}
+	if (opt_bench == BENCH_L2FWD || opt_bench == BENCH_TXONLY)
+		tx = true;
 	for (i = 0; i < opt_num_xsks; i++)
-		xsks[num_socks++] = xsk_configure_socket(umem);
+		xsks[num_socks++] = xsk_configure_socket(umem, rx, tx);
 
-	for (i = 0; i < NUM_FRAMES; i++)
-		gen_eth_frame(umem, i * opt_xsk_frame_size);
+	if (opt_bench == BENCH_TXONLY)
+		for (i = 0; i < NUM_FRAMES; i++)
+			gen_eth_frame(umem, i * opt_xsk_frame_size);
 
 	if (opt_num_xsks > 1 && opt_bench != BENCH_TXONLY)
 		enter_xsks_into_map(obj);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH bpf-next 5/5] xsk: extend documentation for Rx|Tx-only sockets and shared umems
  2019-11-07 17:47 [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets Magnus Karlsson
                   ` (3 preceding siblings ...)
  2019-11-07 17:47 ` [PATCH bpf-next 4/5] samples/bpf: use Rx-only and Tx-only sockets in xdpsock Magnus Karlsson
@ 2019-11-07 17:47 ` Magnus Karlsson
  2019-11-08 23:03   ` Jonathan Lemon
  2019-11-08 14:57 ` [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets William Tu
  2019-11-11  3:32 ` Alexei Starovoitov
  6 siblings, 1 reply; 27+ messages in thread
From: Magnus Karlsson @ 2019-11-07 17:47 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev,
	jonathan.lemon, u9012063
  Cc: bpf

Add more documentation about the new Rx-only and Tx-only sockets in
libbpf and also how libbpf can now support shared umems. Also found
two pieces that could be improved in the text, that got fixed in this
commit.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 Documentation/networking/af_xdp.rst | 28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst
index 7a4caaa..5bc55a4 100644
--- a/Documentation/networking/af_xdp.rst
+++ b/Documentation/networking/af_xdp.rst
@@ -295,7 +295,7 @@ round-robin example of distributing packets is shown below:
    {
 	rr = (rr + 1) & (MAX_SOCKS - 1);
 
-	return bpf_redirect_map(&xsks_map, rr, 0);
+	return bpf_redirect_map(&xsks_map, rr, XDP_DROP);
    }
 
 Note, that since there is only a single set of FILL and COMPLETION
@@ -304,6 +304,12 @@ to make sure that multiple processes or threads do not use these rings
 concurrently. There are no synchronization primitives in the
 libbpf code that protects multiple users at this point in time.
 
+Libbpf uses this mode if you create more than one socket tied to the
+same umem. However, note that you need to supply the
+XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the
+xsk_socket__create calls and load your own XDP program as there is no
+built in one in libbpf that will route the traffic for you.
+
 XDP_USE_NEED_WAKEUP bind flag
 -----------------------------
 
@@ -355,10 +361,22 @@ to set the size of at least one of the RX and TX rings. If you set
 both, you will be able to both receive and send traffic from your
 application, but if you only want to do one of them, you can save
 resources by only setting up one of them. Both the FILL ring and the
-COMPLETION ring are mandatory if you have a UMEM tied to your socket,
-which is the normal case. But if the XDP_SHARED_UMEM flag is used, any
-socket after the first one does not have a UMEM and should in that
-case not have any FILL or COMPLETION rings created.
+COMPLETION ring are mandatory as you need to have a UMEM tied to your
+socket. But if the XDP_SHARED_UMEM flag is used, any socket after the
+first one does not have a UMEM and should in that case not have any
+FILL or COMPLETION rings created as the ones from the shared umem will
+be used. Note, that the rings are single-producer single-consumer, so
+do not try to access them from multiple processes at the same
+time. See the XDP_SHARED_UMEM section.
+
+In libbpf, you can create Rx-only and Tx-only sockets by supplying
+NULL to the rx and tx arguments, respectively, to the
+xsk_socket__create function.
+
+If you create a Tx-only socket, we recommend that you do not put any
+packets on the fill ring. If you do this, drivers might think you are
+going to receive something when you in fact will not, and this can
+negatively impact performance.
 
 XDP_UMEM_REG setsockopt
 -----------------------
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets
  2019-11-07 17:47 [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets Magnus Karlsson
                   ` (4 preceding siblings ...)
  2019-11-07 17:47 ` [PATCH bpf-next 5/5] xsk: extend documentation for Rx|Tx-only sockets and shared umems Magnus Karlsson
@ 2019-11-08 14:57 ` William Tu
  2019-11-08 18:09   ` Magnus Karlsson
  2019-11-11  3:32 ` Alexei Starovoitov
  6 siblings, 1 reply; 27+ messages in thread
From: William Tu @ 2019-11-08 14:57 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: bjorn.topel, ast, daniel, netdev, jonathan.lemon, bpf

On Thu, Nov 07, 2019 at 06:47:35PM +0100, Magnus Karlsson wrote:
> This patch set extends libbpf and the xdpsock sample program to
> demonstrate the shared umem mode (XDP_SHARED_UMEM) as well as Rx-only
> and Tx-only sockets. This in order for users to have an example to use
> as a blue print and also so that these modes will be exercised more
> frequently.
> 
> Note that the user needs to supply an XDP program with the
> XDP_SHARED_UMEM mode that distributes the packets over the sockets
> according to some policy. There is an example supplied with the
> xdpsock program, but there is no default one in libbpf similarly to
> when XDP_SHARED_UMEM is not used. The reason for this is that I felt
> that supplying one that would work for all users in this mode is
> futile. There are just tons of ways to distribute packets, so whatever
> I come up with and build into libbpf would be wrong in most cases.
> 
Hi Magnus,

Thanks for the patch.
I look at the sample code and it's sharing a umem among multiple queues in
the same netdev. Is it possible to shared one umem across multiple netdevs?

For example in OVS, one might create multiple tap/veth devices (using skb-mode
or native-mode). And I want to save memory by having just one shared umem for
these devices.

Thanks
--William

> This patch has been applied against commit 30ee348c1267 ("Merge branch 'bpf-libbpf-fixes'")
> 
> Structure of the patch set:
> 
> Patch 1: Adds shared umem support to libbpf
> Patch 2: Shared umem support and example XPD program added to xdpsock sample
> Patch 3: Adds Rx-only and Tx-only support to libbpf
> Patch 4: Uses Rx-only sockets for rxdrop and Tx-only sockets for txpush in
>          the xdpsock sample
> Patch 5: Add documentation entries for these two features
> 
> Thanks: Magnus
> 
> Magnus Karlsson (5):
>   libbpf: support XDP_SHARED_UMEM with external XDP program
>   samples/bpf: add XDP_SHARED_UMEM support to xdpsock
>   libbpf: allow for creating Rx or Tx only AF_XDP sockets
>   samples/bpf: use Rx-only and Tx-only sockets in xdpsock
>   xsk: extend documentation for Rx|Tx-only sockets and shared umems
> 
>  Documentation/networking/af_xdp.rst |  28 +++++--
>  samples/bpf/Makefile                |   1 +
>  samples/bpf/xdpsock.h               |  11 +++
>  samples/bpf/xdpsock_kern.c          |  24 ++++++
>  samples/bpf/xdpsock_user.c          | 158 ++++++++++++++++++++++++++----------
>  tools/lib/bpf/xsk.c                 |  32 +++++---
>  6 files changed, 195 insertions(+), 59 deletions(-)
>  create mode 100644 samples/bpf/xdpsock.h
>  create mode 100644 samples/bpf/xdpsock_kern.c
> 
> --
> 2.7.4

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program
  2019-11-07 17:47 ` [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program Magnus Karlsson
@ 2019-11-08 18:03   ` William Tu
  2019-11-08 18:19     ` Magnus Karlsson
  2019-11-08 22:56   ` Jonathan Lemon
  1 sibling, 1 reply; 27+ messages in thread
From: William Tu @ 2019-11-08 18:03 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: bjorn.topel, ast, daniel, netdev, jonathan.lemon, bpf

Hi Magnus,

Thanks for the patch.

On Thu, Nov 07, 2019 at 06:47:36PM +0100, Magnus Karlsson wrote:
> Add support in libbpf to create multiple sockets that share a single
> umem. Note that an external XDP program need to be supplied that
> routes the incoming traffic to the desired sockets. So you need to
> supply the libbpf_flag XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD and load
> your own XDP program.
> 
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> ---
>  tools/lib/bpf/xsk.c | 27 +++++++++++++++++----------
>  1 file changed, 17 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> index 86c1b61..8ebd810 100644
> --- a/tools/lib/bpf/xsk.c
> +++ b/tools/lib/bpf/xsk.c
> @@ -586,15 +586,21 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
>  	if (!umem || !xsk_ptr || !rx || !tx)
>  		return -EFAULT;
>  
> -	if (umem->refcount) {
> -		pr_warn("Error: shared umems not supported by libbpf.\n");
> -		return -EBUSY;
> -	}
> -
>  	xsk = calloc(1, sizeof(*xsk));
>  	if (!xsk)
>  		return -ENOMEM;
>  
> +	err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
> +	if (err)
> +		goto out_xsk_alloc;
> +
> +	if (umem->refcount &&
> +	    !(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
> +		pr_warn("Error: shared umems not supported by libbpf supplied XDP program.\n");

Why can't we use the existing default one in libbpf?
If users don't want to redistribute packet to different queue,
then they can still use the libbpf default one.

William
> +		err = -EBUSY;
> +		goto out_xsk_alloc;
> +	}
> +
>  	if (umem->refcount++ > 0) {
>  		xsk->fd = socket(AF_XDP, SOCK_RAW, 0);
>  		if (xsk->fd < 0) {
> @@ -616,10 +622,6 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
>  	memcpy(xsk->ifname, ifname, IFNAMSIZ - 1);
>  	xsk->ifname[IFNAMSIZ - 1] = '\0';
>  
> -	err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
> -	if (err)
> -		goto out_socket;
> -
>  	if (rx) {
>  		err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING,
>  				 &xsk->config.rx_size,
> @@ -687,7 +689,12 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
>  	sxdp.sxdp_family = PF_XDP;
>  	sxdp.sxdp_ifindex = xsk->ifindex;
>  	sxdp.sxdp_queue_id = xsk->queue_id;
> -	sxdp.sxdp_flags = xsk->config.bind_flags;
> +	if (umem->refcount > 1) {
> +		sxdp.sxdp_flags = XDP_SHARED_UMEM;
> +		sxdp.sxdp_shared_umem_fd = umem->fd;
> +	} else {
> +		sxdp.sxdp_flags = xsk->config.bind_flags;
> +	}
>  
>  	err = bind(xsk->fd, (struct sockaddr *)&sxdp, sizeof(sxdp));
>  	if (err) {
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets
  2019-11-08 14:57 ` [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets William Tu
@ 2019-11-08 18:09   ` Magnus Karlsson
  0 siblings, 0 replies; 27+ messages in thread
From: Magnus Karlsson @ 2019-11-08 18:09 UTC (permalink / raw)
  To: William Tu
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Daniel Borkmann, Network Development, Jonathan Lemon, bpf

On Fri, Nov 8, 2019 at 3:58 PM William Tu <u9012063@gmail.com> wrote:
>
> On Thu, Nov 07, 2019 at 06:47:35PM +0100, Magnus Karlsson wrote:
> > This patch set extends libbpf and the xdpsock sample program to
> > demonstrate the shared umem mode (XDP_SHARED_UMEM) as well as Rx-only
> > and Tx-only sockets. This in order for users to have an example to use
> > as a blue print and also so that these modes will be exercised more
> > frequently.
> >
> > Note that the user needs to supply an XDP program with the
> > XDP_SHARED_UMEM mode that distributes the packets over the sockets
> > according to some policy. There is an example supplied with the
> > xdpsock program, but there is no default one in libbpf similarly to
> > when XDP_SHARED_UMEM is not used. The reason for this is that I felt
> > that supplying one that would work for all users in this mode is
> > futile. There are just tons of ways to distribute packets, so whatever
> > I come up with and build into libbpf would be wrong in most cases.
> >
> Hi Magnus,
>
> Thanks for the patch.
> I look at the sample code and it's sharing a umem among multiple queues in
> the same netdev. Is it possible to shared one umem across multiple netdevs?

It should be possible to register the same umem area multiple times
(wasting memory in the current implementation though). I have not
tried this though, so I might be surprised. You really have to make
sure that you only give a buffer (through the Tx or fill rings) to a
single device. If you do not, your packets will be garbled. But this
needs some testing first and some extension to libbpf to make it
simple. I can try it out, but this will be another patch set.

/Magnus

> For example in OVS, one might create multiple tap/veth devices (using skb-mode
> or native-mode). And I want to save memory by having just one shared umem for
> these devices.
>
> Thanks
> --William
>
> > This patch has been applied against commit 30ee348c1267 ("Merge branch 'bpf-libbpf-fixes'")
> >
> > Structure of the patch set:
> >
> > Patch 1: Adds shared umem support to libbpf
> > Patch 2: Shared umem support and example XPD program added to xdpsock sample
> > Patch 3: Adds Rx-only and Tx-only support to libbpf
> > Patch 4: Uses Rx-only sockets for rxdrop and Tx-only sockets for txpush in
> >          the xdpsock sample
> > Patch 5: Add documentation entries for these two features
> >
> > Thanks: Magnus
> >
> > Magnus Karlsson (5):
> >   libbpf: support XDP_SHARED_UMEM with external XDP program
> >   samples/bpf: add XDP_SHARED_UMEM support to xdpsock
> >   libbpf: allow for creating Rx or Tx only AF_XDP sockets
> >   samples/bpf: use Rx-only and Tx-only sockets in xdpsock
> >   xsk: extend documentation for Rx|Tx-only sockets and shared umems
> >
> >  Documentation/networking/af_xdp.rst |  28 +++++--
> >  samples/bpf/Makefile                |   1 +
> >  samples/bpf/xdpsock.h               |  11 +++
> >  samples/bpf/xdpsock_kern.c          |  24 ++++++
> >  samples/bpf/xdpsock_user.c          | 158 ++++++++++++++++++++++++++----------
> >  tools/lib/bpf/xsk.c                 |  32 +++++---
> >  6 files changed, 195 insertions(+), 59 deletions(-)
> >  create mode 100644 samples/bpf/xdpsock.h
> >  create mode 100644 samples/bpf/xdpsock_kern.c
> >
> > --
> > 2.7.4

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 2/5] samples/bpf: add XDP_SHARED_UMEM support to xdpsock
  2019-11-07 17:47 ` [PATCH bpf-next 2/5] samples/bpf: add XDP_SHARED_UMEM support to xdpsock Magnus Karlsson
@ 2019-11-08 18:13   ` William Tu
  2019-11-08 18:33     ` Magnus Karlsson
  2019-11-08 22:59   ` Jonathan Lemon
  1 sibling, 1 reply; 27+ messages in thread
From: William Tu @ 2019-11-08 18:13 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: bjorn.topel, ast, daniel, netdev, jonathan.lemon, bpf

On Thu, Nov 07, 2019 at 06:47:37PM +0100, Magnus Karlsson wrote:
> Add support for the XDP_SHARED_UMEM mode to the xdpsock sample
> application. As libbpf does not have a built in XDP program for this
> mode, we use an explicitly loaded XDP program. This also serves as an
> example on how to write your own XDP program that can route to an
> AF_XDP socket.
> 
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> ---
>  samples/bpf/Makefile       |   1 +
>  samples/bpf/xdpsock.h      |  11 ++++
>  samples/bpf/xdpsock_kern.c |  24 ++++++++
>  samples/bpf/xdpsock_user.c | 141 +++++++++++++++++++++++++++++++--------------
>  4 files changed, 135 insertions(+), 42 deletions(-)
>  create mode 100644 samples/bpf/xdpsock.h
>  create mode 100644 samples/bpf/xdpsock_kern.c
> 
> diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
> index 4df11dd..8a9af3a 100644
> --- a/samples/bpf/Makefile
> +++ b/samples/bpf/Makefile
> @@ -167,6 +167,7 @@ always += xdp_sample_pkts_kern.o
>  always += ibumad_kern.o
>  always += hbm_out_kern.o
>  always += hbm_edt_kern.o
> +always += xdpsock_kern.o
>  
>  ifeq ($(ARCH), arm)
>  # Strip all except -D__LINUX_ARM_ARCH__ option needed to handle linux
> diff --git a/samples/bpf/xdpsock.h b/samples/bpf/xdpsock.h
> new file mode 100644
> index 0000000..b7eca15
> --- /dev/null
> +++ b/samples/bpf/xdpsock.h
> @@ -0,0 +1,11 @@
> +/* SPDX-License-Identifier: GPL-2.0
> + *
> + * Copyright(c) 2019 Intel Corporation.
> + */
> +
> +#ifndef XDPSOCK_H_
> +#define XDPSOCK_H_
> +
> +#define MAX_SOCKS 4
> +
> +#endif /* XDPSOCK_H */
> diff --git a/samples/bpf/xdpsock_kern.c b/samples/bpf/xdpsock_kern.c
> new file mode 100644
> index 0000000..a06177c
> --- /dev/null
> +++ b/samples/bpf/xdpsock_kern.c
> @@ -0,0 +1,24 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/bpf.h>
> +#include "bpf_helpers.h"
> +#include "xdpsock.h"
> +
> +/* This XDP program is only needed for the XDP_SHARED_UMEM mode.
> + * If you do not use this mode, libbpf can supply an XDP program for you.
> + */
> +
> +struct {
> +	__uint(type, BPF_MAP_TYPE_XSKMAP);
> +	__uint(max_entries, MAX_SOCKS);
> +	__uint(key_size, sizeof(int));
> +	__uint(value_size, sizeof(int));
> +} xsks_map SEC(".maps");
> +
> +static unsigned int rr;
> +
> +SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
> +{
> +	rr = (rr + 1) & (MAX_SOCKS - 1);
> +
> +	return bpf_redirect_map(&xsks_map, rr, XDP_DROP);
> +}
> diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
> index 405c4e0..d3dba93 100644
> --- a/samples/bpf/xdpsock_user.c
> +++ b/samples/bpf/xdpsock_user.c
> @@ -29,6 +29,7 @@
>  
>  #include "libbpf.h"
>  #include "xsk.h"
> +#include "xdpsock.h"
>  #include <bpf/bpf.h>
>  
>  #ifndef SOL_XDP
> @@ -47,7 +48,6 @@
>  #define BATCH_SIZE 64
>  
>  #define DEBUG_HEXDUMP 0
> -#define MAX_SOCKS 8
>  
>  typedef __u64 u64;
>  typedef __u32 u32;
> @@ -75,7 +75,8 @@ static u32 opt_xdp_bind_flags;
>  static int opt_xsk_frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE;
>  static int opt_timeout = 1000;
>  static bool opt_need_wakeup = true;
> -static __u32 prog_id;
> +static u32 opt_num_xsks = 1;
> +static u32 prog_id;
>  
>  struct xsk_umem_info {
>  	struct xsk_ring_prod fq;
> @@ -179,7 +180,7 @@ static void *poller(void *arg)
>  
>  static void remove_xdp_program(void)
>  {
> -	__u32 curr_prog_id = 0;
> +	u32 curr_prog_id = 0;
>  
>  	if (bpf_get_link_xdp_id(opt_ifindex, &curr_prog_id, opt_xdp_flags)) {
>  		printf("bpf_get_link_xdp_id failed\n");
> @@ -196,11 +197,11 @@ static void remove_xdp_program(void)
>  static void int_exit(int sig)
>  {
>  	struct xsk_umem *umem = xsks[0]->umem->umem;
> -
> -	(void)sig;
> +	int i;
>  
>  	dump_stats();
> -	xsk_socket__delete(xsks[0]->xsk);
> +	for (i = 0; i < num_socks; i++)
> +		xsk_socket__delete(xsks[i]->xsk);
>  	(void)xsk_umem__delete(umem);
>  	remove_xdp_program();
>  
> @@ -290,8 +291,8 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
>  		.frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM,
>  		.flags = opt_umem_flags
>  	};
> -
> -	int ret;
> +	int ret, i;
> +	u32 idx;
>  
>  	umem = calloc(1, sizeof(*umem));
>  	if (!umem)
> @@ -303,6 +304,15 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
>  	if (ret)
>  		exit_with_error(-ret);
>  
> +	ret = xsk_ring_prod__reserve(&umem->fq,
> +				     XSK_RING_PROD__DEFAULT_NUM_DESCS, &idx);
> +	if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS)
> +		exit_with_error(-ret);
> +	for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS; i++)
> +		*xsk_ring_prod__fill_addr(&umem->fq, idx++) =
> +			i * opt_xsk_frame_size;
> +	xsk_ring_prod__submit(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS);
> +
>  	umem->buffer = buffer;
>  	return umem;
>  }
> @@ -312,8 +322,6 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
>  	struct xsk_socket_config cfg;
>  	struct xsk_socket_info *xsk;
>  	int ret;
> -	u32 idx;
> -	int i;
>  
>  	xsk = calloc(1, sizeof(*xsk));
>  	if (!xsk)
> @@ -322,11 +330,15 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
>  	xsk->umem = umem;
>  	cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
>  	cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
> -	cfg.libbpf_flags = 0;
> +	if (opt_num_xsks > 1)
> +		cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;

I think we can still load our own XDP program, and don't set
XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD.
So the xsk_setup_xdp_prog() will find the the loaded XDP program
and sets the xsk map.

> +	else
> +		cfg.libbpf_flags = 0;
>  	cfg.xdp_flags = opt_xdp_flags;
>  	cfg.bind_flags = opt_xdp_bind_flags;

Do we need to 
cfg.bind_flags |= XDP_SHARED_UMEM?

Thanks
William

> -	ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue, umem->umem,
> -				 &xsk->rx, &xsk->tx, &cfg);
> +
> +	ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue,
> +				 umem->umem, &xsk->rx, &xsk->tx, &cfg);
>  	if (ret)
>  		exit_with_error(-ret);
>  
> @@ -334,17 +346,6 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
>  	if (ret)
>  		exit_with_error(-ret);
>  
> -	ret = xsk_ring_prod__reserve(&xsk->umem->fq,
> -				     XSK_RING_PROD__DEFAULT_NUM_DESCS,
> -				     &idx);
> -	if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS)
> -		exit_with_error(-ret);
> -	for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS; i++)
> -		*xsk_ring_prod__fill_addr(&xsk->umem->fq, idx++) =
> -			i * opt_xsk_frame_size;
> -	xsk_ring_prod__submit(&xsk->umem->fq,
> -			      XSK_RING_PROD__DEFAULT_NUM_DESCS);
> -
>  	return xsk;
>  }
>  
> @@ -363,6 +364,7 @@ static struct option long_options[] = {
>  	{"frame-size", required_argument, 0, 'f'},
>  	{"no-need-wakeup", no_argument, 0, 'm'},
>  	{"unaligned", no_argument, 0, 'u'},
> +	{"shared-umem", no_argument, 0, 'M'},
>  	{0, 0, 0, 0}
>  };
>  
> @@ -386,6 +388,7 @@ static void usage(const char *prog)
>  		"  -m, --no-need-wakeup Turn off use of driver need wakeup flag.\n"
>  		"  -f, --frame-size=n   Set the frame size (must be a power of two in aligned mode, default is %d).\n"
>  		"  -u, --unaligned	Enable unaligned chunk placement\n"
> +		"  -M, --shared-umem	Enable XDP_SHARED_UMEM\n"
>  		"\n";
>  	fprintf(stderr, str, prog, XSK_UMEM__DEFAULT_FRAME_SIZE);
>  	exit(EXIT_FAILURE);
> @@ -398,7 +401,7 @@ static void parse_command_line(int argc, char **argv)
>  	opterr = 0;
>  
>  	for (;;) {
> -		c = getopt_long(argc, argv, "Frtli:q:psSNn:czf:mu",
> +		c = getopt_long(argc, argv, "Frtli:q:psSNn:czf:muM",
>  				long_options, &option_index);
>  		if (c == -1)
>  			break;
> @@ -448,11 +451,14 @@ static void parse_command_line(int argc, char **argv)
>  			break;
>  		case 'f':
>  			opt_xsk_frame_size = atoi(optarg);
> +			break;
>  		case 'm':
>  			opt_need_wakeup = false;
>  			opt_xdp_bind_flags &= ~XDP_USE_NEED_WAKEUP;
>  			break;
> -
> +		case 'M':
> +			opt_num_xsks = MAX_SOCKS;
> +			break;
>  		default:
>  			usage(basename(argv[0]));
>  		}
> @@ -586,11 +592,9 @@ static void rx_drop(struct xsk_socket_info *xsk, struct pollfd *fds)
>  
>  static void rx_drop_all(void)
>  {
> -	struct pollfd fds[MAX_SOCKS + 1];
> +	struct pollfd fds[MAX_SOCKS] = {};
>  	int i, ret;
>  
> -	memset(fds, 0, sizeof(fds));
> -
>  	for (i = 0; i < num_socks; i++) {
>  		fds[i].fd = xsk_socket__fd(xsks[i]->xsk);
>  		fds[i].events = POLLIN;
> @@ -633,11 +637,10 @@ static void tx_only(struct xsk_socket_info *xsk, u32 frame_nb)
>  
>  static void tx_only_all(void)
>  {
> -	struct pollfd fds[MAX_SOCKS];
> +	struct pollfd fds[MAX_SOCKS] = {};
>  	u32 frame_nb[MAX_SOCKS] = {};
>  	int i, ret;
>  
> -	memset(fds, 0, sizeof(fds));
>  	for (i = 0; i < num_socks; i++) {
>  		fds[0].fd = xsk_socket__fd(xsks[i]->xsk);
>  		fds[0].events = POLLOUT;
> @@ -706,11 +709,9 @@ static void l2fwd(struct xsk_socket_info *xsk, struct pollfd *fds)
>  
>  static void l2fwd_all(void)
>  {
> -	struct pollfd fds[MAX_SOCKS];
> +	struct pollfd fds[MAX_SOCKS] = {};
>  	int i, ret;
>  
> -	memset(fds, 0, sizeof(fds));
> -
>  	for (i = 0; i < num_socks; i++) {
>  		fds[i].fd = xsk_socket__fd(xsks[i]->xsk);
>  		fds[i].events = POLLOUT | POLLIN;
> @@ -728,13 +729,65 @@ static void l2fwd_all(void)
>  	}
>  }
>  
> +static void load_xdp_program(char **argv, struct bpf_object **obj)
> +{
> +	struct bpf_prog_load_attr prog_load_attr = {
> +		.prog_type      = BPF_PROG_TYPE_XDP,
> +	};
> +	char xdp_filename[256];
> +	int prog_fd;
> +
> +	snprintf(xdp_filename, sizeof(xdp_filename), "%s_kern.o", argv[0]);
> +	prog_load_attr.file = xdp_filename;
> +
> +	if (bpf_prog_load_xattr(&prog_load_attr, obj, &prog_fd))
> +		exit(EXIT_FAILURE);
> +	if (prog_fd < 0) {
> +		fprintf(stderr, "ERROR: no program found: %s\n",
> +			strerror(prog_fd));
> +		exit(EXIT_FAILURE);
> +	}
> +
> +	if (bpf_set_link_xdp_fd(opt_ifindex, prog_fd, opt_xdp_flags) < 0) {
> +		fprintf(stderr, "ERROR: link set xdp fd failed\n");
> +		exit(EXIT_FAILURE);
> +	}
> +}
> +
> +static void enter_xsks_into_map(struct bpf_object *obj)
> +{
> +	struct bpf_map *map;
> +	int i, xsks_map;
> +
> +	map = bpf_object__find_map_by_name(obj, "xsks_map");
> +	xsks_map = bpf_map__fd(map);
> +	if (xsks_map < 0) {
> +		fprintf(stderr, "ERROR: no xsks map found: %s\n",
> +			strerror(xsks_map));
> +			exit(EXIT_FAILURE);
> +	}
> +
> +	for (i = 0; i < num_socks; i++) {
> +		int fd = xsk_socket__fd(xsks[i]->xsk);
> +		int key, ret;
> +
> +		key = i;
> +		ret = bpf_map_update_elem(xsks_map, &key, &fd, 0);
> +		if (ret) {
> +			fprintf(stderr, "ERROR: bpf_map_update_elem %d\n", i);
> +			exit(EXIT_FAILURE);
> +		}
> +	}
> +}
> +
>  int main(int argc, char **argv)
>  {
>  	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
>  	struct xsk_umem_info *umem;
> +	struct bpf_object *obj;
>  	pthread_t pt;
> +	int i, ret;
>  	void *bufs;
> -	int ret;
>  
>  	parse_command_line(argc, argv);
>  
> @@ -744,6 +797,9 @@ int main(int argc, char **argv)
>  		exit(EXIT_FAILURE);
>  	}
>  
> +	if (opt_num_xsks > 1)
> +		load_xdp_program(argv, &obj);
> +
>  	/* Reserve memory for the umem. Use hugepages if unaligned chunk mode */
>  	bufs = mmap(NULL, NUM_FRAMES * opt_xsk_frame_size,
>  		    PROT_READ | PROT_WRITE,
> @@ -752,16 +808,17 @@ int main(int argc, char **argv)
>  		printf("ERROR: mmap failed\n");
>  		exit(EXIT_FAILURE);
>  	}
> -       /* Create sockets... */
> +
> +	/* Create sockets... */
>  	umem = xsk_configure_umem(bufs, NUM_FRAMES * opt_xsk_frame_size);
> -	xsks[num_socks++] = xsk_configure_socket(umem);
> +	for (i = 0; i < opt_num_xsks; i++)
> +		xsks[num_socks++] = xsk_configure_socket(umem);
>  
> -	if (opt_bench == BENCH_TXONLY) {
> -		int i;
> +	for (i = 0; i < NUM_FRAMES; i++)
> +		gen_eth_frame(umem, i * opt_xsk_frame_size);
>  
> -		for (i = 0; i < NUM_FRAMES; i++)
> -			(void)gen_eth_frame(umem, i * opt_xsk_frame_size);
> -	}
> +	if (opt_num_xsks > 1 && opt_bench != BENCH_TXONLY)
> +		enter_xsks_into_map(obj);
>  
>  	signal(SIGINT, int_exit);
>  	signal(SIGTERM, int_exit);
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program
  2019-11-08 18:03   ` William Tu
@ 2019-11-08 18:19     ` Magnus Karlsson
  2019-11-08 18:43       ` William Tu
  0 siblings, 1 reply; 27+ messages in thread
From: Magnus Karlsson @ 2019-11-08 18:19 UTC (permalink / raw)
  To: William Tu
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Daniel Borkmann, Network Development, Jonathan Lemon, bpf

On Fri, Nov 8, 2019 at 7:03 PM William Tu <u9012063@gmail.com> wrote:
>
> Hi Magnus,
>
> Thanks for the patch.
>
> On Thu, Nov 07, 2019 at 06:47:36PM +0100, Magnus Karlsson wrote:
> > Add support in libbpf to create multiple sockets that share a single
> > umem. Note that an external XDP program need to be supplied that
> > routes the incoming traffic to the desired sockets. So you need to
> > supply the libbpf_flag XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD and load
> > your own XDP program.
> >
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > ---
> >  tools/lib/bpf/xsk.c | 27 +++++++++++++++++----------
> >  1 file changed, 17 insertions(+), 10 deletions(-)
> >
> > diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> > index 86c1b61..8ebd810 100644
> > --- a/tools/lib/bpf/xsk.c
> > +++ b/tools/lib/bpf/xsk.c
> > @@ -586,15 +586,21 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
> >       if (!umem || !xsk_ptr || !rx || !tx)
> >               return -EFAULT;
> >
> > -     if (umem->refcount) {
> > -             pr_warn("Error: shared umems not supported by libbpf.\n");
> > -             return -EBUSY;
> > -     }
> > -
> >       xsk = calloc(1, sizeof(*xsk));
> >       if (!xsk)
> >               return -ENOMEM;
> >
> > +     err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
> > +     if (err)
> > +             goto out_xsk_alloc;
> > +
> > +     if (umem->refcount &&
> > +         !(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
> > +             pr_warn("Error: shared umems not supported by libbpf supplied XDP program.\n");
>
> Why can't we use the existing default one in libbpf?
> If users don't want to redistribute packet to different queue,
> then they can still use the libbpf default one.

Is there any point in creating two or more sockets tied to the same
umem and directing all traffic to just one socket? IMHO, I believe
that most users in this case would want to distribute the packets over
the sockets in some way. I also think that users might be unpleasantly
surprised if they create multiple sockets and all packets only get to
a single socket because libbpf loaded an XDP program that makes little
sense in the XDP_SHARED_UMEM case. If we force them to supply an XDP
program, they need to make this decision. I also wanted to extend the
sample with an explicit user loaded XDP program as an example of how
to do this. What do you think?

/Magnus

> William
> > +             err = -EBUSY;
> > +             goto out_xsk_alloc;
> > +     }
> > +
> >       if (umem->refcount++ > 0) {
> >               xsk->fd = socket(AF_XDP, SOCK_RAW, 0);
> >               if (xsk->fd < 0) {
> > @@ -616,10 +622,6 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
> >       memcpy(xsk->ifname, ifname, IFNAMSIZ - 1);
> >       xsk->ifname[IFNAMSIZ - 1] = '\0';
> >
> > -     err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
> > -     if (err)
> > -             goto out_socket;
> > -
> >       if (rx) {
> >               err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING,
> >                                &xsk->config.rx_size,
> > @@ -687,7 +689,12 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
> >       sxdp.sxdp_family = PF_XDP;
> >       sxdp.sxdp_ifindex = xsk->ifindex;
> >       sxdp.sxdp_queue_id = xsk->queue_id;
> > -     sxdp.sxdp_flags = xsk->config.bind_flags;
> > +     if (umem->refcount > 1) {
> > +             sxdp.sxdp_flags = XDP_SHARED_UMEM;
> > +             sxdp.sxdp_shared_umem_fd = umem->fd;
> > +     } else {
> > +             sxdp.sxdp_flags = xsk->config.bind_flags;
> > +     }
> >
> >       err = bind(xsk->fd, (struct sockaddr *)&sxdp, sizeof(sxdp));
> >       if (err) {
> > --
> > 2.7.4
> >

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 2/5] samples/bpf: add XDP_SHARED_UMEM support to xdpsock
  2019-11-08 18:13   ` William Tu
@ 2019-11-08 18:33     ` Magnus Karlsson
  2019-11-08 19:09       ` William Tu
  0 siblings, 1 reply; 27+ messages in thread
From: Magnus Karlsson @ 2019-11-08 18:33 UTC (permalink / raw)
  To: William Tu
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Daniel Borkmann, Network Development, Jonathan Lemon, bpf

On Fri, Nov 8, 2019 at 7:15 PM William Tu <u9012063@gmail.com> wrote:
>
> On Thu, Nov 07, 2019 at 06:47:37PM +0100, Magnus Karlsson wrote:
> > Add support for the XDP_SHARED_UMEM mode to the xdpsock sample
> > application. As libbpf does not have a built in XDP program for this
> > mode, we use an explicitly loaded XDP program. This also serves as an
> > example on how to write your own XDP program that can route to an
> > AF_XDP socket.
> >
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > ---
> >  samples/bpf/Makefile       |   1 +
> >  samples/bpf/xdpsock.h      |  11 ++++
> >  samples/bpf/xdpsock_kern.c |  24 ++++++++
> >  samples/bpf/xdpsock_user.c | 141 +++++++++++++++++++++++++++++++--------------
> >  4 files changed, 135 insertions(+), 42 deletions(-)
> >  create mode 100644 samples/bpf/xdpsock.h
> >  create mode 100644 samples/bpf/xdpsock_kern.c
> >
> > diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
> > index 4df11dd..8a9af3a 100644
> > --- a/samples/bpf/Makefile
> > +++ b/samples/bpf/Makefile
> > @@ -167,6 +167,7 @@ always += xdp_sample_pkts_kern.o
> >  always += ibumad_kern.o
> >  always += hbm_out_kern.o
> >  always += hbm_edt_kern.o
> > +always += xdpsock_kern.o
> >
> >  ifeq ($(ARCH), arm)
> >  # Strip all except -D__LINUX_ARM_ARCH__ option needed to handle linux
> > diff --git a/samples/bpf/xdpsock.h b/samples/bpf/xdpsock.h
> > new file mode 100644
> > index 0000000..b7eca15
> > --- /dev/null
> > +++ b/samples/bpf/xdpsock.h
> > @@ -0,0 +1,11 @@
> > +/* SPDX-License-Identifier: GPL-2.0
> > + *
> > + * Copyright(c) 2019 Intel Corporation.
> > + */
> > +
> > +#ifndef XDPSOCK_H_
> > +#define XDPSOCK_H_
> > +
> > +#define MAX_SOCKS 4
> > +
> > +#endif /* XDPSOCK_H */
> > diff --git a/samples/bpf/xdpsock_kern.c b/samples/bpf/xdpsock_kern.c
> > new file mode 100644
> > index 0000000..a06177c
> > --- /dev/null
> > +++ b/samples/bpf/xdpsock_kern.c
> > @@ -0,0 +1,24 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <linux/bpf.h>
> > +#include "bpf_helpers.h"
> > +#include "xdpsock.h"
> > +
> > +/* This XDP program is only needed for the XDP_SHARED_UMEM mode.
> > + * If you do not use this mode, libbpf can supply an XDP program for you.
> > + */
> > +
> > +struct {
> > +     __uint(type, BPF_MAP_TYPE_XSKMAP);
> > +     __uint(max_entries, MAX_SOCKS);
> > +     __uint(key_size, sizeof(int));
> > +     __uint(value_size, sizeof(int));
> > +} xsks_map SEC(".maps");
> > +
> > +static unsigned int rr;
> > +
> > +SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
> > +{
> > +     rr = (rr + 1) & (MAX_SOCKS - 1);
> > +
> > +     return bpf_redirect_map(&xsks_map, rr, XDP_DROP);
> > +}
> > diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
> > index 405c4e0..d3dba93 100644
> > --- a/samples/bpf/xdpsock_user.c
> > +++ b/samples/bpf/xdpsock_user.c
> > @@ -29,6 +29,7 @@
> >
> >  #include "libbpf.h"
> >  #include "xsk.h"
> > +#include "xdpsock.h"
> >  #include <bpf/bpf.h>
> >
> >  #ifndef SOL_XDP
> > @@ -47,7 +48,6 @@
> >  #define BATCH_SIZE 64
> >
> >  #define DEBUG_HEXDUMP 0
> > -#define MAX_SOCKS 8
> >
> >  typedef __u64 u64;
> >  typedef __u32 u32;
> > @@ -75,7 +75,8 @@ static u32 opt_xdp_bind_flags;
> >  static int opt_xsk_frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE;
> >  static int opt_timeout = 1000;
> >  static bool opt_need_wakeup = true;
> > -static __u32 prog_id;
> > +static u32 opt_num_xsks = 1;
> > +static u32 prog_id;
> >
> >  struct xsk_umem_info {
> >       struct xsk_ring_prod fq;
> > @@ -179,7 +180,7 @@ static void *poller(void *arg)
> >
> >  static void remove_xdp_program(void)
> >  {
> > -     __u32 curr_prog_id = 0;
> > +     u32 curr_prog_id = 0;
> >
> >       if (bpf_get_link_xdp_id(opt_ifindex, &curr_prog_id, opt_xdp_flags)) {
> >               printf("bpf_get_link_xdp_id failed\n");
> > @@ -196,11 +197,11 @@ static void remove_xdp_program(void)
> >  static void int_exit(int sig)
> >  {
> >       struct xsk_umem *umem = xsks[0]->umem->umem;
> > -
> > -     (void)sig;
> > +     int i;
> >
> >       dump_stats();
> > -     xsk_socket__delete(xsks[0]->xsk);
> > +     for (i = 0; i < num_socks; i++)
> > +             xsk_socket__delete(xsks[i]->xsk);
> >       (void)xsk_umem__delete(umem);
> >       remove_xdp_program();
> >
> > @@ -290,8 +291,8 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
> >               .frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM,
> >               .flags = opt_umem_flags
> >       };
> > -
> > -     int ret;
> > +     int ret, i;
> > +     u32 idx;
> >
> >       umem = calloc(1, sizeof(*umem));
> >       if (!umem)
> > @@ -303,6 +304,15 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
> >       if (ret)
> >               exit_with_error(-ret);
> >
> > +     ret = xsk_ring_prod__reserve(&umem->fq,
> > +                                  XSK_RING_PROD__DEFAULT_NUM_DESCS, &idx);
> > +     if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS)
> > +             exit_with_error(-ret);
> > +     for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS; i++)
> > +             *xsk_ring_prod__fill_addr(&umem->fq, idx++) =
> > +                     i * opt_xsk_frame_size;
> > +     xsk_ring_prod__submit(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS);
> > +
> >       umem->buffer = buffer;
> >       return umem;
> >  }
> > @@ -312,8 +322,6 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
> >       struct xsk_socket_config cfg;
> >       struct xsk_socket_info *xsk;
> >       int ret;
> > -     u32 idx;
> > -     int i;
> >
> >       xsk = calloc(1, sizeof(*xsk));
> >       if (!xsk)
> > @@ -322,11 +330,15 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
> >       xsk->umem = umem;
> >       cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
> >       cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
> > -     cfg.libbpf_flags = 0;
> > +     if (opt_num_xsks > 1)
> > +             cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;
>
> I think we can still load our own XDP program, and don't set
> XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD.
> So the xsk_setup_xdp_prog() will find the the loaded XDP program
> and sets the xsk map.

So what you are saying is that you would like libbpf to be smarter and
insert the sockets into the xskmap automatically? Doable in the simple
case, but what if the XDP program has multiple xskmaps, or an xskmap
with a different name? Seems complicated to do this in the general
case. Or maybe I am just chicken to say the user has to load and
manage his/her own XDP program when XDP_SHARED_UMEM is used :-).

> > +     else
> > +             cfg.libbpf_flags = 0;
> >       cfg.xdp_flags = opt_xdp_flags;
> >       cfg.bind_flags = opt_xdp_bind_flags;
>
> Do we need to
> cfg.bind_flags |= XDP_SHARED_UMEM?

It is set by libbpf automatically, so no need here.

> Thanks
> William
>
> > -     ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue, umem->umem,
> > -                              &xsk->rx, &xsk->tx, &cfg);
> > +
> > +     ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue,
> > +                              umem->umem, &xsk->rx, &xsk->tx, &cfg);
> >       if (ret)
> >               exit_with_error(-ret);
> >
> > @@ -334,17 +346,6 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
> >       if (ret)
> >               exit_with_error(-ret);
> >
> > -     ret = xsk_ring_prod__reserve(&xsk->umem->fq,
> > -                                  XSK_RING_PROD__DEFAULT_NUM_DESCS,
> > -                                  &idx);
> > -     if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS)
> > -             exit_with_error(-ret);
> > -     for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS; i++)
> > -             *xsk_ring_prod__fill_addr(&xsk->umem->fq, idx++) =
> > -                     i * opt_xsk_frame_size;
> > -     xsk_ring_prod__submit(&xsk->umem->fq,
> > -                           XSK_RING_PROD__DEFAULT_NUM_DESCS);
> > -
> >       return xsk;
> >  }
> >
> > @@ -363,6 +364,7 @@ static struct option long_options[] = {
> >       {"frame-size", required_argument, 0, 'f'},
> >       {"no-need-wakeup", no_argument, 0, 'm'},
> >       {"unaligned", no_argument, 0, 'u'},
> > +     {"shared-umem", no_argument, 0, 'M'},
> >       {0, 0, 0, 0}
> >  };
> >
> > @@ -386,6 +388,7 @@ static void usage(const char *prog)
> >               "  -m, --no-need-wakeup Turn off use of driver need wakeup flag.\n"
> >               "  -f, --frame-size=n   Set the frame size (must be a power of two in aligned mode, default is %d).\n"
> >               "  -u, --unaligned      Enable unaligned chunk placement\n"
> > +             "  -M, --shared-umem    Enable XDP_SHARED_UMEM\n"
> >               "\n";
> >       fprintf(stderr, str, prog, XSK_UMEM__DEFAULT_FRAME_SIZE);
> >       exit(EXIT_FAILURE);
> > @@ -398,7 +401,7 @@ static void parse_command_line(int argc, char **argv)
> >       opterr = 0;
> >
> >       for (;;) {
> > -             c = getopt_long(argc, argv, "Frtli:q:psSNn:czf:mu",
> > +             c = getopt_long(argc, argv, "Frtli:q:psSNn:czf:muM",
> >                               long_options, &option_index);
> >               if (c == -1)
> >                       break;
> > @@ -448,11 +451,14 @@ static void parse_command_line(int argc, char **argv)
> >                       break;
> >               case 'f':
> >                       opt_xsk_frame_size = atoi(optarg);
> > +                     break;
> >               case 'm':
> >                       opt_need_wakeup = false;
> >                       opt_xdp_bind_flags &= ~XDP_USE_NEED_WAKEUP;
> >                       break;
> > -
> > +             case 'M':
> > +                     opt_num_xsks = MAX_SOCKS;
> > +                     break;
> >               default:
> >                       usage(basename(argv[0]));
> >               }
> > @@ -586,11 +592,9 @@ static void rx_drop(struct xsk_socket_info *xsk, struct pollfd *fds)
> >
> >  static void rx_drop_all(void)
> >  {
> > -     struct pollfd fds[MAX_SOCKS + 1];
> > +     struct pollfd fds[MAX_SOCKS] = {};
> >       int i, ret;
> >
> > -     memset(fds, 0, sizeof(fds));
> > -
> >       for (i = 0; i < num_socks; i++) {
> >               fds[i].fd = xsk_socket__fd(xsks[i]->xsk);
> >               fds[i].events = POLLIN;
> > @@ -633,11 +637,10 @@ static void tx_only(struct xsk_socket_info *xsk, u32 frame_nb)
> >
> >  static void tx_only_all(void)
> >  {
> > -     struct pollfd fds[MAX_SOCKS];
> > +     struct pollfd fds[MAX_SOCKS] = {};
> >       u32 frame_nb[MAX_SOCKS] = {};
> >       int i, ret;
> >
> > -     memset(fds, 0, sizeof(fds));
> >       for (i = 0; i < num_socks; i++) {
> >               fds[0].fd = xsk_socket__fd(xsks[i]->xsk);
> >               fds[0].events = POLLOUT;
> > @@ -706,11 +709,9 @@ static void l2fwd(struct xsk_socket_info *xsk, struct pollfd *fds)
> >
> >  static void l2fwd_all(void)
> >  {
> > -     struct pollfd fds[MAX_SOCKS];
> > +     struct pollfd fds[MAX_SOCKS] = {};
> >       int i, ret;
> >
> > -     memset(fds, 0, sizeof(fds));
> > -
> >       for (i = 0; i < num_socks; i++) {
> >               fds[i].fd = xsk_socket__fd(xsks[i]->xsk);
> >               fds[i].events = POLLOUT | POLLIN;
> > @@ -728,13 +729,65 @@ static void l2fwd_all(void)
> >       }
> >  }
> >
> > +static void load_xdp_program(char **argv, struct bpf_object **obj)
> > +{
> > +     struct bpf_prog_load_attr prog_load_attr = {
> > +             .prog_type      = BPF_PROG_TYPE_XDP,
> > +     };
> > +     char xdp_filename[256];
> > +     int prog_fd;
> > +
> > +     snprintf(xdp_filename, sizeof(xdp_filename), "%s_kern.o", argv[0]);
> > +     prog_load_attr.file = xdp_filename;
> > +
> > +     if (bpf_prog_load_xattr(&prog_load_attr, obj, &prog_fd))
> > +             exit(EXIT_FAILURE);
> > +     if (prog_fd < 0) {
> > +             fprintf(stderr, "ERROR: no program found: %s\n",
> > +                     strerror(prog_fd));
> > +             exit(EXIT_FAILURE);
> > +     }
> > +
> > +     if (bpf_set_link_xdp_fd(opt_ifindex, prog_fd, opt_xdp_flags) < 0) {
> > +             fprintf(stderr, "ERROR: link set xdp fd failed\n");
> > +             exit(EXIT_FAILURE);
> > +     }
> > +}
> > +
> > +static void enter_xsks_into_map(struct bpf_object *obj)
> > +{
> > +     struct bpf_map *map;
> > +     int i, xsks_map;
> > +
> > +     map = bpf_object__find_map_by_name(obj, "xsks_map");
> > +     xsks_map = bpf_map__fd(map);
> > +     if (xsks_map < 0) {
> > +             fprintf(stderr, "ERROR: no xsks map found: %s\n",
> > +                     strerror(xsks_map));
> > +                     exit(EXIT_FAILURE);
> > +     }
> > +
> > +     for (i = 0; i < num_socks; i++) {
> > +             int fd = xsk_socket__fd(xsks[i]->xsk);
> > +             int key, ret;
> > +
> > +             key = i;
> > +             ret = bpf_map_update_elem(xsks_map, &key, &fd, 0);
> > +             if (ret) {
> > +                     fprintf(stderr, "ERROR: bpf_map_update_elem %d\n", i);
> > +                     exit(EXIT_FAILURE);
> > +             }
> > +     }
> > +}
> > +
> >  int main(int argc, char **argv)
> >  {
> >       struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
> >       struct xsk_umem_info *umem;
> > +     struct bpf_object *obj;
> >       pthread_t pt;
> > +     int i, ret;
> >       void *bufs;
> > -     int ret;
> >
> >       parse_command_line(argc, argv);
> >
> > @@ -744,6 +797,9 @@ int main(int argc, char **argv)
> >               exit(EXIT_FAILURE);
> >       }
> >
> > +     if (opt_num_xsks > 1)
> > +             load_xdp_program(argv, &obj);
> > +
> >       /* Reserve memory for the umem. Use hugepages if unaligned chunk mode */
> >       bufs = mmap(NULL, NUM_FRAMES * opt_xsk_frame_size,
> >                   PROT_READ | PROT_WRITE,
> > @@ -752,16 +808,17 @@ int main(int argc, char **argv)
> >               printf("ERROR: mmap failed\n");
> >               exit(EXIT_FAILURE);
> >       }
> > -       /* Create sockets... */
> > +
> > +     /* Create sockets... */
> >       umem = xsk_configure_umem(bufs, NUM_FRAMES * opt_xsk_frame_size);
> > -     xsks[num_socks++] = xsk_configure_socket(umem);
> > +     for (i = 0; i < opt_num_xsks; i++)
> > +             xsks[num_socks++] = xsk_configure_socket(umem);
> >
> > -     if (opt_bench == BENCH_TXONLY) {
> > -             int i;
> > +     for (i = 0; i < NUM_FRAMES; i++)
> > +             gen_eth_frame(umem, i * opt_xsk_frame_size);
> >
> > -             for (i = 0; i < NUM_FRAMES; i++)
> > -                     (void)gen_eth_frame(umem, i * opt_xsk_frame_size);
> > -     }
> > +     if (opt_num_xsks > 1 && opt_bench != BENCH_TXONLY)
> > +             enter_xsks_into_map(obj);
> >
> >       signal(SIGINT, int_exit);
> >       signal(SIGTERM, int_exit);
> > --
> > 2.7.4
> >

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program
  2019-11-08 18:19     ` Magnus Karlsson
@ 2019-11-08 18:43       ` William Tu
  2019-11-08 19:17         ` Magnus Karlsson
  0 siblings, 1 reply; 27+ messages in thread
From: William Tu @ 2019-11-08 18:43 UTC (permalink / raw)
  To: Magnus Karlsson
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Daniel Borkmann, Network Development, Jonathan Lemon, bpf

On Fri, Nov 08, 2019 at 07:19:18PM +0100, Magnus Karlsson wrote:
> On Fri, Nov 8, 2019 at 7:03 PM William Tu <u9012063@gmail.com> wrote:
> >
> > Hi Magnus,
> >
> > Thanks for the patch.
> >
> > On Thu, Nov 07, 2019 at 06:47:36PM +0100, Magnus Karlsson wrote:
> > > Add support in libbpf to create multiple sockets that share a single
> > > umem. Note that an external XDP program need to be supplied that
> > > routes the incoming traffic to the desired sockets. So you need to
> > > supply the libbpf_flag XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD and load
> > > your own XDP program.
> > >
> > > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > > ---
> > >  tools/lib/bpf/xsk.c | 27 +++++++++++++++++----------
> > >  1 file changed, 17 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> > > index 86c1b61..8ebd810 100644
> > > --- a/tools/lib/bpf/xsk.c
> > > +++ b/tools/lib/bpf/xsk.c
> > > @@ -586,15 +586,21 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
> > >       if (!umem || !xsk_ptr || !rx || !tx)
> > >               return -EFAULT;
> > >
> > > -     if (umem->refcount) {
> > > -             pr_warn("Error: shared umems not supported by libbpf.\n");
> > > -             return -EBUSY;
> > > -     }
> > > -
> > >       xsk = calloc(1, sizeof(*xsk));
> > >       if (!xsk)
> > >               return -ENOMEM;
> > >
> > > +     err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
> > > +     if (err)
> > > +             goto out_xsk_alloc;
> > > +
> > > +     if (umem->refcount &&
> > > +         !(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
> > > +             pr_warn("Error: shared umems not supported by libbpf supplied XDP program.\n");
> >
> > Why can't we use the existing default one in libbpf?
> > If users don't want to redistribute packet to different queue,
> > then they can still use the libbpf default one.
> 
> Is there any point in creating two or more sockets tied to the same
> umem and directing all traffic to just one socket? IMHO, I believe

When using build-in XDP, isn't the traffic being directed to its
own xsk on its queue? (so not just one xsk socket)

So using build-in XDP, for example, queue1/xsk1 and queue2/xsk2, and
sharing one umem. Both xsk1 and xsk2 receive packets from their queue.

> that most users in this case would want to distribute the packets over
> the sockets in some way. I also think that users might be unpleasantly
> surprised if they create multiple sockets and all packets only get to
> a single socket because libbpf loaded an XDP program that makes little
> sense in the XDP_SHARED_UMEM case. If we force them to supply an XDP

Do I misunderstand the code?
I looked at xsk_setup_xdp_prog, xsk_load_xdp_prog, and xsk_set_bpf_maps.
The build-in prog will distribute packets to different xsk sockets,
not a single socket.

> program, they need to make this decision. I also wanted to extend the
> sample with an explicit user loaded XDP program as an example of how
> to do this. What do you think?

Yes, I like it. Like previous version having the xdpsock_kern.c as an
example for people to follow.

William

> 
> /Magnus
> 
> > William
> > > +             err = -EBUSY;
> > > +             goto out_xsk_alloc;
> > > +     }
> > > +
> > >       if (umem->refcount++ > 0) {
> > >               xsk->fd = socket(AF_XDP, SOCK_RAW, 0);
> > >               if (xsk->fd < 0) {
> > > @@ -616,10 +622,6 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
> > >       memcpy(xsk->ifname, ifname, IFNAMSIZ - 1);
> > >       xsk->ifname[IFNAMSIZ - 1] = '\0';
> > >
> > > -     err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
> > > -     if (err)
> > > -             goto out_socket;
> > > -
> > >       if (rx) {
> > >               err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING,
> > >                                &xsk->config.rx_size,
> > > @@ -687,7 +689,12 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
> > >       sxdp.sxdp_family = PF_XDP;
> > >       sxdp.sxdp_ifindex = xsk->ifindex;
> > >       sxdp.sxdp_queue_id = xsk->queue_id;
> > > -     sxdp.sxdp_flags = xsk->config.bind_flags;
> > > +     if (umem->refcount > 1) {
> > > +             sxdp.sxdp_flags = XDP_SHARED_UMEM;
> > > +             sxdp.sxdp_shared_umem_fd = umem->fd;
> > > +     } else {
> > > +             sxdp.sxdp_flags = xsk->config.bind_flags;
> > > +     }
> > >
> > >       err = bind(xsk->fd, (struct sockaddr *)&sxdp, sizeof(sxdp));
> > >       if (err) {
> > > --
> > > 2.7.4
> > >

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 2/5] samples/bpf: add XDP_SHARED_UMEM support to xdpsock
  2019-11-08 18:33     ` Magnus Karlsson
@ 2019-11-08 19:09       ` William Tu
  0 siblings, 0 replies; 27+ messages in thread
From: William Tu @ 2019-11-08 19:09 UTC (permalink / raw)
  To: Magnus Karlsson
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Daniel Borkmann, Network Development, Jonathan Lemon, bpf

On Fri, Nov 08, 2019 at 07:33:40PM +0100, Magnus Karlsson wrote:
> On Fri, Nov 8, 2019 at 7:15 PM William Tu <u9012063@gmail.com> wrote:
> >
> > On Thu, Nov 07, 2019 at 06:47:37PM +0100, Magnus Karlsson wrote:
> > > Add support for the XDP_SHARED_UMEM mode to the xdpsock sample
> > > application. As libbpf does not have a built in XDP program for this
> > > mode, we use an explicitly loaded XDP program. This also serves as an
> > > example on how to write your own XDP program that can route to an
> > > AF_XDP socket.
> > >
> > > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > > ---
> > >  samples/bpf/Makefile       |   1 +
> > >  samples/bpf/xdpsock.h      |  11 ++++
> > >  samples/bpf/xdpsock_kern.c |  24 ++++++++
> > >  samples/bpf/xdpsock_user.c | 141 +++++++++++++++++++++++++++++++--------------
> > >  4 files changed, 135 insertions(+), 42 deletions(-)
> > >  create mode 100644 samples/bpf/xdpsock.h
> > >  create mode 100644 samples/bpf/xdpsock_kern.c
> > >
> > > diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
> > > index 4df11dd..8a9af3a 100644
> > > --- a/samples/bpf/Makefile
> > > +++ b/samples/bpf/Makefile
> > > @@ -167,6 +167,7 @@ always += xdp_sample_pkts_kern.o
> > >  always += ibumad_kern.o
> > >  always += hbm_out_kern.o
> > >  always += hbm_edt_kern.o
> > > +always += xdpsock_kern.o
> > >
> > >  ifeq ($(ARCH), arm)
> > >  # Strip all except -D__LINUX_ARM_ARCH__ option needed to handle linux
> > > diff --git a/samples/bpf/xdpsock.h b/samples/bpf/xdpsock.h
> > > new file mode 100644
> > > index 0000000..b7eca15
> > > --- /dev/null
> > > +++ b/samples/bpf/xdpsock.h
> > > @@ -0,0 +1,11 @@
> > > +/* SPDX-License-Identifier: GPL-2.0
> > > + *
> > > + * Copyright(c) 2019 Intel Corporation.
> > > + */
> > > +
> > > +#ifndef XDPSOCK_H_
> > > +#define XDPSOCK_H_
> > > +
> > > +#define MAX_SOCKS 4
> > > +
> > > +#endif /* XDPSOCK_H */
> > > diff --git a/samples/bpf/xdpsock_kern.c b/samples/bpf/xdpsock_kern.c
> > > new file mode 100644
> > > index 0000000..a06177c
> > > --- /dev/null
> > > +++ b/samples/bpf/xdpsock_kern.c
> > > @@ -0,0 +1,24 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +#include <linux/bpf.h>
> > > +#include "bpf_helpers.h"
> > > +#include "xdpsock.h"
> > > +
> > > +/* This XDP program is only needed for the XDP_SHARED_UMEM mode.
> > > + * If you do not use this mode, libbpf can supply an XDP program for you.
> > > + */
> > > +
> > > +struct {
> > > +     __uint(type, BPF_MAP_TYPE_XSKMAP);
> > > +     __uint(max_entries, MAX_SOCKS);
> > > +     __uint(key_size, sizeof(int));
> > > +     __uint(value_size, sizeof(int));
> > > +} xsks_map SEC(".maps");
> > > +
> > > +static unsigned int rr;
> > > +
> > > +SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
> > > +{
> > > +     rr = (rr + 1) & (MAX_SOCKS - 1);
> > > +
> > > +     return bpf_redirect_map(&xsks_map, rr, XDP_DROP);
> > > +}
> > > diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
> > > index 405c4e0..d3dba93 100644
> > > --- a/samples/bpf/xdpsock_user.c
> > > +++ b/samples/bpf/xdpsock_user.c
> > > @@ -29,6 +29,7 @@
> > >
> > >  #include "libbpf.h"
> > >  #include "xsk.h"
> > > +#include "xdpsock.h"
> > >  #include <bpf/bpf.h>
> > >
> > >  #ifndef SOL_XDP
> > > @@ -47,7 +48,6 @@
> > >  #define BATCH_SIZE 64
> > >
> > >  #define DEBUG_HEXDUMP 0
> > > -#define MAX_SOCKS 8
> > >
> > >  typedef __u64 u64;
> > >  typedef __u32 u32;
> > > @@ -75,7 +75,8 @@ static u32 opt_xdp_bind_flags;
> > >  static int opt_xsk_frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE;
> > >  static int opt_timeout = 1000;
> > >  static bool opt_need_wakeup = true;
> > > -static __u32 prog_id;
> > > +static u32 opt_num_xsks = 1;
> > > +static u32 prog_id;
> > >
> > >  struct xsk_umem_info {
> > >       struct xsk_ring_prod fq;
> > > @@ -179,7 +180,7 @@ static void *poller(void *arg)
> > >
> > >  static void remove_xdp_program(void)
> > >  {
> > > -     __u32 curr_prog_id = 0;
> > > +     u32 curr_prog_id = 0;
> > >
> > >       if (bpf_get_link_xdp_id(opt_ifindex, &curr_prog_id, opt_xdp_flags)) {
> > >               printf("bpf_get_link_xdp_id failed\n");
> > > @@ -196,11 +197,11 @@ static void remove_xdp_program(void)
> > >  static void int_exit(int sig)
> > >  {
> > >       struct xsk_umem *umem = xsks[0]->umem->umem;
> > > -
> > > -     (void)sig;
> > > +     int i;
> > >
> > >       dump_stats();
> > > -     xsk_socket__delete(xsks[0]->xsk);
> > > +     for (i = 0; i < num_socks; i++)
> > > +             xsk_socket__delete(xsks[i]->xsk);
> > >       (void)xsk_umem__delete(umem);
> > >       remove_xdp_program();
> > >
> > > @@ -290,8 +291,8 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
> > >               .frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM,
> > >               .flags = opt_umem_flags
> > >       };
> > > -
> > > -     int ret;
> > > +     int ret, i;
> > > +     u32 idx;
> > >
> > >       umem = calloc(1, sizeof(*umem));
> > >       if (!umem)
> > > @@ -303,6 +304,15 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size)
> > >       if (ret)
> > >               exit_with_error(-ret);
> > >
> > > +     ret = xsk_ring_prod__reserve(&umem->fq,
> > > +                                  XSK_RING_PROD__DEFAULT_NUM_DESCS, &idx);
> > > +     if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS)
> > > +             exit_with_error(-ret);
> > > +     for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS; i++)
> > > +             *xsk_ring_prod__fill_addr(&umem->fq, idx++) =
> > > +                     i * opt_xsk_frame_size;
> > > +     xsk_ring_prod__submit(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS);
> > > +
> > >       umem->buffer = buffer;
> > >       return umem;
> > >  }
> > > @@ -312,8 +322,6 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
> > >       struct xsk_socket_config cfg;
> > >       struct xsk_socket_info *xsk;
> > >       int ret;
> > > -     u32 idx;
> > > -     int i;
> > >
> > >       xsk = calloc(1, sizeof(*xsk));
> > >       if (!xsk)
> > > @@ -322,11 +330,15 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
> > >       xsk->umem = umem;
> > >       cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
> > >       cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
> > > -     cfg.libbpf_flags = 0;
> > > +     if (opt_num_xsks > 1)
> > > +             cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;
> >
> > I think we can still load our own XDP program, and don't set
> > XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD.
> > So the xsk_setup_xdp_prog() will find the the loaded XDP program
> > and sets the xsk map.
> 
> So what you are saying is that you would like libbpf to be smarter and
> insert the sockets into the xskmap automatically? Doable in the simple

Isn't it already there?
xsk_setup_xdp_prog()
    xsk_lookup_bpf_maps() (it searches "xsks_maps")
    xsk_set_bpf_maps()
        insert socket into xskmap.

> case, but what if the XDP program has multiple xskmaps, or an xskmap
> with a different name? Seems complicated to do this in the general
> case. Or maybe I am just chicken to say the user has to load and
> manage his/her own XDP program when XDP_SHARED_UMEM is used :-).

Yes, if different name, then users have to program xskmap manually.

--William
> 
> > > +     else
> > > +             cfg.libbpf_flags = 0;
> > >       cfg.xdp_flags = opt_xdp_flags;
> > >       cfg.bind_flags = opt_xdp_bind_flags;
> >
> > Do we need to
> > cfg.bind_flags |= XDP_SHARED_UMEM?
> 
> It is set by libbpf automatically, so no need here.
> 
> > Thanks
> > William
> >
> > > -     ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue, umem->umem,
> > > -                              &xsk->rx, &xsk->tx, &cfg);
> > > +
> > > +     ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue,
> > > +                              umem->umem, &xsk->rx, &xsk->tx, &cfg);
> > >       if (ret)
> > >               exit_with_error(-ret);
> > >
> > > @@ -334,17 +346,6 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
> > >       if (ret)
> > >               exit_with_error(-ret);
> > >
> > > -     ret = xsk_ring_prod__reserve(&xsk->umem->fq,
> > > -                                  XSK_RING_PROD__DEFAULT_NUM_DESCS,
> > > -                                  &idx);
> > > -     if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS)
> > > -             exit_with_error(-ret);
> > > -     for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS; i++)
> > > -             *xsk_ring_prod__fill_addr(&xsk->umem->fq, idx++) =
> > > -                     i * opt_xsk_frame_size;
> > > -     xsk_ring_prod__submit(&xsk->umem->fq,
> > > -                           XSK_RING_PROD__DEFAULT_NUM_DESCS);
> > > -
> > >       return xsk;
> > >  }
> > >
> > > @@ -363,6 +364,7 @@ static struct option long_options[] = {
> > >       {"frame-size", required_argument, 0, 'f'},
> > >       {"no-need-wakeup", no_argument, 0, 'm'},
> > >       {"unaligned", no_argument, 0, 'u'},
> > > +     {"shared-umem", no_argument, 0, 'M'},
> > >       {0, 0, 0, 0}
> > >  };
> > >
> > > @@ -386,6 +388,7 @@ static void usage(const char *prog)
> > >               "  -m, --no-need-wakeup Turn off use of driver need wakeup flag.\n"
> > >               "  -f, --frame-size=n   Set the frame size (must be a power of two in aligned mode, default is %d).\n"
> > >               "  -u, --unaligned      Enable unaligned chunk placement\n"
> > > +             "  -M, --shared-umem    Enable XDP_SHARED_UMEM\n"
> > >               "\n";
> > >       fprintf(stderr, str, prog, XSK_UMEM__DEFAULT_FRAME_SIZE);
> > >       exit(EXIT_FAILURE);
> > > @@ -398,7 +401,7 @@ static void parse_command_line(int argc, char **argv)
> > >       opterr = 0;
> > >
> > >       for (;;) {
> > > -             c = getopt_long(argc, argv, "Frtli:q:psSNn:czf:mu",
> > > +             c = getopt_long(argc, argv, "Frtli:q:psSNn:czf:muM",
> > >                               long_options, &option_index);
> > >               if (c == -1)
> > >                       break;
> > > @@ -448,11 +451,14 @@ static void parse_command_line(int argc, char **argv)
> > >                       break;
> > >               case 'f':
> > >                       opt_xsk_frame_size = atoi(optarg);
> > > +                     break;
> > >               case 'm':
> > >                       opt_need_wakeup = false;
> > >                       opt_xdp_bind_flags &= ~XDP_USE_NEED_WAKEUP;
> > >                       break;
> > > -
> > > +             case 'M':
> > > +                     opt_num_xsks = MAX_SOCKS;
> > > +                     break;
> > >               default:
> > >                       usage(basename(argv[0]));
> > >               }
> > > @@ -586,11 +592,9 @@ static void rx_drop(struct xsk_socket_info *xsk, struct pollfd *fds)
> > >
> > >  static void rx_drop_all(void)
> > >  {
> > > -     struct pollfd fds[MAX_SOCKS + 1];
> > > +     struct pollfd fds[MAX_SOCKS] = {};
> > >       int i, ret;
> > >
> > > -     memset(fds, 0, sizeof(fds));
> > > -
> > >       for (i = 0; i < num_socks; i++) {
> > >               fds[i].fd = xsk_socket__fd(xsks[i]->xsk);
> > >               fds[i].events = POLLIN;
> > > @@ -633,11 +637,10 @@ static void tx_only(struct xsk_socket_info *xsk, u32 frame_nb)
> > >
> > >  static void tx_only_all(void)
> > >  {
> > > -     struct pollfd fds[MAX_SOCKS];
> > > +     struct pollfd fds[MAX_SOCKS] = {};
> > >       u32 frame_nb[MAX_SOCKS] = {};
> > >       int i, ret;
> > >
> > > -     memset(fds, 0, sizeof(fds));
> > >       for (i = 0; i < num_socks; i++) {
> > >               fds[0].fd = xsk_socket__fd(xsks[i]->xsk);
> > >               fds[0].events = POLLOUT;
> > > @@ -706,11 +709,9 @@ static void l2fwd(struct xsk_socket_info *xsk, struct pollfd *fds)
> > >
> > >  static void l2fwd_all(void)
> > >  {
> > > -     struct pollfd fds[MAX_SOCKS];
> > > +     struct pollfd fds[MAX_SOCKS] = {};
> > >       int i, ret;
> > >
> > > -     memset(fds, 0, sizeof(fds));
> > > -
> > >       for (i = 0; i < num_socks; i++) {
> > >               fds[i].fd = xsk_socket__fd(xsks[i]->xsk);
> > >               fds[i].events = POLLOUT | POLLIN;
> > > @@ -728,13 +729,65 @@ static void l2fwd_all(void)
> > >       }
> > >  }
> > >
> > > +static void load_xdp_program(char **argv, struct bpf_object **obj)
> > > +{
> > > +     struct bpf_prog_load_attr prog_load_attr = {
> > > +             .prog_type      = BPF_PROG_TYPE_XDP,
> > > +     };
> > > +     char xdp_filename[256];
> > > +     int prog_fd;
> > > +
> > > +     snprintf(xdp_filename, sizeof(xdp_filename), "%s_kern.o", argv[0]);
> > > +     prog_load_attr.file = xdp_filename;
> > > +
> > > +     if (bpf_prog_load_xattr(&prog_load_attr, obj, &prog_fd))
> > > +             exit(EXIT_FAILURE);
> > > +     if (prog_fd < 0) {
> > > +             fprintf(stderr, "ERROR: no program found: %s\n",
> > > +                     strerror(prog_fd));
> > > +             exit(EXIT_FAILURE);
> > > +     }
> > > +
> > > +     if (bpf_set_link_xdp_fd(opt_ifindex, prog_fd, opt_xdp_flags) < 0) {
> > > +             fprintf(stderr, "ERROR: link set xdp fd failed\n");
> > > +             exit(EXIT_FAILURE);
> > > +     }
> > > +}
> > > +
> > > +static void enter_xsks_into_map(struct bpf_object *obj)
> > > +{
> > > +     struct bpf_map *map;
> > > +     int i, xsks_map;
> > > +
> > > +     map = bpf_object__find_map_by_name(obj, "xsks_map");
> > > +     xsks_map = bpf_map__fd(map);
> > > +     if (xsks_map < 0) {
> > > +             fprintf(stderr, "ERROR: no xsks map found: %s\n",
> > > +                     strerror(xsks_map));
> > > +                     exit(EXIT_FAILURE);
> > > +     }
> > > +
> > > +     for (i = 0; i < num_socks; i++) {
> > > +             int fd = xsk_socket__fd(xsks[i]->xsk);
> > > +             int key, ret;
> > > +
> > > +             key = i;
> > > +             ret = bpf_map_update_elem(xsks_map, &key, &fd, 0);
> > > +             if (ret) {
> > > +                     fprintf(stderr, "ERROR: bpf_map_update_elem %d\n", i);
> > > +                     exit(EXIT_FAILURE);
> > > +             }
> > > +     }
> > > +}
> > > +
> > >  int main(int argc, char **argv)
> > >  {
> > >       struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
> > >       struct xsk_umem_info *umem;
> > > +     struct bpf_object *obj;
> > >       pthread_t pt;
> > > +     int i, ret;
> > >       void *bufs;
> > > -     int ret;
> > >
> > >       parse_command_line(argc, argv);
> > >
> > > @@ -744,6 +797,9 @@ int main(int argc, char **argv)
> > >               exit(EXIT_FAILURE);
> > >       }
> > >
> > > +     if (opt_num_xsks > 1)
> > > +             load_xdp_program(argv, &obj);
> > > +
> > >       /* Reserve memory for the umem. Use hugepages if unaligned chunk mode */
> > >       bufs = mmap(NULL, NUM_FRAMES * opt_xsk_frame_size,
> > >                   PROT_READ | PROT_WRITE,
> > > @@ -752,16 +808,17 @@ int main(int argc, char **argv)
> > >               printf("ERROR: mmap failed\n");
> > >               exit(EXIT_FAILURE);
> > >       }
> > > -       /* Create sockets... */
> > > +
> > > +     /* Create sockets... */
> > >       umem = xsk_configure_umem(bufs, NUM_FRAMES * opt_xsk_frame_size);
> > > -     xsks[num_socks++] = xsk_configure_socket(umem);
> > > +     for (i = 0; i < opt_num_xsks; i++)
> > > +             xsks[num_socks++] = xsk_configure_socket(umem);
> > >
> > > -     if (opt_bench == BENCH_TXONLY) {
> > > -             int i;
> > > +     for (i = 0; i < NUM_FRAMES; i++)
> > > +             gen_eth_frame(umem, i * opt_xsk_frame_size);
> > >
> > > -             for (i = 0; i < NUM_FRAMES; i++)
> > > -                     (void)gen_eth_frame(umem, i * opt_xsk_frame_size);
> > > -     }
> > > +     if (opt_num_xsks > 1 && opt_bench != BENCH_TXONLY)
> > > +             enter_xsks_into_map(obj);
> > >
> > >       signal(SIGINT, int_exit);
> > >       signal(SIGTERM, int_exit);
> > > --
> > > 2.7.4
> > >

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program
  2019-11-08 18:43       ` William Tu
@ 2019-11-08 19:17         ` Magnus Karlsson
  2019-11-08 22:31           ` William Tu
  0 siblings, 1 reply; 27+ messages in thread
From: Magnus Karlsson @ 2019-11-08 19:17 UTC (permalink / raw)
  To: William Tu
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Daniel Borkmann, Network Development, Jonathan Lemon, bpf

On Fri, Nov 8, 2019 at 7:43 PM William Tu <u9012063@gmail.com> wrote:
>
> On Fri, Nov 08, 2019 at 07:19:18PM +0100, Magnus Karlsson wrote:
> > On Fri, Nov 8, 2019 at 7:03 PM William Tu <u9012063@gmail.com> wrote:
> > >
> > > Hi Magnus,
> > >
> > > Thanks for the patch.
> > >
> > > On Thu, Nov 07, 2019 at 06:47:36PM +0100, Magnus Karlsson wrote:
> > > > Add support in libbpf to create multiple sockets that share a single
> > > > umem. Note that an external XDP program need to be supplied that
> > > > routes the incoming traffic to the desired sockets. So you need to
> > > > supply the libbpf_flag XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD and load
> > > > your own XDP program.
> > > >
> > > > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > > > ---
> > > >  tools/lib/bpf/xsk.c | 27 +++++++++++++++++----------
> > > >  1 file changed, 17 insertions(+), 10 deletions(-)
> > > >
> > > > diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> > > > index 86c1b61..8ebd810 100644
> > > > --- a/tools/lib/bpf/xsk.c
> > > > +++ b/tools/lib/bpf/xsk.c
> > > > @@ -586,15 +586,21 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
> > > >       if (!umem || !xsk_ptr || !rx || !tx)
> > > >               return -EFAULT;
> > > >
> > > > -     if (umem->refcount) {
> > > > -             pr_warn("Error: shared umems not supported by libbpf.\n");
> > > > -             return -EBUSY;
> > > > -     }
> > > > -
> > > >       xsk = calloc(1, sizeof(*xsk));
> > > >       if (!xsk)
> > > >               return -ENOMEM;
> > > >
> > > > +     err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
> > > > +     if (err)
> > > > +             goto out_xsk_alloc;
> > > > +
> > > > +     if (umem->refcount &&
> > > > +         !(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
> > > > +             pr_warn("Error: shared umems not supported by libbpf supplied XDP program.\n");
> > >
> > > Why can't we use the existing default one in libbpf?
> > > If users don't want to redistribute packet to different queue,
> > > then they can still use the libbpf default one.
> >
> > Is there any point in creating two or more sockets tied to the same
> > umem and directing all traffic to just one socket? IMHO, I believe
>
> When using build-in XDP, isn't the traffic being directed to its
> own xsk on its queue? (so not just one xsk socket)
>
> So using build-in XDP, for example, queue1/xsk1 and queue2/xsk2, and
> sharing one umem. Both xsk1 and xsk2 receive packets from their queue.

WIth the XDP_SHARED_UMEM flag this is not allowed. In your example,
queue1/xsk1 and queue1/xsk2 would be allowed. All sockets need to be
tied to the same queue id if they share a umem. In this case an XDP
program has to decide how to distribute the packets since they all
arrive on the same queue.

If you want queue1/xsk1 and queue2/xsk2 you need separate umems since
it would otherwise violate the SPSC requirement or the rings. Or
implement MPSC and SPMC queues to be used in this configuration.

> > that most users in this case would want to distribute the packets over
> > the sockets in some way. I also think that users might be unpleasantly
> > surprised if they create multiple sockets and all packets only get to
> > a single socket because libbpf loaded an XDP program that makes little
> > sense in the XDP_SHARED_UMEM case. If we force them to supply an XDP
>
> Do I misunderstand the code?
> I looked at xsk_setup_xdp_prog, xsk_load_xdp_prog, and xsk_set_bpf_maps.
> The build-in prog will distribute packets to different xsk sockets,
> not a single socket.

True, but only for the case above (queue1/xsk1 and queue2/xsk2) where
they have separate umems. For the queue1/xsk1 and queue1/xsk2 case, it
would send everything to xsk1.

/Magnus

> > program, they need to make this decision. I also wanted to extend the
> > sample with an explicit user loaded XDP program as an example of how
> > to do this. What do you think?
>
> Yes, I like it. Like previous version having the xdpsock_kern.c as an
> example for people to follow.
>
> William
>
> >
> > /Magnus
> >
> > > William
> > > > +             err = -EBUSY;
> > > > +             goto out_xsk_alloc;
> > > > +     }
> > > > +
> > > >       if (umem->refcount++ > 0) {
> > > >               xsk->fd = socket(AF_XDP, SOCK_RAW, 0);
> > > >               if (xsk->fd < 0) {
> > > > @@ -616,10 +622,6 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
> > > >       memcpy(xsk->ifname, ifname, IFNAMSIZ - 1);
> > > >       xsk->ifname[IFNAMSIZ - 1] = '\0';
> > > >
> > > > -     err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
> > > > -     if (err)
> > > > -             goto out_socket;
> > > > -
> > > >       if (rx) {
> > > >               err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING,
> > > >                                &xsk->config.rx_size,
> > > > @@ -687,7 +689,12 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
> > > >       sxdp.sxdp_family = PF_XDP;
> > > >       sxdp.sxdp_ifindex = xsk->ifindex;
> > > >       sxdp.sxdp_queue_id = xsk->queue_id;
> > > > -     sxdp.sxdp_flags = xsk->config.bind_flags;
> > > > +     if (umem->refcount > 1) {
> > > > +             sxdp.sxdp_flags = XDP_SHARED_UMEM;
> > > > +             sxdp.sxdp_shared_umem_fd = umem->fd;
> > > > +     } else {
> > > > +             sxdp.sxdp_flags = xsk->config.bind_flags;
> > > > +     }
> > > >
> > > >       err = bind(xsk->fd, (struct sockaddr *)&sxdp, sizeof(sxdp));
> > > >       if (err) {
> > > > --
> > > > 2.7.4
> > > >

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program
  2019-11-08 19:17         ` Magnus Karlsson
@ 2019-11-08 22:31           ` William Tu
  0 siblings, 0 replies; 27+ messages in thread
From: William Tu @ 2019-11-08 22:31 UTC (permalink / raw)
  To: Magnus Karlsson
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Daniel Borkmann, Network Development, Jonathan Lemon, bpf

On Fri, Nov 08, 2019 at 08:17:53PM +0100, Magnus Karlsson wrote:
> On Fri, Nov 8, 2019 at 7:43 PM William Tu <u9012063@gmail.com> wrote:
> >
> > On Fri, Nov 08, 2019 at 07:19:18PM +0100, Magnus Karlsson wrote:
> > > On Fri, Nov 8, 2019 at 7:03 PM William Tu <u9012063@gmail.com> wrote:
> > > >
> > > > Hi Magnus,
> > > >
> > > > Thanks for the patch.
> > > >
> > > > On Thu, Nov 07, 2019 at 06:47:36PM +0100, Magnus Karlsson wrote:
> > > > > Add support in libbpf to create multiple sockets that share a single
> > > > > umem. Note that an external XDP program need to be supplied that
> > > > > routes the incoming traffic to the desired sockets. So you need to
> > > > > supply the libbpf_flag XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD and load
> > > > > your own XDP program.
> > > > >
> > > > > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > > > > ---
> > > > >  tools/lib/bpf/xsk.c | 27 +++++++++++++++++----------
> > > > >  1 file changed, 17 insertions(+), 10 deletions(-)
> > > > >
> > > > > diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> > > > > index 86c1b61..8ebd810 100644
> > > > > --- a/tools/lib/bpf/xsk.c
> > > > > +++ b/tools/lib/bpf/xsk.c
> > > > > @@ -586,15 +586,21 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
> > > > >       if (!umem || !xsk_ptr || !rx || !tx)
> > > > >               return -EFAULT;
> > > > >
> > > > > -     if (umem->refcount) {
> > > > > -             pr_warn("Error: shared umems not supported by libbpf.\n");
> > > > > -             return -EBUSY;
> > > > > -     }
> > > > > -
> > > > >       xsk = calloc(1, sizeof(*xsk));
> > > > >       if (!xsk)
> > > > >               return -ENOMEM;
> > > > >
> > > > > +     err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
> > > > > +     if (err)
> > > > > +             goto out_xsk_alloc;
> > > > > +
> > > > > +     if (umem->refcount &&
> > > > > +         !(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
> > > > > +             pr_warn("Error: shared umems not supported by libbpf supplied XDP program.\n");
> > > >
> > > > Why can't we use the existing default one in libbpf?
> > > > If users don't want to redistribute packet to different queue,
> > > > then they can still use the libbpf default one.
> > >
> > > Is there any point in creating two or more sockets tied to the same
> > > umem and directing all traffic to just one socket? IMHO, I believe
> >
> > When using build-in XDP, isn't the traffic being directed to its
> > own xsk on its queue? (so not just one xsk socket)
> >
> > So using build-in XDP, for example, queue1/xsk1 and queue2/xsk2, and
> > sharing one umem. Both xsk1 and xsk2 receive packets from their queue.
> 
> WIth the XDP_SHARED_UMEM flag this is not allowed. In your example,
> queue1/xsk1 and queue1/xsk2 would be allowed. All sockets need to be
> tied to the same queue id if they share a umem. In this case an XDP
> program has to decide how to distribute the packets since they all
> arrive on the same queue.
> 
> If you want queue1/xsk1 and queue2/xsk2 you need separate umems since
> it would otherwise violate the SPSC requirement or the rings. Or
> implement MPSC and SPMC queues to be used in this configuration.
> 
> > > that most users in this case would want to distribute the packets over
> > > the sockets in some way. I also think that users might be unpleasantly
> > > surprised if they create multiple sockets and all packets only get to
> > > a single socket because libbpf loaded an XDP program that makes little
> > > sense in the XDP_SHARED_UMEM case. If we force them to supply an XDP
> >
> > Do I misunderstand the code?
> > I looked at xsk_setup_xdp_prog, xsk_load_xdp_prog, and xsk_set_bpf_maps.
> > The build-in prog will distribute packets to different xsk sockets,
> > not a single socket.
> 
> True, but only for the case above (queue1/xsk1 and queue2/xsk2) where
> they have separate umems. For the queue1/xsk1 and queue1/xsk2 case, it
> would send everything to xsk1.
> 
> /Magnus

Hi Magnus,

Thanks for your explanation. Now I understand.

William


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program
  2019-11-07 17:47 ` [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program Magnus Karlsson
  2019-11-08 18:03   ` William Tu
@ 2019-11-08 22:56   ` Jonathan Lemon
  2019-11-10 18:34     ` William Tu
  1 sibling, 1 reply; 27+ messages in thread
From: Jonathan Lemon @ 2019-11-08 22:56 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: bjorn.topel, ast, daniel, netdev, u9012063, bpf



On 7 Nov 2019, at 9:47, Magnus Karlsson wrote:

> Add support in libbpf to create multiple sockets that share a single
> umem. Note that an external XDP program need to be supplied that
> routes the incoming traffic to the desired sockets. So you need to
> supply the libbpf_flag XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD and load
> your own XDP program.
>
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>

Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 2/5] samples/bpf: add XDP_SHARED_UMEM support to xdpsock
  2019-11-07 17:47 ` [PATCH bpf-next 2/5] samples/bpf: add XDP_SHARED_UMEM support to xdpsock Magnus Karlsson
  2019-11-08 18:13   ` William Tu
@ 2019-11-08 22:59   ` Jonathan Lemon
  2019-11-10 18:34     ` William Tu
  1 sibling, 1 reply; 27+ messages in thread
From: Jonathan Lemon @ 2019-11-08 22:59 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: bjorn.topel, ast, daniel, netdev, u9012063, bpf



On 7 Nov 2019, at 9:47, Magnus Karlsson wrote:

> Add support for the XDP_SHARED_UMEM mode to the xdpsock sample
> application. As libbpf does not have a built in XDP program for this
> mode, we use an explicitly loaded XDP program. This also serves as an
> example on how to write your own XDP program that can route to an
> AF_XDP socket.
>
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>

Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 3/5] libbpf: allow for creating Rx or Tx only AF_XDP sockets
  2019-11-07 17:47 ` [PATCH bpf-next 3/5] libbpf: allow for creating Rx or Tx only AF_XDP sockets Magnus Karlsson
@ 2019-11-08 23:00   ` Jonathan Lemon
  2019-11-10 18:34     ` William Tu
  0 siblings, 1 reply; 27+ messages in thread
From: Jonathan Lemon @ 2019-11-08 23:00 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: bjorn.topel, ast, daniel, netdev, u9012063, bpf



On 7 Nov 2019, at 9:47, Magnus Karlsson wrote:

> The libbpf AF_XDP code is extended to allow for the creation of Rx
> only or Tx only sockets. Previously it returned an error if the socket
> was not initialized for both Rx and Tx.
>
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>

Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 4/5] samples/bpf: use Rx-only and Tx-only sockets in xdpsock
  2019-11-07 17:47 ` [PATCH bpf-next 4/5] samples/bpf: use Rx-only and Tx-only sockets in xdpsock Magnus Karlsson
@ 2019-11-08 23:02   ` Jonathan Lemon
  2019-11-10 18:34     ` William Tu
  0 siblings, 1 reply; 27+ messages in thread
From: Jonathan Lemon @ 2019-11-08 23:02 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: bjorn.topel, ast, daniel, netdev, u9012063, bpf



On 7 Nov 2019, at 9:47, Magnus Karlsson wrote:

> Use Rx-only sockets for the rxdrop sample and Tx-only sockets for the
> txpush sample in the xdpsock application. This so that we exercise and
> show case these socket types too.
>
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>

Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 5/5] xsk: extend documentation for Rx|Tx-only sockets and shared umems
  2019-11-07 17:47 ` [PATCH bpf-next 5/5] xsk: extend documentation for Rx|Tx-only sockets and shared umems Magnus Karlsson
@ 2019-11-08 23:03   ` Jonathan Lemon
  2019-11-10 18:35     ` William Tu
  0 siblings, 1 reply; 27+ messages in thread
From: Jonathan Lemon @ 2019-11-08 23:03 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: bjorn.topel, ast, daniel, netdev, u9012063, bpf



On 7 Nov 2019, at 9:47, Magnus Karlsson wrote:

> Add more documentation about the new Rx-only and Tx-only sockets in
> libbpf and also how libbpf can now support shared umems. Also found
> two pieces that could be improved in the text, that got fixed in this
> commit.
>
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>

Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program
  2019-11-08 22:56   ` Jonathan Lemon
@ 2019-11-10 18:34     ` William Tu
  0 siblings, 0 replies; 27+ messages in thread
From: William Tu @ 2019-11-10 18:34 UTC (permalink / raw)
  To: Jonathan Lemon
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Daniel Borkmann, Linux Kernel Network Developers, bpf

On Fri, Nov 8, 2019 at 2:56 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>
>
>
> On 7 Nov 2019, at 9:47, Magnus Karlsson wrote:
>
> > Add support in libbpf to create multiple sockets that share a single
> > umem. Note that an external XDP program need to be supplied that
> > routes the incoming traffic to the desired sockets. So you need to
> > supply the libbpf_flag XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD and load
> > your own XDP program.
> >
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
>
> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

Tested-by: William Tu <u9012063@gmail.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 2/5] samples/bpf: add XDP_SHARED_UMEM support to xdpsock
  2019-11-08 22:59   ` Jonathan Lemon
@ 2019-11-10 18:34     ` William Tu
  0 siblings, 0 replies; 27+ messages in thread
From: William Tu @ 2019-11-10 18:34 UTC (permalink / raw)
  To: Jonathan Lemon
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Daniel Borkmann, Linux Kernel Network Developers, bpf

On Fri, Nov 8, 2019 at 2:59 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>
>
>
> On 7 Nov 2019, at 9:47, Magnus Karlsson wrote:
>
> > Add support for the XDP_SHARED_UMEM mode to the xdpsock sample
> > application. As libbpf does not have a built in XDP program for this
> > mode, we use an explicitly loaded XDP program. This also serves as an
> > example on how to write your own XDP program that can route to an
> > AF_XDP socket.
> >
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
>
> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

Tested-by: William Tu <u9012063@gmail.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 3/5] libbpf: allow for creating Rx or Tx only AF_XDP sockets
  2019-11-08 23:00   ` Jonathan Lemon
@ 2019-11-10 18:34     ` William Tu
  0 siblings, 0 replies; 27+ messages in thread
From: William Tu @ 2019-11-10 18:34 UTC (permalink / raw)
  To: Jonathan Lemon
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Daniel Borkmann, Linux Kernel Network Developers, bpf

On Fri, Nov 8, 2019 at 3:00 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>
>
>
> On 7 Nov 2019, at 9:47, Magnus Karlsson wrote:
>
> > The libbpf AF_XDP code is extended to allow for the creation of Rx
> > only or Tx only sockets. Previously it returned an error if the socket
> > was not initialized for both Rx and Tx.
> >
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
>
> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

Tested-by: William Tu <u9012063@gmail.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 4/5] samples/bpf: use Rx-only and Tx-only sockets in xdpsock
  2019-11-08 23:02   ` Jonathan Lemon
@ 2019-11-10 18:34     ` William Tu
  0 siblings, 0 replies; 27+ messages in thread
From: William Tu @ 2019-11-10 18:34 UTC (permalink / raw)
  To: Jonathan Lemon
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Daniel Borkmann, Linux Kernel Network Developers, bpf

On Fri, Nov 8, 2019 at 3:02 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>
>
>
> On 7 Nov 2019, at 9:47, Magnus Karlsson wrote:
>
> > Use Rx-only sockets for the rxdrop sample and Tx-only sockets for the
> > txpush sample in the xdpsock application. This so that we exercise and
> > show case these socket types too.
> >
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
>
> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

Tested-by: William Tu <u9012063@gmail.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 5/5] xsk: extend documentation for Rx|Tx-only sockets and shared umems
  2019-11-08 23:03   ` Jonathan Lemon
@ 2019-11-10 18:35     ` William Tu
  0 siblings, 0 replies; 27+ messages in thread
From: William Tu @ 2019-11-10 18:35 UTC (permalink / raw)
  To: Jonathan Lemon
  Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
	Daniel Borkmann, Linux Kernel Network Developers, bpf

On Fri, Nov 8, 2019 at 3:03 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>
>
>
> On 7 Nov 2019, at 9:47, Magnus Karlsson wrote:
>
> > Add more documentation about the new Rx-only and Tx-only sockets in
> > libbpf and also how libbpf can now support shared umems. Also found
> > two pieces that could be improved in the text, that got fixed in this
> > commit.
> >
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
>
> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

Tested-by: William Tu <u9012063@gmail.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets
  2019-11-07 17:47 [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets Magnus Karlsson
                   ` (5 preceding siblings ...)
  2019-11-08 14:57 ` [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets William Tu
@ 2019-11-11  3:32 ` Alexei Starovoitov
  6 siblings, 0 replies; 27+ messages in thread
From: Alexei Starovoitov @ 2019-11-11  3:32 UTC (permalink / raw)
  To: Magnus Karlsson
  Cc: Björn Töpel, Alexei Starovoitov, Daniel Borkmann,
	Network Development, Jonathan Lemon, William Tu, bpf

On Thu, Nov 7, 2019 at 9:48 AM Magnus Karlsson
<magnus.karlsson@intel.com> wrote:
>
> This patch set extends libbpf and the xdpsock sample program to
> demonstrate the shared umem mode (XDP_SHARED_UMEM) as well as Rx-only
> and Tx-only sockets. This in order for users to have an example to use
> as a blue print and also so that these modes will be exercised more
> frequently.
>
> Note that the user needs to supply an XDP program with the
> XDP_SHARED_UMEM mode that distributes the packets over the sockets
> according to some policy. There is an example supplied with the
> xdpsock program, but there is no default one in libbpf similarly to
> when XDP_SHARED_UMEM is not used. The reason for this is that I felt
> that supplying one that would work for all users in this mode is
> futile. There are just tons of ways to distribute packets, so whatever
> I come up with and build into libbpf would be wrong in most cases.
>
> This patch has been applied against commit 30ee348c1267 ("Merge branch 'bpf-libbpf-fixes'")
>
> Structure of the patch set:
>
> Patch 1: Adds shared umem support to libbpf
> Patch 2: Shared umem support and example XPD program added to xdpsock sample
> Patch 3: Adds Rx-only and Tx-only support to libbpf
> Patch 4: Uses Rx-only sockets for rxdrop and Tx-only sockets for txpush in
>          the xdpsock sample
> Patch 5: Add documentation entries for these two features

Applied. Thanks

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2019-11-11  3:32 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-07 17:47 [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets Magnus Karlsson
2019-11-07 17:47 ` [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program Magnus Karlsson
2019-11-08 18:03   ` William Tu
2019-11-08 18:19     ` Magnus Karlsson
2019-11-08 18:43       ` William Tu
2019-11-08 19:17         ` Magnus Karlsson
2019-11-08 22:31           ` William Tu
2019-11-08 22:56   ` Jonathan Lemon
2019-11-10 18:34     ` William Tu
2019-11-07 17:47 ` [PATCH bpf-next 2/5] samples/bpf: add XDP_SHARED_UMEM support to xdpsock Magnus Karlsson
2019-11-08 18:13   ` William Tu
2019-11-08 18:33     ` Magnus Karlsson
2019-11-08 19:09       ` William Tu
2019-11-08 22:59   ` Jonathan Lemon
2019-11-10 18:34     ` William Tu
2019-11-07 17:47 ` [PATCH bpf-next 3/5] libbpf: allow for creating Rx or Tx only AF_XDP sockets Magnus Karlsson
2019-11-08 23:00   ` Jonathan Lemon
2019-11-10 18:34     ` William Tu
2019-11-07 17:47 ` [PATCH bpf-next 4/5] samples/bpf: use Rx-only and Tx-only sockets in xdpsock Magnus Karlsson
2019-11-08 23:02   ` Jonathan Lemon
2019-11-10 18:34     ` William Tu
2019-11-07 17:47 ` [PATCH bpf-next 5/5] xsk: extend documentation for Rx|Tx-only sockets and shared umems Magnus Karlsson
2019-11-08 23:03   ` Jonathan Lemon
2019-11-10 18:35     ` William Tu
2019-11-08 14:57 ` [PATCH bpf-next 0/5] Extend libbpf to support shared umems and Rx|Tx-only sockets William Tu
2019-11-08 18:09   ` Magnus Karlsson
2019-11-11  3:32 ` Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).