All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v4 0/3] bpf: support input xdp_md context in BPF_PROG_TEST_RUN
@ 2021-06-04 22:02 Zvi Effron
  2021-06-04 22:02 ` [PATCH bpf-next v4 1/3] " Zvi Effron
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Zvi Effron @ 2021-06-04 22:02 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, David S. Miller, Daniel Borkmann,
	Jesper Dangaard Brouer, Andrii Nakryiko, Maciej Fijalkowski,
	Martin KaFai Lau, Zvi Effron

This patchset adds support for passing an xdp_md via ctx_in/ctx_out in
bpf_attr for BPF_PROG_TEST_RUN of XDP programs.

Patch 1 adds initial support for passing XDP meta data in addition to
packet data.

Patch 2 adds support for also specifying the ingress interface and
rx queue.

Patch 3 adds selftests to ensure functionality is correct.

Changelog:
----------
v3->v4
v3: https://lore.kernel.org/bpf/20210602190815.8096-1-zeffron@riotgames.com/

 * Clean up nits
 * Validate xdp_md->data_end in bpf_prog_test_run_xdp
 * Remove intermediate metalen variables

v2 -> v3
v2: https://lore.kernel.org/bpf/20210527201341.7128-1-zeffron@riotgames.com/

 * Check errno first in selftests
 * Use DECLARE_LIBBPF_OPTS
 * Rename tattr to opts in selftests
 * Remove extra new line
 * Rename convert_xdpmd_to_xdpb to xdp_convert_md_to_buff
 * Rename convert_xdpb_to_xdpmd to xdp_convert_buff_to_md
 * Move declaration of device and rxqueue in xdp_convert_md_to_buff to
  patch 2
 * Reorder the kfree calls in bpf_prog_test_run_xdp

v1 -> v2
v1: https://lore.kernel.org/bpf/20210524220555.251473-1-zeffron@riotgames.com

 * Fix null pointer dereference with no context
 * Use the BPF skeleton and replace CHECK with ASSERT macros

Zvi Effron (3):
  bpf: support input xdp_md context in BPF_PROG_TEST_RUN
  bpf: support specifying ingress via xdp_md context in
    BPF_PROG_TEST_RUN
  selftests/bpf: Add test for xdp_md context in BPF_PROG_TEST_RUN

 include/uapi/linux/bpf.h                      |   3 -
 net/bpf/test_run.c                            |  91 ++++++++++++--
 .../bpf/prog_tests/xdp_context_test_run.c     | 114 ++++++++++++++++++
 .../bpf/progs/test_xdp_context_test_run.c     |  20 +++
 4 files changed, 218 insertions(+), 10 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_context_test_run.c


base-commit: 56b8b7f9533b5c40cbc1266b5cc6a3b19dfd2aad
-- 
2.31.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH bpf-next v4 1/3] bpf: support input xdp_md context in BPF_PROG_TEST_RUN
  2021-06-04 22:02 [PATCH bpf-next v4 0/3] bpf: support input xdp_md context in BPF_PROG_TEST_RUN Zvi Effron
@ 2021-06-04 22:02 ` Zvi Effron
  2021-06-06  3:17   ` Yonghong Song
  2021-06-04 22:02 ` [PATCH bpf-next v4 2/3] bpf: support specifying ingress via " Zvi Effron
  2021-06-04 22:02 ` [PATCH bpf-next v4 3/3] selftests/bpf: Add test for " Zvi Effron
  2 siblings, 1 reply; 13+ messages in thread
From: Zvi Effron @ 2021-06-04 22:02 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, David S. Miller, Daniel Borkmann,
	Jesper Dangaard Brouer, Andrii Nakryiko, Maciej Fijalkowski,
	Martin KaFai Lau, Zvi Effron, Cody Haas, Lisa Watanabe

Support passing a xdp_md via ctx_in/ctx_out in bpf_attr for
BPF_PROG_TEST_RUN.

The intended use case is to pass some XDP meta data to the test runs of
XDP programs that are used as tail calls.

For programs that use bpf_prog_test_run_xdp, support xdp_md input and
output. Unlike with an actual xdp_md during a non-test run, data_meta must
be 0 because it must point to the start of the provided user data. From
the initial xdp_md, use data and data_end to adjust the pointers in the
generated xdp_buff. All other non-zero fields are prohibited (with
EINVAL). If the user has set ctx_out/ctx_size_out, copy the (potentially
different) xdp_md back to the userspace.

We require all fields of input xdp_md except the ones we explicitly
support to be set to zero. The expectation is that in the future we might
add support for more fields and we want to fail explicitly if the user
runs the program on the kernel where we don't yet support them.

Co-developed-by: Cody Haas <chaas@riotgames.com>
Signed-off-by: Cody Haas <chaas@riotgames.com>
Co-developed-by: Lisa Watanabe <lwatanabe@riotgames.com>
Signed-off-by: Lisa Watanabe <lwatanabe@riotgames.com>
Signed-off-by: Zvi Effron <zeffron@riotgames.com>
---
 include/uapi/linux/bpf.h |  3 --
 net/bpf/test_run.c       | 77 ++++++++++++++++++++++++++++++++++++----
 2 files changed, 70 insertions(+), 10 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 2c1ba70abbf1..a9dcf3d8c85a 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -324,9 +324,6 @@ union bpf_iter_link_info {
  *		**BPF_PROG_TYPE_SK_LOOKUP**
  *			*data_in* and *data_out* must be NULL.
  *
- *		**BPF_PROG_TYPE_XDP**
- *			*ctx_in* and *ctx_out* must be NULL.
- *
  *		**BPF_PROG_TYPE_RAW_TRACEPOINT**,
  *		**BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE**
  *
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index aa47af349ba8..698618f2b27e 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -687,6 +687,38 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
 	return ret;
 }
 
+static int xdp_convert_md_to_buff(struct xdp_buff *xdp, struct xdp_md *xdp_md)
+{
+	void *data;
+
+	if (!xdp_md)
+		return 0;
+
+	if (xdp_md->egress_ifindex != 0)
+		return -EINVAL;
+
+	if (xdp_md->data > xdp_md->data_end)
+		return -EINVAL;
+
+	xdp->data = xdp->data_meta + xdp_md->data;
+
+	if (xdp_md->ingress_ifindex != 0 || xdp_md->rx_queue_index != 0)
+		return -EINVAL;
+
+	return 0;
+}
+
+static void xdp_convert_buff_to_md(struct xdp_buff *xdp, struct xdp_md *xdp_md)
+{
+	if (!xdp_md)
+		return;
+
+	/* xdp_md->data_meta must always point to the start of the out buffer */
+	xdp_md->data_meta = 0;
+	xdp_md->data = xdp->data - xdp->data_meta;
+	xdp_md->data_end = xdp->data_end - xdp->data_meta;
+}
+
 int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
 			  union bpf_attr __user *uattr)
 {
@@ -696,36 +728,68 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
 	u32 repeat = kattr->test.repeat;
 	struct netdev_rx_queue *rxqueue;
 	struct xdp_buff xdp = {};
+	struct xdp_md *ctx;
 	u32 retval, duration;
 	u32 max_data_sz;
 	void *data;
 	int ret;
 
-	if (kattr->test.ctx_in || kattr->test.ctx_out)
-		return -EINVAL;
+	ctx = bpf_ctx_init(kattr, sizeof(struct xdp_md));
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
+
+	/* There can't be user provided data before the metadata */
+	if (ctx) {
+		if (ctx->data_meta)
+			return -EINVAL;
+		if (ctx->data_end != size)
+			return -EINVAL;
+		if (unlikely((ctx->data & (sizeof(__u32) - 1)) ||
+			     ctx->data > 32))
+			return -EINVAL;
+		/* Metadata is allocated from the headroom */
+		headroom -= ctx->data;
+	}
 
 	/* XDP have extra tailroom as (most) drivers use full page */
 	max_data_sz = 4096 - headroom - tailroom;
 
 	data = bpf_test_init(kattr, max_data_sz, headroom, tailroom);
-	if (IS_ERR(data))
+	if (IS_ERR(data)) {
+		kfree(ctx);
 		return PTR_ERR(data);
+	}
 
 	rxqueue = __netif_get_rx_queue(current->nsproxy->net_ns->loopback_dev, 0);
 	xdp_init_buff(&xdp, headroom + max_data_sz + tailroom,
 		      &rxqueue->xdp_rxq);
 	xdp_prepare_buff(&xdp, data, headroom, size, true);
 
+	ret = xdp_convert_md_to_buff(&xdp, ctx);
+	if (ret) {
+		kfree(data);
+		kfree(ctx);
+		return ret;
+	}
+
 	bpf_prog_change_xdp(NULL, prog);
 	ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration, true);
 	if (ret)
 		goto out;
-	if (xdp.data != data + headroom || xdp.data_end != xdp.data + size)
-		size = xdp.data_end - xdp.data;
-	ret = bpf_test_finish(kattr, uattr, xdp.data, size, retval, duration);
+
+	if (xdp.data_meta != data + headroom || xdp.data_end != xdp.data_meta + size)
+		size = xdp.data_end - xdp.data_meta;
+
+	xdp_convert_buff_to_md(&xdp, ctx);
+
+	ret = bpf_test_finish(kattr, uattr, xdp.data_meta, size, retval, duration);
+	if (!ret)
+		ret = bpf_ctx_finish(kattr, uattr, ctx,
+				     sizeof(struct xdp_md));
 out:
 	bpf_prog_change_xdp(prog, NULL);
 	kfree(data);
+	kfree(ctx);
 	return ret;
 }
 
@@ -809,7 +873,6 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 	if (!ret)
 		ret = bpf_ctx_finish(kattr, uattr, user_ctx,
 				     sizeof(struct bpf_flow_keys));
-
 out:
 	kfree(user_ctx);
 	kfree(data);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf-next v4 2/3] bpf: support specifying ingress via xdp_md context in BPF_PROG_TEST_RUN
  2021-06-04 22:02 [PATCH bpf-next v4 0/3] bpf: support input xdp_md context in BPF_PROG_TEST_RUN Zvi Effron
  2021-06-04 22:02 ` [PATCH bpf-next v4 1/3] " Zvi Effron
@ 2021-06-04 22:02 ` Zvi Effron
  2021-06-06  3:36   ` Yonghong Song
  2021-06-04 22:02 ` [PATCH bpf-next v4 3/3] selftests/bpf: Add test for " Zvi Effron
  2 siblings, 1 reply; 13+ messages in thread
From: Zvi Effron @ 2021-06-04 22:02 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, David S. Miller, Daniel Borkmann,
	Jesper Dangaard Brouer, Andrii Nakryiko, Maciej Fijalkowski,
	Martin KaFai Lau, Zvi Effron, Cody Haas, Lisa Watanabe

Support specifying the ingress_ifindex and rx_queue_index of xdp_md
contexts for BPF_PROG_TEST_RUN.

The intended use case is to allow testing XDP programs that make decisions
based on the ingress interface or RX queue.

If ingress_ifindex is specified, look up the device by the provided index
in the current namespace and use its xdp_rxq for the xdp_buff. If the
rx_queue_index is out of range, or is non-zero when the ingress_ifindex is
0, return EINVAL.

Co-developed-by: Cody Haas <chaas@riotgames.com>
Signed-off-by: Cody Haas <chaas@riotgames.com>
Co-developed-by: Lisa Watanabe <lwatanabe@riotgames.com>
Signed-off-by: Lisa Watanabe <lwatanabe@riotgames.com>
Signed-off-by: Zvi Effron <zeffron@riotgames.com>
---
 net/bpf/test_run.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 698618f2b27e..3916205fc3d4 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -690,6 +690,8 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
 static int xdp_convert_md_to_buff(struct xdp_buff *xdp, struct xdp_md *xdp_md)
 {
 	void *data;
+	struct net_device *device;
+	struct netdev_rx_queue *rxqueue;
 
 	if (!xdp_md)
 		return 0;
@@ -702,9 +704,21 @@ static int xdp_convert_md_to_buff(struct xdp_buff *xdp, struct xdp_md *xdp_md)
 
 	xdp->data = xdp->data_meta + xdp_md->data;
 
-	if (xdp_md->ingress_ifindex != 0 || xdp_md->rx_queue_index != 0)
+	if (!xdp_md->ingress_ifindex && xdp_md->rx_queue_index)
 		return -EINVAL;
 
+	if (xdp_md->ingress_ifindex) {
+		device = dev_get_by_index(current->nsproxy->net_ns, xdp_md->ingress_ifindex);
+		if (!device)
+			return -EINVAL;
+
+		if (xdp_md->rx_queue_index >= device->real_num_rx_queues)
+			return -EINVAL;
+
+		rxqueue = __netif_get_rx_queue(device, xdp_md->rx_queue_index);
+		xdp->rxq = &rxqueue->xdp_rxq;
+	}
+
 	return 0;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf-next v4 3/3] selftests/bpf: Add test for xdp_md context in BPF_PROG_TEST_RUN
  2021-06-04 22:02 [PATCH bpf-next v4 0/3] bpf: support input xdp_md context in BPF_PROG_TEST_RUN Zvi Effron
  2021-06-04 22:02 ` [PATCH bpf-next v4 1/3] " Zvi Effron
  2021-06-04 22:02 ` [PATCH bpf-next v4 2/3] bpf: support specifying ingress via " Zvi Effron
@ 2021-06-04 22:02 ` Zvi Effron
  2021-06-06  4:18   ` Yonghong Song
  2021-06-06  5:36   ` Yonghong Song
  2 siblings, 2 replies; 13+ messages in thread
From: Zvi Effron @ 2021-06-04 22:02 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, David S. Miller, Daniel Borkmann,
	Jesper Dangaard Brouer, Andrii Nakryiko, Maciej Fijalkowski,
	Martin KaFai Lau, Zvi Effron, Cody Haas, Lisa Watanabe

Add a test for using xdp_md as a context to BPF_PROG_TEST_RUN for XDP
programs.

The test uses a BPF program that takes in a return value from XDP
metadata, then reduces the size of the XDP metadata by 4 bytes.

Test cases validate the possible failure cases for passing in invalid
xdp_md contexts, that the return value is successfully passed
in, and that the adjusted metadata is successfully copied out.

Co-developed-by: Cody Haas <chaas@riotgames.com>
Signed-off-by: Cody Haas <chaas@riotgames.com>
Co-developed-by: Lisa Watanabe <lwatanabe@riotgames.com>
Signed-off-by: Lisa Watanabe <lwatanabe@riotgames.com>
Signed-off-by: Zvi Effron <zeffron@riotgames.com>
---
 .../bpf/prog_tests/xdp_context_test_run.c     | 114 ++++++++++++++++++
 .../bpf/progs/test_xdp_context_test_run.c     |  20 +++
 2 files changed, 134 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_context_test_run.c

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
new file mode 100644
index 000000000000..0dbdebbc66ce
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
@@ -0,0 +1,114 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+#include <network_helpers.h>
+#include "test_xdp_context_test_run.skel.h"
+
+void test_xdp_context_test_run(void)
+{
+	struct test_xdp_context_test_run *skel = NULL;
+	char data[sizeof(pkt_v4) + sizeof(__u32)];
+	char buf[128];
+	char bad_ctx[sizeof(struct xdp_md)];
+	struct xdp_md ctx_in, ctx_out;
+	DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts,
+			    .data_in = &data,
+			    .data_out = buf,
+				.data_size_in = sizeof(data),
+			    .data_size_out = sizeof(buf),
+			    .ctx_out = &ctx_out,
+			    .ctx_size_out = sizeof(ctx_out),
+			    .repeat = 1,
+		);
+	int err, prog_fd;
+
+	skel = test_xdp_context_test_run__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "skel"))
+		return;
+	prog_fd = bpf_program__fd(skel->progs._xdp_context);
+
+	*(__u32 *)data = XDP_PASS;
+	*(struct ipv4_packet *)(data + sizeof(__u32)) = pkt_v4;
+
+	memset(&ctx_in, 0, sizeof(ctx_in));
+	opts.ctx_in = &ctx_in;
+	opts.ctx_size_in = sizeof(ctx_in);
+
+	opts.ctx_in = &ctx_in;
+	opts.ctx_size_in = sizeof(ctx_in);
+	ctx_in.data_meta = 0;
+	ctx_in.data = sizeof(__u32);
+	ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
+	err = bpf_prog_test_run_opts(prog_fd, &opts);
+	ASSERT_OK(err, "bpf_prog_test_run(test1)");
+	ASSERT_EQ(opts.retval, XDP_PASS, "test1-retval");
+	ASSERT_EQ(opts.data_size_out, sizeof(pkt_v4), "test1-datasize");
+	ASSERT_EQ(opts.ctx_size_out, opts.ctx_size_in, "test1-ctxsize");
+	ASSERT_EQ(ctx_out.data_meta, 0, "test1-datameta");
+	ASSERT_EQ(ctx_out.data, ctx_out.data_meta, "test1-data");
+	ASSERT_EQ(ctx_out.data_end, sizeof(pkt_v4), "test1-dataend");
+
+	/* Data past the end of the kernel's struct xdp_md must be 0 */
+	bad_ctx[sizeof(bad_ctx) - 1] = 1;
+	opts.ctx_in = bad_ctx;
+	opts.ctx_size_in = sizeof(bad_ctx);
+	err = bpf_prog_test_run_opts(prog_fd, &opts);
+	ASSERT_EQ(errno, 22, "test2-errno");
+	ASSERT_ERR(err, "bpf_prog_test_run(test2)");
+
+	/* The egress cannot be specified */
+	ctx_in.egress_ifindex = 1;
+	err = bpf_prog_test_run_opts(prog_fd, &opts);
+	ASSERT_EQ(errno, 22, "test3-errno");
+	ASSERT_ERR(err, "bpf_prog_test_run(test3)");
+
+	/* data_meta must reference the start of data */
+	ctx_in.data_meta = sizeof(__u32);
+	ctx_in.data = ctx_in.data_meta;
+	ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
+	ctx_in.egress_ifindex = 0;
+	err = bpf_prog_test_run_opts(prog_fd, &opts);
+	ASSERT_EQ(errno, 22, "test4-errno");
+	ASSERT_ERR(err, "bpf_prog_test_run(test4)");
+
+	/* Metadata must be 32 bytes or smaller */
+	ctx_in.data_meta = 0;
+	ctx_in.data = sizeof(__u32)*9;
+	ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
+	err = bpf_prog_test_run_opts(prog_fd, &opts);
+	ASSERT_EQ(errno, 22, "test5-errno");
+	ASSERT_ERR(err, "bpf_prog_test_run(test5)");
+
+	/* Metadata's size must be a multiple of 4 */
+	ctx_in.data = 3;
+	err = bpf_prog_test_run_opts(prog_fd, &opts);
+	ASSERT_EQ(errno, 22, "test6-errno");
+	ASSERT_ERR(err, "bpf_prog_test_run(test6)");
+
+	/* Total size of data must match data_end - data_meta */
+	ctx_in.data = 0;
+	ctx_in.data_end = sizeof(pkt_v4) - 4;
+	err = bpf_prog_test_run_opts(prog_fd, &opts);
+	ASSERT_EQ(errno, 22, "test7-errno");
+	ASSERT_ERR(err, "bpf_prog_test_run(test7)");
+
+	ctx_in.data_end = sizeof(pkt_v4) + 4;
+	err = bpf_prog_test_run_opts(prog_fd, &opts);
+	ASSERT_EQ(errno, 22, "test8-errno");
+	ASSERT_ERR(err, "bpf_prog_test_run(test8)");
+
+	/* RX queue cannot be specified without specifying an ingress */
+	ctx_in.data_end = sizeof(pkt_v4);
+	ctx_in.ingress_ifindex = 0;
+	ctx_in.rx_queue_index = 1;
+	err = bpf_prog_test_run_opts(prog_fd, &opts);
+	ASSERT_EQ(errno, 22, "test9-errno");
+	ASSERT_ERR(err, "bpf_prog_test_run(test9)");
+
+	ctx_in.ingress_ifindex = 1;
+	ctx_in.rx_queue_index = 1;
+	err = bpf_prog_test_run_opts(prog_fd, &opts);
+	ASSERT_EQ(errno, 22, "test10-errno");
+	ASSERT_ERR(err, "bpf_prog_test_run(test10)");
+
+	test_xdp_context_test_run__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_context_test_run.c b/tools/testing/selftests/bpf/progs/test_xdp_context_test_run.c
new file mode 100644
index 000000000000..56fd0995b67c
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_xdp_context_test_run.c
@@ -0,0 +1,20 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+SEC("xdp")
+int _xdp_context(struct xdp_md *xdp)
+{
+	void *data = (void *)(unsigned long)xdp->data;
+	__u32 *metadata = (void *)(unsigned long)xdp->data_meta;
+	__u32 ret;
+
+	if (metadata + 1 > data)
+		return XDP_ABORTED;
+	ret = *metadata;
+	if (bpf_xdp_adjust_meta(xdp, 4))
+		return XDP_ABORTED;
+	return ret;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v4 1/3] bpf: support input xdp_md context in BPF_PROG_TEST_RUN
  2021-06-04 22:02 ` [PATCH bpf-next v4 1/3] " Zvi Effron
@ 2021-06-06  3:17   ` Yonghong Song
  2021-06-07 17:58     ` Martin KaFai Lau
  2021-06-09 17:06     ` Zvi Effron
  0 siblings, 2 replies; 13+ messages in thread
From: Yonghong Song @ 2021-06-06  3:17 UTC (permalink / raw)
  To: Zvi Effron, bpf
  Cc: Alexei Starovoitov, David S. Miller, Daniel Borkmann,
	Jesper Dangaard Brouer, Andrii Nakryiko, Maciej Fijalkowski,
	Martin KaFai Lau, Cody Haas, Lisa Watanabe



On 6/4/21 3:02 PM, Zvi Effron wrote:
> Support passing a xdp_md via ctx_in/ctx_out in bpf_attr for
> BPF_PROG_TEST_RUN.
> 
> The intended use case is to pass some XDP meta data to the test runs of
> XDP programs that are used as tail calls.
> 
> For programs that use bpf_prog_test_run_xdp, support xdp_md input and
> output. Unlike with an actual xdp_md during a non-test run, data_meta must
> be 0 because it must point to the start of the provided user data. From
> the initial xdp_md, use data and data_end to adjust the pointers in the
> generated xdp_buff. All other non-zero fields are prohibited (with
> EINVAL). If the user has set ctx_out/ctx_size_out, copy the (potentially
> different) xdp_md back to the userspace.
> 
> We require all fields of input xdp_md except the ones we explicitly
> support to be set to zero. The expectation is that in the future we might
> add support for more fields and we want to fail explicitly if the user
> runs the program on the kernel where we don't yet support them.
> 
> Co-developed-by: Cody Haas <chaas@riotgames.com>
> Signed-off-by: Cody Haas <chaas@riotgames.com>
> Co-developed-by: Lisa Watanabe <lwatanabe@riotgames.com>
> Signed-off-by: Lisa Watanabe <lwatanabe@riotgames.com>
> Signed-off-by: Zvi Effron <zeffron@riotgames.com>
> ---
>   include/uapi/linux/bpf.h |  3 --
>   net/bpf/test_run.c       | 77 ++++++++++++++++++++++++++++++++++++----
>   2 files changed, 70 insertions(+), 10 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 2c1ba70abbf1..a9dcf3d8c85a 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -324,9 +324,6 @@ union bpf_iter_link_info {
>    *		**BPF_PROG_TYPE_SK_LOOKUP**
>    *			*data_in* and *data_out* must be NULL.
>    *
> - *		**BPF_PROG_TYPE_XDP**
> - *			*ctx_in* and *ctx_out* must be NULL.
> - *
>    *		**BPF_PROG_TYPE_RAW_TRACEPOINT**,
>    *		**BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE**
>    *
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index aa47af349ba8..698618f2b27e 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -687,6 +687,38 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
>   	return ret;
>   }
>   
> +static int xdp_convert_md_to_buff(struct xdp_buff *xdp, struct xdp_md *xdp_md)

Should the order of parameters be switched to (xdp_md, xdp)?
This will follow the convention of below function xdp_convert_buff_to_md().

> +{
> +	void *data;
> +
> +	if (!xdp_md)
> +		return 0;
> +
> +	if (xdp_md->egress_ifindex != 0)
> +		return -EINVAL;
> +
> +	if (xdp_md->data > xdp_md->data_end)
> +		return -EINVAL;
> +
> +	xdp->data = xdp->data_meta + xdp_md->data;
> +
> +	if (xdp_md->ingress_ifindex != 0 || xdp_md->rx_queue_index != 0)
> +		return -EINVAL;

It would be good if you did all error checking before doing xdp->data
assignment. Also looks like xdp_md error checking happens here and
bpf_prog_test_run_xdp(). If it is hard to put all error checking
in bpf_prog_test_run_xdp(), at least put "xdp_md->data > 
xdp_md->data_end) in bpf_prog_test_run_xdp(), so this function only
checks *_ifindex and rx_queue_index?


> +
> +	return 0;
> +}
> +
> +static void xdp_convert_buff_to_md(struct xdp_buff *xdp, struct xdp_md *xdp_md)
> +{
> +	if (!xdp_md)
> +		return;
> +
> +	/* xdp_md->data_meta must always point to the start of the out buffer */
> +	xdp_md->data_meta = 0;
> +	xdp_md->data = xdp->data - xdp->data_meta;
> +	xdp_md->data_end = xdp->data_end - xdp->data_meta;
> +}
> +
>   int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
>   			  union bpf_attr __user *uattr)
>   {
> @@ -696,36 +728,68 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
>   	u32 repeat = kattr->test.repeat;
>   	struct netdev_rx_queue *rxqueue;
>   	struct xdp_buff xdp = {};
> +	struct xdp_md *ctx;

Let us try to maintain reverse christmas tree?

>   	u32 retval, duration;
>   	u32 max_data_sz;
>   	void *data;
>   	int ret;
>   
> -	if (kattr->test.ctx_in || kattr->test.ctx_out)
> -		return -EINVAL;
> +	ctx = bpf_ctx_init(kattr, sizeof(struct xdp_md));
> +	if (IS_ERR(ctx))
> +		return PTR_ERR(ctx);
> +
> +	/* There can't be user provided data before the metadata */
> +	if (ctx) {
> +		if (ctx->data_meta)
> +			return -EINVAL;
> +		if (ctx->data_end != size)
> +			return -EINVAL;
> +		if (unlikely((ctx->data & (sizeof(__u32) - 1)) ||
> +			     ctx->data > 32))

Why 32? Should it be sizeof(struct xdp_md)?

> +			return -EINVAL;

As I mentioned in early comments, it would be good if we can
do some or all input parameter validation here.

> +		/* Metadata is allocated from the headroom */
> +		headroom -= ctx->data;

sizeof(struct xdp_md) should be smaller than headroom 
(XDP_PACKET_HEADROOM), so we don't need to a check, but
some comments might be helpful so people looking at the
code doesn't need to double check.

> +	}
>   
>   	/* XDP have extra tailroom as (most) drivers use full page */
>   	max_data_sz = 4096 - headroom - tailroom;
>   
>   	data = bpf_test_init(kattr, max_data_sz, headroom, tailroom);
> -	if (IS_ERR(data))
> +	if (IS_ERR(data)) {
> +		kfree(ctx);
>   		return PTR_ERR(data);
> +	}
>   
>   	rxqueue = __netif_get_rx_queue(current->nsproxy->net_ns->loopback_dev, 0);
>   	xdp_init_buff(&xdp, headroom + max_data_sz + tailroom,
>   		      &rxqueue->xdp_rxq);
>   	xdp_prepare_buff(&xdp, data, headroom, size, true);
>   
> +	ret = xdp_convert_md_to_buff(&xdp, ctx);
> +	if (ret) {
> +		kfree(data);
> +		kfree(ctx);
> +		return ret;
> +	}
> +
>   	bpf_prog_change_xdp(NULL, prog);
>   	ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration, true);
>   	if (ret)
>   		goto out;
> -	if (xdp.data != data + headroom || xdp.data_end != xdp.data + size)
> -		size = xdp.data_end - xdp.data;
> -	ret = bpf_test_finish(kattr, uattr, xdp.data, size, retval, duration);
> +
> +	if (xdp.data_meta != data + headroom || xdp.data_end != xdp.data_meta + size)
> +		size = xdp.data_end - xdp.data_meta;
> +
> +	xdp_convert_buff_to_md(&xdp, ctx);
> +
> +	ret = bpf_test_finish(kattr, uattr, xdp.data_meta, size, retval, duration);
> +	if (!ret)
> +		ret = bpf_ctx_finish(kattr, uattr, ctx,
> +				     sizeof(struct xdp_md));
>   out:
>   	bpf_prog_change_xdp(prog, NULL);
>   	kfree(data);
> +	kfree(ctx);
>   	return ret;
>   }
>   
> @@ -809,7 +873,6 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
>   	if (!ret)
>   		ret = bpf_ctx_finish(kattr, uattr, user_ctx,
>   				     sizeof(struct bpf_flow_keys));
> -
>   out:
>   	kfree(user_ctx);
>   	kfree(data);
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v4 2/3] bpf: support specifying ingress via xdp_md context in BPF_PROG_TEST_RUN
  2021-06-04 22:02 ` [PATCH bpf-next v4 2/3] bpf: support specifying ingress via " Zvi Effron
@ 2021-06-06  3:36   ` Yonghong Song
  0 siblings, 0 replies; 13+ messages in thread
From: Yonghong Song @ 2021-06-06  3:36 UTC (permalink / raw)
  To: Zvi Effron, bpf
  Cc: Alexei Starovoitov, David S. Miller, Daniel Borkmann,
	Jesper Dangaard Brouer, Andrii Nakryiko, Maciej Fijalkowski,
	Martin KaFai Lau, Cody Haas, Lisa Watanabe



On 6/4/21 3:02 PM, Zvi Effron wrote:
> Support specifying the ingress_ifindex and rx_queue_index of xdp_md
> contexts for BPF_PROG_TEST_RUN.
> 
> The intended use case is to allow testing XDP programs that make decisions
> based on the ingress interface or RX queue.
> 
> If ingress_ifindex is specified, look up the device by the provided index
> in the current namespace and use its xdp_rxq for the xdp_buff. If the
> rx_queue_index is out of range, or is non-zero when the ingress_ifindex is
> 0, return EINVAL.
> 
> Co-developed-by: Cody Haas <chaas@riotgames.com>
> Signed-off-by: Cody Haas <chaas@riotgames.com>
> Co-developed-by: Lisa Watanabe <lwatanabe@riotgames.com>
> Signed-off-by: Lisa Watanabe <lwatanabe@riotgames.com>
> Signed-off-by: Zvi Effron <zeffron@riotgames.com>
> ---
>   net/bpf/test_run.c | 16 +++++++++++++++-
>   1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 698618f2b27e..3916205fc3d4 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -690,6 +690,8 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
>   static int xdp_convert_md_to_buff(struct xdp_buff *xdp, struct xdp_md *xdp_md)
>   {
>   	void *data;
> +	struct net_device *device;
> +	struct netdev_rx_queue *rxqueue;

reverse christmas tree?

>   
>   	if (!xdp_md)
>   		return 0;
> @@ -702,9 +704,21 @@ static int xdp_convert_md_to_buff(struct xdp_buff *xdp, struct xdp_md *xdp_md)
>   
>   	xdp->data = xdp->data_meta + xdp_md->data;
>   
> -	if (xdp_md->ingress_ifindex != 0 || xdp_md->rx_queue_index != 0)
> +	if (!xdp_md->ingress_ifindex && xdp_md->rx_queue_index)
>   		return -EINVAL;

xdp_md->ingress_ifindex and xdp_md->rx_queue_index are used three times 
each in this function here. Maybe worthwhile to assign them to
temporary variables?

>   
> +	if (xdp_md->ingress_ifindex) {
> +		device = dev_get_by_index(current->nsproxy->net_ns, xdp_md->ingress_ifindex);
> +		if (!device)
> +			return -EINVAL;
> +
> +		if (xdp_md->rx_queue_index >= device->real_num_rx_queues)
> +			return -EINVAL;
> +
> +		rxqueue = __netif_get_rx_queue(device, xdp_md->rx_queue_index);
> +		xdp->rxq = &rxqueue->xdp_rxq;
> +	}
> +
>   	return 0;
>   }
>   
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v4 3/3] selftests/bpf: Add test for xdp_md context in BPF_PROG_TEST_RUN
  2021-06-04 22:02 ` [PATCH bpf-next v4 3/3] selftests/bpf: Add test for " Zvi Effron
@ 2021-06-06  4:18   ` Yonghong Song
  2021-06-09 17:07     ` Zvi Effron
  2021-06-06  5:36   ` Yonghong Song
  1 sibling, 1 reply; 13+ messages in thread
From: Yonghong Song @ 2021-06-06  4:18 UTC (permalink / raw)
  To: Zvi Effron, bpf
  Cc: Alexei Starovoitov, David S. Miller, Daniel Borkmann,
	Jesper Dangaard Brouer, Andrii Nakryiko, Maciej Fijalkowski,
	Martin KaFai Lau, Cody Haas, Lisa Watanabe



On 6/4/21 3:02 PM, Zvi Effron wrote:
> Add a test for using xdp_md as a context to BPF_PROG_TEST_RUN for XDP
> programs.
> 
> The test uses a BPF program that takes in a return value from XDP
> metadata, then reduces the size of the XDP metadata by 4 bytes.
> 
> Test cases validate the possible failure cases for passing in invalid
> xdp_md contexts, that the return value is successfully passed
> in, and that the adjusted metadata is successfully copied out.
> 
> Co-developed-by: Cody Haas <chaas@riotgames.com>
> Signed-off-by: Cody Haas <chaas@riotgames.com>
> Co-developed-by: Lisa Watanabe <lwatanabe@riotgames.com>
> Signed-off-by: Lisa Watanabe <lwatanabe@riotgames.com>
> Signed-off-by: Zvi Effron <zeffron@riotgames.com>
> ---
>   .../bpf/prog_tests/xdp_context_test_run.c     | 114 ++++++++++++++++++
>   .../bpf/progs/test_xdp_context_test_run.c     |  20 +++
>   2 files changed, 134 insertions(+)
>   create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
>   create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_context_test_run.c
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
> new file mode 100644
> index 000000000000..0dbdebbc66ce
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
> @@ -0,0 +1,114 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <test_progs.h>
> +#include <network_helpers.h>
> +#include "test_xdp_context_test_run.skel.h"
> +
> +void test_xdp_context_test_run(void)
> +{
> +	struct test_xdp_context_test_run *skel = NULL;
> +	char data[sizeof(pkt_v4) + sizeof(__u32)];
> +	char buf[128];
> +	char bad_ctx[sizeof(struct xdp_md)];
> +	struct xdp_md ctx_in, ctx_out;
> +	DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts,
> +			    .data_in = &data,
> +			    .data_out = buf,
> +				.data_size_in = sizeof(data),
> +			    .data_size_out = sizeof(buf),
> +			    .ctx_out = &ctx_out,
> +			    .ctx_size_out = sizeof(ctx_out),
> +			    .repeat = 1,
> +		);
> +	int err, prog_fd;
> +
> +	skel = test_xdp_context_test_run__open_and_load();
> +	if (!ASSERT_OK_PTR(skel, "skel"))
> +		return;
> +	prog_fd = bpf_program__fd(skel->progs._xdp_context);
> +
> +	*(__u32 *)data = XDP_PASS;
> +	*(struct ipv4_packet *)(data + sizeof(__u32)) = pkt_v4;
> +
> +	memset(&ctx_in, 0, sizeof(ctx_in));
> +	opts.ctx_in = &ctx_in;
> +	opts.ctx_size_in = sizeof(ctx_in);
> +
> +	opts.ctx_in = &ctx_in;
> +	opts.ctx_size_in = sizeof(ctx_in);

The above two assignments are redundant.

> +	ctx_in.data_meta = 0;
> +	ctx_in.data = sizeof(__u32);
> +	ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_OK(err, "bpf_prog_test_run(test1)");
> +	ASSERT_EQ(opts.retval, XDP_PASS, "test1-retval");
> +	ASSERT_EQ(opts.data_size_out, sizeof(pkt_v4), "test1-datasize");
> +	ASSERT_EQ(opts.ctx_size_out, opts.ctx_size_in, "test1-ctxsize");
> +	ASSERT_EQ(ctx_out.data_meta, 0, "test1-datameta");
> +	ASSERT_EQ(ctx_out.data, ctx_out.data_meta, "test1-data");

I suggest just to test ctx_out.data == 0. It just happens
the input data - meta = 4 and bpf program adjuested by 4.
If they are not the same, the result won't be equal to data_meta.

> +	ASSERT_EQ(ctx_out.data_end, sizeof(pkt_v4), "test1-dataend");
> +
> +	/* Data past the end of the kernel's struct xdp_md must be 0 */
> +	bad_ctx[sizeof(bad_ctx) - 1] = 1;
> +	opts.ctx_in = bad_ctx;
> +	opts.ctx_size_in = sizeof(bad_ctx);
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test2-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test2)");

I suggest to drop this test. Basically you did here
is to have non-zero egress_ifindex which is not allowed.
You have a test below.

> +
> +	/* The egress cannot be specified */
> +	ctx_in.egress_ifindex = 1;
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test3-errno");

Use EINVAL explicitly? The same for below a few other cases.

> +	ASSERT_ERR(err, "bpf_prog_test_run(test3)");
> +
> +	/* data_meta must reference the start of data */
> +	ctx_in.data_meta = sizeof(__u32);
> +	ctx_in.data = ctx_in.data_meta;
> +	ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
> +	ctx_in.egress_ifindex = 0;
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test4-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test4)");
> +
> +	/* Metadata must be 32 bytes or smaller */
> +	ctx_in.data_meta = 0;
> +	ctx_in.data = sizeof(__u32)*9;
> +	ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test5-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test5)");

This test is not necessary if ctx size should be
<= sizeof(struct xdp_md). So far, I think we can
require it must be sizeof(struct xdp_md). If
in the future, kernel struct xdp_md is extended,
it may be changed to accept both old and new
xdp_md's similar to other uapi data strcture
like struct bpf_prog_info if there is a desire.
In my opinion, the kernel should just stick
to sizeof(struct xdp_md) size since the functionality
is implemented as a *testing* mechanism.

> +
> +	/* Metadata's size must be a multiple of 4 */
> +	ctx_in.data = 3;
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test6-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test6)");
> +
> +	/* Total size of data must match data_end - data_meta */
> +	ctx_in.data = 0;
> +	ctx_in.data_end = sizeof(pkt_v4) - 4;
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test7-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test7)");
> +
> +	ctx_in.data_end = sizeof(pkt_v4) + 4;
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test8-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test8)");
> +
> +	/* RX queue cannot be specified without specifying an ingress */
> +	ctx_in.data_end = sizeof(pkt_v4);
> +	ctx_in.ingress_ifindex = 0;
> +	ctx_in.rx_queue_index = 1;
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test9-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test9)");
> +
> +	ctx_in.ingress_ifindex = 1;
> +	ctx_in.rx_queue_index = 1;
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test10-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test10)");

Why this failure? I guess it is due to device search failure, right?
So this test MAY succeed if the underlying host happens with
a proper configuration with ingress_ifindex = 1 and rx_queue_index = 1,
right?

> +
> +	test_xdp_context_test_run__destroy(skel);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/test_xdp_context_test_run.c b/tools/testing/selftests/bpf/progs/test_xdp_context_test_run.c
> new file mode 100644
> index 000000000000..56fd0995b67c
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_xdp_context_test_run.c
> @@ -0,0 +1,20 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +
> +SEC("xdp")
> +int _xdp_context(struct xdp_md *xdp)

Maybe drop prefix "_" from the function name?

> +{
> +	void *data = (void *)(unsigned long)xdp->data;
> +	__u32 *metadata = (void *)(unsigned long)xdp->data_meta;

The above code is okay as verifier will rewrite correctly with actual 
address. But I still suggest to use "long" instead of "unsigned long"
to be consistent with other bpf programs.

> +	__u32 ret;
> +
> +	if (metadata + 1 > data)
> +		return XDP_ABORTED;
> +	ret = *metadata;
> +	if (bpf_xdp_adjust_meta(xdp, 4))
> +		return XDP_ABORTED;
> +	return ret;
> +}
> +
> +char _license[] SEC("license") = "GPL";
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v4 3/3] selftests/bpf: Add test for xdp_md context in BPF_PROG_TEST_RUN
  2021-06-04 22:02 ` [PATCH bpf-next v4 3/3] selftests/bpf: Add test for " Zvi Effron
  2021-06-06  4:18   ` Yonghong Song
@ 2021-06-06  5:36   ` Yonghong Song
  1 sibling, 0 replies; 13+ messages in thread
From: Yonghong Song @ 2021-06-06  5:36 UTC (permalink / raw)
  To: Zvi Effron, bpf
  Cc: Alexei Starovoitov, David S. Miller, Daniel Borkmann,
	Jesper Dangaard Brouer, Andrii Nakryiko, Maciej Fijalkowski,
	Martin KaFai Lau, Cody Haas, Lisa Watanabe



On 6/4/21 3:02 PM, Zvi Effron wrote:
> Add a test for using xdp_md as a context to BPF_PROG_TEST_RUN for XDP
> programs.
> 
> The test uses a BPF program that takes in a return value from XDP
> metadata, then reduces the size of the XDP metadata by 4 bytes.
> 
> Test cases validate the possible failure cases for passing in invalid
> xdp_md contexts, that the return value is successfully passed
> in, and that the adjusted metadata is successfully copied out.
> 
> Co-developed-by: Cody Haas <chaas@riotgames.com>
> Signed-off-by: Cody Haas <chaas@riotgames.com>
> Co-developed-by: Lisa Watanabe <lwatanabe@riotgames.com>
> Signed-off-by: Lisa Watanabe <lwatanabe@riotgames.com>
> Signed-off-by: Zvi Effron <zeffron@riotgames.com>
> ---
>   .../bpf/prog_tests/xdp_context_test_run.c     | 114 ++++++++++++++++++
>   .../bpf/progs/test_xdp_context_test_run.c     |  20 +++
>   2 files changed, 134 insertions(+)
>   create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
>   create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_context_test_run.c
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
> new file mode 100644
> index 000000000000..0dbdebbc66ce
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
> @@ -0,0 +1,114 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <test_progs.h>
> +#include <network_helpers.h>
> +#include "test_xdp_context_test_run.skel.h"
> +
> +void test_xdp_context_test_run(void)
> +{
> +	struct test_xdp_context_test_run *skel = NULL;
> +	char data[sizeof(pkt_v4) + sizeof(__u32)];
> +	char buf[128];
> +	char bad_ctx[sizeof(struct xdp_md)];
> +	struct xdp_md ctx_in, ctx_out;
> +	DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts,
> +			    .data_in = &data,
> +			    .data_out = buf,
> +				.data_size_in = sizeof(data),
> +			    .data_size_out = sizeof(buf),
> +			    .ctx_out = &ctx_out,
> +			    .ctx_size_out = sizeof(ctx_out),
> +			    .repeat = 1,
> +		);
> +	int err, prog_fd;
> +
> +	skel = test_xdp_context_test_run__open_and_load();
> +	if (!ASSERT_OK_PTR(skel, "skel"))
> +		return;
> +	prog_fd = bpf_program__fd(skel->progs._xdp_context);
> +
> +	*(__u32 *)data = XDP_PASS;
> +	*(struct ipv4_packet *)(data + sizeof(__u32)) = pkt_v4;
> +
> +	memset(&ctx_in, 0, sizeof(ctx_in));
> +	opts.ctx_in = &ctx_in;
> +	opts.ctx_size_in = sizeof(ctx_in);
> +
> +	opts.ctx_in = &ctx_in;
> +	opts.ctx_size_in = sizeof(ctx_in);
> +	ctx_in.data_meta = 0;
> +	ctx_in.data = sizeof(__u32);
> +	ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_OK(err, "bpf_prog_test_run(test1)");
> +	ASSERT_EQ(opts.retval, XDP_PASS, "test1-retval");
> +	ASSERT_EQ(opts.data_size_out, sizeof(pkt_v4), "test1-datasize");
> +	ASSERT_EQ(opts.ctx_size_out, opts.ctx_size_in, "test1-ctxsize");
> +	ASSERT_EQ(ctx_out.data_meta, 0, "test1-datameta");
> +	ASSERT_EQ(ctx_out.data, ctx_out.data_meta, "test1-data");
> +	ASSERT_EQ(ctx_out.data_end, sizeof(pkt_v4), "test1-dataend");
> +
> +	/* Data past the end of the kernel's struct xdp_md must be 0 */
> +	bad_ctx[sizeof(bad_ctx) - 1] = 1;
> +	opts.ctx_in = bad_ctx;
> +	opts.ctx_size_in = sizeof(bad_ctx);
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test2-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test2)");
> +
> +	/* The egress cannot be specified */
> +	ctx_in.egress_ifindex = 1;
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test3-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test3)");
> +
> +	/* data_meta must reference the start of data */
> +	ctx_in.data_meta = sizeof(__u32);
> +	ctx_in.data = ctx_in.data_meta;
> +	ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
> +	ctx_in.egress_ifindex = 0;
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test4-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test4)");
> +
> +	/* Metadata must be 32 bytes or smaller */
> +	ctx_in.data_meta = 0;
> +	ctx_in.data = sizeof(__u32)*9;
> +	ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test5-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test5)");
> +
> +	/* Metadata's size must be a multiple of 4 */
> +	ctx_in.data = 3;
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test6-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test6)");
> +
> +	/* Total size of data must match data_end - data_meta */
> +	ctx_in.data = 0;
> +	ctx_in.data_end = sizeof(pkt_v4) - 4;
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test7-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test7)");
> +
> +	ctx_in.data_end = sizeof(pkt_v4) + 4;
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test8-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test8)");
> +
> +	/* RX queue cannot be specified without specifying an ingress */
> +	ctx_in.data_end = sizeof(pkt_v4);
> +	ctx_in.ingress_ifindex = 0;
> +	ctx_in.rx_queue_index = 1;
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test9-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test9)");

Also, these failure tests are very similar. I suggest to have
a static function to do the common work. This should
simplify the code and also it will be clear for each
test what are ctx_in parameters. In current form,
one may have to search previous test to get some
parameter value. For example, for the above, one has
to search previous tests to find ctx_in.data = 0 and
ctx_in.data_meta = 0.

> +
> +	ctx_in.ingress_ifindex = 1;
> +	ctx_in.rx_queue_index = 1;
> +	err = bpf_prog_test_run_opts(prog_fd, &opts);
> +	ASSERT_EQ(errno, 22, "test10-errno");
> +	ASSERT_ERR(err, "bpf_prog_test_run(test10)");
> +
> +	test_xdp_context_test_run__destroy(skel);
> +}
[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v4 1/3] bpf: support input xdp_md context in BPF_PROG_TEST_RUN
  2021-06-06  3:17   ` Yonghong Song
@ 2021-06-07 17:58     ` Martin KaFai Lau
  2021-06-09 17:06     ` Zvi Effron
  1 sibling, 0 replies; 13+ messages in thread
From: Martin KaFai Lau @ 2021-06-07 17:58 UTC (permalink / raw)
  To: Zvi Effron
  Cc: Yonghong Song, bpf, Alexei Starovoitov, David S. Miller,
	Daniel Borkmann, Jesper Dangaard Brouer, Andrii Nakryiko,
	Maciej Fijalkowski, Cody Haas, Lisa Watanabe

On Sat, Jun 05, 2021 at 08:17:00PM -0700, Yonghong Song wrote:
> 
> 
> On 6/4/21 3:02 PM, Zvi Effron wrote:
> > Support passing a xdp_md via ctx_in/ctx_out in bpf_attr for
> > BPF_PROG_TEST_RUN.
> > 
> > The intended use case is to pass some XDP meta data to the test runs of
> > XDP programs that are used as tail calls.
> > 
> > For programs that use bpf_prog_test_run_xdp, support xdp_md input and
> > output. Unlike with an actual xdp_md during a non-test run, data_meta must
> > be 0 because it must point to the start of the provided user data. From
> > the initial xdp_md, use data and data_end to adjust the pointers in the
> > generated xdp_buff. All other non-zero fields are prohibited (with
> > EINVAL). If the user has set ctx_out/ctx_size_out, copy the (potentially
> > different) xdp_md back to the userspace.
> > 
> > We require all fields of input xdp_md except the ones we explicitly
> > support to be set to zero. The expectation is that in the future we might
> > add support for more fields and we want to fail explicitly if the user
> > runs the program on the kernel where we don't yet support them.
> > 
> > Co-developed-by: Cody Haas <chaas@riotgames.com>
> > Signed-off-by: Cody Haas <chaas@riotgames.com>
> > Co-developed-by: Lisa Watanabe <lwatanabe@riotgames.com>
> > Signed-off-by: Lisa Watanabe <lwatanabe@riotgames.com>
> > Signed-off-by: Zvi Effron <zeffron@riotgames.com>
> > ---
> >   include/uapi/linux/bpf.h |  3 --
> >   net/bpf/test_run.c       | 77 ++++++++++++++++++++++++++++++++++++----
> >   2 files changed, 70 insertions(+), 10 deletions(-)
> > 
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 2c1ba70abbf1..a9dcf3d8c85a 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -324,9 +324,6 @@ union bpf_iter_link_info {
> >    *		**BPF_PROG_TYPE_SK_LOOKUP**
> >    *			*data_in* and *data_out* must be NULL.
> >    *
> > - *		**BPF_PROG_TYPE_XDP**
> > - *			*ctx_in* and *ctx_out* must be NULL.
> > - *
> >    *		**BPF_PROG_TYPE_RAW_TRACEPOINT**,
> >    *		**BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE**
> >    *
> > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > index aa47af349ba8..698618f2b27e 100644
> > --- a/net/bpf/test_run.c
> > +++ b/net/bpf/test_run.c
> > @@ -687,6 +687,38 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
> >   	return ret;
> >   }
> > +static int xdp_convert_md_to_buff(struct xdp_buff *xdp, struct xdp_md *xdp_md)
> 
> Should the order of parameters be switched to (xdp_md, xdp)?
> This will follow the convention of below function xdp_convert_buff_to_md().
> 
> > +{
> > +	void *data;
> > +
> > +	if (!xdp_md)
> > +		return 0;
> > +
> > +	if (xdp_md->egress_ifindex != 0)
> > +		return -EINVAL;
> > +
> > +	if (xdp_md->data > xdp_md->data_end)
> > +		return -EINVAL;
> > +
> > +	xdp->data = xdp->data_meta + xdp_md->data;
> > +
> > +	if (xdp_md->ingress_ifindex != 0 || xdp_md->rx_queue_index != 0)
> > +		return -EINVAL;
> 
> It would be good if you did all error checking before doing xdp->data
> assignment. Also looks like xdp_md error checking happens here and
> bpf_prog_test_run_xdp(). If it is hard to put all error checking
> in bpf_prog_test_run_xdp(), at least put "xdp_md->data > xdp_md->data_end)
> in bpf_prog_test_run_xdp(),
+1 on at least all data_meta/data/data_end checks should be in one place
in bpf_prog_test_run_xdp().

> so this function only
> checks *_ifindex and rx_queue_index?
> 
> 
> > +
> > +	return 0;
> > +}
> > +
> > +static void xdp_convert_buff_to_md(struct xdp_buff *xdp, struct xdp_md *xdp_md)
> > +{
> > +	if (!xdp_md)
> > +		return;
> > +
> > +	/* xdp_md->data_meta must always point to the start of the out buffer */
> > +	xdp_md->data_meta = 0;
Is this necessary?  data_meta should not have been changed.

> > +	xdp_md->data = xdp->data - xdp->data_meta;
> > +	xdp_md->data_end = xdp->data_end - xdp->data_meta;
> > +}
> > +

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v4 1/3] bpf: support input xdp_md context in BPF_PROG_TEST_RUN
  2021-06-06  3:17   ` Yonghong Song
  2021-06-07 17:58     ` Martin KaFai Lau
@ 2021-06-09 17:06     ` Zvi Effron
  2021-06-10  0:07       ` Yonghong Song
  1 sibling, 1 reply; 13+ messages in thread
From: Zvi Effron @ 2021-06-09 17:06 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, David S. Miller, Daniel Borkmann,
	Jesper Dangaard Brouer, Andrii Nakryiko, Maciej Fijalkowski,
	Martin KaFai Lau, Cody Haas, Lisa Watanabe

On Sat, Jun 5, 2021 at 10:17 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 6/4/21 3:02 PM, Zvi Effron wrote:
> > --- a/net/bpf/test_run.c
> > +++ b/net/bpf/test_run.c
> > @@ -687,6 +687,38 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
> >       return ret;
> >   }
> >
> > +static int xdp_convert_md_to_buff(struct xdp_buff *xdp, struct xdp_md *xdp_md)
>
> Should the order of parameters be switched to (xdp_md, xdp)?
> This will follow the convention of below function xdp_convert_buff_to_md().
>

The order was done to match the skb versions of these functions, which seem to
have the output format first and the input format second, which is why the
order flips between conversion functions. We're not particular about order, so
we can definitely make it consistent.

> > +{
> > +     void *data;
> > +
> > +     if (!xdp_md)
> > +             return 0;
> > +
> > +     if (xdp_md->egress_ifindex != 0)
> > +             return -EINVAL;
> > +
> > +     if (xdp_md->data > xdp_md->data_end)
> > +             return -EINVAL;
> > +
> > +     xdp->data = xdp->data_meta + xdp_md->data;
> > +
> > +     if (xdp_md->ingress_ifindex != 0 || xdp_md->rx_queue_index != 0)
> > +             return -EINVAL;
>
> It would be good if you did all error checking before doing xdp->data
> assignment. Also looks like xdp_md error checking happens here and
> bpf_prog_test_run_xdp(). If it is hard to put all error checking
> in bpf_prog_test_run_xdp(), at least put "xdp_md->data >
> xdp_md->data_end) in bpf_prog_test_run_xdp(), so this function only
> checks *_ifindex and rx_queue_index?
>

bpf_prog_test_run_xdp() was already a large function, which is why this was
turned into a helper. Initially, we tried to have all xdp_md related logic in
the helper, with only the required logic in bpf_prog_test_run_xdp(). Based on
a prior suggestion, we moved one additional check from the helper to
bpf_prog_test_run_xdp() as it simplified the logic. It's not clear to us what
benefit moving the other checks to bpf_prog_test_run_xdp() provides, but it
does reduce the benefit of having the helper function.

> > @@ -696,36 +728,68 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
> >       u32 repeat = kattr->test.repeat;
> >       struct netdev_rx_queue *rxqueue;
> >       struct xdp_buff xdp = {};
> > +     struct xdp_md *ctx;
>
> Let us try to maintain reverse christmas tree?

Sure.


>
> >       u32 retval, duration;
> >       u32 max_data_sz;
> >       void *data;
> >       int ret;
> >
> > -     if (kattr->test.ctx_in || kattr->test.ctx_out)
> > -             return -EINVAL;
> > +     ctx = bpf_ctx_init(kattr, sizeof(struct xdp_md));
> > +     if (IS_ERR(ctx))
> > +             return PTR_ERR(ctx);
> > +
> > +     /* There can't be user provided data before the metadata */
> > +     if (ctx) {
> > +             if (ctx->data_meta)
> > +                     return -EINVAL;
> > +             if (ctx->data_end != size)
> > +                     return -EINVAL;
> > +             if (unlikely((ctx->data & (sizeof(__u32) - 1)) ||
> > +                          ctx->data > 32))
>
> Why 32? Should it be sizeof(struct xdp_md)?

This is not checking the context itself, but the amount of metadata. XDP allows
at most 32 bytes of metadata.

>
> > +             /* Metadata is allocated from the headroom */
> > +             headroom -= ctx->data;
>
> sizeof(struct xdp_md) should be smaller than headroom
> (XDP_PACKET_HEADROOM), so we don't need to a check, but
> some comments might be helpful so people looking at the
> code doesn't need to double check.

We're not sure what check you're referring to, as there's no check here. This
subtraction is, as the comment says, because the XDP metadata is allocated out
of the XDP headroom, so the headroom size needs to be reduced by the metadata
size.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v4 3/3] selftests/bpf: Add test for xdp_md context in BPF_PROG_TEST_RUN
  2021-06-06  4:18   ` Yonghong Song
@ 2021-06-09 17:07     ` Zvi Effron
  2021-06-10  0:11       ` Yonghong Song
  0 siblings, 1 reply; 13+ messages in thread
From: Zvi Effron @ 2021-06-09 17:07 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, David S. Miller, Daniel Borkmann,
	Jesper Dangaard Brouer, Andrii Nakryiko, Maciej Fijalkowski,
	Martin KaFai Lau, Cody Haas, Lisa Watanabe

On Sat, Jun 5, 2021 at 11:19 PM Yonghong Song <yhs@fb.com> wrote:
> On 6/4/21 3:02 PM, Zvi Effron wrote:
> > +     opts.ctx_in = &ctx_in;
> > +     opts.ctx_size_in = sizeof(ctx_in);
> > +
> > +     opts.ctx_in = &ctx_in;
> > +     opts.ctx_size_in = sizeof(ctx_in);
>
> The above two assignments are redundant.
>

Good catch.

> > +     ctx_in.data_meta = 0;
> > +     ctx_in.data = sizeof(__u32);
> > +     ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
> > +     err = bpf_prog_test_run_opts(prog_fd, &opts);
> > +     ASSERT_OK(err, "bpf_prog_test_run(test1)");
> > +     ASSERT_EQ(opts.retval, XDP_PASS, "test1-retval");
> > +     ASSERT_EQ(opts.data_size_out, sizeof(pkt_v4), "test1-datasize");
> > +     ASSERT_EQ(opts.ctx_size_out, opts.ctx_size_in, "test1-ctxsize");
> > +     ASSERT_EQ(ctx_out.data_meta, 0, "test1-datameta");
> > +     ASSERT_EQ(ctx_out.data, ctx_out.data_meta, "test1-data");
>
> I suggest just to test ctx_out.data == 0. It just happens
> the input data - meta = 4 and bpf program adjuested by 4.
> If they are not the same, the result won't be equal to data_meta.
>

Sure.

> > +     ASSERT_EQ(ctx_out.data_end, sizeof(pkt_v4), "test1-dataend");
> > +
> > +     /* Data past the end of the kernel's struct xdp_md must be 0 */
> > +     bad_ctx[sizeof(bad_ctx) - 1] = 1;
> > +     opts.ctx_in = bad_ctx;
> > +     opts.ctx_size_in = sizeof(bad_ctx);
> > +     err = bpf_prog_test_run_opts(prog_fd, &opts);
> > +     ASSERT_EQ(errno, 22, "test2-errno");
> > +     ASSERT_ERR(err, "bpf_prog_test_run(test2)");
>
> I suggest to drop this test. Basically you did here
> is to have non-zero egress_ifindex which is not allowed.
> You have a test below.
>

We think the actual correction here is that bad_ctx is supposed to be one byte
larger than than struct xdp_md. It is misdeclared. We'll correct that.

> > +
> > +     /* The egress cannot be specified */
> > +     ctx_in.egress_ifindex = 1;
> > +     err = bpf_prog_test_run_opts(prog_fd, &opts);
> > +     ASSERT_EQ(errno, 22, "test3-errno");
>
> Use EINVAL explicitly? The same for below a few other cases.
>

Good suggestion.

> > +     ASSERT_ERR(err, "bpf_prog_test_run(test3)");
> > +
> > +     /* data_meta must reference the start of data */
> > +     ctx_in.data_meta = sizeof(__u32);
> > +     ctx_in.data = ctx_in.data_meta;
> > +     ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
> > +     ctx_in.egress_ifindex = 0;
> > +     err = bpf_prog_test_run_opts(prog_fd, &opts);
> > +     ASSERT_EQ(errno, 22, "test4-errno");
> > +     ASSERT_ERR(err, "bpf_prog_test_run(test4)");
> > +
> > +     /* Metadata must be 32 bytes or smaller */
> > +     ctx_in.data_meta = 0;
> > +     ctx_in.data = sizeof(__u32)*9;
> > +     ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
> > +     err = bpf_prog_test_run_opts(prog_fd, &opts);
> > +     ASSERT_EQ(errno, 22, "test5-errno");
> > +     ASSERT_ERR(err, "bpf_prog_test_run(test5)");
>
> This test is not necessary if ctx size should be
> <= sizeof(struct xdp_md). So far, I think we can
> require it must be sizeof(struct xdp_md). If
> in the future, kernel struct xdp_md is extended,
> it may be changed to accept both old and new
> xdp_md's similar to other uapi data strcture
> like struct bpf_prog_info if there is a desire.
> In my opinion, the kernel should just stick
> to sizeof(struct xdp_md) size since the functionality
> is implemented as a *testing* mechanism.
>

You might be confusing the context (struct xdp_md) with the XDP metadata (data
just before the frame data). XDP allows at most 32 bytes of metadata. This test
is verifying that a metadata size >32 bytes is rejected.

> > +     ctx_in.ingress_ifindex = 1;
> > +     ctx_in.rx_queue_index = 1;
> > +     err = bpf_prog_test_run_opts(prog_fd, &opts);
> > +     ASSERT_EQ(errno, 22, "test10-errno");
> > +     ASSERT_ERR(err, "bpf_prog_test_run(test10)");
>
> Why this failure? I guess it is due to device search failure, right?
> So this test MAY succeed if the underlying host happens with
> a proper configuration with ingress_ifindex = 1 and rx_queue_index = 1,
> right?
>

I may be making incorrect assumptions, but my understanding is that interface
index 1 is always the loopback interface, and the loopback interface only ever
(in current kernels) has one rx queue. If that's not the case, we'll need to
adjust (or remove) the test.

> > +
> > +     test_xdp_context_test_run__destroy(skel);
> > +}
> > diff --git a/tools/testing/selftests/bpf/progs/test_xdp_context_test_run.c b/tools/testing/selftests/bpf/progs/test_xdp_context_test_run.c
> > new file mode 100644
> > index 000000000000..56fd0995b67c
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/test_xdp_context_test_run.c
> > @@ -0,0 +1,20 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <linux/bpf.h>
> > +#include <bpf/bpf_helpers.h>
> > +
> > +SEC("xdp")
> > +int _xdp_context(struct xdp_md *xdp)
>
> Maybe drop prefix "_" from the function name?
>

Sure.

> > +{
> > +     void *data = (void *)(unsigned long)xdp->data;
> > +     __u32 *metadata = (void *)(unsigned long)xdp->data_meta;
>
> The above code is okay as verifier will rewrite correctly with actual
> address. But I still suggest to use "long" instead of "unsigned long"
> to be consistent with other bpf programs.
>

Sure.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v4 1/3] bpf: support input xdp_md context in BPF_PROG_TEST_RUN
  2021-06-09 17:06     ` Zvi Effron
@ 2021-06-10  0:07       ` Yonghong Song
  0 siblings, 0 replies; 13+ messages in thread
From: Yonghong Song @ 2021-06-10  0:07 UTC (permalink / raw)
  To: Zvi Effron
  Cc: bpf, Alexei Starovoitov, David S. Miller, Daniel Borkmann,
	Jesper Dangaard Brouer, Andrii Nakryiko, Maciej Fijalkowski,
	Martin KaFai Lau, Cody Haas, Lisa Watanabe



On 6/9/21 10:06 AM, Zvi Effron wrote:
> On Sat, Jun 5, 2021 at 10:17 PM Yonghong Song <yhs@fb.com> wrote:
>>
>>
>>
>> On 6/4/21 3:02 PM, Zvi Effron wrote:
>>> --- a/net/bpf/test_run.c
>>> +++ b/net/bpf/test_run.c
>>> @@ -687,6 +687,38 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
>>>        return ret;
>>>    }
>>>
>>> +static int xdp_convert_md_to_buff(struct xdp_buff *xdp, struct xdp_md *xdp_md)
>>
>> Should the order of parameters be switched to (xdp_md, xdp)?
>> This will follow the convention of below function xdp_convert_buff_to_md().
>>
> 
> The order was done to match the skb versions of these functions, which seem to
> have the output format first and the input format second, which is why the
> order flips between conversion functions. We're not particular about order, so
> we can definitely make it consistent.

But for another function we have

+static void xdp_convert_buff_to_md(struct xdp_buff *xdp, struct xdp_md 
*xdp_md)

The input first and the output second. In my opinion, in the same file,
we should keep the same ordering convention.

> 
>>> +{
>>> +     void *data;
>>> +
>>> +     if (!xdp_md)
>>> +             return 0;
>>> +
>>> +     if (xdp_md->egress_ifindex != 0)
>>> +             return -EINVAL;
>>> +
>>> +     if (xdp_md->data > xdp_md->data_end)
>>> +             return -EINVAL;
>>> +
>>> +     xdp->data = xdp->data_meta + xdp_md->data;
>>> +
>>> +     if (xdp_md->ingress_ifindex != 0 || xdp_md->rx_queue_index != 0)
>>> +             return -EINVAL;
>>
>> It would be good if you did all error checking before doing xdp->data
>> assignment. Also looks like xdp_md error checking happens here and
>> bpf_prog_test_run_xdp(). If it is hard to put all error checking
>> in bpf_prog_test_run_xdp(), at least put "xdp_md->data >
>> xdp_md->data_end) in bpf_prog_test_run_xdp(), so this function only
>> checks *_ifindex and rx_queue_index?
>>
> 
> bpf_prog_test_run_xdp() was already a large function, which is why this was
> turned into a helper. Initially, we tried to have all xdp_md related logic in
> the helper, with only the required logic in bpf_prog_test_run_xdp(). Based on
> a prior suggestion, we moved one additional check from the helper to
> bpf_prog_test_run_xdp() as it simplified the logic. It's not clear to us what
> benefit moving the other checks to bpf_prog_test_run_xdp() provides, but it
> does reduce the benefit of having the helper function.

At least put "if (xdp_md->data > xdp_md->data_end)" checking in the 
bpf_prog_test_run_xdp() as similar fields are already checked there.

It is okay to put *_ifindex/rx_queue_index in this function since you 
need to get device for checking and there is no need to get device twice.

> 
>>> @@ -696,36 +728,68 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
>>>        u32 repeat = kattr->test.repeat;
>>>        struct netdev_rx_queue *rxqueue;
>>>        struct xdp_buff xdp = {};
>>> +     struct xdp_md *ctx;
>>
>> Let us try to maintain reverse christmas tree?
> 
> Sure.
> 
> 
>>
>>>        u32 retval, duration;
>>>        u32 max_data_sz;
>>>        void *data;
>>>        int ret;
>>>
>>> -     if (kattr->test.ctx_in || kattr->test.ctx_out)
>>> -             return -EINVAL;
>>> +     ctx = bpf_ctx_init(kattr, sizeof(struct xdp_md));
>>> +     if (IS_ERR(ctx))
>>> +             return PTR_ERR(ctx);
>>> +
>>> +     /* There can't be user provided data before the metadata */
>>> +     if (ctx) {
>>> +             if (ctx->data_meta)
>>> +                     return -EINVAL;
>>> +             if (ctx->data_end != size)
>>> +                     return -EINVAL;
>>> +             if (unlikely((ctx->data & (sizeof(__u32) - 1)) ||
>>> +                          ctx->data > 32))
>>
>> Why 32? Should it be sizeof(struct xdp_md)?
> 
> This is not checking the context itself, but the amount of metadata. XDP allows
> at most 32 bytes of metadata.

Do we have a macro for this "32"? It would be good if we have one.
Otherwise, some comments will be good.

Previously I am thinking just enforce ctx->data to be sizeof(struct 
xdp_md). But think twice, this is a little bit too restricted.
So your current handling is fine.

> 
>>
>>> +             /* Metadata is allocated from the headroom */
>>> +             headroom -= ctx->data;
>>
>> sizeof(struct xdp_md) should be smaller than headroom
>> (XDP_PACKET_HEADROOM), so we don't need to a check, but
>> some comments might be helpful so people looking at the
>> code doesn't need to double check.
> 
> We're not sure what check you're referring to, as there's no check here. This
> subtraction is, as the comment says, because the XDP metadata is allocated out
> of the XDP headroom, so the headroom size needs to be reduced by the metadata
> size.

I am wondering whether we need to check
   if (headroom  < ctx->data)
      return -EINVAL;
   headroom -= ctx->data;
We have
   headroom = XDP_PACKET_HEADROOM;
   ctx->data <= 32
so we should be okay.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v4 3/3] selftests/bpf: Add test for xdp_md context in BPF_PROG_TEST_RUN
  2021-06-09 17:07     ` Zvi Effron
@ 2021-06-10  0:11       ` Yonghong Song
  0 siblings, 0 replies; 13+ messages in thread
From: Yonghong Song @ 2021-06-10  0:11 UTC (permalink / raw)
  To: Zvi Effron
  Cc: bpf, Alexei Starovoitov, David S. Miller, Daniel Borkmann,
	Jesper Dangaard Brouer, Andrii Nakryiko, Maciej Fijalkowski,
	Martin KaFai Lau, Cody Haas, Lisa Watanabe



On 6/9/21 10:07 AM, Zvi Effron wrote:
> On Sat, Jun 5, 2021 at 11:19 PM Yonghong Song <yhs@fb.com> wrote:
>> On 6/4/21 3:02 PM, Zvi Effron wrote:
>>> +     opts.ctx_in = &ctx_in;
>>> +     opts.ctx_size_in = sizeof(ctx_in);
>>> +
>>> +     opts.ctx_in = &ctx_in;
>>> +     opts.ctx_size_in = sizeof(ctx_in);
>>
>> The above two assignments are redundant.
>>
> 
> Good catch.
> 
>>> +     ctx_in.data_meta = 0;
>>> +     ctx_in.data = sizeof(__u32);
>>> +     ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
>>> +     err = bpf_prog_test_run_opts(prog_fd, &opts);
>>> +     ASSERT_OK(err, "bpf_prog_test_run(test1)");
>>> +     ASSERT_EQ(opts.retval, XDP_PASS, "test1-retval");
>>> +     ASSERT_EQ(opts.data_size_out, sizeof(pkt_v4), "test1-datasize");
>>> +     ASSERT_EQ(opts.ctx_size_out, opts.ctx_size_in, "test1-ctxsize");
>>> +     ASSERT_EQ(ctx_out.data_meta, 0, "test1-datameta");
>>> +     ASSERT_EQ(ctx_out.data, ctx_out.data_meta, "test1-data");
>>
>> I suggest just to test ctx_out.data == 0. It just happens
>> the input data - meta = 4 and bpf program adjuested by 4.
>> If they are not the same, the result won't be equal to data_meta.
>>
> 
> Sure.
> 
>>> +     ASSERT_EQ(ctx_out.data_end, sizeof(pkt_v4), "test1-dataend");
>>> +
>>> +     /* Data past the end of the kernel's struct xdp_md must be 0 */
>>> +     bad_ctx[sizeof(bad_ctx) - 1] = 1;
>>> +     opts.ctx_in = bad_ctx;
>>> +     opts.ctx_size_in = sizeof(bad_ctx);
>>> +     err = bpf_prog_test_run_opts(prog_fd, &opts);
>>> +     ASSERT_EQ(errno, 22, "test2-errno");
>>> +     ASSERT_ERR(err, "bpf_prog_test_run(test2)");
>>
>> I suggest to drop this test. Basically you did here
>> is to have non-zero egress_ifindex which is not allowed.
>> You have a test below.
>>
> 
> We think the actual correction here is that bad_ctx is supposed to be one byte
> larger than than struct xdp_md. It is misdeclared. We'll correct that.
> 
>>> +
>>> +     /* The egress cannot be specified */
>>> +     ctx_in.egress_ifindex = 1;
>>> +     err = bpf_prog_test_run_opts(prog_fd, &opts);
>>> +     ASSERT_EQ(errno, 22, "test3-errno");
>>
>> Use EINVAL explicitly? The same for below a few other cases.
>>
> 
> Good suggestion.
> 
>>> +     ASSERT_ERR(err, "bpf_prog_test_run(test3)");
>>> +
>>> +     /* data_meta must reference the start of data */
>>> +     ctx_in.data_meta = sizeof(__u32);
>>> +     ctx_in.data = ctx_in.data_meta;
>>> +     ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
>>> +     ctx_in.egress_ifindex = 0;
>>> +     err = bpf_prog_test_run_opts(prog_fd, &opts);
>>> +     ASSERT_EQ(errno, 22, "test4-errno");
>>> +     ASSERT_ERR(err, "bpf_prog_test_run(test4)");
>>> +
>>> +     /* Metadata must be 32 bytes or smaller */
>>> +     ctx_in.data_meta = 0;
>>> +     ctx_in.data = sizeof(__u32)*9;
>>> +     ctx_in.data_end = ctx_in.data + sizeof(pkt_v4);
>>> +     err = bpf_prog_test_run_opts(prog_fd, &opts);
>>> +     ASSERT_EQ(errno, 22, "test5-errno");
>>> +     ASSERT_ERR(err, "bpf_prog_test_run(test5)");
>>
>> This test is not necessary if ctx size should be
>> <= sizeof(struct xdp_md). So far, I think we can
>> require it must be sizeof(struct xdp_md). If
>> in the future, kernel struct xdp_md is extended,
>> it may be changed to accept both old and new
>> xdp_md's similar to other uapi data strcture
>> like struct bpf_prog_info if there is a desire.
>> In my opinion, the kernel should just stick
>> to sizeof(struct xdp_md) size since the functionality
>> is implemented as a *testing* mechanism.
>>
> 
> You might be confusing the context (struct xdp_md) with the XDP metadata (data
> just before the frame data). XDP allows at most 32 bytes of metadata. This test
> is verifying that a metadata size >32 bytes is rejected.

Right, you can keep this test. Previously I suggested to enforce
sizeof(struct xdp_md) as the ctx size, but that may be too restrictive.

> 
>>> +     ctx_in.ingress_ifindex = 1;
>>> +     ctx_in.rx_queue_index = 1;
>>> +     err = bpf_prog_test_run_opts(prog_fd, &opts);
>>> +     ASSERT_EQ(errno, 22, "test10-errno");
>>> +     ASSERT_ERR(err, "bpf_prog_test_run(test10)");
>>
>> Why this failure? I guess it is due to device search failure, right?
>> So this test MAY succeed if the underlying host happens with
>> a proper configuration with ingress_ifindex = 1 and rx_queue_index = 1,
>> right?
>>
> 
> I may be making incorrect assumptions, but my understanding is that interface
> index 1 is always the loopback interface, and the loopback interface only ever
> (in current kernels) has one rx queue. If that's not the case, we'll need to
> adjust (or remove) the test.

You could be correct. Please add some comments though.

> 
>>> +
>>> +     test_xdp_context_test_run__destroy(skel);
>>> +}
[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-06-10  0:11 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-04 22:02 [PATCH bpf-next v4 0/3] bpf: support input xdp_md context in BPF_PROG_TEST_RUN Zvi Effron
2021-06-04 22:02 ` [PATCH bpf-next v4 1/3] " Zvi Effron
2021-06-06  3:17   ` Yonghong Song
2021-06-07 17:58     ` Martin KaFai Lau
2021-06-09 17:06     ` Zvi Effron
2021-06-10  0:07       ` Yonghong Song
2021-06-04 22:02 ` [PATCH bpf-next v4 2/3] bpf: support specifying ingress via " Zvi Effron
2021-06-06  3:36   ` Yonghong Song
2021-06-04 22:02 ` [PATCH bpf-next v4 3/3] selftests/bpf: Add test for " Zvi Effron
2021-06-06  4:18   ` Yonghong Song
2021-06-09 17:07     ` Zvi Effron
2021-06-10  0:11       ` Yonghong Song
2021-06-06  5:36   ` Yonghong Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.