All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/2] double-fix bpf_test_run + XDP_PASS recycling
@ 2023-03-16 17:50 Alexander Lobakin
  2023-03-16 17:50 ` [PATCH bpf-next 1/2] bpf, test_run: fix crashes due to XDP frame overwriting/corruption Alexander Lobakin
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Alexander Lobakin @ 2023-03-16 17:50 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau
  Cc: Alexander Lobakin, Maciej Fijalkowski, Larysa Zaremba,
	Toke Høiland-Jørgensen, Ilya Leoshkevich, Song Liu,
	Jesper Dangaard Brouer, Jakub Kicinski, bpf, netdev,
	linux-kernel

Enabling skb PP recycling revealed a couple issues in the bpf_test_run
code. Recycling broke the assumption that the headroom won't ever be
touched during the test_run execution: xdp_scrub_frame() invalidates the
XDP frame at the headroom start, while neigh xmit code overwrites 2 bytes
to the left of the Ethernet header. The first makes the kernel panic in
certain cases, while the second breaks xdp_do_redirect selftest on BE.
test_run is a limited-scope entity, so let's hope no more corner cases
will happen here or at least they will be as easy and pleasant to fix
as those two.

Alexander Lobakin (2):
  bpf, test_run: fix crashes due to XDP frame overwriting/corruption
  selftests/bpf: fix "metadata marker" getting overwritten by the
    netstack

 net/bpf/test_run.c                                   | 12 +++++++++++-
 .../selftests/bpf/prog_tests/xdp_do_redirect.c       |  7 ++++---
 .../selftests/bpf/progs/test_xdp_do_redirect.c       |  2 +-
 3 files changed, 16 insertions(+), 5 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH bpf-next 1/2] bpf, test_run: fix crashes due to XDP frame overwriting/corruption
  2023-03-16 17:50 [PATCH bpf-next 0/2] double-fix bpf_test_run + XDP_PASS recycling Alexander Lobakin
@ 2023-03-16 17:50 ` Alexander Lobakin
  2023-03-16 20:09   ` Toke Høiland-Jørgensen
  2023-03-16 21:21   ` Ilya Leoshkevich
  2023-03-16 17:50 ` [PATCH bpf-next 2/2] selftests/bpf: fix "metadata marker" getting overwritten by the netstack Alexander Lobakin
  2023-03-17  5:30 ` [PATCH bpf-next 0/2] double-fix bpf_test_run + XDP_PASS recycling patchwork-bot+netdevbpf
  2 siblings, 2 replies; 10+ messages in thread
From: Alexander Lobakin @ 2023-03-16 17:50 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau
  Cc: Alexander Lobakin, Maciej Fijalkowski, Larysa Zaremba,
	Toke Høiland-Jørgensen, Ilya Leoshkevich, Song Liu,
	Jesper Dangaard Brouer, Jakub Kicinski, bpf, netdev,
	linux-kernel, syzbot+e1d1b65f7c32f2a86a9f

syzbot and Ilya faced the splats when %XDP_PASS happens for bpf_test_run
after skb PP recycling was enabled for {__,}xdp_build_skb_from_frame():

BUG: kernel NULL pointer dereference, address: 0000000000000d28
RIP: 0010:memset_erms+0xd/0x20 arch/x86/lib/memset_64.S:66
[...]
Call Trace:
 <TASK>
 __finalize_skb_around net/core/skbuff.c:321 [inline]
 __build_skb_around+0x232/0x3a0 net/core/skbuff.c:379
 build_skb_around+0x32/0x290 net/core/skbuff.c:444
 __xdp_build_skb_from_frame+0x121/0x760 net/core/xdp.c:622
 xdp_recv_frames net/bpf/test_run.c:248 [inline]
 xdp_test_run_batch net/bpf/test_run.c:334 [inline]
 bpf_test_run_xdp_live+0x1289/0x1930 net/bpf/test_run.c:362
 bpf_prog_test_run_xdp+0xa05/0x14e0 net/bpf/test_run.c:1418
[...]

This happens due to that it calls xdp_scrub_frame(), which nullifies
xdpf->data. bpf_test_run code doesn't reinit the frame when the XDP
program doesn't adjust head or tail. Previously, %XDP_PASS meant the
page will be released from the pool and returned to the MM layer, but
now it does return to the Pool with the nullified xdpf->data, which
doesn't get reinitialized then.
So, in addition to checking whether the head and/or tail have been
adjusted, check also for a potential XDP frame corruption. xdpf->data
is 100% affected and also xdpf->flags is the field closest to the
metadata / frame start. Checking for these two should be enough for
non-extreme cases.

Fixes: 9c94bbf9a87b ("xdp: recycle Page Pool backed skbs built from XDP frames")
Reported-by: syzbot+e1d1b65f7c32f2a86a9f@syzkaller.appspotmail.com
Link: https://lore.kernel.org/bpf/000000000000f1985705f6ef2243@google.com
Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/bpf/e07dd94022ad5731705891b9487cc9ed66328b94.camel@linux.ibm.com
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
 net/bpf/test_run.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 71226f68270d..8d6b31209bd6 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -208,6 +208,16 @@ static void xdp_test_run_teardown(struct xdp_test_data *xdp)
 	kfree(xdp->skbs);
 }
 
+static bool frame_was_changed(const struct xdp_page_head *head)
+{
+	/* xdp_scrub_frame() zeroes the data pointer, flags is the last field,
+	 * i.e. has the highest chances to be overwritten. If those two are
+	 * untouched, it's most likely safe to skip the context reset.
+	 */
+	return head->frm.data != head->orig_ctx.data ||
+	       head->frm.flags != head->orig_ctx.flags;
+}
+
 static bool ctx_was_changed(struct xdp_page_head *head)
 {
 	return head->orig_ctx.data != head->ctx.data ||
@@ -217,7 +227,7 @@ static bool ctx_was_changed(struct xdp_page_head *head)
 
 static void reset_ctx(struct xdp_page_head *head)
 {
-	if (likely(!ctx_was_changed(head)))
+	if (likely(!frame_was_changed(head) && !ctx_was_changed(head)))
 		return;
 
 	head->ctx.data = head->orig_ctx.data;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bpf-next 2/2] selftests/bpf: fix "metadata marker" getting overwritten by the netstack
  2023-03-16 17:50 [PATCH bpf-next 0/2] double-fix bpf_test_run + XDP_PASS recycling Alexander Lobakin
  2023-03-16 17:50 ` [PATCH bpf-next 1/2] bpf, test_run: fix crashes due to XDP frame overwriting/corruption Alexander Lobakin
@ 2023-03-16 17:50 ` Alexander Lobakin
  2023-03-16 20:10   ` Toke Høiland-Jørgensen
  2023-03-16 21:22   ` Ilya Leoshkevich
  2023-03-17  5:30 ` [PATCH bpf-next 0/2] double-fix bpf_test_run + XDP_PASS recycling patchwork-bot+netdevbpf
  2 siblings, 2 replies; 10+ messages in thread
From: Alexander Lobakin @ 2023-03-16 17:50 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau
  Cc: Alexander Lobakin, Maciej Fijalkowski, Larysa Zaremba,
	Toke Høiland-Jørgensen, Ilya Leoshkevich, Song Liu,
	Jesper Dangaard Brouer, Jakub Kicinski, bpf, netdev,
	linux-kernel

Alexei noticed xdp_do_redirect test on BPF CI started failing on
BE systems after skb PP recycling was enabled:

test_xdp_do_redirect:PASS:prog_run 0 nsec
test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec
test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec
test_xdp_do_redirect:FAIL:pkt_count_tc unexpected pkt_count_tc: actual
220 != expected 9998
test_max_pkt_size:PASS:prog_run_max_size 0 nsec
test_max_pkt_size:PASS:prog_run_too_big 0 nsec
close_netns:PASS:setns 0 nsec
 #289 xdp_do_redirect:FAIL
Summary: 270/1674 PASSED, 30 SKIPPED, 1 FAILED

and it doesn't happen on LE systems.
Ilya then hunted it down to:

 #0  0x0000000000aaeee6 in neigh_hh_output (hh=0x83258df0,
skb=0x88142200) at linux/include/net/neighbour.h:503
 #1  0x0000000000ab2cda in neigh_output (skip_cache=false,
skb=0x88142200, n=<optimized out>) at linux/include/net/neighbour.h:544
 #2  ip6_finish_output2 (net=net@entry=0x88edba00, sk=sk@entry=0x0,
skb=skb@entry=0x88142200) at linux/net/ipv6/ip6_output.c:134
 #3  0x0000000000ab4cbc in __ip6_finish_output (skb=0x88142200, sk=0x0,
net=0x88edba00) at linux/net/ipv6/ip6_output.c:195
 #4  ip6_finish_output (net=0x88edba00, sk=0x0, skb=0x88142200) at
linux/net/ipv6/ip6_output.c:206

xdp_do_redirect test places a u32 marker (0x42) right before the Ethernet
header to check it then in the XDP program and return %XDP_ABORTED if it's
not there. Neigh xmit code likes to round up hard header length to speed
up copying the header, so it overwrites two bytes in front of the Eth
header. On LE systems, 0x42 is one byte at `data - 4`, while on BE it's
`data - 1`, what explains why it happens only there.
It didn't happen previously due to that %XDP_PASS meant the page will be
discarded and replaced by a new one, but now it can be recycled as well,
while bpf_test_run code doesn't reinitialize the content of recycled
pages. This mark is limited to this particular test and its setup though,
so there's no need to predict 1000 different possible cases. Just move
it 4 bytes to the left, still keeping it 32 bit to match on more bytes.

Fixes: 9c94bbf9a87b ("xdp: recycle Page Pool backed skbs built from XDP frames")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/CAADnVQ+B_JOU+EpP=DKhbY9yXdN6GiRPnpTTXfEZ9sNkUeb-yQ@mail.gmail.com
Reported-by: Ilya Leoshkevich <iii@linux.ibm.com> # + debugging
Link: https://lore.kernel.org/bpf/8341c1d9f935f410438e79d3bd8a9cc50aefe105.camel@linux.ibm.com
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
 tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c | 7 ++++---
 tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c | 2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
index 856cbc29e6a1..4eaa3dcaebc8 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
@@ -86,12 +86,12 @@ static void test_max_pkt_size(int fd)
 void test_xdp_do_redirect(void)
 {
 	int err, xdp_prog_fd, tc_prog_fd, ifindex_src, ifindex_dst;
-	char data[sizeof(pkt_udp) + sizeof(__u32)];
+	char data[sizeof(pkt_udp) + sizeof(__u64)];
 	struct test_xdp_do_redirect *skel = NULL;
 	struct nstoken *nstoken = NULL;
 	struct bpf_link *link;
 	LIBBPF_OPTS(bpf_xdp_query_opts, query_opts);
-	struct xdp_md ctx_in = { .data = sizeof(__u32),
+	struct xdp_md ctx_in = { .data = sizeof(__u64),
 				 .data_end = sizeof(data) };
 	DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts,
 			    .data_in = &data,
@@ -105,8 +105,9 @@ void test_xdp_do_redirect(void)
 	DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook,
 			    .attach_point = BPF_TC_INGRESS);
 
-	memcpy(&data[sizeof(__u32)], &pkt_udp, sizeof(pkt_udp));
+	memcpy(&data[sizeof(__u64)], &pkt_udp, sizeof(pkt_udp));
 	*((__u32 *)data) = 0x42; /* metadata test value */
+	*((__u32 *)data + 4) = 0;
 
 	skel = test_xdp_do_redirect__open();
 	if (!ASSERT_OK_PTR(skel, "skel"))
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c
index cd2d4e3258b8..5baaafed0d2d 100644
--- a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c
+++ b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c
@@ -52,7 +52,7 @@ int xdp_redirect(struct xdp_md *xdp)
 
 	*payload = MARK_IN;
 
-	if (bpf_xdp_adjust_meta(xdp, 4))
+	if (bpf_xdp_adjust_meta(xdp, sizeof(__u64)))
 		return XDP_ABORTED;
 
 	if (retcode > XDP_PASS)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf, test_run: fix crashes due to XDP frame overwriting/corruption
  2023-03-16 17:50 ` [PATCH bpf-next 1/2] bpf, test_run: fix crashes due to XDP frame overwriting/corruption Alexander Lobakin
@ 2023-03-16 20:09   ` Toke Høiland-Jørgensen
  2023-03-16 21:21   ` Ilya Leoshkevich
  1 sibling, 0 replies; 10+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-03-16 20:09 UTC (permalink / raw)
  To: Alexander Lobakin, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau
  Cc: Alexander Lobakin, Maciej Fijalkowski, Larysa Zaremba,
	Ilya Leoshkevich, Song Liu, Jesper Dangaard Brouer,
	Jakub Kicinski, bpf, netdev, linux-kernel,
	syzbot+e1d1b65f7c32f2a86a9f

Alexander Lobakin <aleksander.lobakin@intel.com> writes:

> syzbot and Ilya faced the splats when %XDP_PASS happens for bpf_test_run
> after skb PP recycling was enabled for {__,}xdp_build_skb_from_frame():
>
> BUG: kernel NULL pointer dereference, address: 0000000000000d28
> RIP: 0010:memset_erms+0xd/0x20 arch/x86/lib/memset_64.S:66
> [...]
> Call Trace:
>  <TASK>
>  __finalize_skb_around net/core/skbuff.c:321 [inline]
>  __build_skb_around+0x232/0x3a0 net/core/skbuff.c:379
>  build_skb_around+0x32/0x290 net/core/skbuff.c:444
>  __xdp_build_skb_from_frame+0x121/0x760 net/core/xdp.c:622
>  xdp_recv_frames net/bpf/test_run.c:248 [inline]
>  xdp_test_run_batch net/bpf/test_run.c:334 [inline]
>  bpf_test_run_xdp_live+0x1289/0x1930 net/bpf/test_run.c:362
>  bpf_prog_test_run_xdp+0xa05/0x14e0 net/bpf/test_run.c:1418
> [...]
>
> This happens due to that it calls xdp_scrub_frame(), which nullifies
> xdpf->data. bpf_test_run code doesn't reinit the frame when the XDP
> program doesn't adjust head or tail. Previously, %XDP_PASS meant the
> page will be released from the pool and returned to the MM layer, but
> now it does return to the Pool with the nullified xdpf->data, which
> doesn't get reinitialized then.
> So, in addition to checking whether the head and/or tail have been
> adjusted, check also for a potential XDP frame corruption. xdpf->data
> is 100% affected and also xdpf->flags is the field closest to the
> metadata / frame start. Checking for these two should be enough for
> non-extreme cases.
>
> Fixes: 9c94bbf9a87b ("xdp: recycle Page Pool backed skbs built from XDP frames")
> Reported-by: syzbot+e1d1b65f7c32f2a86a9f@syzkaller.appspotmail.com
> Link: https://lore.kernel.org/bpf/000000000000f1985705f6ef2243@google.com
> Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
> Link: https://lore.kernel.org/bpf/e07dd94022ad5731705891b9487cc9ed66328b94.camel@linux.ibm.com
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: fix "metadata marker" getting overwritten by the netstack
  2023-03-16 17:50 ` [PATCH bpf-next 2/2] selftests/bpf: fix "metadata marker" getting overwritten by the netstack Alexander Lobakin
@ 2023-03-16 20:10   ` Toke Høiland-Jørgensen
  2023-03-17 13:38     ` Alexander Lobakin
  2023-03-16 21:22   ` Ilya Leoshkevich
  1 sibling, 1 reply; 10+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-03-16 20:10 UTC (permalink / raw)
  To: Alexander Lobakin, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau
  Cc: Alexander Lobakin, Maciej Fijalkowski, Larysa Zaremba,
	Ilya Leoshkevich, Song Liu, Jesper Dangaard Brouer,
	Jakub Kicinski, bpf, netdev, linux-kernel

Alexander Lobakin <aleksander.lobakin@intel.com> writes:

> Alexei noticed xdp_do_redirect test on BPF CI started failing on
> BE systems after skb PP recycling was enabled:
>
> test_xdp_do_redirect:PASS:prog_run 0 nsec
> test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec
> test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec
> test_xdp_do_redirect:FAIL:pkt_count_tc unexpected pkt_count_tc: actual
> 220 != expected 9998
> test_max_pkt_size:PASS:prog_run_max_size 0 nsec
> test_max_pkt_size:PASS:prog_run_too_big 0 nsec
> close_netns:PASS:setns 0 nsec
>  #289 xdp_do_redirect:FAIL
> Summary: 270/1674 PASSED, 30 SKIPPED, 1 FAILED
>
> and it doesn't happen on LE systems.
> Ilya then hunted it down to:
>
>  #0  0x0000000000aaeee6 in neigh_hh_output (hh=0x83258df0,
> skb=0x88142200) at linux/include/net/neighbour.h:503
>  #1  0x0000000000ab2cda in neigh_output (skip_cache=false,
> skb=0x88142200, n=<optimized out>) at linux/include/net/neighbour.h:544
>  #2  ip6_finish_output2 (net=net@entry=0x88edba00, sk=sk@entry=0x0,
> skb=skb@entry=0x88142200) at linux/net/ipv6/ip6_output.c:134
>  #3  0x0000000000ab4cbc in __ip6_finish_output (skb=0x88142200, sk=0x0,
> net=0x88edba00) at linux/net/ipv6/ip6_output.c:195
>  #4  ip6_finish_output (net=0x88edba00, sk=0x0, skb=0x88142200) at
> linux/net/ipv6/ip6_output.c:206
>
> xdp_do_redirect test places a u32 marker (0x42) right before the Ethernet
> header to check it then in the XDP program and return %XDP_ABORTED if it's
> not there. Neigh xmit code likes to round up hard header length to speed
> up copying the header, so it overwrites two bytes in front of the Eth
> header. On LE systems, 0x42 is one byte at `data - 4`, while on BE it's
> `data - 1`, what explains why it happens only there.
> It didn't happen previously due to that %XDP_PASS meant the page will be
> discarded and replaced by a new one, but now it can be recycled as well,
> while bpf_test_run code doesn't reinitialize the content of recycled
> pages. This mark is limited to this particular test and its setup though,
> so there's no need to predict 1000 different possible cases. Just move
> it 4 bytes to the left, still keeping it 32 bit to match on more
> bytes.

Wow, this must have been annoying to track down - nice work :)

> Fixes: 9c94bbf9a87b ("xdp: recycle Page Pool backed skbs built from XDP frames")
> Reported-by: Alexei Starovoitov <ast@kernel.org>
> Link: https://lore.kernel.org/bpf/CAADnVQ+B_JOU+EpP=DKhbY9yXdN6GiRPnpTTXfEZ9sNkUeb-yQ@mail.gmail.com
> Reported-by: Ilya Leoshkevich <iii@linux.ibm.com> # + debugging
> Link: https://lore.kernel.org/bpf/8341c1d9f935f410438e79d3bd8a9cc50aefe105.camel@linux.ibm.com
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf, test_run: fix crashes due to XDP frame overwriting/corruption
  2023-03-16 17:50 ` [PATCH bpf-next 1/2] bpf, test_run: fix crashes due to XDP frame overwriting/corruption Alexander Lobakin
  2023-03-16 20:09   ` Toke Høiland-Jørgensen
@ 2023-03-16 21:21   ` Ilya Leoshkevich
  1 sibling, 0 replies; 10+ messages in thread
From: Ilya Leoshkevich @ 2023-03-16 21:21 UTC (permalink / raw)
  To: Alexander Lobakin, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau
  Cc: Maciej Fijalkowski, Larysa Zaremba,
	Toke Høiland-Jørgensen, Song Liu,
	Jesper Dangaard Brouer, Jakub Kicinski, bpf, netdev,
	linux-kernel, syzbot+e1d1b65f7c32f2a86a9f

On Thu, 2023-03-16 at 18:50 +0100, Alexander Lobakin wrote:
> syzbot and Ilya faced the splats when %XDP_PASS happens for
> bpf_test_run
> after skb PP recycling was enabled for
> {__,}xdp_build_skb_from_frame():
> 
> BUG: kernel NULL pointer dereference, address: 0000000000000d28
> RIP: 0010:memset_erms+0xd/0x20 arch/x86/lib/memset_64.S:66
> [...]
> Call Trace:
>  <TASK>
>  __finalize_skb_around net/core/skbuff.c:321 [inline]
>  __build_skb_around+0x232/0x3a0 net/core/skbuff.c:379
>  build_skb_around+0x32/0x290 net/core/skbuff.c:444
>  __xdp_build_skb_from_frame+0x121/0x760 net/core/xdp.c:622
>  xdp_recv_frames net/bpf/test_run.c:248 [inline]
>  xdp_test_run_batch net/bpf/test_run.c:334 [inline]
>  bpf_test_run_xdp_live+0x1289/0x1930 net/bpf/test_run.c:362
>  bpf_prog_test_run_xdp+0xa05/0x14e0 net/bpf/test_run.c:1418
> [...]
> 
> This happens due to that it calls xdp_scrub_frame(), which nullifies
> xdpf->data. bpf_test_run code doesn't reinit the frame when the XDP
> program doesn't adjust head or tail. Previously, %XDP_PASS meant the
> page will be released from the pool and returned to the MM layer, but
> now it does return to the Pool with the nullified xdpf->data, which
> doesn't get reinitialized then.
> So, in addition to checking whether the head and/or tail have been
> adjusted, check also for a potential XDP frame corruption. xdpf->data
> is 100% affected and also xdpf->flags is the field closest to the
> metadata / frame start. Checking for these two should be enough for
> non-extreme cases.
> 
> Fixes: 9c94bbf9a87b ("xdp: recycle Page Pool backed skbs built from
> XDP frames")
> Reported-by: syzbot+e1d1b65f7c32f2a86a9f@syzkaller.appspotmail.com
> Link:
> https://lore.kernel.org/bpf/000000000000f1985705f6ef2243@google.com
> Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
> Link:
> https://lore.kernel.org/bpf/e07dd94022ad5731705891b9487cc9ed66328b94.camel@linux.ibm.com
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> ---
>  net/bpf/test_run.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 71226f68270d..8d6b31209bd6 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -208,6 +208,16 @@ static void xdp_test_run_teardown(struct
> xdp_test_data *xdp)
>         kfree(xdp->skbs);
>  }
>  
> +static bool frame_was_changed(const struct xdp_page_head *head)
> +{
> +       /* xdp_scrub_frame() zeroes the data pointer, flags is the
> last field,
> +        * i.e. has the highest chances to be overwritten. If those
> two are
> +        * untouched, it's most likely safe to skip the context
> reset.
> +        */
> +       return head->frm.data != head->orig_ctx.data ||
> +              head->frm.flags != head->orig_ctx.flags;
> +}
> +
>  static bool ctx_was_changed(struct xdp_page_head *head)
>  {
>         return head->orig_ctx.data != head->ctx.data ||
> @@ -217,7 +227,7 @@ static bool ctx_was_changed(struct xdp_page_head
> *head)
>  
>  static void reset_ctx(struct xdp_page_head *head)
>  {
> -       if (likely(!ctx_was_changed(head)))
> +       if (likely(!frame_was_changed(head) &&
> !ctx_was_changed(head)))
>                 return;
>  
>         head->ctx.data = head->orig_ctx.data;

With this test begins to work on s390x:

# ./test_progs -t xdp_do_redirect
IPv6: ADDRCONF(NETDEV_CHANGE): veth_dst: link becomes ready
IPv6: ADDRCONF(NETDEV_CHANGE): veth_src: link becomes ready
#290     xdp_do_redirect:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Thanks!

Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: fix "metadata marker" getting overwritten by the netstack
  2023-03-16 17:50 ` [PATCH bpf-next 2/2] selftests/bpf: fix "metadata marker" getting overwritten by the netstack Alexander Lobakin
  2023-03-16 20:10   ` Toke Høiland-Jørgensen
@ 2023-03-16 21:22   ` Ilya Leoshkevich
  2023-03-17 13:40     ` Alexander Lobakin
  1 sibling, 1 reply; 10+ messages in thread
From: Ilya Leoshkevich @ 2023-03-16 21:22 UTC (permalink / raw)
  To: Alexander Lobakin, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau
  Cc: Maciej Fijalkowski, Larysa Zaremba,
	Toke Høiland-Jørgensen, Song Liu,
	Jesper Dangaard Brouer, Jakub Kicinski, bpf, netdev,
	linux-kernel

On Thu, 2023-03-16 at 18:50 +0100, Alexander Lobakin wrote:
> Alexei noticed xdp_do_redirect test on BPF CI started failing on
> BE systems after skb PP recycling was enabled:
> 
> test_xdp_do_redirect:PASS:prog_run 0 nsec
> test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec
> test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec
> test_xdp_do_redirect:FAIL:pkt_count_tc unexpected pkt_count_tc:
> actual
> 220 != expected 9998
> test_max_pkt_size:PASS:prog_run_max_size 0 nsec
> test_max_pkt_size:PASS:prog_run_too_big 0 nsec
> close_netns:PASS:setns 0 nsec
>  #289 xdp_do_redirect:FAIL
> Summary: 270/1674 PASSED, 30 SKIPPED, 1 FAILED
> 
> and it doesn't happen on LE systems.
> Ilya then hunted it down to:
> 
>  #0  0x0000000000aaeee6 in neigh_hh_output (hh=0x83258df0,
> skb=0x88142200) at linux/include/net/neighbour.h:503
>  #1  0x0000000000ab2cda in neigh_output (skip_cache=false,
> skb=0x88142200, n=<optimized out>) at
> linux/include/net/neighbour.h:544
>  #2  ip6_finish_output2 (net=net@entry=0x88edba00, sk=sk@entry=0x0,
> skb=skb@entry=0x88142200) at linux/net/ipv6/ip6_output.c:134
>  #3  0x0000000000ab4cbc in __ip6_finish_output (skb=0x88142200,
> sk=0x0,
> net=0x88edba00) at linux/net/ipv6/ip6_output.c:195
>  #4  ip6_finish_output (net=0x88edba00, sk=0x0, skb=0x88142200) at
> linux/net/ipv6/ip6_output.c:206
> 
> xdp_do_redirect test places a u32 marker (0x42) right before the
> Ethernet
> header to check it then in the XDP program and return %XDP_ABORTED if
> it's
> not there. Neigh xmit code likes to round up hard header length to
> speed
> up copying the header, so it overwrites two bytes in front of the Eth
> header. On LE systems, 0x42 is one byte at `data - 4`, while on BE
> it's
> `data - 1`, what explains why it happens only there.
> It didn't happen previously due to that %XDP_PASS meant the page will
> be
> discarded and replaced by a new one, but now it can be recycled as
> well,
> while bpf_test_run code doesn't reinitialize the content of recycled
> pages. This mark is limited to this particular test and its setup
> though,
> so there's no need to predict 1000 different possible cases. Just
> move
> it 4 bytes to the left, still keeping it 32 bit to match on more
> bytes.
> 
> Fixes: 9c94bbf9a87b ("xdp: recycle Page Pool backed skbs built from
> XDP frames")
> Reported-by: Alexei Starovoitov <ast@kernel.org>
> Link:
> https://lore.kernel.org/bpf/CAADnVQ+B_JOU+EpP=DKhbY9yXdN6GiRPnpTTXfEZ9sNkUeb-yQ@mail.gmail.com
> Reported-by: Ilya Leoshkevich <iii@linux.ibm.com> # + debugging
> Link:
> https://lore.kernel.org/bpf/8341c1d9f935f410438e79d3bd8a9cc50aefe105.camel@linux.ibm.com
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> ---
>  tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c | 7 ++++---
>  tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c | 2 +-
>  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
> b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
> index 856cbc29e6a1..4eaa3dcaebc8 100644
> --- a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
> +++ b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
> @@ -86,12 +86,12 @@ static void test_max_pkt_size(int fd)
>  void test_xdp_do_redirect(void)
>  {
>         int err, xdp_prog_fd, tc_prog_fd, ifindex_src, ifindex_dst;
> -       char data[sizeof(pkt_udp) + sizeof(__u32)];
> +       char data[sizeof(pkt_udp) + sizeof(__u64)];
>         struct test_xdp_do_redirect *skel = NULL;
>         struct nstoken *nstoken = NULL;
>         struct bpf_link *link;
>         LIBBPF_OPTS(bpf_xdp_query_opts, query_opts);
> -       struct xdp_md ctx_in = { .data = sizeof(__u32),
> +       struct xdp_md ctx_in = { .data = sizeof(__u64),
>                                  .data_end = sizeof(data) };
>         DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts,
>                             .data_in = &data,
> @@ -105,8 +105,9 @@ void test_xdp_do_redirect(void)
>         DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook,
>                             .attach_point = BPF_TC_INGRESS);
>  
> -       memcpy(&data[sizeof(__u32)], &pkt_udp, sizeof(pkt_udp));
> +       memcpy(&data[sizeof(__u64)], &pkt_udp, sizeof(pkt_udp));
>         *((__u32 *)data) = 0x42; /* metadata test value */
> +       *((__u32 *)data + 4) = 0;
>  
>         skel = test_xdp_do_redirect__open();
>         if (!ASSERT_OK_PTR(skel, "skel"))
> diff --git a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c
> b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c
> index cd2d4e3258b8..5baaafed0d2d 100644
> --- a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c
> +++ b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c
> @@ -52,7 +52,7 @@ int xdp_redirect(struct xdp_md *xdp)
>  
>         *payload = MARK_IN;
>  
> -       if (bpf_xdp_adjust_meta(xdp, 4))
> +       if (bpf_xdp_adjust_meta(xdp, sizeof(__u64)))
>                 return XDP_ABORTED;
>  
>         if (retcode > XDP_PASS)

Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf-next 0/2] double-fix bpf_test_run + XDP_PASS recycling
  2023-03-16 17:50 [PATCH bpf-next 0/2] double-fix bpf_test_run + XDP_PASS recycling Alexander Lobakin
  2023-03-16 17:50 ` [PATCH bpf-next 1/2] bpf, test_run: fix crashes due to XDP frame overwriting/corruption Alexander Lobakin
  2023-03-16 17:50 ` [PATCH bpf-next 2/2] selftests/bpf: fix "metadata marker" getting overwritten by the netstack Alexander Lobakin
@ 2023-03-17  5:30 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 10+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-03-17  5:30 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: ast, daniel, andrii, martin.lau, maciej.fijalkowski,
	larysa.zaremba, toke, iii, song, hawk, kuba, bpf, netdev,
	linux-kernel

Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Thu, 16 Mar 2023 18:50:49 +0100 you wrote:
> Enabling skb PP recycling revealed a couple issues in the bpf_test_run
> code. Recycling broke the assumption that the headroom won't ever be
> touched during the test_run execution: xdp_scrub_frame() invalidates the
> XDP frame at the headroom start, while neigh xmit code overwrites 2 bytes
> to the left of the Ethernet header. The first makes the kernel panic in
> certain cases, while the second breaks xdp_do_redirect selftest on BE.
> test_run is a limited-scope entity, so let's hope no more corner cases
> will happen here or at least they will be as easy and pleasant to fix
> as those two.
> 
> [...]

Here is the summary with links:
  - [bpf-next,1/2] bpf, test_run: fix crashes due to XDP frame overwriting/corruption
    https://git.kernel.org/bpf/bpf-next/c/e5995bc7e2ba
  - [bpf-next,2/2] selftests/bpf: fix "metadata marker" getting overwritten by the netstack
    https://git.kernel.org/bpf/bpf-next/c/5640b6d89434

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: fix "metadata marker" getting overwritten by the netstack
  2023-03-16 20:10   ` Toke Høiland-Jørgensen
@ 2023-03-17 13:38     ` Alexander Lobakin
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Lobakin @ 2023-03-17 13:38 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Maciej Fijalkowski, Larysa Zaremba,
	Ilya Leoshkevich, Song Liu, Jesper Dangaard Brouer,
	Jakub Kicinski, bpf, netdev, linux-kernel

From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Thu, 16 Mar 2023 21:10:02 +0100

> Alexander Lobakin <aleksander.lobakin@intel.com> writes:
> 
>> Alexei noticed xdp_do_redirect test on BPF CI started failing on
>> BE systems after skb PP recycling was enabled:
>>
>> test_xdp_do_redirect:PASS:prog_run 0 nsec
>> test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec
>> test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec
>> test_xdp_do_redirect:FAIL:pkt_count_tc unexpected pkt_count_tc: actual
>> 220 != expected 9998
>> test_max_pkt_size:PASS:prog_run_max_size 0 nsec
>> test_max_pkt_size:PASS:prog_run_too_big 0 nsec
>> close_netns:PASS:setns 0 nsec
>>  #289 xdp_do_redirect:FAIL
>> Summary: 270/1674 PASSED, 30 SKIPPED, 1 FAILED
>>
>> and it doesn't happen on LE systems.
>> Ilya then hunted it down to:
>>
>>  #0  0x0000000000aaeee6 in neigh_hh_output (hh=0x83258df0,
>> skb=0x88142200) at linux/include/net/neighbour.h:503
>>  #1  0x0000000000ab2cda in neigh_output (skip_cache=false,
>> skb=0x88142200, n=<optimized out>) at linux/include/net/neighbour.h:544
>>  #2  ip6_finish_output2 (net=net@entry=0x88edba00, sk=sk@entry=0x0,
>> skb=skb@entry=0x88142200) at linux/net/ipv6/ip6_output.c:134
>>  #3  0x0000000000ab4cbc in __ip6_finish_output (skb=0x88142200, sk=0x0,
>> net=0x88edba00) at linux/net/ipv6/ip6_output.c:195
>>  #4  ip6_finish_output (net=0x88edba00, sk=0x0, skb=0x88142200) at
>> linux/net/ipv6/ip6_output.c:206
>>
>> xdp_do_redirect test places a u32 marker (0x42) right before the Ethernet
>> header to check it then in the XDP program and return %XDP_ABORTED if it's
>> not there. Neigh xmit code likes to round up hard header length to speed
>> up copying the header, so it overwrites two bytes in front of the Eth
>> header. On LE systems, 0x42 is one byte at `data - 4`, while on BE it's
>> `data - 1`, what explains why it happens only there.
>> It didn't happen previously due to that %XDP_PASS meant the page will be
>> discarded and replaced by a new one, but now it can be recycled as well,
>> while bpf_test_run code doesn't reinitialize the content of recycled
>> pages. This mark is limited to this particular test and its setup though,
>> so there's no need to predict 1000 different possible cases. Just move
>> it 4 bytes to the left, still keeping it 32 bit to match on more
>> bytes.
> 
> Wow, this must have been annoying to track down - nice work :)

I just blinked my eyes once and Ilya already came back with the detailed
stacktrace, so it's almost entirely his work <O

> 
>> Fixes: 9c94bbf9a87b ("xdp: recycle Page Pool backed skbs built from XDP frames")
>> Reported-by: Alexei Starovoitov <ast@kernel.org>
>> Link: https://lore.kernel.org/bpf/CAADnVQ+B_JOU+EpP=DKhbY9yXdN6GiRPnpTTXfEZ9sNkUeb-yQ@mail.gmail.com
>> Reported-by: Ilya Leoshkevich <iii@linux.ibm.com> # + debugging
>> Link: https://lore.kernel.org/bpf/8341c1d9f935f410438e79d3bd8a9cc50aefe105.camel@linux.ibm.com
>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> 
> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
> 

Thanks! That was quick :D

Olek

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: fix "metadata marker" getting overwritten by the netstack
  2023-03-16 21:22   ` Ilya Leoshkevich
@ 2023-03-17 13:40     ` Alexander Lobakin
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Lobakin @ 2023-03-17 13:40 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Maciej Fijalkowski, Larysa Zaremba,
	Toke Høiland-Jørgensen, Song Liu,
	Jesper Dangaard Brouer, Jakub Kicinski, bpf, netdev,
	linux-kernel

From: Ilya Leoshkevich <iii@linux.ibm.com>
Date: Thu, 16 Mar 2023 22:22:26 +0100

> On Thu, 2023-03-16 at 18:50 +0100, Alexander Lobakin wrote:
>> Alexei noticed xdp_do_redirect test on BPF CI started failing on
>> BE systems after skb PP recycling was enabled:

[...]

>> @@ -52,7 +52,7 @@ int xdp_redirect(struct xdp_md *xdp)
>>  
>>         *payload = MARK_IN;
>>  
>> -       if (bpf_xdp_adjust_meta(xdp, 4))
>> +       if (bpf_xdp_adjust_meta(xdp, sizeof(__u64)))
>>                 return XDP_ABORTED;
>>  
>>         if (retcode > XDP_PASS)
> 
> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>

Much thanks again for the debugging and testing! It definitely wouldn't
have been quick without you :D

Olek

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-03-17 13:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-16 17:50 [PATCH bpf-next 0/2] double-fix bpf_test_run + XDP_PASS recycling Alexander Lobakin
2023-03-16 17:50 ` [PATCH bpf-next 1/2] bpf, test_run: fix crashes due to XDP frame overwriting/corruption Alexander Lobakin
2023-03-16 20:09   ` Toke Høiland-Jørgensen
2023-03-16 21:21   ` Ilya Leoshkevich
2023-03-16 17:50 ` [PATCH bpf-next 2/2] selftests/bpf: fix "metadata marker" getting overwritten by the netstack Alexander Lobakin
2023-03-16 20:10   ` Toke Høiland-Jørgensen
2023-03-17 13:38     ` Alexander Lobakin
2023-03-16 21:22   ` Ilya Leoshkevich
2023-03-17 13:40     ` Alexander Lobakin
2023-03-17  5:30 ` [PATCH bpf-next 0/2] double-fix bpf_test_run + XDP_PASS recycling patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.