All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v3 1/2] bpf: Fix test verif_scale_strobemeta_subprogs failure due to llvm19
@ 2024-02-08 21:54 Yonghong Song
  2024-02-08 21:54 ` [PATCH bpf-next v3 2/2] selftests/bpf: Add a negative test for stack accounting in jit mode Yonghong Song
  2024-02-13 19:21 ` [PATCH bpf-next v3 1/2] bpf: Fix test verif_scale_strobemeta_subprogs failure due to llvm19 Andrii Nakryiko
  0 siblings, 2 replies; 6+ messages in thread
From: Yonghong Song @ 2024-02-08 21:54 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Martin KaFai Lau

With latest llvm19, I hit the following selftest failures with

  $ ./test_progs -j
  libbpf: prog 'on_event': BPF program load failed: Permission denied
  libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG --
  combined stack size of 4 calls is 544. Too large
  verification time 1344153 usec
  stack depth 24+440+0+32
  processed 51008 insns (limit 1000000) max_states_per_insn 19 total_states 1467 peak_states 303 mark_read 146
  -- END PROG LOAD LOG --
  libbpf: prog 'on_event': failed to load: -13
  libbpf: failed to load object 'strobemeta_subprogs.bpf.o'
  scale_test:FAIL:expect_success unexpected error: -13 (errno 13)
  #498     verif_scale_strobemeta_subprogs:FAIL

The verifier complains too big of the combined stack size (544 bytes) which
exceeds the maximum stack limit 512. This is a regression from llvm19 ([1]).

In the above error log, the original stack depth is 24+440+0+32.
To satisfy interpreter's need, in verifier the stack depth is adjusted to
32+448+32+32=544 which exceeds 512, hence the error. The same adjusted
stack size is also used for jit case.

But the jitted codes could use smaller stack size.

  $ egrep -r stack_depth | grep round_up
  arm64/net/bpf_jit_comp.c:       ctx->stack_size = round_up(prog->aux->stack_depth, 16);
  loongarch/net/bpf_jit.c:        bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
  powerpc/net/bpf_jit_comp.c:     cgctx.stack_size = round_up(fp->aux->stack_depth, 16);
  riscv/net/bpf_jit_comp32.c:             round_up(ctx->prog->aux->stack_depth, STACK_ALIGN);
  riscv/net/bpf_jit_comp64.c:     bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
  s390/net/bpf_jit_comp.c:        u32 stack_depth = round_up(fp->aux->stack_depth, 8);
  sparc/net/bpf_jit_comp_64.c:            stack_needed += round_up(stack_depth, 16);
  x86/net/bpf_jit_comp.c:         EMIT3_off32(0x48, 0x81, 0xEC, round_up(stack_depth, 8));
  x86/net/bpf_jit_comp.c: int tcc_off = -4 - round_up(stack_depth, 8);
  x86/net/bpf_jit_comp.c:                     round_up(stack_depth, 8));
  x86/net/bpf_jit_comp.c: int tcc_off = -4 - round_up(stack_depth, 8);
  x86/net/bpf_jit_comp.c:         EMIT3_off32(0x48, 0x81, 0xC4, round_up(stack_depth, 8));

In the above, STACK_ALIGN in riscv/net/bpf_jit_comp32.c is defined as 16.
So stack is aligned in either 8 or 16, x86/s390 having 8-byte stack alignment and
the rest having 16-byte alignment.

This patch calculates total stack depth based on 16-byte alignment if jit is requested.
For the above failing case, the new stack size will be 32+448+0+32=512 and no verification
failure. llvm19 regression will be discussed separately in llvm upstream.

The verifier change caused three test failures as these tests compared messages
with stack size. More specifically,
  - test_global_funcs/global_func1: adjusted to interpreter only since verification will
    succeed in jit mode. A new test will be added for jit mode later.
  - async_stack_depth/{pseudo_call_check, async_call_root_check}: since jit and interpreter
    will calculate different stack sizes, the failure msg is adjusted to omit those
    specific stack size numbers.

  [1] https://lore.kernel.org/bpf/32bde0f0-1881-46c9-931a-673be566c61d@linux.dev/

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 kernel/bpf/verifier.c                          | 18 +++++++++++++-----
 .../bpf/prog_tests/test_global_funcs.c         |  5 ++++-
 .../selftests/bpf/progs/async_stack_depth.c    |  4 ++--
 3 files changed, 19 insertions(+), 8 deletions(-)

Changelogs:
  v2 -> v3:
    - fix async_stack_depth test failure if jit is turned off
  v1 -> v2:
    - fix some selftest failures

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ddaf09db1175..6441a540904b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5812,6 +5812,17 @@ static int check_ptr_alignment(struct bpf_verifier_env *env,
 					   strict);
 }
 
+static int round_up_stack_depth(struct bpf_verifier_env *env, int stack_depth)
+{
+	if (env->prog->jit_requested)
+		return round_up(stack_depth, 16);
+
+	/* round up to 32-bytes, since this is granularity
+	 * of interpreter stack size
+	 */
+	return round_up(max_t(u32, stack_depth, 1), 32);
+}
+
 /* starting from main bpf function walk all instructions of the function
  * and recursively walk all callees that given function can call.
  * Ignore jump and exit insns.
@@ -5855,10 +5866,7 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx)
 			depth);
 		return -EACCES;
 	}
-	/* round up to 32-bytes, since this is granularity
-	 * of interpreter stack size
-	 */
-	depth += round_up(max_t(u32, subprog[idx].stack_depth, 1), 32);
+	depth += round_up_stack_depth(env, subprog[idx].stack_depth);
 	if (depth > MAX_BPF_STACK) {
 		verbose(env, "combined stack size of %d calls is %d. Too large\n",
 			frame + 1, depth);
@@ -5952,7 +5960,7 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx)
 	 */
 	if (frame == 0)
 		return 0;
-	depth -= round_up(max_t(u32, subprog[idx].stack_depth, 1), 32);
+	depth -= round_up_stack_depth(env, subprog[idx].stack_depth);
 	frame--;
 	i = ret_insn[frame];
 	idx = ret_prog[frame];
diff --git a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
index e905cbaf6b3d..a3a41680b38e 100644
--- a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
+++ b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
@@ -138,7 +138,10 @@ static void subtest_ctx_arg_rewrite(void)
 
 void test_test_global_funcs(void)
 {
-	RUN_TESTS(test_global_func1);
+	if (!env.jit_enabled) {
+		RUN_TESTS(test_global_func1);
+	}
+
 	RUN_TESTS(test_global_func2);
 	RUN_TESTS(test_global_func3);
 	RUN_TESTS(test_global_func4);
diff --git a/tools/testing/selftests/bpf/progs/async_stack_depth.c b/tools/testing/selftests/bpf/progs/async_stack_depth.c
index 3517c0e01206..36734683acbd 100644
--- a/tools/testing/selftests/bpf/progs/async_stack_depth.c
+++ b/tools/testing/selftests/bpf/progs/async_stack_depth.c
@@ -30,7 +30,7 @@ static int bad_timer_cb(void *map, int *key, struct bpf_timer *timer)
 }
 
 SEC("tc")
-__failure __msg("combined stack size of 2 calls is 576. Too large")
+__failure __msg("combined stack size of 2 calls is")
 int pseudo_call_check(struct __sk_buff *ctx)
 {
 	struct hmap_elem *elem;
@@ -45,7 +45,7 @@ int pseudo_call_check(struct __sk_buff *ctx)
 }
 
 SEC("tc")
-__failure __msg("combined stack size of 2 calls is 608. Too large")
+__failure __msg("combined stack size of 2 calls is")
 int async_call_root_check(struct __sk_buff *ctx)
 {
 	struct hmap_elem *elem;
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH bpf-next v3 2/2] selftests/bpf: Add a negative test for stack accounting in jit mode
  2024-02-08 21:54 [PATCH bpf-next v3 1/2] bpf: Fix test verif_scale_strobemeta_subprogs failure due to llvm19 Yonghong Song
@ 2024-02-08 21:54 ` Yonghong Song
  2024-02-13 19:40   ` Andrii Nakryiko
  2024-02-13 19:21 ` [PATCH bpf-next v3 1/2] bpf: Fix test verif_scale_strobemeta_subprogs failure due to llvm19 Andrii Nakryiko
  1 sibling, 1 reply; 6+ messages in thread
From: Yonghong Song @ 2024-02-08 21:54 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Martin KaFai Lau

The new test is very similar to test_global_func1.c, but
is modified to fail on jit mode.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 .../bpf/prog_tests/test_global_funcs.c        |  3 ++
 .../selftests/bpf/progs/test_global_func18.c  | 44 +++++++++++++++++++
 2 files changed, 47 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/test_global_func18.c

diff --git a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
index a3a41680b38e..dccbf2213135 100644
--- a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
+++ b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
@@ -18,6 +18,7 @@
 #include "test_global_func15.skel.h"
 #include "test_global_func16.skel.h"
 #include "test_global_func17.skel.h"
+#include "test_global_func18.skel.h"
 #include "test_global_func_ctx_args.skel.h"
 
 #include "bpf/libbpf_internal.h"
@@ -140,6 +141,8 @@ void test_test_global_funcs(void)
 {
 	if (!env.jit_enabled) {
 		RUN_TESTS(test_global_func1);
+	} else {
+		RUN_TESTS(test_global_func18);
 	}
 
 	RUN_TESTS(test_global_func2);
diff --git a/tools/testing/selftests/bpf/progs/test_global_func18.c b/tools/testing/selftests/bpf/progs/test_global_func18.c
new file mode 100644
index 000000000000..d1aa3b2c68fe
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_global_func18.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
+#include <stddef.h>
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+#define MAX_STACK1 (512 - 3 * 32 + 8)
+#define MAX_STACK2 (3 * 32)
+
+__attribute__ ((noinline))
+int f1(struct __sk_buff *skb)
+{
+	return skb->len;
+}
+
+int f3(int, struct __sk_buff *skb, int);
+
+__attribute__ ((noinline))
+int f2(int val, struct __sk_buff *skb)
+{
+	volatile char buf[MAX_STACK1] = {};
+
+	__sink(buf[MAX_STACK1 - 1]);
+
+	return f1(skb) + f3(val, skb, 1);
+}
+
+__attribute__ ((noinline))
+int f3(int val, struct __sk_buff *skb, int var)
+{
+	volatile char buf[MAX_STACK2] = {};
+
+	__sink(buf[MAX_STACK2 - 1]);
+
+	return skb->ifindex * val * var;
+}
+
+SEC("tc")
+__failure __msg("combined stack size of 3 calls is 528")
+int global_func18(struct __sk_buff *skb)
+{
+	return f1(skb) + f2(2, skb) + f3(3, skb, 4);
+}
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next v3 1/2] bpf: Fix test verif_scale_strobemeta_subprogs failure due to llvm19
  2024-02-08 21:54 [PATCH bpf-next v3 1/2] bpf: Fix test verif_scale_strobemeta_subprogs failure due to llvm19 Yonghong Song
  2024-02-08 21:54 ` [PATCH bpf-next v3 2/2] selftests/bpf: Add a negative test for stack accounting in jit mode Yonghong Song
@ 2024-02-13 19:21 ` Andrii Nakryiko
  2024-02-13 20:26   ` Yonghong Song
  1 sibling, 1 reply; 6+ messages in thread
From: Andrii Nakryiko @ 2024-02-13 19:21 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Martin KaFai Lau

On Thu, Feb 8, 2024 at 1:54 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>
> With latest llvm19, I hit the following selftest failures with
>
>   $ ./test_progs -j
>   libbpf: prog 'on_event': BPF program load failed: Permission denied
>   libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG --
>   combined stack size of 4 calls is 544. Too large
>   verification time 1344153 usec
>   stack depth 24+440+0+32
>   processed 51008 insns (limit 1000000) max_states_per_insn 19 total_states 1467 peak_states 303 mark_read 146
>   -- END PROG LOAD LOG --
>   libbpf: prog 'on_event': failed to load: -13
>   libbpf: failed to load object 'strobemeta_subprogs.bpf.o'
>   scale_test:FAIL:expect_success unexpected error: -13 (errno 13)
>   #498     verif_scale_strobemeta_subprogs:FAIL
>
> The verifier complains too big of the combined stack size (544 bytes) which
> exceeds the maximum stack limit 512. This is a regression from llvm19 ([1]).
>
> In the above error log, the original stack depth is 24+440+0+32.
> To satisfy interpreter's need, in verifier the stack depth is adjusted to
> 32+448+32+32=544 which exceeds 512, hence the error. The same adjusted
> stack size is also used for jit case.
>
> But the jitted codes could use smaller stack size.
>
>   $ egrep -r stack_depth | grep round_up
>   arm64/net/bpf_jit_comp.c:       ctx->stack_size = round_up(prog->aux->stack_depth, 16);
>   loongarch/net/bpf_jit.c:        bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
>   powerpc/net/bpf_jit_comp.c:     cgctx.stack_size = round_up(fp->aux->stack_depth, 16);
>   riscv/net/bpf_jit_comp32.c:             round_up(ctx->prog->aux->stack_depth, STACK_ALIGN);
>   riscv/net/bpf_jit_comp64.c:     bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
>   s390/net/bpf_jit_comp.c:        u32 stack_depth = round_up(fp->aux->stack_depth, 8);
>   sparc/net/bpf_jit_comp_64.c:            stack_needed += round_up(stack_depth, 16);
>   x86/net/bpf_jit_comp.c:         EMIT3_off32(0x48, 0x81, 0xEC, round_up(stack_depth, 8));
>   x86/net/bpf_jit_comp.c: int tcc_off = -4 - round_up(stack_depth, 8);
>   x86/net/bpf_jit_comp.c:                     round_up(stack_depth, 8));
>   x86/net/bpf_jit_comp.c: int tcc_off = -4 - round_up(stack_depth, 8);
>   x86/net/bpf_jit_comp.c:         EMIT3_off32(0x48, 0x81, 0xC4, round_up(stack_depth, 8));
>
> In the above, STACK_ALIGN in riscv/net/bpf_jit_comp32.c is defined as 16.
> So stack is aligned in either 8 or 16, x86/s390 having 8-byte stack alignment and
> the rest having 16-byte alignment.
>
> This patch calculates total stack depth based on 16-byte alignment if jit is requested.
> For the above failing case, the new stack size will be 32+448+0+32=512 and no verification
> failure. llvm19 regression will be discussed separately in llvm upstream.
>
> The verifier change caused three test failures as these tests compared messages
> with stack size. More specifically,
>   - test_global_funcs/global_func1: adjusted to interpreter only since verification will
>     succeed in jit mode. A new test will be added for jit mode later.
>   - async_stack_depth/{pseudo_call_check, async_call_root_check}: since jit and interpreter
>     will calculate different stack sizes, the failure msg is adjusted to omit those
>     specific stack size numbers.
>
>   [1] https://lore.kernel.org/bpf/32bde0f0-1881-46c9-931a-673be566c61d@linux.dev/
>
> Suggested-by: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
> ---
>  kernel/bpf/verifier.c                          | 18 +++++++++++++-----
>  .../bpf/prog_tests/test_global_funcs.c         |  5 ++++-
>  .../selftests/bpf/progs/async_stack_depth.c    |  4 ++--
>  3 files changed, 19 insertions(+), 8 deletions(-)
>
> Changelogs:
>   v2 -> v3:
>     - fix async_stack_depth test failure if jit is turned off
>   v1 -> v2:
>     - fix some selftest failures
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index ddaf09db1175..6441a540904b 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5812,6 +5812,17 @@ static int check_ptr_alignment(struct bpf_verifier_env *env,
>                                            strict);
>  }
>
> +static int round_up_stack_depth(struct bpf_verifier_env *env, int stack_depth)
> +{
> +       if (env->prog->jit_requested)
> +               return round_up(stack_depth, 16);
> +
> +       /* round up to 32-bytes, since this is granularity
> +        * of interpreter stack size
> +        */
> +       return round_up(max_t(u32, stack_depth, 1), 32);
> +}
> +
>  /* starting from main bpf function walk all instructions of the function
>   * and recursively walk all callees that given function can call.
>   * Ignore jump and exit insns.
> @@ -5855,10 +5866,7 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx)
>                         depth);
>                 return -EACCES;
>         }
> -       /* round up to 32-bytes, since this is granularity
> -        * of interpreter stack size
> -        */
> -       depth += round_up(max_t(u32, subprog[idx].stack_depth, 1), 32);
> +       depth += round_up_stack_depth(env, subprog[idx].stack_depth);
>         if (depth > MAX_BPF_STACK) {
>                 verbose(env, "combined stack size of %d calls is %d. Too large\n",
>                         frame + 1, depth);
> @@ -5952,7 +5960,7 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx)
>          */
>         if (frame == 0)
>                 return 0;
> -       depth -= round_up(max_t(u32, subprog[idx].stack_depth, 1), 32);
> +       depth -= round_up_stack_depth(env, subprog[idx].stack_depth);
>         frame--;
>         i = ret_insn[frame];
>         idx = ret_prog[frame];
> diff --git a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
> index e905cbaf6b3d..a3a41680b38e 100644
> --- a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
> +++ b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
> @@ -138,7 +138,10 @@ static void subtest_ctx_arg_rewrite(void)
>
>  void test_test_global_funcs(void)
>  {
> -       RUN_TESTS(test_global_func1);
> +       if (!env.jit_enabled) {
> +               RUN_TESTS(test_global_func1);
> +       }

Can we increase the amount of used stack size to make it fail
regardless of JIT? That's what I was asking on v1, actually. We have
those arbitrarily sized buf[256] and buf[300], what prevents us from
making them a few bytes bigger to not be affected by 16 vs 32 byte
rounding?


> +
>         RUN_TESTS(test_global_func2);
>         RUN_TESTS(test_global_func3);
>         RUN_TESTS(test_global_func4);
> diff --git a/tools/testing/selftests/bpf/progs/async_stack_depth.c b/tools/testing/selftests/bpf/progs/async_stack_depth.c
> index 3517c0e01206..36734683acbd 100644
> --- a/tools/testing/selftests/bpf/progs/async_stack_depth.c
> +++ b/tools/testing/selftests/bpf/progs/async_stack_depth.c
> @@ -30,7 +30,7 @@ static int bad_timer_cb(void *map, int *key, struct bpf_timer *timer)
>  }
>
>  SEC("tc")
> -__failure __msg("combined stack size of 2 calls is 576. Too large")
> +__failure __msg("combined stack size of 2 calls is")
>  int pseudo_call_check(struct __sk_buff *ctx)
>  {
>         struct hmap_elem *elem;
> @@ -45,7 +45,7 @@ int pseudo_call_check(struct __sk_buff *ctx)
>  }
>
>  SEC("tc")
> -__failure __msg("combined stack size of 2 calls is 608. Too large")
> +__failure __msg("combined stack size of 2 calls is")
>  int async_call_root_check(struct __sk_buff *ctx)
>  {
>         struct hmap_elem *elem;
> --
> 2.39.3
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next v3 2/2] selftests/bpf: Add a negative test for stack accounting in jit mode
  2024-02-08 21:54 ` [PATCH bpf-next v3 2/2] selftests/bpf: Add a negative test for stack accounting in jit mode Yonghong Song
@ 2024-02-13 19:40   ` Andrii Nakryiko
  2024-02-13 20:28     ` Yonghong Song
  0 siblings, 1 reply; 6+ messages in thread
From: Andrii Nakryiko @ 2024-02-13 19:40 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Martin KaFai Lau

On Thu, Feb 8, 2024 at 1:54 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>
> The new test is very similar to test_global_func1.c, but
> is modified to fail on jit mode.
>
> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
> ---
>  .../bpf/prog_tests/test_global_funcs.c        |  3 ++
>  .../selftests/bpf/progs/test_global_func18.c  | 44 +++++++++++++++++++
>  2 files changed, 47 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/progs/test_global_func18.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
> index a3a41680b38e..dccbf2213135 100644
> --- a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
> +++ b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
> @@ -18,6 +18,7 @@
>  #include "test_global_func15.skel.h"
>  #include "test_global_func16.skel.h"
>  #include "test_global_func17.skel.h"
> +#include "test_global_func18.skel.h"
>  #include "test_global_func_ctx_args.skel.h"
>
>  #include "bpf/libbpf_internal.h"
> @@ -140,6 +141,8 @@ void test_test_global_funcs(void)
>  {
>         if (!env.jit_enabled) {
>                 RUN_TESTS(test_global_func1);
> +       } else {
> +               RUN_TESTS(test_global_func18);
>         }
>
>         RUN_TESTS(test_global_func2);
> diff --git a/tools/testing/selftests/bpf/progs/test_global_func18.c b/tools/testing/selftests/bpf/progs/test_global_func18.c
> new file mode 100644
> index 000000000000..d1aa3b2c68fe
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_global_func18.c
> @@ -0,0 +1,44 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
> +#include <stddef.h>
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include "bpf_misc.h"
> +
> +#define MAX_STACK1 (512 - 3 * 32 + 8)
> +#define MAX_STACK2 (3 * 32)
> +
> +__attribute__ ((noinline))

nit: we have __noinline defined, let's use it as a less verbose option?

> +int f1(struct __sk_buff *skb)
> +{
> +       return skb->len;
> +}
> +
> +int f3(int, struct __sk_buff *skb, int);
> +
> +__attribute__ ((noinline))
> +int f2(int val, struct __sk_buff *skb)
> +{
> +       volatile char buf[MAX_STACK1] = {};
> +
> +       __sink(buf[MAX_STACK1 - 1]);
> +
> +       return f1(skb) + f3(val, skb, 1);
> +}
> +
> +__attribute__ ((noinline))
> +int f3(int val, struct __sk_buff *skb, int var)
> +{
> +       volatile char buf[MAX_STACK2] = {};
> +
> +       __sink(buf[MAX_STACK2 - 1]);
> +
> +       return skb->ifindex * val * var;
> +}
> +
> +SEC("tc")
> +__failure __msg("combined stack size of 3 calls is 528")
> +int global_func18(struct __sk_buff *skb)
> +{
> +       return f1(skb) + f2(2, skb) + f3(3, skb, 4);
> +}
> --
> 2.39.3
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next v3 1/2] bpf: Fix test verif_scale_strobemeta_subprogs failure due to llvm19
  2024-02-13 19:21 ` [PATCH bpf-next v3 1/2] bpf: Fix test verif_scale_strobemeta_subprogs failure due to llvm19 Andrii Nakryiko
@ 2024-02-13 20:26   ` Yonghong Song
  0 siblings, 0 replies; 6+ messages in thread
From: Yonghong Song @ 2024-02-13 20:26 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Martin KaFai Lau


On 2/13/24 11:21 AM, Andrii Nakryiko wrote:
> On Thu, Feb 8, 2024 at 1:54 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>> With latest llvm19, I hit the following selftest failures with
>>
>>    $ ./test_progs -j
>>    libbpf: prog 'on_event': BPF program load failed: Permission denied
>>    libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG --
>>    combined stack size of 4 calls is 544. Too large
>>    verification time 1344153 usec
>>    stack depth 24+440+0+32
>>    processed 51008 insns (limit 1000000) max_states_per_insn 19 total_states 1467 peak_states 303 mark_read 146
>>    -- END PROG LOAD LOG --
>>    libbpf: prog 'on_event': failed to load: -13
>>    libbpf: failed to load object 'strobemeta_subprogs.bpf.o'
>>    scale_test:FAIL:expect_success unexpected error: -13 (errno 13)
>>    #498     verif_scale_strobemeta_subprogs:FAIL
>>
>> The verifier complains too big of the combined stack size (544 bytes) which
>> exceeds the maximum stack limit 512. This is a regression from llvm19 ([1]).
>>
>> In the above error log, the original stack depth is 24+440+0+32.
>> To satisfy interpreter's need, in verifier the stack depth is adjusted to
>> 32+448+32+32=544 which exceeds 512, hence the error. The same adjusted
>> stack size is also used for jit case.
>>
>> But the jitted codes could use smaller stack size.
>>
>>    $ egrep -r stack_depth | grep round_up
>>    arm64/net/bpf_jit_comp.c:       ctx->stack_size = round_up(prog->aux->stack_depth, 16);
>>    loongarch/net/bpf_jit.c:        bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
>>    powerpc/net/bpf_jit_comp.c:     cgctx.stack_size = round_up(fp->aux->stack_depth, 16);
>>    riscv/net/bpf_jit_comp32.c:             round_up(ctx->prog->aux->stack_depth, STACK_ALIGN);
>>    riscv/net/bpf_jit_comp64.c:     bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
>>    s390/net/bpf_jit_comp.c:        u32 stack_depth = round_up(fp->aux->stack_depth, 8);
>>    sparc/net/bpf_jit_comp_64.c:            stack_needed += round_up(stack_depth, 16);
>>    x86/net/bpf_jit_comp.c:         EMIT3_off32(0x48, 0x81, 0xEC, round_up(stack_depth, 8));
>>    x86/net/bpf_jit_comp.c: int tcc_off = -4 - round_up(stack_depth, 8);
>>    x86/net/bpf_jit_comp.c:                     round_up(stack_depth, 8));
>>    x86/net/bpf_jit_comp.c: int tcc_off = -4 - round_up(stack_depth, 8);
>>    x86/net/bpf_jit_comp.c:         EMIT3_off32(0x48, 0x81, 0xC4, round_up(stack_depth, 8));
>>
>> In the above, STACK_ALIGN in riscv/net/bpf_jit_comp32.c is defined as 16.
>> So stack is aligned in either 8 or 16, x86/s390 having 8-byte stack alignment and
>> the rest having 16-byte alignment.
>>
>> This patch calculates total stack depth based on 16-byte alignment if jit is requested.
>> For the above failing case, the new stack size will be 32+448+0+32=512 and no verification
>> failure. llvm19 regression will be discussed separately in llvm upstream.
>>
>> The verifier change caused three test failures as these tests compared messages
>> with stack size. More specifically,
>>    - test_global_funcs/global_func1: adjusted to interpreter only since verification will
>>      succeed in jit mode. A new test will be added for jit mode later.
>>    - async_stack_depth/{pseudo_call_check, async_call_root_check}: since jit and interpreter
>>      will calculate different stack sizes, the failure msg is adjusted to omit those
>>      specific stack size numbers.
>>
>>    [1] https://lore.kernel.org/bpf/32bde0f0-1881-46c9-931a-673be566c61d@linux.dev/
>>
>> Suggested-by: Alexei Starovoitov <ast@kernel.org>
>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>> ---
>>   kernel/bpf/verifier.c                          | 18 +++++++++++++-----
>>   .../bpf/prog_tests/test_global_funcs.c         |  5 ++++-
>>   .../selftests/bpf/progs/async_stack_depth.c    |  4 ++--
>>   3 files changed, 19 insertions(+), 8 deletions(-)
>>
>> Changelogs:
>>    v2 -> v3:
>>      - fix async_stack_depth test failure if jit is turned off
>>    v1 -> v2:
>>      - fix some selftest failures
>>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index ddaf09db1175..6441a540904b 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -5812,6 +5812,17 @@ static int check_ptr_alignment(struct bpf_verifier_env *env,
>>                                             strict);
>>   }
>>
>> +static int round_up_stack_depth(struct bpf_verifier_env *env, int stack_depth)
>> +{
>> +       if (env->prog->jit_requested)
>> +               return round_up(stack_depth, 16);
>> +
>> +       /* round up to 32-bytes, since this is granularity
>> +        * of interpreter stack size
>> +        */
>> +       return round_up(max_t(u32, stack_depth, 1), 32);
>> +}
>> +
>>   /* starting from main bpf function walk all instructions of the function
>>    * and recursively walk all callees that given function can call.
>>    * Ignore jump and exit insns.
>> @@ -5855,10 +5866,7 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx)
>>                          depth);
>>                  return -EACCES;
>>          }
>> -       /* round up to 32-bytes, since this is granularity
>> -        * of interpreter stack size
>> -        */
>> -       depth += round_up(max_t(u32, subprog[idx].stack_depth, 1), 32);
>> +       depth += round_up_stack_depth(env, subprog[idx].stack_depth);
>>          if (depth > MAX_BPF_STACK) {
>>                  verbose(env, "combined stack size of %d calls is %d. Too large\n",
>>                          frame + 1, depth);
>> @@ -5952,7 +5960,7 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx)
>>           */
>>          if (frame == 0)
>>                  return 0;
>> -       depth -= round_up(max_t(u32, subprog[idx].stack_depth, 1), 32);
>> +       depth -= round_up_stack_depth(env, subprog[idx].stack_depth);
>>          frame--;
>>          i = ret_insn[frame];
>>          idx = ret_prog[frame];
>> diff --git a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
>> index e905cbaf6b3d..a3a41680b38e 100644
>> --- a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
>> +++ b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
>> @@ -138,7 +138,10 @@ static void subtest_ctx_arg_rewrite(void)
>>
>>   void test_test_global_funcs(void)
>>   {
>> -       RUN_TESTS(test_global_func1);
>> +       if (!env.jit_enabled) {
>> +               RUN_TESTS(test_global_func1);
>> +       }
> Can we increase the amount of used stack size to make it fail
> regardless of JIT? That's what I was asking on v1, actually. We have
> those arbitrarily sized buf[256] and buf[300], what prevents us from
> making them a few bytes bigger to not be affected by 16 vs 32 byte
> rounding?

Yes, we can do this. Will do it in v4.

>
>
>> +
>>          RUN_TESTS(test_global_func2);
>>          RUN_TESTS(test_global_func3);
>>          RUN_TESTS(test_global_func4);
>> diff --git a/tools/testing/selftests/bpf/progs/async_stack_depth.c b/tools/testing/selftests/bpf/progs/async_stack_depth.c
>> index 3517c0e01206..36734683acbd 100644
>> --- a/tools/testing/selftests/bpf/progs/async_stack_depth.c
>> +++ b/tools/testing/selftests/bpf/progs/async_stack_depth.c
>> @@ -30,7 +30,7 @@ static int bad_timer_cb(void *map, int *key, struct bpf_timer *timer)
>>   }
>>
>>   SEC("tc")
>> -__failure __msg("combined stack size of 2 calls is 576. Too large")
>> +__failure __msg("combined stack size of 2 calls is")
>>   int pseudo_call_check(struct __sk_buff *ctx)
>>   {
>>          struct hmap_elem *elem;
>> @@ -45,7 +45,7 @@ int pseudo_call_check(struct __sk_buff *ctx)
>>   }
>>
>>   SEC("tc")
>> -__failure __msg("combined stack size of 2 calls is 608. Too large")
>> +__failure __msg("combined stack size of 2 calls is")
>>   int async_call_root_check(struct __sk_buff *ctx)
>>   {
>>          struct hmap_elem *elem;
>> --
>> 2.39.3
>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next v3 2/2] selftests/bpf: Add a negative test for stack accounting in jit mode
  2024-02-13 19:40   ` Andrii Nakryiko
@ 2024-02-13 20:28     ` Yonghong Song
  0 siblings, 0 replies; 6+ messages in thread
From: Yonghong Song @ 2024-02-13 20:28 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Martin KaFai Lau


On 2/13/24 11:40 AM, Andrii Nakryiko wrote:
> On Thu, Feb 8, 2024 at 1:54 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>> The new test is very similar to test_global_func1.c, but
>> is modified to fail on jit mode.
>>
>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>> ---
>>   .../bpf/prog_tests/test_global_funcs.c        |  3 ++
>>   .../selftests/bpf/progs/test_global_func18.c  | 44 +++++++++++++++++++
>>   2 files changed, 47 insertions(+)
>>   create mode 100644 tools/testing/selftests/bpf/progs/test_global_func18.c
>>
>> diff --git a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
>> index a3a41680b38e..dccbf2213135 100644
>> --- a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
>> +++ b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c
>> @@ -18,6 +18,7 @@
>>   #include "test_global_func15.skel.h"
>>   #include "test_global_func16.skel.h"
>>   #include "test_global_func17.skel.h"
>> +#include "test_global_func18.skel.h"
>>   #include "test_global_func_ctx_args.skel.h"
>>
>>   #include "bpf/libbpf_internal.h"
>> @@ -140,6 +141,8 @@ void test_test_global_funcs(void)
>>   {
>>          if (!env.jit_enabled) {
>>                  RUN_TESTS(test_global_func1);
>> +       } else {
>> +               RUN_TESTS(test_global_func18);
>>          }
>>
>>          RUN_TESTS(test_global_func2);
>> diff --git a/tools/testing/selftests/bpf/progs/test_global_func18.c b/tools/testing/selftests/bpf/progs/test_global_func18.c
>> new file mode 100644
>> index 000000000000..d1aa3b2c68fe
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/test_global_func18.c
>> @@ -0,0 +1,44 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
>> +#include <stddef.h>
>> +#include <linux/bpf.h>
>> +#include <bpf/bpf_helpers.h>
>> +#include "bpf_misc.h"
>> +
>> +#define MAX_STACK1 (512 - 3 * 32 + 8)
>> +#define MAX_STACK2 (3 * 32)
>> +
>> +__attribute__ ((noinline))
> nit: we have __noinline defined, let's use it as a less verbose option?

A copy-paste issue. In v4, I will drop this patch
since I will increase the stack size to accommodate
both jit and !jit cases.

>
>> +int f1(struct __sk_buff *skb)
>> +{
>> +       return skb->len;
>> +}
>> +
>> +int f3(int, struct __sk_buff *skb, int);
>> +
>> +__attribute__ ((noinline))
>> +int f2(int val, struct __sk_buff *skb)
>> +{
>> +       volatile char buf[MAX_STACK1] = {};
>> +
>> +       __sink(buf[MAX_STACK1 - 1]);
>> +
>> +       return f1(skb) + f3(val, skb, 1);
>> +}
>> +
>> +__attribute__ ((noinline))
>> +int f3(int val, struct __sk_buff *skb, int var)
>> +{
>> +       volatile char buf[MAX_STACK2] = {};
>> +
>> +       __sink(buf[MAX_STACK2 - 1]);
>> +
>> +       return skb->ifindex * val * var;
>> +}
>> +
>> +SEC("tc")
>> +__failure __msg("combined stack size of 3 calls is 528")
>> +int global_func18(struct __sk_buff *skb)
>> +{
>> +       return f1(skb) + f2(2, skb) + f3(3, skb, 4);
>> +}
>> --
>> 2.39.3
>>
>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-02-13 20:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-08 21:54 [PATCH bpf-next v3 1/2] bpf: Fix test verif_scale_strobemeta_subprogs failure due to llvm19 Yonghong Song
2024-02-08 21:54 ` [PATCH bpf-next v3 2/2] selftests/bpf: Add a negative test for stack accounting in jit mode Yonghong Song
2024-02-13 19:40   ` Andrii Nakryiko
2024-02-13 20:28     ` Yonghong Song
2024-02-13 19:21 ` [PATCH bpf-next v3 1/2] bpf: Fix test verif_scale_strobemeta_subprogs failure due to llvm19 Andrii Nakryiko
2024-02-13 20:26   ` Yonghong Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.