bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg
@ 2021-02-16 14:19 Brendan Jackman
  2021-02-16 16:30 ` KP Singh
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Brendan Jackman @ 2021-02-16 14:19 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh,
	Florent Revest, Ilya Leoshkevich, Brendan Jackman

As pointed out by Ilya and explained in the new comment, there's a
discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads
the value from memory into r0, while x86 only does so when r0 and the
value in memory are different. The same issue affects s390.

At first this might sound like pure semantics, but it makes a real
difference when the comparison is 32-bit, since the load will
zero-extend r0/rax.

The fix is to explicitly zero-extend rax after doing such a
CMPXCHG. Since this problem affects multiple archs, this is done in
the verifier by patching in a BPF_ZEXT_REG instruction after every
32-bit cmpxchg. Any archs that don't need such manual zero-extension
can do a look-ahead with insn_is_zext to skip the unnecessary mov.

Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg")
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---

Difference from v1[1]: Now solved centrally in the verifier instead of
  specifically for the x86 JIT. Thanks to Ilya and Daniel for the suggestions!

[1] https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@iogearbox.net/T/#t

 kernel/bpf/verifier.c                         | 36 +++++++++++++++++++
 .../selftests/bpf/verifier/atomic_cmpxchg.c   | 25 +++++++++++++
 .../selftests/bpf/verifier/atomic_or.c        | 26 ++++++++++++++
 3 files changed, 87 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 16ba43352a5f..7f4a83d62acc 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11889,6 +11889,39 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
 	return 0;
 }

+/* BPF_CMPXCHG always loads a value into R0, therefore always zero-extends.
+ * However some archs' equivalent instruction only does this load when the
+ * comparison is successful. So here we add a BPF_ZEXT_REG after every 32-bit
+ * CMPXCHG, so that such archs' JITs don't need to deal with the issue. Archs
+ * that don't face this issue may use insn_is_zext to detect and skip the added
+ * instruction.
+ */
+static int add_zext_after_cmpxchg(struct bpf_verifier_env *env)
+{
+	struct bpf_insn zext_patch[2] = { [1] = BPF_ZEXT_REG(BPF_REG_0) };
+	struct bpf_insn *insn = env->prog->insnsi;
+	int insn_cnt = env->prog->len;
+	struct bpf_prog *new_prog;
+	int delta = 0; /* Number of instructions added */
+	int i;
+
+	for (i = 0; i < insn_cnt; i++, insn++) {
+		if (insn->code != (BPF_STX | BPF_W | BPF_ATOMIC) || insn->imm != BPF_CMPXCHG)
+			continue;
+
+		zext_patch[0] = *insn;
+		new_prog = bpf_patch_insn_data(env, i + delta, zext_patch, 2);
+		if (!new_prog)
+			return -ENOMEM;
+
+		delta++;
+		env->prog = new_prog;
+		insn = new_prog->insnsi + i + delta;
+	}
+
+	return 0;
+}
+
 static void free_states(struct bpf_verifier_env *env)
 {
 	struct bpf_verifier_state_list *sl, *sln;
@@ -12655,6 +12688,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 	if (ret == 0)
 		ret = fixup_call_args(env);

+	if (ret == 0)
+		ret = add_zext_after_cmpxchg(env);
+
 	env->verification_time = ktime_get_ns() - start_time;
 	print_verification_stats(env);

diff --git a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
index 2efd8bcf57a1..6e52dfc64415 100644
--- a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
+++ b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
@@ -94,3 +94,28 @@
 	.result = REJECT,
 	.errstr = "invalid read from stack",
 },
+{
+	"BPF_W cmpxchg should zero top 32 bits",
+	.insns = {
+		/* r0 = U64_MAX; */
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 1),
+		/* u64 val = r0; */
+		BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -8),
+		/* r0 = (u32)atomic_cmpxchg((u32 *)&val, r0, 1); */
+		BPF_MOV32_IMM(BPF_REG_1, 1),
+		BPF_ATOMIC_OP(BPF_W, BPF_CMPXCHG, BPF_REG_10, BPF_REG_1, -8),
+		/* r1 = 0x00000000FFFFFFFFull; */
+		BPF_MOV64_IMM(BPF_REG_1, 1),
+		BPF_ALU64_IMM(BPF_LSH, BPF_REG_1, 32),
+		BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 1),
+		/* if (r0 != r1) exit(1); */
+		BPF_JMP_REG(BPF_JEQ, BPF_REG_0, BPF_REG_1, 2),
+		BPF_MOV32_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+		/* exit(0); */
+		BPF_MOV32_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	},
+	.result = ACCEPT,
+},
diff --git a/tools/testing/selftests/bpf/verifier/atomic_or.c b/tools/testing/selftests/bpf/verifier/atomic_or.c
index 70f982e1f9f0..0a08b99e6ddd 100644
--- a/tools/testing/selftests/bpf/verifier/atomic_or.c
+++ b/tools/testing/selftests/bpf/verifier/atomic_or.c
@@ -75,3 +75,29 @@
 	},
 	.result = ACCEPT,
 },
+{
+	"BPF_W atomic_fetch_or should zero top 32 bits",
+	.insns = {
+		/* r1 = U64_MAX; */
+		BPF_MOV64_IMM(BPF_REG_1, 0),
+		BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 1),
+		/* u64 val = r0; */
+		BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
+		/* r1 = (u32)atomic_sub((u32 *)&val, 1); */
+		BPF_MOV32_IMM(BPF_REG_1, 2),
+		BPF_ATOMIC_OP(BPF_W, BPF_OR | BPF_FETCH, BPF_REG_10, BPF_REG_1, -8),
+		/* r2 = 0x00000000FFFFFFFF; */
+		BPF_MOV64_IMM(BPF_REG_2, 1),
+		BPF_ALU64_IMM(BPF_LSH, BPF_REG_2, 32),
+		BPF_ALU64_IMM(BPF_SUB, BPF_REG_2, 1),
+		/* if (r2 != r1) exit(1); */
+		BPF_JMP_REG(BPF_JEQ, BPF_REG_2, BPF_REG_1, 2),
+		/* BPF_MOV32_IMM(BPF_REG_0, 1), */
+		BPF_MOV64_REG(BPF_REG_0, BPF_REG_1),
+		BPF_EXIT_INSN(),
+		/* exit(0); */
+		BPF_MOV32_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	},
+	.result = ACCEPT,
+},

base-commit: 45159b27637b0fef6d5ddb86fc7c46b13c77960f
--
2.30.0.478.g8a0d178c01-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg
  2021-02-16 14:19 [PATCH v2 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg Brendan Jackman
@ 2021-02-16 16:30 ` KP Singh
  2021-02-16 19:55 ` Ilya Leoshkevich
  2021-02-17  0:50 ` Daniel Borkmann
  2 siblings, 0 replies; 8+ messages in thread
From: KP Singh @ 2021-02-16 16:30 UTC (permalink / raw)
  To: Brendan Jackman
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Florent Revest, Ilya Leoshkevich

On Tue, Feb 16, 2021 at 3:19 PM Brendan Jackman <jackmanb@google.com> wrote:
>
> As pointed out by Ilya and explained in the new comment, there's a
> discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads
> the value from memory into r0, while x86 only does so when r0 and the
> value in memory are different. The same issue affects s390.
>
> At first this might sound like pure semantics, but it makes a real
> difference when the comparison is 32-bit, since the load will
> zero-extend r0/rax.
>
> The fix is to explicitly zero-extend rax after doing such a
> CMPXCHG. Since this problem affects multiple archs, this is done in
> the verifier by patching in a BPF_ZEXT_REG instruction after every
> 32-bit cmpxchg. Any archs that don't need such manual zero-extension
> can do a look-ahead with insn_is_zext to skip the unnecessary mov.
>
> Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
> Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg")
> Signed-off-by: Brendan Jackman <jackmanb@google.com>

Acked-by: KP Singh <kpsingh@kernel.org>

> ---
>
> Difference from v1[1]: Now solved centrally in the verifier instead of
>   specifically for the x86 JIT. Thanks to Ilya and Daniel for the suggestions!
>
> [1] https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@iogearbox.net/T/#t
>
>  kernel/bpf/verifier.c                         | 36 +++++++++++++++++++
>  .../selftests/bpf/verifier/atomic_cmpxchg.c   | 25 +++++++++++++
>  .../selftests/bpf/verifier/atomic_or.c        | 26 ++++++++++++++
>  3 files changed, 87 insertions(+)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 16ba43352a5f..7f4a83d62acc 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -11889,6 +11889,39 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
>         return 0;
>  }
>
> +/* BPF_CMPXCHG always loads a value into R0, therefore always zero-extends.
> + * However some archs' equivalent instruction only does this load when the
> + * comparison is successful. So here we add a BPF_ZEXT_REG after every 32-bit
> + * CMPXCHG, so that such archs' JITs don't need to deal with the issue. Archs
> + * that don't face this issue may use insn_is_zext to detect and skip the added
> + * instruction.
> + */
> +static int add_zext_after_cmpxchg(struct bpf_verifier_env *env)
> +{
> +       struct bpf_insn zext_patch[2] = { [1] = BPF_ZEXT_REG(BPF_REG_0) };

I was initially confused as to why do we have 2 instructions here for the patch.

> +       struct bpf_insn *insn = env->prog->insnsi;
> +       int insn_cnt = env->prog->len;
> +       struct bpf_prog *new_prog;
> +       int delta = 0; /* Number of instructions added */
> +       int i;
> +
> +       for (i = 0; i < insn_cnt; i++, insn++) {
> +               if (insn->code != (BPF_STX | BPF_W | BPF_ATOMIC) || insn->imm != BPF_CMPXCHG)
> +                       continue;
> +
> +               zext_patch[0] = *insn;

But the patch also needs to have the original instruction, so it makes sense.


> +               new_prog = bpf_patch_insn_data(env, i + delta, zext_patch, 2);
> +               if (!new_prog)
> +                       return -ENOMEM;
> +
> +               delta++;
> +               env->prog = new_prog;
> +               insn = new_prog->insnsi + i + delta;
> +       }
> +
> +       return 0;
> +}
> +
>  static void free_states(struct bpf_verifier_env *env)
>  {
>         struct bpf_verifier_state_list *sl, *sln;
> @@ -12655,6 +12688,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
>         if (ret == 0)
>                 ret = fixup_call_args(env);
>
> +       if (ret == 0)
> +               ret = add_zext_after_cmpxchg(env);
> +
>         env->verification_time = ktime_get_ns() - start_time;
>         print_verification_stats(env);
>
> diff --git a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
> index 2efd8bcf57a1..6e52dfc64415 100644
> --- a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
> +++ b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
> @@ -94,3 +94,28 @@
>         .result = REJECT,
>         .errstr = "invalid read from stack",
>

[...]

>
> base-commit: 45159b27637b0fef6d5ddb86fc7c46b13c77960f
> --
> 2.30.0.478.g8a0d178c01-goog
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg
  2021-02-16 14:19 [PATCH v2 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg Brendan Jackman
  2021-02-16 16:30 ` KP Singh
@ 2021-02-16 19:55 ` Ilya Leoshkevich
  2021-02-17  7:51   ` Brendan Jackman
  2021-02-17  0:50 ` Daniel Borkmann
  2 siblings, 1 reply; 8+ messages in thread
From: Ilya Leoshkevich @ 2021-02-16 19:55 UTC (permalink / raw)
  To: Brendan Jackman, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh,
	Florent Revest

On Tue, 2021-02-16 at 14:19 +0000, Brendan Jackman wrote:
> As pointed out by Ilya and explained in the new comment, there's a
> discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads
> the value from memory into r0, while x86 only does so when r0 and the
> value in memory are different. The same issue affects s390.
> 
> At first this might sound like pure semantics, but it makes a real
> difference when the comparison is 32-bit, since the load will
> zero-extend r0/rax.
> 
> The fix is to explicitly zero-extend rax after doing such a
> CMPXCHG. Since this problem affects multiple archs, this is done in
> the verifier by patching in a BPF_ZEXT_REG instruction after every
> 32-bit cmpxchg. Any archs that don't need such manual zero-extension
> can do a look-ahead with insn_is_zext to skip the unnecessary mov.
> 
> Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
> Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg")
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---
> 
> Difference from v1[1]: Now solved centrally in the verifier instead
> of
>   specifically for the x86 JIT. Thanks to Ilya and Daniel for the
> suggestions!
> 
> [1] 
> https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@iogearbox.net/T/#t
> 
>  kernel/bpf/verifier.c                         | 36
> +++++++++++++++++++
>  .../selftests/bpf/verifier/atomic_cmpxchg.c   | 25 +++++++++++++
>  .../selftests/bpf/verifier/atomic_or.c        | 26 ++++++++++++++
>  3 files changed, 87 insertions(+)

I tried this with my s390 atomics patch, and it's working, thanks!

I was thinking whether this could go through the existing zext_dst
flag infrastructure, but it probably won't play too nicely with the
x86_64 JIT, which doesn't override bpf_jit_needs_zext().

Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>

[...]


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg
  2021-02-16 14:19 [PATCH v2 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg Brendan Jackman
  2021-02-16 16:30 ` KP Singh
  2021-02-16 19:55 ` Ilya Leoshkevich
@ 2021-02-17  0:50 ` Daniel Borkmann
  2021-02-17  1:43   ` KP Singh
  2 siblings, 1 reply; 8+ messages in thread
From: Daniel Borkmann @ 2021-02-17  0:50 UTC (permalink / raw)
  To: Brendan Jackman, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, KP Singh, Florent Revest,
	Ilya Leoshkevich

On 2/16/21 3:19 PM, Brendan Jackman wrote:
> As pointed out by Ilya and explained in the new comment, there's a
> discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads
> the value from memory into r0, while x86 only does so when r0 and the
> value in memory are different. The same issue affects s390.
> 
> At first this might sound like pure semantics, but it makes a real
> difference when the comparison is 32-bit, since the load will
> zero-extend r0/rax.
> 
> The fix is to explicitly zero-extend rax after doing such a
> CMPXCHG. Since this problem affects multiple archs, this is done in
> the verifier by patching in a BPF_ZEXT_REG instruction after every
> 32-bit cmpxchg. Any archs that don't need such manual zero-extension
> can do a look-ahead with insn_is_zext to skip the unnecessary mov.
> 
> Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
> Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg")
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
[...]
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 16ba43352a5f..7f4a83d62acc 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -11889,6 +11889,39 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
>   	return 0;
>   }
> 
> +/* BPF_CMPXCHG always loads a value into R0, therefore always zero-extends.
> + * However some archs' equivalent instruction only does this load when the
> + * comparison is successful. So here we add a BPF_ZEXT_REG after every 32-bit
> + * CMPXCHG, so that such archs' JITs don't need to deal with the issue. Archs
> + * that don't face this issue may use insn_is_zext to detect and skip the added
> + * instruction.
> + */
> +static int add_zext_after_cmpxchg(struct bpf_verifier_env *env)
> +{
> +	struct bpf_insn zext_patch[2] = { [1] = BPF_ZEXT_REG(BPF_REG_0) };
> +	struct bpf_insn *insn = env->prog->insnsi;
> +	int insn_cnt = env->prog->len;
> +	struct bpf_prog *new_prog;
> +	int delta = 0; /* Number of instructions added */
> +	int i;
> +
> +	for (i = 0; i < insn_cnt; i++, insn++) {
> +		if (insn->code != (BPF_STX | BPF_W | BPF_ATOMIC) || insn->imm != BPF_CMPXCHG)
> +			continue;
> +
> +		zext_patch[0] = *insn;
> +		new_prog = bpf_patch_insn_data(env, i + delta, zext_patch, 2);
> +		if (!new_prog)
> +			return -ENOMEM;
> +
> +		delta++;
> +		env->prog = new_prog;
> +		insn = new_prog->insnsi + i + delta;
> +	}

Looks good overall, one small nit ... is it possible to move this into fixup_bpf_calls()
where we walk the prog insns & handle most of the rewrites already?

> +
> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg
  2021-02-17  0:50 ` Daniel Borkmann
@ 2021-02-17  1:43   ` KP Singh
  2021-02-17  7:59     ` Brendan Jackman
  0 siblings, 1 reply; 8+ messages in thread
From: KP Singh @ 2021-02-17  1:43 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Brendan Jackman, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Florent Revest, Ilya Leoshkevich

On Wed, Feb 17, 2021 at 1:50 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 2/16/21 3:19 PM, Brendan Jackman wrote:
> > As pointed out by Ilya and explained in the new comment, there's a
> > discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads
> > the value from memory into r0, while x86 only does so when r0 and the
> > value in memory are different. The same issue affects s390.
> >
> > At first this might sound like pure semantics, but it makes a real
> > difference when the comparison is 32-bit, since the load will
> > zero-extend r0/rax.
> >
> > The fix is to explicitly zero-extend rax after doing such a
> > CMPXCHG. Since this problem affects multiple archs, this is done in
> > the verifier by patching in a BPF_ZEXT_REG instruction after every
> > 32-bit cmpxchg. Any archs that don't need such manual zero-extension
> > can do a look-ahead with insn_is_zext to skip the unnecessary mov.
> >
> > Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg")
> > Signed-off-by: Brendan Jackman <jackmanb@google.com>
> [...]
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 16ba43352a5f..7f4a83d62acc 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -11889,6 +11889,39 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
> >       return 0;
> >   }
> >
> > +/* BPF_CMPXCHG always loads a value into R0, therefore always zero-extends.
> > + * However some archs' equivalent instruction only does this load when the
> > + * comparison is successful. So here we add a BPF_ZEXT_REG after every 32-bit
> > + * CMPXCHG, so that such archs' JITs don't need to deal with the issue. Archs
> > + * that don't face this issue may use insn_is_zext to detect and skip the added
> > + * instruction.
> > + */
> > +static int add_zext_after_cmpxchg(struct bpf_verifier_env *env)
> > +{
> > +     struct bpf_insn zext_patch[2] = { [1] = BPF_ZEXT_REG(BPF_REG_0) };
> > +     struct bpf_insn *insn = env->prog->insnsi;
> > +     int insn_cnt = env->prog->len;
> > +     struct bpf_prog *new_prog;
> > +     int delta = 0; /* Number of instructions added */
> > +     int i;
> > +
> > +     for (i = 0; i < insn_cnt; i++, insn++) {
> > +             if (insn->code != (BPF_STX | BPF_W | BPF_ATOMIC) || insn->imm != BPF_CMPXCHG)
> > +                     continue;
> > +
> > +             zext_patch[0] = *insn;
> > +             new_prog = bpf_patch_insn_data(env, i + delta, zext_patch, 2);
> > +             if (!new_prog)
> > +                     return -ENOMEM;
> > +
> > +             delta++;
> > +             env->prog = new_prog;
> > +             insn = new_prog->insnsi + i + delta;
> > +     }
>
> Looks good overall, one small nit ... is it possible to move this into fixup_bpf_calls()
> where we walk the prog insns & handle most of the rewrites already?

Ah, so I thought fixup_bpf_calls was for "calls" but now looking at
the function it does
more than just fixing up calls. I guess we could also rename it and
update the comment
on the function.

- KP

>
> > +
> > +     return 0;
> > +}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg
  2021-02-16 19:55 ` Ilya Leoshkevich
@ 2021-02-17  7:51   ` Brendan Jackman
  0 siblings, 0 replies; 8+ messages in thread
From: Brendan Jackman @ 2021-02-17  7:51 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	KP Singh, Florent Revest

On Tue, 16 Feb 2021 at 20:55, Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> On Tue, 2021-02-16 at 14:19 +0000, Brendan Jackman wrote:
> > As pointed out by Ilya and explained in the new comment, there's a
> > discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads
> > the value from memory into r0, while x86 only does so when r0 and the
> > value in memory are different. The same issue affects s390.
> >
> > At first this might sound like pure semantics, but it makes a real
> > difference when the comparison is 32-bit, since the load will
> > zero-extend r0/rax.
> >
> > The fix is to explicitly zero-extend rax after doing such a
> > CMPXCHG. Since this problem affects multiple archs, this is done in
> > the verifier by patching in a BPF_ZEXT_REG instruction after every
> > 32-bit cmpxchg. Any archs that don't need such manual zero-extension
> > can do a look-ahead with insn_is_zext to skip the unnecessary mov.
> >
> > Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg")
> > Signed-off-by: Brendan Jackman <jackmanb@google.com>
> > ---
> >
> > Difference from v1[1]: Now solved centrally in the verifier instead
> > of
> >   specifically for the x86 JIT. Thanks to Ilya and Daniel for the
> > suggestions!
> >
> > [1]
> > https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@iogearbox.net/T/#t
> >
> >  kernel/bpf/verifier.c                         | 36
> > +++++++++++++++++++
> >  .../selftests/bpf/verifier/atomic_cmpxchg.c   | 25 +++++++++++++
> >  .../selftests/bpf/verifier/atomic_or.c        | 26 ++++++++++++++
> >  3 files changed, 87 insertions(+)
>
> I tried this with my s390 atomics patch, and it's working, thanks!
>
> I was thinking whether this could go through the existing zext_dst
> flag infrastructure, but it probably won't play too nicely with the
> x86_64 JIT, which doesn't override bpf_jit_needs_zext().

Ah right, I actually didn't understand what the
opt_subreg_zext_lo32_rnd_hi32 was doing until now so didn't consider
this.

But yeah I think cmpxchg is properly special here because the zext is
sometimes (e.g. on x86_64) needed even on architectures that don't
_generally_ need explicit zext.

I think I'll update some comments to reflect these learnings, thanks.

> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
>
> [...]
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg
  2021-02-17  1:43   ` KP Singh
@ 2021-02-17  7:59     ` Brendan Jackman
  2021-02-17  8:59       ` Daniel Borkmann
  0 siblings, 1 reply; 8+ messages in thread
From: Brendan Jackman @ 2021-02-17  7:59 UTC (permalink / raw)
  To: KP Singh
  Cc: Daniel Borkmann, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Florent Revest, Ilya Leoshkevich

On Wed, 17 Feb 2021 at 02:43, KP Singh <kpsingh@kernel.org> wrote:
>
> On Wed, Feb 17, 2021 at 1:50 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >
> > On 2/16/21 3:19 PM, Brendan Jackman wrote:
[...]
> > Looks good overall, one small nit ... is it possible to move this into fixup_bpf_calls()
> > where we walk the prog insns & handle most of the rewrites already?
>
> Ah, so I thought fixup_bpf_calls was for "calls" but now looking at
> the function it does
> more than just fixing up calls. I guess we could also rename it and
> update the comment
> on the function.

Ah yes. Looks like we have:

- Some division-by-zero related stuff
- Implementation of LD_ABS/LD_IND
- Some spectre mitigation
- Tail calls
- Fixups for map and jiffies helper calls

How about I rename this function to do_misc_fixups and add a short
comment to each of the above sections outlining what they're doing?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg
  2021-02-17  7:59     ` Brendan Jackman
@ 2021-02-17  8:59       ` Daniel Borkmann
  0 siblings, 0 replies; 8+ messages in thread
From: Daniel Borkmann @ 2021-02-17  8:59 UTC (permalink / raw)
  To: Brendan Jackman, KP Singh
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Florent Revest,
	Ilya Leoshkevich

On 2/17/21 8:59 AM, Brendan Jackman wrote:
> On Wed, 17 Feb 2021 at 02:43, KP Singh <kpsingh@kernel.org> wrote:
>> On Wed, Feb 17, 2021 at 1:50 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>> On 2/16/21 3:19 PM, Brendan Jackman wrote:
> [...]
>>> Looks good overall, one small nit ... is it possible to move this into fixup_bpf_calls()
>>> where we walk the prog insns & handle most of the rewrites already?
>>
>> Ah, so I thought fixup_bpf_calls was for "calls" but now looking at
>> the function it does
>> more than just fixing up calls. I guess we could also rename it and
>> update the comment
>> on the function.
> 
> Ah yes. Looks like we have:
> 
> - Some division-by-zero related stuff
> - Implementation of LD_ABS/LD_IND
> - Some spectre mitigation
> - Tail calls
> - Fixups for map and jiffies helper calls
> 
> How about I rename this function to do_misc_fixups and add a short
> comment to each of the above sections outlining what they're doing?

Sounds good to me, I would probably make the rename & comments a separate
patch from the 32 bit cmpxchg fix though.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-02-17  9:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-16 14:19 [PATCH v2 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg Brendan Jackman
2021-02-16 16:30 ` KP Singh
2021-02-16 19:55 ` Ilya Leoshkevich
2021-02-17  7:51   ` Brendan Jackman
2021-02-17  0:50 ` Daniel Borkmann
2021-02-17  1:43   ` KP Singh
2021-02-17  7:59     ` Brendan Jackman
2021-02-17  8:59       ` Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).