bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg
@ 2021-03-02 10:54 Brendan Jackman
  2021-03-02 10:56 ` Brendan Jackman
  2021-03-02 18:43 ` Martin KaFai Lau
  0 siblings, 2 replies; 6+ messages in thread
From: Brendan Jackman @ 2021-03-02 10:54 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh,
	Florent Revest, Ilya Leoshkevich, Martin KaFai Lau,
	Brendan Jackman

As pointed out by Ilya and explained in the new comment, there's a
discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads
the value from memory into r0, while x86 only does so when r0 and the
value in memory are different. The same issue affects s390.

At first this might sound like pure semantics, but it makes a real
difference when the comparison is 32-bit, since the load will
zero-extend r0/rax.

The fix is to explicitly zero-extend rax after doing such a
CMPXCHG. Since this problem affects multiple archs, this is done in
the verifier by patching in a BPF_ZEXT_REG instruction after every
32-bit cmpxchg. Any archs that don't need such manual zero-extension
can do a look-ahead with insn_is_zext to skip the unnecessary mov.

Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg")
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---


Differences v4->v5[1]:
 - Moved the logic entirely into opt_subreg_zext_lo32_rnd_hi32, thanks to Martin
   for suggesting this.

Differences v3->v4[1]:
 - Moved the optimization against pointless zext into the correct place:
   opt_subreg_zext_lo32_rnd_hi32 is called _after_ fixup_bpf_calls.

Differences v2->v3[1]:
 - Moved patching into fixup_bpf_calls (patch incoming to rename this function)
 - Added extra commentary on bpf_jit_needs_zext
 - Added check to avoid adding a pointless zext(r0) if there's already one there.

Difference v1->v2[1]: Now solved centrally in the verifier instead of
  specifically for the x86 JIT. Thanks to Ilya and Daniel for the suggestions!

[1] v4: https://lore.kernel.org/bpf/CA+i-1C3ytZz6FjcPmUg5s4L51pMQDxWcZNvM86w4RHZ_o2khwg@mail.gmail.com/T/#t
    v3: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
    v2: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
    v1: https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@iogearbox.net/T/#t


 kernel/bpf/core.c                             |  4 +++
 kernel/bpf/verifier.c                         | 17 +++++++++++-
 .../selftests/bpf/verifier/atomic_cmpxchg.c   | 25 ++++++++++++++++++
 .../selftests/bpf/verifier/atomic_or.c        | 26 +++++++++++++++++++
 4 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 0ae015ad1e05..dcf18612841b 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2342,6 +2342,10 @@ bool __weak bpf_helper_changes_pkt_data(void *func)
 /* Return TRUE if the JIT backend wants verifier to enable sub-register usage
  * analysis code and wants explicit zero extension inserted by verifier.
  * Otherwise, return FALSE.
+ *
+ * The verifier inserts an explicit zero extension after BPF_CMPXCHGs even if
+ * you don't override this. JITs that don't want these extra insns can detect
+ * them using insn_is_zext.
  */
 bool __weak bpf_jit_needs_zext(void)
 {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 4c373589273b..37076e4c6175 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11237,6 +11237,11 @@ static int opt_remove_nops(struct bpf_verifier_env *env)
 	return 0;
 }

+static inline bool is_cmpxchg(struct bpf_insn *insn)
+{
+	return (BPF_MODE(insn->code) == BPF_ATOMIC && insn->imm == BPF_CMPXCHG);
+}
+
 static int opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env,
 					 const union bpf_attr *attr)
 {
@@ -11296,7 +11301,17 @@ static int opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env,
 			goto apply_patch_buffer;
 		}

-		if (!bpf_jit_needs_zext())
+		/* Add in an zero-extend instruction if a) the JIT has requested
+		 * it or b) it's a CMPXCHG.
+		 *
+		 * The latter is because: BPF_CMPXCHG always loads a value into
+		 * R0, therefore always zero-extends. However some archs'
+		 * equivalent instruction only does this load when the
+		 * comparison is successful. This detail of CMPXCHG is
+		 * orthogonal to the general zero-extension behaviour of the
+		 * CPU, so it's treated independently of bpf_jit_needs_zext.
+		 */
+		if (!bpf_jit_needs_zext() && !is_cmpxchg(&insn))
 			continue;

 		if (WARN_ON_ONCE(load_reg == -1)) {
diff --git a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
index 2efd8bcf57a1..6e52dfc64415 100644
--- a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
+++ b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
@@ -94,3 +94,28 @@
 	.result = REJECT,
 	.errstr = "invalid read from stack",
 },
+{
+	"BPF_W cmpxchg should zero top 32 bits",
+	.insns = {
+		/* r0 = U64_MAX; */
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 1),
+		/* u64 val = r0; */
+		BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -8),
+		/* r0 = (u32)atomic_cmpxchg((u32 *)&val, r0, 1); */
+		BPF_MOV32_IMM(BPF_REG_1, 1),
+		BPF_ATOMIC_OP(BPF_W, BPF_CMPXCHG, BPF_REG_10, BPF_REG_1, -8),
+		/* r1 = 0x00000000FFFFFFFFull; */
+		BPF_MOV64_IMM(BPF_REG_1, 1),
+		BPF_ALU64_IMM(BPF_LSH, BPF_REG_1, 32),
+		BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 1),
+		/* if (r0 != r1) exit(1); */
+		BPF_JMP_REG(BPF_JEQ, BPF_REG_0, BPF_REG_1, 2),
+		BPF_MOV32_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+		/* exit(0); */
+		BPF_MOV32_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	},
+	.result = ACCEPT,
+},
diff --git a/tools/testing/selftests/bpf/verifier/atomic_or.c b/tools/testing/selftests/bpf/verifier/atomic_or.c
index 70f982e1f9f0..0a08b99e6ddd 100644
--- a/tools/testing/selftests/bpf/verifier/atomic_or.c
+++ b/tools/testing/selftests/bpf/verifier/atomic_or.c
@@ -75,3 +75,29 @@
 	},
 	.result = ACCEPT,
 },
+{
+	"BPF_W atomic_fetch_or should zero top 32 bits",
+	.insns = {
+		/* r1 = U64_MAX; */
+		BPF_MOV64_IMM(BPF_REG_1, 0),
+		BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 1),
+		/* u64 val = r0; */
+		BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
+		/* r1 = (u32)atomic_sub((u32 *)&val, 1); */
+		BPF_MOV32_IMM(BPF_REG_1, 2),
+		BPF_ATOMIC_OP(BPF_W, BPF_OR | BPF_FETCH, BPF_REG_10, BPF_REG_1, -8),
+		/* r2 = 0x00000000FFFFFFFF; */
+		BPF_MOV64_IMM(BPF_REG_2, 1),
+		BPF_ALU64_IMM(BPF_LSH, BPF_REG_2, 32),
+		BPF_ALU64_IMM(BPF_SUB, BPF_REG_2, 1),
+		/* if (r2 != r1) exit(1); */
+		BPF_JMP_REG(BPF_JEQ, BPF_REG_2, BPF_REG_1, 2),
+		/* BPF_MOV32_IMM(BPF_REG_0, 1), */
+		BPF_MOV64_REG(BPF_REG_0, BPF_REG_1),
+		BPF_EXIT_INSN(),
+		/* exit(0); */
+		BPF_MOV32_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	},
+	.result = ACCEPT,
+},

base-commit: f2cfe32e8a965a86e512dcb2e6251371d4a60c63
--
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg
  2021-03-02 10:54 [PATCH v5 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg Brendan Jackman
@ 2021-03-02 10:56 ` Brendan Jackman
  2021-03-02 10:58   ` Daniel Borkmann
  2021-03-02 18:43 ` Martin KaFai Lau
  1 sibling, 1 reply; 6+ messages in thread
From: Brendan Jackman @ 2021-03-02 10:56 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh,
	Florent Revest, Ilya Leoshkevich, Martin KaFai Lau

On Tue, 2 Mar 2021 at 11:54, Brendan Jackman <jackmanb@google.com> wrote:
> base-commit: f2cfe32e8a965a86e512dcb2e6251371d4a60c63

Oh yeah, this is based on Ilya's patch [1]. Is that OK or should I
just resend it once that one is merged?

[1] https://lore.kernel.org/bpf/20210301154019.129110-1-iii@linux.ibm.com/T/#u

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg
  2021-03-02 10:56 ` Brendan Jackman
@ 2021-03-02 10:58   ` Daniel Borkmann
  0 siblings, 0 replies; 6+ messages in thread
From: Daniel Borkmann @ 2021-03-02 10:58 UTC (permalink / raw)
  To: Brendan Jackman, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, KP Singh, Florent Revest,
	Ilya Leoshkevich, Martin KaFai Lau

On 3/2/21 11:56 AM, Brendan Jackman wrote:
> On Tue, 2 Mar 2021 at 11:54, Brendan Jackman <jackmanb@google.com> wrote:
>> base-commit: f2cfe32e8a965a86e512dcb2e6251371d4a60c63
> 
> Oh yeah, this is based on Ilya's patch [1]. Is that OK or should I
> just resend it once that one is merged?
> 
> [1] https://lore.kernel.org/bpf/20210301154019.129110-1-iii@linux.ibm.com/T/#u

No need to resend, enough to have it stated here.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg
  2021-03-02 10:54 [PATCH v5 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg Brendan Jackman
  2021-03-02 10:56 ` Brendan Jackman
@ 2021-03-02 18:43 ` Martin KaFai Lau
  2021-03-03  7:29   ` Yonghong Song
  1 sibling, 1 reply; 6+ messages in thread
From: Martin KaFai Lau @ 2021-03-02 18:43 UTC (permalink / raw)
  To: Brendan Jackman
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	KP Singh, Florent Revest, Ilya Leoshkevich

On Tue, Mar 02, 2021 at 10:54:00AM +0000, Brendan Jackman wrote:
> As pointed out by Ilya and explained in the new comment, there's a
> discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads
> the value from memory into r0, while x86 only does so when r0 and the
> value in memory are different. The same issue affects s390.
> 
> At first this might sound like pure semantics, but it makes a real
> difference when the comparison is 32-bit, since the load will
> zero-extend r0/rax.
> 
> The fix is to explicitly zero-extend rax after doing such a
> CMPXCHG. Since this problem affects multiple archs, this is done in
> the verifier by patching in a BPF_ZEXT_REG instruction after every
> 32-bit cmpxchg. Any archs that don't need such manual zero-extension
> can do a look-ahead with insn_is_zext to skip the unnecessary mov.
> 
> Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
> Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg")
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---
> 
> 
> Differences v4->v5[1]:
>  - Moved the logic entirely into opt_subreg_zext_lo32_rnd_hi32, thanks to Martin
>    for suggesting this.
> 
> Differences v3->v4[1]:
>  - Moved the optimization against pointless zext into the correct place:
>    opt_subreg_zext_lo32_rnd_hi32 is called _after_ fixup_bpf_calls.
> 
> Differences v2->v3[1]:
>  - Moved patching into fixup_bpf_calls (patch incoming to rename this function)
>  - Added extra commentary on bpf_jit_needs_zext
>  - Added check to avoid adding a pointless zext(r0) if there's already one there.
> 
> Difference v1->v2[1]: Now solved centrally in the verifier instead of
>   specifically for the x86 JIT. Thanks to Ilya and Daniel for the suggestions!
> 
> [1] v4: https://lore.kernel.org/bpf/CA+i-1C3ytZz6FjcPmUg5s4L51pMQDxWcZNvM86w4RHZ_o2khwg@mail.gmail.com/T/#t
>     v3: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
>     v2: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
>     v1: https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@iogearbox.net/T/#t
> 
> 
>  kernel/bpf/core.c                             |  4 +++
>  kernel/bpf/verifier.c                         | 17 +++++++++++-
>  .../selftests/bpf/verifier/atomic_cmpxchg.c   | 25 ++++++++++++++++++
>  .../selftests/bpf/verifier/atomic_or.c        | 26 +++++++++++++++++++
>  4 files changed, 71 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index 0ae015ad1e05..dcf18612841b 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -2342,6 +2342,10 @@ bool __weak bpf_helper_changes_pkt_data(void *func)
>  /* Return TRUE if the JIT backend wants verifier to enable sub-register usage
>   * analysis code and wants explicit zero extension inserted by verifier.
>   * Otherwise, return FALSE.
> + *
> + * The verifier inserts an explicit zero extension after BPF_CMPXCHGs even if
> + * you don't override this. JITs that don't want these extra insns can detect
> + * them using insn_is_zext.
>   */
>  bool __weak bpf_jit_needs_zext(void)
>  {
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 4c373589273b..37076e4c6175 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -11237,6 +11237,11 @@ static int opt_remove_nops(struct bpf_verifier_env *env)
>  	return 0;
>  }
> 
> +static inline bool is_cmpxchg(struct bpf_insn *insn)
nit. "const" struct bpf_insn *insn.

> +{
> +	return (BPF_MODE(insn->code) == BPF_ATOMIC && insn->imm == BPF_CMPXCHG);
I think it is better to check BPF_CLASS(insn->code) == BPF_STX also
in case in the future this helper will be reused before do_check()
has a chance to verify the instructions.

> +}
> +
>  static int opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env,
>  					 const union bpf_attr *attr)
>  {
> @@ -11296,7 +11301,17 @@ static int opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env,
>  			goto apply_patch_buffer;
>  		}
> 
> -		if (!bpf_jit_needs_zext())
> +		/* Add in an zero-extend instruction if a) the JIT has requested
> +		 * it or b) it's a CMPXCHG.
> +		 *
> +		 * The latter is because: BPF_CMPXCHG always loads a value into
> +		 * R0, therefore always zero-extends. However some archs'
> +		 * equivalent instruction only does this load when the
> +		 * comparison is successful. This detail of CMPXCHG is
> +		 * orthogonal to the general zero-extension behaviour of the
> +		 * CPU, so it's treated independently of bpf_jit_needs_zext.
> +		 */
> +		if (!bpf_jit_needs_zext() && !is_cmpxchg(&insn))
>  			continue;
> 
>  		if (WARN_ON_ONCE(load_reg == -1)) {
> diff --git a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
> index 2efd8bcf57a1..6e52dfc64415 100644
> --- a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
> +++ b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
> @@ -94,3 +94,28 @@
>  	.result = REJECT,
>  	.errstr = "invalid read from stack",
>  },
> +{
> +	"BPF_W cmpxchg should zero top 32 bits",
> +	.insns = {
> +		/* r0 = U64_MAX; */
> +		BPF_MOV64_IMM(BPF_REG_0, 0),
> +		BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 1),
> +		/* u64 val = r0; */
> +		BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -8),
> +		/* r0 = (u32)atomic_cmpxchg((u32 *)&val, r0, 1); */
> +		BPF_MOV32_IMM(BPF_REG_1, 1),
> +		BPF_ATOMIC_OP(BPF_W, BPF_CMPXCHG, BPF_REG_10, BPF_REG_1, -8),
> +		/* r1 = 0x00000000FFFFFFFFull; */
> +		BPF_MOV64_IMM(BPF_REG_1, 1),
> +		BPF_ALU64_IMM(BPF_LSH, BPF_REG_1, 32),
> +		BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 1),
> +		/* if (r0 != r1) exit(1); */
> +		BPF_JMP_REG(BPF_JEQ, BPF_REG_0, BPF_REG_1, 2),
> +		BPF_MOV32_IMM(BPF_REG_0, 1),
> +		BPF_EXIT_INSN(),
> +		/* exit(0); */
> +		BPF_MOV32_IMM(BPF_REG_0, 0),
> +		BPF_EXIT_INSN(),
> +	},
> +	.result = ACCEPT,
> +},
> diff --git a/tools/testing/selftests/bpf/verifier/atomic_or.c b/tools/testing/selftests/bpf/verifier/atomic_or.c
> index 70f982e1f9f0..0a08b99e6ddd 100644
> --- a/tools/testing/selftests/bpf/verifier/atomic_or.c
> +++ b/tools/testing/selftests/bpf/verifier/atomic_or.c
> @@ -75,3 +75,29 @@
>  	},
>  	.result = ACCEPT,
>  },
> +{
> +	"BPF_W atomic_fetch_or should zero top 32 bits",
> +	.insns = {
> +		/* r1 = U64_MAX; */
> +		BPF_MOV64_IMM(BPF_REG_1, 0),
> +		BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 1),
> +		/* u64 val = r0; */
s/r0/r1/

> +		BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
> +		/* r1 = (u32)atomic_sub((u32 *)&val, 1); */
		   r1 = (u32)atomic_fetch_or((u32 *)&val, 2)
		   
> +		BPF_MOV32_IMM(BPF_REG_1, 2),
> +		BPF_ATOMIC_OP(BPF_W, BPF_OR | BPF_FETCH, BPF_REG_10, BPF_REG_1, -8),
> +		/* r2 = 0x00000000FFFFFFFF; */
> +		BPF_MOV64_IMM(BPF_REG_2, 1),
> +		BPF_ALU64_IMM(BPF_LSH, BPF_REG_2, 32),
> +		BPF_ALU64_IMM(BPF_SUB, BPF_REG_2, 1),
> +		/* if (r2 != r1) exit(1); */
> +		BPF_JMP_REG(BPF_JEQ, BPF_REG_2, BPF_REG_1, 2),
> +		/* BPF_MOV32_IMM(BPF_REG_0, 1), */
> +		BPF_MOV64_REG(BPF_REG_0, BPF_REG_1),
> +		BPF_EXIT_INSN(),
> +		/* exit(0); */
> +		BPF_MOV32_IMM(BPF_REG_0, 0),
> +		BPF_EXIT_INSN(),
> +	},
> +	.result = ACCEPT,
> +},
> 
> base-commit: f2cfe32e8a965a86e512dcb2e6251371d4a60c63
> --
> 2.30.1.766.gb4fecdf3b7-goog
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg
  2021-03-02 18:43 ` Martin KaFai Lau
@ 2021-03-03  7:29   ` Yonghong Song
  2021-03-03 10:41     ` Brendan Jackman
  0 siblings, 1 reply; 6+ messages in thread
From: Yonghong Song @ 2021-03-03  7:29 UTC (permalink / raw)
  To: Martin KaFai Lau, Brendan Jackman
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	KP Singh, Florent Revest, Ilya Leoshkevich



On 3/2/21 10:43 AM, Martin KaFai Lau wrote:
> On Tue, Mar 02, 2021 at 10:54:00AM +0000, Brendan Jackman wrote:
>> As pointed out by Ilya and explained in the new comment, there's a
>> discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads
>> the value from memory into r0, while x86 only does so when r0 and the
>> value in memory are different. The same issue affects s390.
>>
>> At first this might sound like pure semantics, but it makes a real
>> difference when the comparison is 32-bit, since the load will
>> zero-extend r0/rax.
>>
>> The fix is to explicitly zero-extend rax after doing such a
>> CMPXCHG. Since this problem affects multiple archs, this is done in
>> the verifier by patching in a BPF_ZEXT_REG instruction after every
>> 32-bit cmpxchg. Any archs that don't need such manual zero-extension
>> can do a look-ahead with insn_is_zext to skip the unnecessary mov.
>>
>> Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
>> Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg")
>> Signed-off-by: Brendan Jackman <jackmanb@google.com>
>> ---
>>
>>
>> Differences v4->v5[1]:
>>   - Moved the logic entirely into opt_subreg_zext_lo32_rnd_hi32, thanks to Martin
>>     for suggesting this.
>>
>> Differences v3->v4[1]:
>>   - Moved the optimization against pointless zext into the correct place:
>>     opt_subreg_zext_lo32_rnd_hi32 is called _after_ fixup_bpf_calls.
>>
>> Differences v2->v3[1]:
>>   - Moved patching into fixup_bpf_calls (patch incoming to rename this function)
>>   - Added extra commentary on bpf_jit_needs_zext
>>   - Added check to avoid adding a pointless zext(r0) if there's already one there.
>>
>> Difference v1->v2[1]: Now solved centrally in the verifier instead of
>>    specifically for the x86 JIT. Thanks to Ilya and Daniel for the suggestions!
>>
>> [1] v4: https://lore.kernel.org/bpf/CA+i-1C3ytZz6FjcPmUg5s4L51pMQDxWcZNvM86w4RHZ_o2khwg@mail.gmail.com/T/#t
>>      v3: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
>>      v2: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
>>      v1: https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@iogearbox.net/T/#t
>>
>>
>>   kernel/bpf/core.c                             |  4 +++
>>   kernel/bpf/verifier.c                         | 17 +++++++++++-
>>   .../selftests/bpf/verifier/atomic_cmpxchg.c   | 25 ++++++++++++++++++
>>   .../selftests/bpf/verifier/atomic_or.c        | 26 +++++++++++++++++++
>>   4 files changed, 71 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
>> index 0ae015ad1e05..dcf18612841b 100644
>> --- a/kernel/bpf/core.c
>> +++ b/kernel/bpf/core.c
>> @@ -2342,6 +2342,10 @@ bool __weak bpf_helper_changes_pkt_data(void *func)
>>   /* Return TRUE if the JIT backend wants verifier to enable sub-register usage
>>    * analysis code and wants explicit zero extension inserted by verifier.
>>    * Otherwise, return FALSE.
>> + *
>> + * The verifier inserts an explicit zero extension after BPF_CMPXCHGs even if
>> + * you don't override this. JITs that don't want these extra insns can detect
>> + * them using insn_is_zext.
>>    */
>>   bool __weak bpf_jit_needs_zext(void)
>>   {
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 4c373589273b..37076e4c6175 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -11237,6 +11237,11 @@ static int opt_remove_nops(struct bpf_verifier_env *env)
>>   	return 0;
>>   }
>>
>> +static inline bool is_cmpxchg(struct bpf_insn *insn)
> nit. "const" struct bpf_insn *insn.
> 
>> +{
>> +	return (BPF_MODE(insn->code) == BPF_ATOMIC && insn->imm == BPF_CMPXCHG);
> I think it is better to check BPF_CLASS(insn->code) == BPF_STX also
> in case in the future this helper will be reused before do_check()
> has a chance to verify the instructions.

If this is the case, I would suggest to move is_cmpxchg() earlier
in verifier.c so later on for reuse we do not need to move this 
function. Also, in the return statement, there is no need
for outmost ().

> 
>> +}
>> +
>>   static int opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env,
>>   					 const union bpf_attr *attr)
>>   {
>> @@ -11296,7 +11301,17 @@ static int opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env,
>>   			goto apply_patch_buffer;
>>   		}
>>
>> -		if (!bpf_jit_needs_zext())
>> +		/* Add in an zero-extend instruction if a) the JIT has requested
>> +		 * it or b) it's a CMPXCHG.
>> +		 *
>> +		 * The latter is because: BPF_CMPXCHG always loads a value into
>> +		 * R0, therefore always zero-extends. However some archs'
>> +		 * equivalent instruction only does this load when the
>> +		 * comparison is successful. This detail of CMPXCHG is
>> +		 * orthogonal to the general zero-extension behaviour of the
>> +		 * CPU, so it's treated independently of bpf_jit_needs_zext.
>> +		 */
>> +		if (!bpf_jit_needs_zext() && !is_cmpxchg(&insn))
>>   			continue;
>>
>>   		if (WARN_ON_ONCE(load_reg == -1)) {
>> diff --git a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
>> index 2efd8bcf57a1..6e52dfc64415 100644
>> --- a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
>> +++ b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
>> @@ -94,3 +94,28 @@
>>   	.result = REJECT,
>>   	.errstr = "invalid read from stack",
>>   },
>> +{
>> +	"BPF_W cmpxchg should zero top 32 bits",
>> +	.insns = {
>> +		/* r0 = U64_MAX; */
>> +		BPF_MOV64_IMM(BPF_REG_0, 0),
>> +		BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 1),
>> +		/* u64 val = r0; */
>> +		BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -8),
>> +		/* r0 = (u32)atomic_cmpxchg((u32 *)&val, r0, 1); */
>> +		BPF_MOV32_IMM(BPF_REG_1, 1),
>> +		BPF_ATOMIC_OP(BPF_W, BPF_CMPXCHG, BPF_REG_10, BPF_REG_1, -8),
>> +		/* r1 = 0x00000000FFFFFFFFull; */
>> +		BPF_MOV64_IMM(BPF_REG_1, 1),
>> +		BPF_ALU64_IMM(BPF_LSH, BPF_REG_1, 32),
>> +		BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 1),
>> +		/* if (r0 != r1) exit(1); */
>> +		BPF_JMP_REG(BPF_JEQ, BPF_REG_0, BPF_REG_1, 2),
>> +		BPF_MOV32_IMM(BPF_REG_0, 1),
>> +		BPF_EXIT_INSN(),
>> +		/* exit(0); */
>> +		BPF_MOV32_IMM(BPF_REG_0, 0),
>> +		BPF_EXIT_INSN(),
>> +	},
>> +	.result = ACCEPT,
>> +},
>> diff --git a/tools/testing/selftests/bpf/verifier/atomic_or.c b/tools/testing/selftests/bpf/verifier/atomic_or.c
>> index 70f982e1f9f0..0a08b99e6ddd 100644
>> --- a/tools/testing/selftests/bpf/verifier/atomic_or.c
>> +++ b/tools/testing/selftests/bpf/verifier/atomic_or.c
>> @@ -75,3 +75,29 @@
>>   	},
>>   	.result = ACCEPT,
>>   },
>> +{
>> +	"BPF_W atomic_fetch_or should zero top 32 bits",
>> +	.insns = {
>> +		/* r1 = U64_MAX; */
>> +		BPF_MOV64_IMM(BPF_REG_1, 0),
>> +		BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 1),
>> +		/* u64 val = r0; */
> s/r0/r1/
> 
>> +		BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
>> +		/* r1 = (u32)atomic_sub((u32 *)&val, 1); */
> 		   r1 = (u32)atomic_fetch_or((u32 *)&val, 2)
> 		
>> +		BPF_MOV32_IMM(BPF_REG_1, 2),
>> +		BPF_ATOMIC_OP(BPF_W, BPF_OR | BPF_FETCH, BPF_REG_10, BPF_REG_1, -8),
>> +		/* r2 = 0x00000000FFFFFFFF; */
>> +		BPF_MOV64_IMM(BPF_REG_2, 1),
>> +		BPF_ALU64_IMM(BPF_LSH, BPF_REG_2, 32),
>> +		BPF_ALU64_IMM(BPF_SUB, BPF_REG_2, 1),
>> +		/* if (r2 != r1) exit(1); */
>> +		BPF_JMP_REG(BPF_JEQ, BPF_REG_2, BPF_REG_1, 2),
>> +		/* BPF_MOV32_IMM(BPF_REG_0, 1), */
>> +		BPF_MOV64_REG(BPF_REG_0, BPF_REG_1),
>> +		BPF_EXIT_INSN(),
>> +		/* exit(0); */
>> +		BPF_MOV32_IMM(BPF_REG_0, 0),
>> +		BPF_EXIT_INSN(),
>> +	},
>> +	.result = ACCEPT,
>> +},
>>
>> base-commit: f2cfe32e8a965a86e512dcb2e6251371d4a60c63
>> --
>> 2.30.1.766.gb4fecdf3b7-goog
>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg
  2021-03-03  7:29   ` Yonghong Song
@ 2021-03-03 10:41     ` Brendan Jackman
  0 siblings, 0 replies; 6+ messages in thread
From: Brendan Jackman @ 2021-03-03 10:41 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Martin KaFai Lau, bpf, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, KP Singh, Florent Revest, Ilya Leoshkevich

Thanks once again for the reviews.

On Wed, 3 Mar 2021 at 08:29, Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 3/2/21 10:43 AM, Martin KaFai Lau wrote:
> > On Tue, Mar 02, 2021 at 10:54:00AM +0000, Brendan Jackman wrote:
> >> As pointed out by Ilya and explained in the new comment, there's a
> >> discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads
> >> the value from memory into r0, while x86 only does so when r0 and the
> >> value in memory are different. The same issue affects s390.
> >>
> >> At first this might sound like pure semantics, but it makes a real
> >> difference when the comparison is 32-bit, since the load will
> >> zero-extend r0/rax.
> >>
> >> The fix is to explicitly zero-extend rax after doing such a
> >> CMPXCHG. Since this problem affects multiple archs, this is done in
> >> the verifier by patching in a BPF_ZEXT_REG instruction after every
> >> 32-bit cmpxchg. Any archs that don't need such manual zero-extension
> >> can do a look-ahead with insn_is_zext to skip the unnecessary mov.
> >>
> >> Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
> >> Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg")
> >> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> >> ---
> >>
> >>
> >> Differences v4->v5[1]:
> >>   - Moved the logic entirely into opt_subreg_zext_lo32_rnd_hi32, thanks to Martin
> >>     for suggesting this.
> >>
> >> Differences v3->v4[1]:
> >>   - Moved the optimization against pointless zext into the correct place:
> >>     opt_subreg_zext_lo32_rnd_hi32 is called _after_ fixup_bpf_calls.
> >>
> >> Differences v2->v3[1]:
> >>   - Moved patching into fixup_bpf_calls (patch incoming to rename this function)
> >>   - Added extra commentary on bpf_jit_needs_zext
> >>   - Added check to avoid adding a pointless zext(r0) if there's already one there.
> >>
> >> Difference v1->v2[1]: Now solved centrally in the verifier instead of
> >>    specifically for the x86 JIT. Thanks to Ilya and Daniel for the suggestions!
> >>
> >> [1] v4: https://lore.kernel.org/bpf/CA+i-1C3ytZz6FjcPmUg5s4L51pMQDxWcZNvM86w4RHZ_o2khwg@mail.gmail.com/T/#t
> >>      v3: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
> >>      v2: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
> >>      v1: https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@iogearbox.net/T/#t
> >>
> >>
> >>   kernel/bpf/core.c                             |  4 +++
> >>   kernel/bpf/verifier.c                         | 17 +++++++++++-
> >>   .../selftests/bpf/verifier/atomic_cmpxchg.c   | 25 ++++++++++++++++++
> >>   .../selftests/bpf/verifier/atomic_or.c        | 26 +++++++++++++++++++
> >>   4 files changed, 71 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> >> index 0ae015ad1e05..dcf18612841b 100644
> >> --- a/kernel/bpf/core.c
> >> +++ b/kernel/bpf/core.c
> >> @@ -2342,6 +2342,10 @@ bool __weak bpf_helper_changes_pkt_data(void *func)
> >>   /* Return TRUE if the JIT backend wants verifier to enable sub-register usage
> >>    * analysis code and wants explicit zero extension inserted by verifier.
> >>    * Otherwise, return FALSE.
> >> + *
> >> + * The verifier inserts an explicit zero extension after BPF_CMPXCHGs even if
> >> + * you don't override this. JITs that don't want these extra insns can detect
> >> + * them using insn_is_zext.
> >>    */
> >>   bool __weak bpf_jit_needs_zext(void)
> >>   {
> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> >> index 4c373589273b..37076e4c6175 100644
> >> --- a/kernel/bpf/verifier.c
> >> +++ b/kernel/bpf/verifier.c
> >> @@ -11237,6 +11237,11 @@ static int opt_remove_nops(struct bpf_verifier_env *env)
> >>      return 0;
> >>   }
> >>
> >> +static inline bool is_cmpxchg(struct bpf_insn *insn)
> > nit. "const" struct bpf_insn *insn.
> >
> >> +{
> >> +    return (BPF_MODE(insn->code) == BPF_ATOMIC && insn->imm == BPF_CMPXCHG);
> > I think it is better to check BPF_CLASS(insn->code) == BPF_STX also
> > in case in the future this helper will be reused before do_check()
> > has a chance to verify the instructions.
>
> If this is the case, I would suggest to move is_cmpxchg() earlier
> in verifier.c so later on for reuse we do not need to move this
> function. Also, in the return statement, there is no need
> for outmost ().

Yep, all good points, thanks. Spinning v6 now.

> >
> >> +}
> >> +
> >>   static int opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env,
> >>                                       const union bpf_attr *attr)
> >>   {
> >> @@ -11296,7 +11301,17 @@ static int opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env,
> >>                      goto apply_patch_buffer;
> >>              }
> >>
> >> -            if (!bpf_jit_needs_zext())
> >> +            /* Add in an zero-extend instruction if a) the JIT has requested
> >> +             * it or b) it's a CMPXCHG.
> >> +             *
> >> +             * The latter is because: BPF_CMPXCHG always loads a value into
> >> +             * R0, therefore always zero-extends. However some archs'
> >> +             * equivalent instruction only does this load when the
> >> +             * comparison is successful. This detail of CMPXCHG is
> >> +             * orthogonal to the general zero-extension behaviour of the
> >> +             * CPU, so it's treated independently of bpf_jit_needs_zext.
> >> +             */
> >> +            if (!bpf_jit_needs_zext() && !is_cmpxchg(&insn))
> >>                      continue;
> >>
> >>              if (WARN_ON_ONCE(load_reg == -1)) {
> >> diff --git a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
> >> index 2efd8bcf57a1..6e52dfc64415 100644
> >> --- a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
> >> +++ b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
> >> @@ -94,3 +94,28 @@
> >>      .result = REJECT,
> >>      .errstr = "invalid read from stack",
> >>   },
> >> +{
> >> +    "BPF_W cmpxchg should zero top 32 bits",
> >> +    .insns = {
> >> +            /* r0 = U64_MAX; */
> >> +            BPF_MOV64_IMM(BPF_REG_0, 0),
> >> +            BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 1),
> >> +            /* u64 val = r0; */
> >> +            BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -8),
> >> +            /* r0 = (u32)atomic_cmpxchg((u32 *)&val, r0, 1); */
> >> +            BPF_MOV32_IMM(BPF_REG_1, 1),
> >> +            BPF_ATOMIC_OP(BPF_W, BPF_CMPXCHG, BPF_REG_10, BPF_REG_1, -8),
> >> +            /* r1 = 0x00000000FFFFFFFFull; */
> >> +            BPF_MOV64_IMM(BPF_REG_1, 1),
> >> +            BPF_ALU64_IMM(BPF_LSH, BPF_REG_1, 32),
> >> +            BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 1),
> >> +            /* if (r0 != r1) exit(1); */
> >> +            BPF_JMP_REG(BPF_JEQ, BPF_REG_0, BPF_REG_1, 2),
> >> +            BPF_MOV32_IMM(BPF_REG_0, 1),
> >> +            BPF_EXIT_INSN(),
> >> +            /* exit(0); */
> >> +            BPF_MOV32_IMM(BPF_REG_0, 0),
> >> +            BPF_EXIT_INSN(),
> >> +    },
> >> +    .result = ACCEPT,
> >> +},
> >> diff --git a/tools/testing/selftests/bpf/verifier/atomic_or.c b/tools/testing/selftests/bpf/verifier/atomic_or.c
> >> index 70f982e1f9f0..0a08b99e6ddd 100644
> >> --- a/tools/testing/selftests/bpf/verifier/atomic_or.c
> >> +++ b/tools/testing/selftests/bpf/verifier/atomic_or.c
> >> @@ -75,3 +75,29 @@
> >>      },
> >>      .result = ACCEPT,
> >>   },
> >> +{
> >> +    "BPF_W atomic_fetch_or should zero top 32 bits",
> >> +    .insns = {
> >> +            /* r1 = U64_MAX; */
> >> +            BPF_MOV64_IMM(BPF_REG_1, 0),
> >> +            BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 1),
> >> +            /* u64 val = r0; */
> > s/r0/r1/
> >
> >> +            BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
> >> +            /* r1 = (u32)atomic_sub((u32 *)&val, 1); */
> >                  r1 = (u32)atomic_fetch_or((u32 *)&val, 2)
> >
> >> +            BPF_MOV32_IMM(BPF_REG_1, 2),
> >> +            BPF_ATOMIC_OP(BPF_W, BPF_OR | BPF_FETCH, BPF_REG_10, BPF_REG_1, -8),
> >> +            /* r2 = 0x00000000FFFFFFFF; */
> >> +            BPF_MOV64_IMM(BPF_REG_2, 1),
> >> +            BPF_ALU64_IMM(BPF_LSH, BPF_REG_2, 32),
> >> +            BPF_ALU64_IMM(BPF_SUB, BPF_REG_2, 1),
> >> +            /* if (r2 != r1) exit(1); */
> >> +            BPF_JMP_REG(BPF_JEQ, BPF_REG_2, BPF_REG_1, 2),
> >> +            /* BPF_MOV32_IMM(BPF_REG_0, 1), */
> >> +            BPF_MOV64_REG(BPF_REG_0, BPF_REG_1),
> >> +            BPF_EXIT_INSN(),
> >> +            /* exit(0); */
> >> +            BPF_MOV32_IMM(BPF_REG_0, 0),
> >> +            BPF_EXIT_INSN(),
> >> +    },
> >> +    .result = ACCEPT,
> >> +},
> >>
> >> base-commit: f2cfe32e8a965a86e512dcb2e6251371d4a60c63
> >> --
> >> 2.30.1.766.gb4fecdf3b7-goog
> >>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-03 22:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-02 10:54 [PATCH v5 bpf-next] bpf: Explicitly zero-extend R0 after 32-bit cmpxchg Brendan Jackman
2021-03-02 10:56 ` Brendan Jackman
2021-03-02 10:58   ` Daniel Borkmann
2021-03-02 18:43 ` Martin KaFai Lau
2021-03-03  7:29   ` Yonghong Song
2021-03-03 10:41     ` Brendan Jackman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).