All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Atomics support for eBPF on powerpc
@ 2022-06-10 15:55 ` Hari Bathini
  0 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-10 15:55 UTC (permalink / raw)
  To: bpf, linuxppc-dev
  Cc: Michael Ellerman, Naveen N. Rao, Christophe Leroy, netdev,
	Benjamin Herrenschmidt, Paul Mackerras, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Jordan Niethe,
	Russell Currey

This patchset adds atomic operations to the eBPF instruction set on
powerpc. The instructions that are added here can be summarised with
this list of kernel operations for ppc64:

* atomic[64]_[fetch_]add
* atomic[64]_[fetch_]and
* atomic[64]_[fetch_]or
* atomic[64]_[fetch_]xor
* atomic[64]_xchg
* atomic[64]_cmpxchg

and this list of kernel operations for ppc32:

* atomic_[fetch_]add
* atomic_[fetch_]and
* atomic_[fetch_]or
* atomic_[fetch_]xor
* atomic_xchg
* atomic_cmpxchg

The following are left out of scope for this effort:

* 64 bit operations on ppc32.
* Explicit memory barriers, 16 and 8 bit operations on both ppc32
  & ppc64.

The first patch adds support for bitwsie atomic operations on ppc64.
The next patch adds fetch variant support for these instructions. The
third patch adds support for xchg and cmpxchg atomic operations on
ppc64. Patch #4 adds support for 32-bit atomic bitwise operations on
ppc32. patch #5 adds support for xchg and cmpxchg atomic operations
on ppc32.

Validated these changes successfully with atomics test cases in
test_bpf testsuite and  test_verifier & test_progs selftests.
With test_bpf testsuite:

  all 147 atomics related test cases (both 32-bit & 64-bit) JIT'ed
  successfully on ppc64:

    test_bpf: Summary: 1026 PASSED, 0 FAILED, [1014/1014 JIT'ed]

  all 76 atomics related test cases (32-bit) JIT'ed successfully
  on ppc32:

    test_bpf: Summary: 1027 PASSED, 0 FAILED, [915/1015 JIT'ed]

Changes in v2:
* Moved variable declaration to avoid late declaration error on
  some compilers. Thanks to Russell for pointing this out.
* For ppc64, added an optimization for 32-bit cmpxchg with regard
  to commit 39491867ace5.
* For ppc32, used an additional register (BPF_REG_AX):
    - to avoid clobbering src_reg.
    - to keep the lwarx reservation as intended.
    - to avoid the odd switch/goto construct.
* For ppc32, zero'ed out the higher 32-bit explicitly when required.


Hari Bathini (5):
  bpf ppc64: add support for BPF_ATOMIC bitwise operations
  bpf ppc64: add support for atomic fetch operations
  bpf ppc64: Add instructions for atomic_[cmp]xchg
  bpf ppc32: add support for BPF_ATOMIC bitwise operations
  bpf ppc32: Add instructions for atomic_[cmp]xchg

 arch/powerpc/net/bpf_jit_comp32.c | 72 +++++++++++++++++++----
 arch/powerpc/net/bpf_jit_comp64.c | 96 ++++++++++++++++++++++---------
 2 files changed, 129 insertions(+), 39 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 0/5] Atomics support for eBPF on powerpc
@ 2022-06-10 15:55 ` Hari Bathini
  0 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-10 15:55 UTC (permalink / raw)
  To: bpf, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, netdev, Paul Mackerras, Naveen N. Rao,
	Yonghong Song, KP Singh, Jordan Niethe, Martin KaFai Lau

This patchset adds atomic operations to the eBPF instruction set on
powerpc. The instructions that are added here can be summarised with
this list of kernel operations for ppc64:

* atomic[64]_[fetch_]add
* atomic[64]_[fetch_]and
* atomic[64]_[fetch_]or
* atomic[64]_[fetch_]xor
* atomic[64]_xchg
* atomic[64]_cmpxchg

and this list of kernel operations for ppc32:

* atomic_[fetch_]add
* atomic_[fetch_]and
* atomic_[fetch_]or
* atomic_[fetch_]xor
* atomic_xchg
* atomic_cmpxchg

The following are left out of scope for this effort:

* 64 bit operations on ppc32.
* Explicit memory barriers, 16 and 8 bit operations on both ppc32
  & ppc64.

The first patch adds support for bitwsie atomic operations on ppc64.
The next patch adds fetch variant support for these instructions. The
third patch adds support for xchg and cmpxchg atomic operations on
ppc64. Patch #4 adds support for 32-bit atomic bitwise operations on
ppc32. patch #5 adds support for xchg and cmpxchg atomic operations
on ppc32.

Validated these changes successfully with atomics test cases in
test_bpf testsuite and  test_verifier & test_progs selftests.
With test_bpf testsuite:

  all 147 atomics related test cases (both 32-bit & 64-bit) JIT'ed
  successfully on ppc64:

    test_bpf: Summary: 1026 PASSED, 0 FAILED, [1014/1014 JIT'ed]

  all 76 atomics related test cases (32-bit) JIT'ed successfully
  on ppc32:

    test_bpf: Summary: 1027 PASSED, 0 FAILED, [915/1015 JIT'ed]

Changes in v2:
* Moved variable declaration to avoid late declaration error on
  some compilers. Thanks to Russell for pointing this out.
* For ppc64, added an optimization for 32-bit cmpxchg with regard
  to commit 39491867ace5.
* For ppc32, used an additional register (BPF_REG_AX):
    - to avoid clobbering src_reg.
    - to keep the lwarx reservation as intended.
    - to avoid the odd switch/goto construct.
* For ppc32, zero'ed out the higher 32-bit explicitly when required.


Hari Bathini (5):
  bpf ppc64: add support for BPF_ATOMIC bitwise operations
  bpf ppc64: add support for atomic fetch operations
  bpf ppc64: Add instructions for atomic_[cmp]xchg
  bpf ppc32: add support for BPF_ATOMIC bitwise operations
  bpf ppc32: Add instructions for atomic_[cmp]xchg

 arch/powerpc/net/bpf_jit_comp32.c | 72 +++++++++++++++++++----
 arch/powerpc/net/bpf_jit_comp64.c | 96 ++++++++++++++++++++++---------
 2 files changed, 129 insertions(+), 39 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 1/5] bpf ppc64: add support for BPF_ATOMIC bitwise operations
  2022-06-10 15:55 ` Hari Bathini
@ 2022-06-10 15:55   ` Hari Bathini
  -1 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-10 15:55 UTC (permalink / raw)
  To: bpf, linuxppc-dev
  Cc: Michael Ellerman, Naveen N. Rao, Christophe Leroy, netdev,
	Benjamin Herrenschmidt, Paul Mackerras, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Jordan Niethe,
	Russell Currey

Adding instructions for ppc64 for

atomic[64]_and
atomic[64]_or
atomic[64]_xor

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
---

* No changes in v2.


 arch/powerpc/net/bpf_jit_comp64.c | 57 ++++++++++++++++---------------
 1 file changed, 29 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 594c54931e20..c421bedd0e98 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -777,41 +777,42 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		 * BPF_STX ATOMIC (atomic ops)
 		 */
 		case BPF_STX | BPF_ATOMIC | BPF_W:
-			if (imm != BPF_ADD) {
-				pr_err_ratelimited(
-					"eBPF filter atomic op code %02x (@%d) unsupported\n",
-					code, i);
-				return -ENOTSUPP;
-			}
-
-			/* *(u32 *)(dst + off) += src */
-
-			/* Get EA into TMP_REG_1 */
-			EMIT(PPC_RAW_ADDI(tmp1_reg, dst_reg, off));
+		case BPF_STX | BPF_ATOMIC | BPF_DW:
+			/* Get offset into TMP_REG_1 */
+			EMIT(PPC_RAW_LI(tmp1_reg, off));
 			tmp_idx = ctx->idx * 4;
 			/* load value from memory into TMP_REG_2 */
-			EMIT(PPC_RAW_LWARX(tmp2_reg, 0, tmp1_reg, 0));
-			/* add value from src_reg into this */
-			EMIT(PPC_RAW_ADD(tmp2_reg, tmp2_reg, src_reg));
-			/* store result back */
-			EMIT(PPC_RAW_STWCX(tmp2_reg, 0, tmp1_reg));
-			/* we're done if this succeeded */
-			PPC_BCC_SHORT(COND_NE, tmp_idx);
-			break;
-		case BPF_STX | BPF_ATOMIC | BPF_DW:
-			if (imm != BPF_ADD) {
+			if (size == BPF_DW)
+				EMIT(PPC_RAW_LDARX(tmp2_reg, tmp1_reg, dst_reg, 0));
+			else
+				EMIT(PPC_RAW_LWARX(tmp2_reg, tmp1_reg, dst_reg, 0));
+
+			switch (imm) {
+			case BPF_ADD:
+				EMIT(PPC_RAW_ADD(tmp2_reg, tmp2_reg, src_reg));
+				break;
+			case BPF_AND:
+				EMIT(PPC_RAW_AND(tmp2_reg, tmp2_reg, src_reg));
+				break;
+			case BPF_OR:
+				EMIT(PPC_RAW_OR(tmp2_reg, tmp2_reg, src_reg));
+				break;
+			case BPF_XOR:
+				EMIT(PPC_RAW_XOR(tmp2_reg, tmp2_reg, src_reg));
+				break;
+			default:
 				pr_err_ratelimited(
 					"eBPF filter atomic op code %02x (@%d) unsupported\n",
 					code, i);
-				return -ENOTSUPP;
+				return -EOPNOTSUPP;
 			}
-			/* *(u64 *)(dst + off) += src */
 
-			EMIT(PPC_RAW_ADDI(tmp1_reg, dst_reg, off));
-			tmp_idx = ctx->idx * 4;
-			EMIT(PPC_RAW_LDARX(tmp2_reg, 0, tmp1_reg, 0));
-			EMIT(PPC_RAW_ADD(tmp2_reg, tmp2_reg, src_reg));
-			EMIT(PPC_RAW_STDCX(tmp2_reg, 0, tmp1_reg));
+			/* store result back */
+			if (size == BPF_DW)
+				EMIT(PPC_RAW_STDCX(tmp2_reg, tmp1_reg, dst_reg));
+			else
+				EMIT(PPC_RAW_STWCX(tmp2_reg, tmp1_reg, dst_reg));
+			/* we're done if this succeeded */
 			PPC_BCC_SHORT(COND_NE, tmp_idx);
 			break;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 1/5] bpf ppc64: add support for BPF_ATOMIC bitwise operations
@ 2022-06-10 15:55   ` Hari Bathini
  0 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-10 15:55 UTC (permalink / raw)
  To: bpf, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, netdev, Paul Mackerras, Naveen N. Rao,
	Yonghong Song, KP Singh, Jordan Niethe, Martin KaFai Lau

Adding instructions for ppc64 for

atomic[64]_and
atomic[64]_or
atomic[64]_xor

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
---

* No changes in v2.


 arch/powerpc/net/bpf_jit_comp64.c | 57 ++++++++++++++++---------------
 1 file changed, 29 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 594c54931e20..c421bedd0e98 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -777,41 +777,42 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		 * BPF_STX ATOMIC (atomic ops)
 		 */
 		case BPF_STX | BPF_ATOMIC | BPF_W:
-			if (imm != BPF_ADD) {
-				pr_err_ratelimited(
-					"eBPF filter atomic op code %02x (@%d) unsupported\n",
-					code, i);
-				return -ENOTSUPP;
-			}
-
-			/* *(u32 *)(dst + off) += src */
-
-			/* Get EA into TMP_REG_1 */
-			EMIT(PPC_RAW_ADDI(tmp1_reg, dst_reg, off));
+		case BPF_STX | BPF_ATOMIC | BPF_DW:
+			/* Get offset into TMP_REG_1 */
+			EMIT(PPC_RAW_LI(tmp1_reg, off));
 			tmp_idx = ctx->idx * 4;
 			/* load value from memory into TMP_REG_2 */
-			EMIT(PPC_RAW_LWARX(tmp2_reg, 0, tmp1_reg, 0));
-			/* add value from src_reg into this */
-			EMIT(PPC_RAW_ADD(tmp2_reg, tmp2_reg, src_reg));
-			/* store result back */
-			EMIT(PPC_RAW_STWCX(tmp2_reg, 0, tmp1_reg));
-			/* we're done if this succeeded */
-			PPC_BCC_SHORT(COND_NE, tmp_idx);
-			break;
-		case BPF_STX | BPF_ATOMIC | BPF_DW:
-			if (imm != BPF_ADD) {
+			if (size == BPF_DW)
+				EMIT(PPC_RAW_LDARX(tmp2_reg, tmp1_reg, dst_reg, 0));
+			else
+				EMIT(PPC_RAW_LWARX(tmp2_reg, tmp1_reg, dst_reg, 0));
+
+			switch (imm) {
+			case BPF_ADD:
+				EMIT(PPC_RAW_ADD(tmp2_reg, tmp2_reg, src_reg));
+				break;
+			case BPF_AND:
+				EMIT(PPC_RAW_AND(tmp2_reg, tmp2_reg, src_reg));
+				break;
+			case BPF_OR:
+				EMIT(PPC_RAW_OR(tmp2_reg, tmp2_reg, src_reg));
+				break;
+			case BPF_XOR:
+				EMIT(PPC_RAW_XOR(tmp2_reg, tmp2_reg, src_reg));
+				break;
+			default:
 				pr_err_ratelimited(
 					"eBPF filter atomic op code %02x (@%d) unsupported\n",
 					code, i);
-				return -ENOTSUPP;
+				return -EOPNOTSUPP;
 			}
-			/* *(u64 *)(dst + off) += src */
 
-			EMIT(PPC_RAW_ADDI(tmp1_reg, dst_reg, off));
-			tmp_idx = ctx->idx * 4;
-			EMIT(PPC_RAW_LDARX(tmp2_reg, 0, tmp1_reg, 0));
-			EMIT(PPC_RAW_ADD(tmp2_reg, tmp2_reg, src_reg));
-			EMIT(PPC_RAW_STDCX(tmp2_reg, 0, tmp1_reg));
+			/* store result back */
+			if (size == BPF_DW)
+				EMIT(PPC_RAW_STDCX(tmp2_reg, tmp1_reg, dst_reg));
+			else
+				EMIT(PPC_RAW_STWCX(tmp2_reg, tmp1_reg, dst_reg));
+			/* we're done if this succeeded */
 			PPC_BCC_SHORT(COND_NE, tmp_idx);
 			break;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/5] bpf ppc64: add support for atomic fetch operations
  2022-06-10 15:55 ` Hari Bathini
@ 2022-06-10 15:55   ` Hari Bathini
  -1 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-10 15:55 UTC (permalink / raw)
  To: bpf, linuxppc-dev
  Cc: Michael Ellerman, Naveen N. Rao, Christophe Leroy, netdev,
	Benjamin Herrenschmidt, Paul Mackerras, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Jordan Niethe,
	Russell Currey

Adding instructions for ppc64 for

atomic[64]_fetch_add
atomic[64]_fetch_and
atomic[64]_fetch_or
atomic[64]_fetch_xor

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
---

* No changes in v2.


 arch/powerpc/net/bpf_jit_comp64.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index c421bedd0e98..c53236b3a8b1 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -787,17 +787,25 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 			else
 				EMIT(PPC_RAW_LWARX(tmp2_reg, tmp1_reg, dst_reg, 0));
 
+			/* Save old value in _R0 */
+			if (imm & BPF_FETCH)
+				EMIT(PPC_RAW_MR(_R0, tmp2_reg));
+
 			switch (imm) {
 			case BPF_ADD:
+			case BPF_ADD | BPF_FETCH:
 				EMIT(PPC_RAW_ADD(tmp2_reg, tmp2_reg, src_reg));
 				break;
 			case BPF_AND:
+			case BPF_AND | BPF_FETCH:
 				EMIT(PPC_RAW_AND(tmp2_reg, tmp2_reg, src_reg));
 				break;
 			case BPF_OR:
+			case BPF_OR | BPF_FETCH:
 				EMIT(PPC_RAW_OR(tmp2_reg, tmp2_reg, src_reg));
 				break;
 			case BPF_XOR:
+			case BPF_XOR | BPF_FETCH:
 				EMIT(PPC_RAW_XOR(tmp2_reg, tmp2_reg, src_reg));
 				break;
 			default:
@@ -807,13 +815,17 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 				return -EOPNOTSUPP;
 			}
 
-			/* store result back */
+			/* store new value */
 			if (size == BPF_DW)
 				EMIT(PPC_RAW_STDCX(tmp2_reg, tmp1_reg, dst_reg));
 			else
 				EMIT(PPC_RAW_STWCX(tmp2_reg, tmp1_reg, dst_reg));
 			/* we're done if this succeeded */
 			PPC_BCC_SHORT(COND_NE, tmp_idx);
+
+			/* For the BPF_FETCH variant, get old value into src_reg */
+			if (imm & BPF_FETCH)
+				EMIT(PPC_RAW_MR(src_reg, _R0));
 			break;
 
 		/*
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/5] bpf ppc64: add support for atomic fetch operations
@ 2022-06-10 15:55   ` Hari Bathini
  0 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-10 15:55 UTC (permalink / raw)
  To: bpf, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, netdev, Paul Mackerras, Naveen N. Rao,
	Yonghong Song, KP Singh, Jordan Niethe, Martin KaFai Lau

Adding instructions for ppc64 for

atomic[64]_fetch_add
atomic[64]_fetch_and
atomic[64]_fetch_or
atomic[64]_fetch_xor

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
---

* No changes in v2.


 arch/powerpc/net/bpf_jit_comp64.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index c421bedd0e98..c53236b3a8b1 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -787,17 +787,25 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 			else
 				EMIT(PPC_RAW_LWARX(tmp2_reg, tmp1_reg, dst_reg, 0));
 
+			/* Save old value in _R0 */
+			if (imm & BPF_FETCH)
+				EMIT(PPC_RAW_MR(_R0, tmp2_reg));
+
 			switch (imm) {
 			case BPF_ADD:
+			case BPF_ADD | BPF_FETCH:
 				EMIT(PPC_RAW_ADD(tmp2_reg, tmp2_reg, src_reg));
 				break;
 			case BPF_AND:
+			case BPF_AND | BPF_FETCH:
 				EMIT(PPC_RAW_AND(tmp2_reg, tmp2_reg, src_reg));
 				break;
 			case BPF_OR:
+			case BPF_OR | BPF_FETCH:
 				EMIT(PPC_RAW_OR(tmp2_reg, tmp2_reg, src_reg));
 				break;
 			case BPF_XOR:
+			case BPF_XOR | BPF_FETCH:
 				EMIT(PPC_RAW_XOR(tmp2_reg, tmp2_reg, src_reg));
 				break;
 			default:
@@ -807,13 +815,17 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 				return -EOPNOTSUPP;
 			}
 
-			/* store result back */
+			/* store new value */
 			if (size == BPF_DW)
 				EMIT(PPC_RAW_STDCX(tmp2_reg, tmp1_reg, dst_reg));
 			else
 				EMIT(PPC_RAW_STWCX(tmp2_reg, tmp1_reg, dst_reg));
 			/* we're done if this succeeded */
 			PPC_BCC_SHORT(COND_NE, tmp_idx);
+
+			/* For the BPF_FETCH variant, get old value into src_reg */
+			if (imm & BPF_FETCH)
+				EMIT(PPC_RAW_MR(src_reg, _R0));
 			break;
 
 		/*
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 3/5] bpf ppc64: Add instructions for atomic_[cmp]xchg
  2022-06-10 15:55 ` Hari Bathini
@ 2022-06-10 15:55   ` Hari Bathini
  -1 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-10 15:55 UTC (permalink / raw)
  To: bpf, linuxppc-dev
  Cc: Michael Ellerman, Naveen N. Rao, Christophe Leroy, netdev,
	Benjamin Herrenschmidt, Paul Mackerras, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Jordan Niethe,
	Russell Currey

This adds two atomic opcodes BPF_XCHG and BPF_CMPXCHG on ppc64, both
of which include the BPF_FETCH flag.  The kernel's atomic_cmpxchg
operation fundamentally has 3 operands, but we only have two register
fields. Therefore the operand we compare against (the kernel's API
calls it 'old') is hard-coded to be BPF_REG_R0. Also, kernel's
atomic_cmpxchg returns the previous value at dst_reg + off. JIT the
same for BPF too with return value put in BPF_REG_0.

  BPF_REG_R0 = atomic_cmpxchg(dst_reg + off, BPF_REG_R0, src_reg);

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
---

Changes in v2:
* Moved variable declaration to avoid late declaration error on
  some compilers.
* Added an optimization for 32-bit cmpxchg with regard to
  commit see commit 39491867ace5.


 arch/powerpc/net/bpf_jit_comp64.c | 39 +++++++++++++++++++++++++++----
 1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index c53236b3a8b1..29ee306d6302 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -360,6 +360,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		u32 size = BPF_SIZE(code);
 		u32 tmp1_reg = bpf_to_ppc(TMP_REG_1);
 		u32 tmp2_reg = bpf_to_ppc(TMP_REG_2);
+		u32 save_reg, ret_reg;
 		s16 off = insn[i].off;
 		s32 imm = insn[i].imm;
 		bool func_addr_fixed;
@@ -778,6 +779,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		 */
 		case BPF_STX | BPF_ATOMIC | BPF_W:
 		case BPF_STX | BPF_ATOMIC | BPF_DW:
+			save_reg = tmp2_reg;
+			ret_reg = src_reg;
+
 			/* Get offset into TMP_REG_1 */
 			EMIT(PPC_RAW_LI(tmp1_reg, off));
 			tmp_idx = ctx->idx * 4;
@@ -808,6 +812,24 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 			case BPF_XOR | BPF_FETCH:
 				EMIT(PPC_RAW_XOR(tmp2_reg, tmp2_reg, src_reg));
 				break;
+			case BPF_CMPXCHG:
+				/*
+				 * Return old value in BPF_REG_0 for BPF_CMPXCHG &
+				 * in src_reg for other cases.
+				 */
+				ret_reg = bpf_to_ppc(BPF_REG_0);
+
+				/* Compare with old value in BPF_R0 */
+				if (size == BPF_DW)
+					EMIT(PPC_RAW_CMPD(bpf_to_ppc(BPF_REG_0), tmp2_reg));
+				else
+					EMIT(PPC_RAW_CMPW(bpf_to_ppc(BPF_REG_0), tmp2_reg));
+				/* Don't set if different from old value */
+				PPC_BCC_SHORT(COND_NE, (ctx->idx + 3) * 4);
+				fallthrough;
+			case BPF_XCHG:
+				save_reg = src_reg;
+				break;
 			default:
 				pr_err_ratelimited(
 					"eBPF filter atomic op code %02x (@%d) unsupported\n",
@@ -817,15 +839,22 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 
 			/* store new value */
 			if (size == BPF_DW)
-				EMIT(PPC_RAW_STDCX(tmp2_reg, tmp1_reg, dst_reg));
+				EMIT(PPC_RAW_STDCX(save_reg, tmp1_reg, dst_reg));
 			else
-				EMIT(PPC_RAW_STWCX(tmp2_reg, tmp1_reg, dst_reg));
+				EMIT(PPC_RAW_STWCX(save_reg, tmp1_reg, dst_reg));
 			/* we're done if this succeeded */
 			PPC_BCC_SHORT(COND_NE, tmp_idx);
 
-			/* For the BPF_FETCH variant, get old value into src_reg */
-			if (imm & BPF_FETCH)
-				EMIT(PPC_RAW_MR(src_reg, _R0));
+			if (imm & BPF_FETCH) {
+				EMIT(PPC_RAW_MR(ret_reg, _R0));
+				/*
+				 * Skip unnecessary zero-extension for 32-bit cmpxchg.
+				 * For context, see commit 39491867ace5.
+				 */
+				if (size != BPF_DW && imm == BPF_CMPXCHG &&
+				    insn_is_zext(&insn[i + 1]))
+					addrs[++i] = ctx->idx * 4;
+			}
 			break;
 
 		/*
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 3/5] bpf ppc64: Add instructions for atomic_[cmp]xchg
@ 2022-06-10 15:55   ` Hari Bathini
  0 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-10 15:55 UTC (permalink / raw)
  To: bpf, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, netdev, Paul Mackerras, Naveen N. Rao,
	Yonghong Song, KP Singh, Jordan Niethe, Martin KaFai Lau

This adds two atomic opcodes BPF_XCHG and BPF_CMPXCHG on ppc64, both
of which include the BPF_FETCH flag.  The kernel's atomic_cmpxchg
operation fundamentally has 3 operands, but we only have two register
fields. Therefore the operand we compare against (the kernel's API
calls it 'old') is hard-coded to be BPF_REG_R0. Also, kernel's
atomic_cmpxchg returns the previous value at dst_reg + off. JIT the
same for BPF too with return value put in BPF_REG_0.

  BPF_REG_R0 = atomic_cmpxchg(dst_reg + off, BPF_REG_R0, src_reg);

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
---

Changes in v2:
* Moved variable declaration to avoid late declaration error on
  some compilers.
* Added an optimization for 32-bit cmpxchg with regard to
  commit see commit 39491867ace5.


 arch/powerpc/net/bpf_jit_comp64.c | 39 +++++++++++++++++++++++++++----
 1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index c53236b3a8b1..29ee306d6302 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -360,6 +360,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		u32 size = BPF_SIZE(code);
 		u32 tmp1_reg = bpf_to_ppc(TMP_REG_1);
 		u32 tmp2_reg = bpf_to_ppc(TMP_REG_2);
+		u32 save_reg, ret_reg;
 		s16 off = insn[i].off;
 		s32 imm = insn[i].imm;
 		bool func_addr_fixed;
@@ -778,6 +779,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		 */
 		case BPF_STX | BPF_ATOMIC | BPF_W:
 		case BPF_STX | BPF_ATOMIC | BPF_DW:
+			save_reg = tmp2_reg;
+			ret_reg = src_reg;
+
 			/* Get offset into TMP_REG_1 */
 			EMIT(PPC_RAW_LI(tmp1_reg, off));
 			tmp_idx = ctx->idx * 4;
@@ -808,6 +812,24 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 			case BPF_XOR | BPF_FETCH:
 				EMIT(PPC_RAW_XOR(tmp2_reg, tmp2_reg, src_reg));
 				break;
+			case BPF_CMPXCHG:
+				/*
+				 * Return old value in BPF_REG_0 for BPF_CMPXCHG &
+				 * in src_reg for other cases.
+				 */
+				ret_reg = bpf_to_ppc(BPF_REG_0);
+
+				/* Compare with old value in BPF_R0 */
+				if (size == BPF_DW)
+					EMIT(PPC_RAW_CMPD(bpf_to_ppc(BPF_REG_0), tmp2_reg));
+				else
+					EMIT(PPC_RAW_CMPW(bpf_to_ppc(BPF_REG_0), tmp2_reg));
+				/* Don't set if different from old value */
+				PPC_BCC_SHORT(COND_NE, (ctx->idx + 3) * 4);
+				fallthrough;
+			case BPF_XCHG:
+				save_reg = src_reg;
+				break;
 			default:
 				pr_err_ratelimited(
 					"eBPF filter atomic op code %02x (@%d) unsupported\n",
@@ -817,15 +839,22 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 
 			/* store new value */
 			if (size == BPF_DW)
-				EMIT(PPC_RAW_STDCX(tmp2_reg, tmp1_reg, dst_reg));
+				EMIT(PPC_RAW_STDCX(save_reg, tmp1_reg, dst_reg));
 			else
-				EMIT(PPC_RAW_STWCX(tmp2_reg, tmp1_reg, dst_reg));
+				EMIT(PPC_RAW_STWCX(save_reg, tmp1_reg, dst_reg));
 			/* we're done if this succeeded */
 			PPC_BCC_SHORT(COND_NE, tmp_idx);
 
-			/* For the BPF_FETCH variant, get old value into src_reg */
-			if (imm & BPF_FETCH)
-				EMIT(PPC_RAW_MR(src_reg, _R0));
+			if (imm & BPF_FETCH) {
+				EMIT(PPC_RAW_MR(ret_reg, _R0));
+				/*
+				 * Skip unnecessary zero-extension for 32-bit cmpxchg.
+				 * For context, see commit 39491867ace5.
+				 */
+				if (size != BPF_DW && imm == BPF_CMPXCHG &&
+				    insn_is_zext(&insn[i + 1]))
+					addrs[++i] = ctx->idx * 4;
+			}
 			break;
 
 		/*
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 4/5] bpf ppc32: add support for BPF_ATOMIC bitwise operations
  2022-06-10 15:55 ` Hari Bathini
@ 2022-06-10 15:55   ` Hari Bathini
  -1 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-10 15:55 UTC (permalink / raw)
  To: bpf, linuxppc-dev
  Cc: Michael Ellerman, Naveen N. Rao, Christophe Leroy, netdev,
	Benjamin Herrenschmidt, Paul Mackerras, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Jordan Niethe,
	Russell Currey

Adding instructions for ppc32 for

atomic_and
atomic_or
atomic_xor
atomic_fetch_add
atomic_fetch_and
atomic_fetch_or
atomic_fetch_xor

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
---

Changes in v2:
* Used an additional register (BPF_REG_AX)
    - to avoid clobbering src_reg.
    - to keep the lwarx reservation as intended.
    - to avoid the odd switch/goto construct.
* Zero'ed out the higher 32-bit explicitly when required.

 arch/powerpc/net/bpf_jit_comp32.c | 53 ++++++++++++++++++++++++-------
 1 file changed, 41 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
index e46ed1e8c6ca..28dc6a1a8f2f 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -294,6 +294,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		u32 dst_reg_h = dst_reg - 1;
 		u32 src_reg = bpf_to_ppc(insn[i].src_reg);
 		u32 src_reg_h = src_reg - 1;
+		u32 ax_reg = bpf_to_ppc(BPF_REG_AX);
 		u32 tmp_reg = bpf_to_ppc(TMP_REG);
 		u32 size = BPF_SIZE(code);
 		s16 off = insn[i].off;
@@ -798,25 +799,53 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		 * BPF_STX ATOMIC (atomic ops)
 		 */
 		case BPF_STX | BPF_ATOMIC | BPF_W:
-			if (imm != BPF_ADD) {
-				pr_err_ratelimited("eBPF filter atomic op code %02x (@%d) unsupported\n",
-						   code, i);
-				return -ENOTSUPP;
-			}
-
-			/* *(u32 *)(dst + off) += src */
-
 			bpf_set_seen_register(ctx, tmp_reg);
+			bpf_set_seen_register(ctx, ax_reg);
+
 			/* Get offset into TMP_REG */
 			EMIT(PPC_RAW_LI(tmp_reg, off));
+			tmp_idx = ctx->idx * 4;
 			/* load value from memory into r0 */
 			EMIT(PPC_RAW_LWARX(_R0, tmp_reg, dst_reg, 0));
-			/* add value from src_reg into this */
-			EMIT(PPC_RAW_ADD(_R0, _R0, src_reg));
-			/* store result back */
+
+			/* Save old value in BPF_REG_AX */
+			if (imm & BPF_FETCH)
+				EMIT(PPC_RAW_MR(ax_reg, _R0));
+
+			switch (imm) {
+			case BPF_ADD:
+			case BPF_ADD | BPF_FETCH:
+				EMIT(PPC_RAW_ADD(_R0, _R0, src_reg));
+				break;
+			case BPF_AND:
+			case BPF_AND | BPF_FETCH:
+				EMIT(PPC_RAW_AND(_R0, _R0, src_reg));
+				break;
+			case BPF_OR:
+			case BPF_OR | BPF_FETCH:
+				EMIT(PPC_RAW_OR(_R0, _R0, src_reg));
+				break;
+			case BPF_XOR:
+			case BPF_XOR | BPF_FETCH:
+				EMIT(PPC_RAW_XOR(_R0, _R0, src_reg));
+				break;
+			default:
+				pr_err_ratelimited("eBPF filter atomic op code %02x (@%d) unsupported\n",
+						   code, i);
+				return -EOPNOTSUPP;
+			}
+
+			/* store new value */
 			EMIT(PPC_RAW_STWCX(_R0, tmp_reg, dst_reg));
 			/* we're done if this succeeded */
-			PPC_BCC_SHORT(COND_NE, (ctx->idx - 3) * 4);
+			PPC_BCC_SHORT(COND_NE, tmp_idx);
+
+			/* For the BPF_FETCH variant, get old data into src_reg */
+			if (imm & BPF_FETCH) {
+				EMIT(PPC_RAW_MR(src_reg, ax_reg));
+				if (!fp->aux->verifier_zext)
+					EMIT(PPC_RAW_LI(src_reg_h, 0));
+			}
 			break;
 
 		case BPF_STX | BPF_ATOMIC | BPF_DW: /* *(u64 *)(dst + off) += src */
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 4/5] bpf ppc32: add support for BPF_ATOMIC bitwise operations
@ 2022-06-10 15:55   ` Hari Bathini
  0 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-10 15:55 UTC (permalink / raw)
  To: bpf, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, netdev, Paul Mackerras, Naveen N. Rao,
	Yonghong Song, KP Singh, Jordan Niethe, Martin KaFai Lau

Adding instructions for ppc32 for

atomic_and
atomic_or
atomic_xor
atomic_fetch_add
atomic_fetch_and
atomic_fetch_or
atomic_fetch_xor

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
---

Changes in v2:
* Used an additional register (BPF_REG_AX)
    - to avoid clobbering src_reg.
    - to keep the lwarx reservation as intended.
    - to avoid the odd switch/goto construct.
* Zero'ed out the higher 32-bit explicitly when required.

 arch/powerpc/net/bpf_jit_comp32.c | 53 ++++++++++++++++++++++++-------
 1 file changed, 41 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
index e46ed1e8c6ca..28dc6a1a8f2f 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -294,6 +294,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		u32 dst_reg_h = dst_reg - 1;
 		u32 src_reg = bpf_to_ppc(insn[i].src_reg);
 		u32 src_reg_h = src_reg - 1;
+		u32 ax_reg = bpf_to_ppc(BPF_REG_AX);
 		u32 tmp_reg = bpf_to_ppc(TMP_REG);
 		u32 size = BPF_SIZE(code);
 		s16 off = insn[i].off;
@@ -798,25 +799,53 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		 * BPF_STX ATOMIC (atomic ops)
 		 */
 		case BPF_STX | BPF_ATOMIC | BPF_W:
-			if (imm != BPF_ADD) {
-				pr_err_ratelimited("eBPF filter atomic op code %02x (@%d) unsupported\n",
-						   code, i);
-				return -ENOTSUPP;
-			}
-
-			/* *(u32 *)(dst + off) += src */
-
 			bpf_set_seen_register(ctx, tmp_reg);
+			bpf_set_seen_register(ctx, ax_reg);
+
 			/* Get offset into TMP_REG */
 			EMIT(PPC_RAW_LI(tmp_reg, off));
+			tmp_idx = ctx->idx * 4;
 			/* load value from memory into r0 */
 			EMIT(PPC_RAW_LWARX(_R0, tmp_reg, dst_reg, 0));
-			/* add value from src_reg into this */
-			EMIT(PPC_RAW_ADD(_R0, _R0, src_reg));
-			/* store result back */
+
+			/* Save old value in BPF_REG_AX */
+			if (imm & BPF_FETCH)
+				EMIT(PPC_RAW_MR(ax_reg, _R0));
+
+			switch (imm) {
+			case BPF_ADD:
+			case BPF_ADD | BPF_FETCH:
+				EMIT(PPC_RAW_ADD(_R0, _R0, src_reg));
+				break;
+			case BPF_AND:
+			case BPF_AND | BPF_FETCH:
+				EMIT(PPC_RAW_AND(_R0, _R0, src_reg));
+				break;
+			case BPF_OR:
+			case BPF_OR | BPF_FETCH:
+				EMIT(PPC_RAW_OR(_R0, _R0, src_reg));
+				break;
+			case BPF_XOR:
+			case BPF_XOR | BPF_FETCH:
+				EMIT(PPC_RAW_XOR(_R0, _R0, src_reg));
+				break;
+			default:
+				pr_err_ratelimited("eBPF filter atomic op code %02x (@%d) unsupported\n",
+						   code, i);
+				return -EOPNOTSUPP;
+			}
+
+			/* store new value */
 			EMIT(PPC_RAW_STWCX(_R0, tmp_reg, dst_reg));
 			/* we're done if this succeeded */
-			PPC_BCC_SHORT(COND_NE, (ctx->idx - 3) * 4);
+			PPC_BCC_SHORT(COND_NE, tmp_idx);
+
+			/* For the BPF_FETCH variant, get old data into src_reg */
+			if (imm & BPF_FETCH) {
+				EMIT(PPC_RAW_MR(src_reg, ax_reg));
+				if (!fp->aux->verifier_zext)
+					EMIT(PPC_RAW_LI(src_reg_h, 0));
+			}
 			break;
 
 		case BPF_STX | BPF_ATOMIC | BPF_DW: /* *(u64 *)(dst + off) += src */
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 5/5] bpf ppc32: Add instructions for atomic_[cmp]xchg
  2022-06-10 15:55 ` Hari Bathini
@ 2022-06-10 15:55   ` Hari Bathini
  -1 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-10 15:55 UTC (permalink / raw)
  To: bpf, linuxppc-dev
  Cc: Michael Ellerman, Naveen N. Rao, Christophe Leroy, netdev,
	Benjamin Herrenschmidt, Paul Mackerras, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Jordan Niethe,
	Russell Currey

This adds two atomic opcodes BPF_XCHG and BPF_CMPXCHG on ppc32, both
of which include the BPF_FETCH flag.  The kernel's atomic_cmpxchg
operation fundamentally has 3 operands, but we only have two register
fields. Therefore the operand we compare against (the kernel's API
calls it 'old') is hard-coded to be BPF_REG_R0. Also, kernel's
atomic_cmpxchg returns the previous value at dst_reg + off. JIT the
same for BPF too with return value put in BPF_REG_0.

  BPF_REG_R0 = atomic_cmpxchg(dst_reg + off, BPF_REG_R0, src_reg);

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
---

Changes in v2:
* Moved variable declaration to avoid late declaration error on
  some compilers.
* Tried to make code readable and compact.


 arch/powerpc/net/bpf_jit_comp32.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
index 28dc6a1a8f2f..43f1c76d48ce 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -297,6 +297,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		u32 ax_reg = bpf_to_ppc(BPF_REG_AX);
 		u32 tmp_reg = bpf_to_ppc(TMP_REG);
 		u32 size = BPF_SIZE(code);
+		u32 save_reg, ret_reg;
 		s16 off = insn[i].off;
 		s32 imm = insn[i].imm;
 		bool func_addr_fixed;
@@ -799,6 +800,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		 * BPF_STX ATOMIC (atomic ops)
 		 */
 		case BPF_STX | BPF_ATOMIC | BPF_W:
+			save_reg = _R0;
+			ret_reg = src_reg;
+
 			bpf_set_seen_register(ctx, tmp_reg);
 			bpf_set_seen_register(ctx, ax_reg);
 
@@ -829,6 +833,21 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 			case BPF_XOR | BPF_FETCH:
 				EMIT(PPC_RAW_XOR(_R0, _R0, src_reg));
 				break;
+			case BPF_CMPXCHG:
+				/*
+				 * Return old value in BPF_REG_0 for BPF_CMPXCHG &
+				 * in src_reg for other cases.
+				 */
+				ret_reg = bpf_to_ppc(BPF_REG_0);
+
+				/* Compare with old value in BPF_REG_0 */
+				EMIT(PPC_RAW_CMPW(bpf_to_ppc(BPF_REG_0), _R0));
+				/* Don't set if different from old value */
+				PPC_BCC_SHORT(COND_NE, (ctx->idx + 3) * 4);
+				fallthrough;
+			case BPF_XCHG:
+				save_reg = src_reg;
+				break;
 			default:
 				pr_err_ratelimited("eBPF filter atomic op code %02x (@%d) unsupported\n",
 						   code, i);
@@ -836,15 +855,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 			}
 
 			/* store new value */
-			EMIT(PPC_RAW_STWCX(_R0, tmp_reg, dst_reg));
+			EMIT(PPC_RAW_STWCX(save_reg, tmp_reg, dst_reg));
 			/* we're done if this succeeded */
 			PPC_BCC_SHORT(COND_NE, tmp_idx);
 
 			/* For the BPF_FETCH variant, get old data into src_reg */
 			if (imm & BPF_FETCH) {
-				EMIT(PPC_RAW_MR(src_reg, ax_reg));
+				EMIT(PPC_RAW_MR(ret_reg, ax_reg));
 				if (!fp->aux->verifier_zext)
-					EMIT(PPC_RAW_LI(src_reg_h, 0));
+					EMIT(PPC_RAW_LI(ret_reg - 1, 0)); /* higher 32-bit */
 			}
 			break;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 5/5] bpf ppc32: Add instructions for atomic_[cmp]xchg
@ 2022-06-10 15:55   ` Hari Bathini
  0 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-10 15:55 UTC (permalink / raw)
  To: bpf, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, netdev, Paul Mackerras, Naveen N. Rao,
	Yonghong Song, KP Singh, Jordan Niethe, Martin KaFai Lau

This adds two atomic opcodes BPF_XCHG and BPF_CMPXCHG on ppc32, both
of which include the BPF_FETCH flag.  The kernel's atomic_cmpxchg
operation fundamentally has 3 operands, but we only have two register
fields. Therefore the operand we compare against (the kernel's API
calls it 'old') is hard-coded to be BPF_REG_R0. Also, kernel's
atomic_cmpxchg returns the previous value at dst_reg + off. JIT the
same for BPF too with return value put in BPF_REG_0.

  BPF_REG_R0 = atomic_cmpxchg(dst_reg + off, BPF_REG_R0, src_reg);

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
---

Changes in v2:
* Moved variable declaration to avoid late declaration error on
  some compilers.
* Tried to make code readable and compact.


 arch/powerpc/net/bpf_jit_comp32.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
index 28dc6a1a8f2f..43f1c76d48ce 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -297,6 +297,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		u32 ax_reg = bpf_to_ppc(BPF_REG_AX);
 		u32 tmp_reg = bpf_to_ppc(TMP_REG);
 		u32 size = BPF_SIZE(code);
+		u32 save_reg, ret_reg;
 		s16 off = insn[i].off;
 		s32 imm = insn[i].imm;
 		bool func_addr_fixed;
@@ -799,6 +800,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		 * BPF_STX ATOMIC (atomic ops)
 		 */
 		case BPF_STX | BPF_ATOMIC | BPF_W:
+			save_reg = _R0;
+			ret_reg = src_reg;
+
 			bpf_set_seen_register(ctx, tmp_reg);
 			bpf_set_seen_register(ctx, ax_reg);
 
@@ -829,6 +833,21 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 			case BPF_XOR | BPF_FETCH:
 				EMIT(PPC_RAW_XOR(_R0, _R0, src_reg));
 				break;
+			case BPF_CMPXCHG:
+				/*
+				 * Return old value in BPF_REG_0 for BPF_CMPXCHG &
+				 * in src_reg for other cases.
+				 */
+				ret_reg = bpf_to_ppc(BPF_REG_0);
+
+				/* Compare with old value in BPF_REG_0 */
+				EMIT(PPC_RAW_CMPW(bpf_to_ppc(BPF_REG_0), _R0));
+				/* Don't set if different from old value */
+				PPC_BCC_SHORT(COND_NE, (ctx->idx + 3) * 4);
+				fallthrough;
+			case BPF_XCHG:
+				save_reg = src_reg;
+				break;
 			default:
 				pr_err_ratelimited("eBPF filter atomic op code %02x (@%d) unsupported\n",
 						   code, i);
@@ -836,15 +855,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 			}
 
 			/* store new value */
-			EMIT(PPC_RAW_STWCX(_R0, tmp_reg, dst_reg));
+			EMIT(PPC_RAW_STWCX(save_reg, tmp_reg, dst_reg));
 			/* we're done if this succeeded */
 			PPC_BCC_SHORT(COND_NE, tmp_idx);
 
 			/* For the BPF_FETCH variant, get old data into src_reg */
 			if (imm & BPF_FETCH) {
-				EMIT(PPC_RAW_MR(src_reg, ax_reg));
+				EMIT(PPC_RAW_MR(ret_reg, ax_reg));
 				if (!fp->aux->verifier_zext)
-					EMIT(PPC_RAW_LI(src_reg_h, 0));
+					EMIT(PPC_RAW_LI(ret_reg - 1, 0)); /* higher 32-bit */
 			}
 			break;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/5] bpf ppc32: add support for BPF_ATOMIC bitwise operations
  2022-06-10 15:55   ` Hari Bathini
@ 2022-06-11 17:14     ` Christophe Leroy
  -1 siblings, 0 replies; 28+ messages in thread
From: Christophe Leroy @ 2022-06-11 17:14 UTC (permalink / raw)
  To: Hari Bathini, bpf, linuxppc-dev
  Cc: Michael Ellerman, Naveen N. Rao, netdev, Benjamin Herrenschmidt,
	Paul Mackerras, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Jordan Niethe, Russell Currey



Le 10/06/2022 à 17:55, Hari Bathini a écrit :
> Adding instructions for ppc32 for
> 
> atomic_and
> atomic_or
> atomic_xor
> atomic_fetch_add
> atomic_fetch_and
> atomic_fetch_or
> atomic_fetch_xor
> 
> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
> ---
> 
> Changes in v2:
> * Used an additional register (BPF_REG_AX)
>      - to avoid clobbering src_reg.
>      - to keep the lwarx reservation as intended.
>      - to avoid the odd switch/goto construct.

Might be a stupid question as I don't know the internals of BPF: Are we 
sure BPF_REG_AX cannot be the src reg or the dst reg ?


> * Zero'ed out the higher 32-bit explicitly when required.
> 
>   arch/powerpc/net/bpf_jit_comp32.c | 53 ++++++++++++++++++++++++-------
>   1 file changed, 41 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
> index e46ed1e8c6ca..28dc6a1a8f2f 100644
> --- a/arch/powerpc/net/bpf_jit_comp32.c
> +++ b/arch/powerpc/net/bpf_jit_comp32.c
> @@ -294,6 +294,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   		u32 dst_reg_h = dst_reg - 1;
>   		u32 src_reg = bpf_to_ppc(insn[i].src_reg);
>   		u32 src_reg_h = src_reg - 1;
> +		u32 ax_reg = bpf_to_ppc(BPF_REG_AX);
>   		u32 tmp_reg = bpf_to_ppc(TMP_REG);
>   		u32 size = BPF_SIZE(code);
>   		s16 off = insn[i].off;
> @@ -798,25 +799,53 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   		 * BPF_STX ATOMIC (atomic ops)
>   		 */
>   		case BPF_STX | BPF_ATOMIC | BPF_W:
> -			if (imm != BPF_ADD) {
> -				pr_err_ratelimited("eBPF filter atomic op code %02x (@%d) unsupported\n",
> -						   code, i);
> -				return -ENOTSUPP;
> -			}
> -
> -			/* *(u32 *)(dst + off) += src */
> -
>   			bpf_set_seen_register(ctx, tmp_reg);
> +			bpf_set_seen_register(ctx, ax_reg);
> +
>   			/* Get offset into TMP_REG */
>   			EMIT(PPC_RAW_LI(tmp_reg, off));
> +			tmp_idx = ctx->idx * 4;
>   			/* load value from memory into r0 */
>   			EMIT(PPC_RAW_LWARX(_R0, tmp_reg, dst_reg, 0));
> -			/* add value from src_reg into this */
> -			EMIT(PPC_RAW_ADD(_R0, _R0, src_reg));
> -			/* store result back */
> +
> +			/* Save old value in BPF_REG_AX */
> +			if (imm & BPF_FETCH)
> +				EMIT(PPC_RAW_MR(ax_reg, _R0));
> +
> +			switch (imm) {
> +			case BPF_ADD:
> +			case BPF_ADD | BPF_FETCH:
> +				EMIT(PPC_RAW_ADD(_R0, _R0, src_reg));
> +				break;
> +			case BPF_AND:
> +			case BPF_AND | BPF_FETCH:
> +				EMIT(PPC_RAW_AND(_R0, _R0, src_reg));
> +				break;
> +			case BPF_OR:
> +			case BPF_OR | BPF_FETCH:
> +				EMIT(PPC_RAW_OR(_R0, _R0, src_reg));
> +				break;
> +			case BPF_XOR:
> +			case BPF_XOR | BPF_FETCH:
> +				EMIT(PPC_RAW_XOR(_R0, _R0, src_reg));
> +				break;
> +			default:
> +				pr_err_ratelimited("eBPF filter atomic op code %02x (@%d) unsupported\n",
> +						   code, i);
> +				return -EOPNOTSUPP;
> +			}
> +
> +			/* store new value */
>   			EMIT(PPC_RAW_STWCX(_R0, tmp_reg, dst_reg));
>   			/* we're done if this succeeded */
> -			PPC_BCC_SHORT(COND_NE, (ctx->idx - 3) * 4);
> +			PPC_BCC_SHORT(COND_NE, tmp_idx);
> +
> +			/* For the BPF_FETCH variant, get old data into src_reg */
> +			if (imm & BPF_FETCH) {
> +				EMIT(PPC_RAW_MR(src_reg, ax_reg));
> +				if (!fp->aux->verifier_zext)
> +					EMIT(PPC_RAW_LI(src_reg_h, 0));
> +			}
>   			break;
>   
>   		case BPF_STX | BPF_ATOMIC | BPF_DW: /* *(u64 *)(dst + off) += src */

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/5] bpf ppc32: add support for BPF_ATOMIC bitwise operations
@ 2022-06-11 17:14     ` Christophe Leroy
  0 siblings, 0 replies; 28+ messages in thread
From: Christophe Leroy @ 2022-06-11 17:14 UTC (permalink / raw)
  To: Hari Bathini, bpf, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, netdev, Paul Mackerras, Naveen N. Rao,
	Yonghong Song, KP Singh, Jordan Niethe, Martin KaFai Lau



Le 10/06/2022 à 17:55, Hari Bathini a écrit :
> Adding instructions for ppc32 for
> 
> atomic_and
> atomic_or
> atomic_xor
> atomic_fetch_add
> atomic_fetch_and
> atomic_fetch_or
> atomic_fetch_xor
> 
> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
> ---
> 
> Changes in v2:
> * Used an additional register (BPF_REG_AX)
>      - to avoid clobbering src_reg.
>      - to keep the lwarx reservation as intended.
>      - to avoid the odd switch/goto construct.

Might be a stupid question as I don't know the internals of BPF: Are we 
sure BPF_REG_AX cannot be the src reg or the dst reg ?


> * Zero'ed out the higher 32-bit explicitly when required.
> 
>   arch/powerpc/net/bpf_jit_comp32.c | 53 ++++++++++++++++++++++++-------
>   1 file changed, 41 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
> index e46ed1e8c6ca..28dc6a1a8f2f 100644
> --- a/arch/powerpc/net/bpf_jit_comp32.c
> +++ b/arch/powerpc/net/bpf_jit_comp32.c
> @@ -294,6 +294,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   		u32 dst_reg_h = dst_reg - 1;
>   		u32 src_reg = bpf_to_ppc(insn[i].src_reg);
>   		u32 src_reg_h = src_reg - 1;
> +		u32 ax_reg = bpf_to_ppc(BPF_REG_AX);
>   		u32 tmp_reg = bpf_to_ppc(TMP_REG);
>   		u32 size = BPF_SIZE(code);
>   		s16 off = insn[i].off;
> @@ -798,25 +799,53 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   		 * BPF_STX ATOMIC (atomic ops)
>   		 */
>   		case BPF_STX | BPF_ATOMIC | BPF_W:
> -			if (imm != BPF_ADD) {
> -				pr_err_ratelimited("eBPF filter atomic op code %02x (@%d) unsupported\n",
> -						   code, i);
> -				return -ENOTSUPP;
> -			}
> -
> -			/* *(u32 *)(dst + off) += src */
> -
>   			bpf_set_seen_register(ctx, tmp_reg);
> +			bpf_set_seen_register(ctx, ax_reg);
> +
>   			/* Get offset into TMP_REG */
>   			EMIT(PPC_RAW_LI(tmp_reg, off));
> +			tmp_idx = ctx->idx * 4;
>   			/* load value from memory into r0 */
>   			EMIT(PPC_RAW_LWARX(_R0, tmp_reg, dst_reg, 0));
> -			/* add value from src_reg into this */
> -			EMIT(PPC_RAW_ADD(_R0, _R0, src_reg));
> -			/* store result back */
> +
> +			/* Save old value in BPF_REG_AX */
> +			if (imm & BPF_FETCH)
> +				EMIT(PPC_RAW_MR(ax_reg, _R0));
> +
> +			switch (imm) {
> +			case BPF_ADD:
> +			case BPF_ADD | BPF_FETCH:
> +				EMIT(PPC_RAW_ADD(_R0, _R0, src_reg));
> +				break;
> +			case BPF_AND:
> +			case BPF_AND | BPF_FETCH:
> +				EMIT(PPC_RAW_AND(_R0, _R0, src_reg));
> +				break;
> +			case BPF_OR:
> +			case BPF_OR | BPF_FETCH:
> +				EMIT(PPC_RAW_OR(_R0, _R0, src_reg));
> +				break;
> +			case BPF_XOR:
> +			case BPF_XOR | BPF_FETCH:
> +				EMIT(PPC_RAW_XOR(_R0, _R0, src_reg));
> +				break;
> +			default:
> +				pr_err_ratelimited("eBPF filter atomic op code %02x (@%d) unsupported\n",
> +						   code, i);
> +				return -EOPNOTSUPP;
> +			}
> +
> +			/* store new value */
>   			EMIT(PPC_RAW_STWCX(_R0, tmp_reg, dst_reg));
>   			/* we're done if this succeeded */
> -			PPC_BCC_SHORT(COND_NE, (ctx->idx - 3) * 4);
> +			PPC_BCC_SHORT(COND_NE, tmp_idx);
> +
> +			/* For the BPF_FETCH variant, get old data into src_reg */
> +			if (imm & BPF_FETCH) {
> +				EMIT(PPC_RAW_MR(src_reg, ax_reg));
> +				if (!fp->aux->verifier_zext)
> +					EMIT(PPC_RAW_LI(src_reg_h, 0));
> +			}
>   			break;
>   
>   		case BPF_STX | BPF_ATOMIC | BPF_DW: /* *(u64 *)(dst + off) += src */

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/5] bpf ppc32: Add instructions for atomic_[cmp]xchg
  2022-06-10 15:55   ` Hari Bathini
@ 2022-06-11 17:34     ` Christophe Leroy
  -1 siblings, 0 replies; 28+ messages in thread
From: Christophe Leroy @ 2022-06-11 17:34 UTC (permalink / raw)
  To: Hari Bathini, bpf, linuxppc-dev
  Cc: Michael Ellerman, Naveen N. Rao, netdev, Benjamin Herrenschmidt,
	Paul Mackerras, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Jordan Niethe, Russell Currey



Le 10/06/2022 à 17:55, Hari Bathini a écrit :
> This adds two atomic opcodes BPF_XCHG and BPF_CMPXCHG on ppc32, both
> of which include the BPF_FETCH flag.  The kernel's atomic_cmpxchg
> operation fundamentally has 3 operands, but we only have two register
> fields. Therefore the operand we compare against (the kernel's API
> calls it 'old') is hard-coded to be BPF_REG_R0. Also, kernel's
> atomic_cmpxchg returns the previous value at dst_reg + off. JIT the
> same for BPF too with return value put in BPF_REG_0.
> 
>    BPF_REG_R0 = atomic_cmpxchg(dst_reg + off, BPF_REG_R0, src_reg);
> 
> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
> ---
> 
> Changes in v2:
> * Moved variable declaration to avoid late declaration error on
>    some compilers.
> * Tried to make code readable and compact.
> 
> 
>   arch/powerpc/net/bpf_jit_comp32.c | 25 ++++++++++++++++++++++---
>   1 file changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
> index 28dc6a1a8f2f..43f1c76d48ce 100644
> --- a/arch/powerpc/net/bpf_jit_comp32.c
> +++ b/arch/powerpc/net/bpf_jit_comp32.c
> @@ -297,6 +297,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   		u32 ax_reg = bpf_to_ppc(BPF_REG_AX);
>   		u32 tmp_reg = bpf_to_ppc(TMP_REG);
>   		u32 size = BPF_SIZE(code);
> +		u32 save_reg, ret_reg;
>   		s16 off = insn[i].off;
>   		s32 imm = insn[i].imm;
>   		bool func_addr_fixed;
> @@ -799,6 +800,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   		 * BPF_STX ATOMIC (atomic ops)
>   		 */
>   		case BPF_STX | BPF_ATOMIC | BPF_W:
> +			save_reg = _R0;
> +			ret_reg = src_reg;
> +
>   			bpf_set_seen_register(ctx, tmp_reg);
>   			bpf_set_seen_register(ctx, ax_reg);
>   
> @@ -829,6 +833,21 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   			case BPF_XOR | BPF_FETCH:
>   				EMIT(PPC_RAW_XOR(_R0, _R0, src_reg));
>   				break;
> +			case BPF_CMPXCHG:
> +				/*
> +				 * Return old value in BPF_REG_0 for BPF_CMPXCHG &
> +				 * in src_reg for other cases.
> +				 */
> +				ret_reg = bpf_to_ppc(BPF_REG_0);
> +
> +				/* Compare with old value in BPF_REG_0 */
> +				EMIT(PPC_RAW_CMPW(bpf_to_ppc(BPF_REG_0), _R0));
> +				/* Don't set if different from old value */
> +				PPC_BCC_SHORT(COND_NE, (ctx->idx + 3) * 4);
> +				fallthrough;
> +			case BPF_XCHG:
> +				save_reg = src_reg;

I'm a bit lost, when save_reg is src_reg, don't we expect the upper part 
(ie src_reg - 1) to be explicitely zeroised ?

> +				break;
>   			default:
>   				pr_err_ratelimited("eBPF filter atomic op code %02x (@%d) unsupported\n",
>   						   code, i);
> @@ -836,15 +855,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   			}
>   
>   			/* store new value */
> -			EMIT(PPC_RAW_STWCX(_R0, tmp_reg, dst_reg));
> +			EMIT(PPC_RAW_STWCX(save_reg, tmp_reg, dst_reg));
>   			/* we're done if this succeeded */
>   			PPC_BCC_SHORT(COND_NE, tmp_idx);
>   
>   			/* For the BPF_FETCH variant, get old data into src_reg */
>   			if (imm & BPF_FETCH) {
> -				EMIT(PPC_RAW_MR(src_reg, ax_reg));
> +				EMIT(PPC_RAW_MR(ret_reg, ax_reg));
>   				if (!fp->aux->verifier_zext)
> -					EMIT(PPC_RAW_LI(src_reg_h, 0));
> +					EMIT(PPC_RAW_LI(ret_reg - 1, 0)); /* higher 32-bit */
>   			}
>   			break;
>   

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/5] bpf ppc32: Add instructions for atomic_[cmp]xchg
@ 2022-06-11 17:34     ` Christophe Leroy
  0 siblings, 0 replies; 28+ messages in thread
From: Christophe Leroy @ 2022-06-11 17:34 UTC (permalink / raw)
  To: Hari Bathini, bpf, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, netdev, Paul Mackerras, Naveen N. Rao,
	Yonghong Song, KP Singh, Jordan Niethe, Martin KaFai Lau



Le 10/06/2022 à 17:55, Hari Bathini a écrit :
> This adds two atomic opcodes BPF_XCHG and BPF_CMPXCHG on ppc32, both
> of which include the BPF_FETCH flag.  The kernel's atomic_cmpxchg
> operation fundamentally has 3 operands, but we only have two register
> fields. Therefore the operand we compare against (the kernel's API
> calls it 'old') is hard-coded to be BPF_REG_R0. Also, kernel's
> atomic_cmpxchg returns the previous value at dst_reg + off. JIT the
> same for BPF too with return value put in BPF_REG_0.
> 
>    BPF_REG_R0 = atomic_cmpxchg(dst_reg + off, BPF_REG_R0, src_reg);
> 
> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
> ---
> 
> Changes in v2:
> * Moved variable declaration to avoid late declaration error on
>    some compilers.
> * Tried to make code readable and compact.
> 
> 
>   arch/powerpc/net/bpf_jit_comp32.c | 25 ++++++++++++++++++++++---
>   1 file changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
> index 28dc6a1a8f2f..43f1c76d48ce 100644
> --- a/arch/powerpc/net/bpf_jit_comp32.c
> +++ b/arch/powerpc/net/bpf_jit_comp32.c
> @@ -297,6 +297,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   		u32 ax_reg = bpf_to_ppc(BPF_REG_AX);
>   		u32 tmp_reg = bpf_to_ppc(TMP_REG);
>   		u32 size = BPF_SIZE(code);
> +		u32 save_reg, ret_reg;
>   		s16 off = insn[i].off;
>   		s32 imm = insn[i].imm;
>   		bool func_addr_fixed;
> @@ -799,6 +800,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   		 * BPF_STX ATOMIC (atomic ops)
>   		 */
>   		case BPF_STX | BPF_ATOMIC | BPF_W:
> +			save_reg = _R0;
> +			ret_reg = src_reg;
> +
>   			bpf_set_seen_register(ctx, tmp_reg);
>   			bpf_set_seen_register(ctx, ax_reg);
>   
> @@ -829,6 +833,21 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   			case BPF_XOR | BPF_FETCH:
>   				EMIT(PPC_RAW_XOR(_R0, _R0, src_reg));
>   				break;
> +			case BPF_CMPXCHG:
> +				/*
> +				 * Return old value in BPF_REG_0 for BPF_CMPXCHG &
> +				 * in src_reg for other cases.
> +				 */
> +				ret_reg = bpf_to_ppc(BPF_REG_0);
> +
> +				/* Compare with old value in BPF_REG_0 */
> +				EMIT(PPC_RAW_CMPW(bpf_to_ppc(BPF_REG_0), _R0));
> +				/* Don't set if different from old value */
> +				PPC_BCC_SHORT(COND_NE, (ctx->idx + 3) * 4);
> +				fallthrough;
> +			case BPF_XCHG:
> +				save_reg = src_reg;

I'm a bit lost, when save_reg is src_reg, don't we expect the upper part 
(ie src_reg - 1) to be explicitely zeroised ?

> +				break;
>   			default:
>   				pr_err_ratelimited("eBPF filter atomic op code %02x (@%d) unsupported\n",
>   						   code, i);
> @@ -836,15 +855,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   			}
>   
>   			/* store new value */
> -			EMIT(PPC_RAW_STWCX(_R0, tmp_reg, dst_reg));
> +			EMIT(PPC_RAW_STWCX(save_reg, tmp_reg, dst_reg));
>   			/* we're done if this succeeded */
>   			PPC_BCC_SHORT(COND_NE, tmp_idx);
>   
>   			/* For the BPF_FETCH variant, get old data into src_reg */
>   			if (imm & BPF_FETCH) {
> -				EMIT(PPC_RAW_MR(src_reg, ax_reg));
> +				EMIT(PPC_RAW_MR(ret_reg, ax_reg));
>   				if (!fp->aux->verifier_zext)
> -					EMIT(PPC_RAW_LI(src_reg_h, 0));
> +					EMIT(PPC_RAW_LI(ret_reg - 1, 0)); /* higher 32-bit */
>   			}
>   			break;
>   

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/5] bpf ppc32: add support for BPF_ATOMIC bitwise operations
  2022-06-11 17:14     ` Christophe Leroy
@ 2022-06-13 19:00       ` Hari Bathini
  -1 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-13 19:00 UTC (permalink / raw)
  To: Christophe Leroy, bpf, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, netdev, John Fastabend,
	Alexei Starovoitov, Andrii Nakryiko, Paul Mackerras,
	Jordan Niethe, Naveen N. Rao, Yonghong Song, KP Singh,
	Martin KaFai Lau



On 11/06/22 10:44 pm, Christophe Leroy wrote:
> 
> 
> Le 10/06/2022 à 17:55, Hari Bathini a écrit :
>> Adding instructions for ppc32 for
>>
>> atomic_and
>> atomic_or
>> atomic_xor
>> atomic_fetch_add
>> atomic_fetch_and
>> atomic_fetch_or
>> atomic_fetch_xor
>>
>> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
>> ---
>>
>> Changes in v2:
>> * Used an additional register (BPF_REG_AX)
>>       - to avoid clobbering src_reg.
>>       - to keep the lwarx reservation as intended.
>>       - to avoid the odd switch/goto construct.
> 
> Might be a stupid question as I don't know the internals of BPF: Are we
> sure BPF_REG_AX cannot be the src reg or the dst reg ?
> 

AFAICS, BPF_REG_AX wouldn't be used as src_reg or dst_reg unless this
code is reused internally, by arch-specific code, for JIT'ing some other
instruction(s) using BPF_REG_AX as either src or dst reg..

Thanks
Hari

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/5] bpf ppc32: add support for BPF_ATOMIC bitwise operations
@ 2022-06-13 19:00       ` Hari Bathini
  0 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-13 19:00 UTC (permalink / raw)
  To: Christophe Leroy, bpf, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, netdev, Paul Mackerras, Naveen N. Rao,
	Yonghong Song, KP Singh, Jordan Niethe, Martin KaFai Lau



On 11/06/22 10:44 pm, Christophe Leroy wrote:
> 
> 
> Le 10/06/2022 à 17:55, Hari Bathini a écrit :
>> Adding instructions for ppc32 for
>>
>> atomic_and
>> atomic_or
>> atomic_xor
>> atomic_fetch_add
>> atomic_fetch_and
>> atomic_fetch_or
>> atomic_fetch_xor
>>
>> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
>> ---
>>
>> Changes in v2:
>> * Used an additional register (BPF_REG_AX)
>>       - to avoid clobbering src_reg.
>>       - to keep the lwarx reservation as intended.
>>       - to avoid the odd switch/goto construct.
> 
> Might be a stupid question as I don't know the internals of BPF: Are we
> sure BPF_REG_AX cannot be the src reg or the dst reg ?
> 

AFAICS, BPF_REG_AX wouldn't be used as src_reg or dst_reg unless this
code is reused internally, by arch-specific code, for JIT'ing some other
instruction(s) using BPF_REG_AX as either src or dst reg..

Thanks
Hari

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/5] bpf ppc32: Add instructions for atomic_[cmp]xchg
  2022-06-11 17:34     ` Christophe Leroy
@ 2022-06-13 19:11       ` Hari Bathini
  -1 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-13 19:11 UTC (permalink / raw)
  To: Christophe Leroy, bpf, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, netdev, John Fastabend,
	Alexei Starovoitov, Andrii Nakryiko, Paul Mackerras,
	Jordan Niethe, Naveen N. Rao, Yonghong Song, KP Singh,
	Martin KaFai Lau



On 11/06/22 11:04 pm, Christophe Leroy wrote:
> 
> 
> Le 10/06/2022 à 17:55, Hari Bathini a écrit :
>> This adds two atomic opcodes BPF_XCHG and BPF_CMPXCHG on ppc32, both
>> of which include the BPF_FETCH flag.  The kernel's atomic_cmpxchg
>> operation fundamentally has 3 operands, but we only have two register
>> fields. Therefore the operand we compare against (the kernel's API
>> calls it 'old') is hard-coded to be BPF_REG_R0. Also, kernel's
>> atomic_cmpxchg returns the previous value at dst_reg + off. JIT the
>> same for BPF too with return value put in BPF_REG_0.
>>
>>     BPF_REG_R0 = atomic_cmpxchg(dst_reg + off, BPF_REG_R0, src_reg);
>>
>> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
>> ---
>>
>> Changes in v2:
>> * Moved variable declaration to avoid late declaration error on
>>     some compilers.
>> * Tried to make code readable and compact.
>>
>>
>>    arch/powerpc/net/bpf_jit_comp32.c | 25 ++++++++++++++++++++++---
>>    1 file changed, 22 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
>> index 28dc6a1a8f2f..43f1c76d48ce 100644
>> --- a/arch/powerpc/net/bpf_jit_comp32.c
>> +++ b/arch/powerpc/net/bpf_jit_comp32.c
>> @@ -297,6 +297,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>>    		u32 ax_reg = bpf_to_ppc(BPF_REG_AX);
>>    		u32 tmp_reg = bpf_to_ppc(TMP_REG);
>>    		u32 size = BPF_SIZE(code);
>> +		u32 save_reg, ret_reg;
>>    		s16 off = insn[i].off;
>>    		s32 imm = insn[i].imm;
>>    		bool func_addr_fixed;
>> @@ -799,6 +800,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>>    		 * BPF_STX ATOMIC (atomic ops)
>>    		 */
>>    		case BPF_STX | BPF_ATOMIC | BPF_W:
>> +			save_reg = _R0;
>> +			ret_reg = src_reg;
>> +
>>    			bpf_set_seen_register(ctx, tmp_reg);
>>    			bpf_set_seen_register(ctx, ax_reg);
>>    
>> @@ -829,6 +833,21 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>>    			case BPF_XOR | BPF_FETCH:
>>    				EMIT(PPC_RAW_XOR(_R0, _R0, src_reg));
>>    				break;
>> +			case BPF_CMPXCHG:
>> +				/*
>> +				 * Return old value in BPF_REG_0 for BPF_CMPXCHG &
>> +				 * in src_reg for other cases.
>> +				 */
>> +				ret_reg = bpf_to_ppc(BPF_REG_0);
>> +
>> +				/* Compare with old value in BPF_REG_0 */
>> +				EMIT(PPC_RAW_CMPW(bpf_to_ppc(BPF_REG_0), _R0));
>> +				/* Don't set if different from old value */
>> +				PPC_BCC_SHORT(COND_NE, (ctx->idx + 3) * 4);
>> +				fallthrough;
>> +			case BPF_XCHG:
>> +				save_reg = src_reg;
> 
> I'm a bit lost, when save_reg is src_reg, don't we expect the upper part
> (ie src_reg - 1) to be explicitely zeroised ?
> 

For BPF_FETCH variants, old value is returned in src_reg (ret_reg).
In all such cases, higher 32-bit is zero'ed. But in case of BPF_CMPXCHG,
src_reg is untouched as BPF_REG_0 is used instead. So, higher 32-bit
remains untouched for that case alone..


>> +				break;
>>    			default:
>>    				pr_err_ratelimited("eBPF filter atomic op code %02x (@%d) unsupported\n",
>>    						   code, i);
>> @@ -836,15 +855,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>>    			}
>>    
>>    			/* store new value */
>> -			EMIT(PPC_RAW_STWCX(_R0, tmp_reg, dst_reg));
>> +			EMIT(PPC_RAW_STWCX(save_reg, tmp_reg, dst_reg));
>>    			/* we're done if this succeeded */
>>    			PPC_BCC_SHORT(COND_NE, tmp_idx);
>>    

>>    			/* For the BPF_FETCH variant, get old data into src_reg */

With this commit, this comment is not true for BPF_CMPXCHG. So, this
comment should not be removed..

Thanks
Hari

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/5] bpf ppc32: Add instructions for atomic_[cmp]xchg
@ 2022-06-13 19:11       ` Hari Bathini
  0 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-13 19:11 UTC (permalink / raw)
  To: Christophe Leroy, bpf, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, netdev, Paul Mackerras, Naveen N. Rao,
	Yonghong Song, KP Singh, Jordan Niethe, Martin KaFai Lau



On 11/06/22 11:04 pm, Christophe Leroy wrote:
> 
> 
> Le 10/06/2022 à 17:55, Hari Bathini a écrit :
>> This adds two atomic opcodes BPF_XCHG and BPF_CMPXCHG on ppc32, both
>> of which include the BPF_FETCH flag.  The kernel's atomic_cmpxchg
>> operation fundamentally has 3 operands, but we only have two register
>> fields. Therefore the operand we compare against (the kernel's API
>> calls it 'old') is hard-coded to be BPF_REG_R0. Also, kernel's
>> atomic_cmpxchg returns the previous value at dst_reg + off. JIT the
>> same for BPF too with return value put in BPF_REG_0.
>>
>>     BPF_REG_R0 = atomic_cmpxchg(dst_reg + off, BPF_REG_R0, src_reg);
>>
>> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
>> ---
>>
>> Changes in v2:
>> * Moved variable declaration to avoid late declaration error on
>>     some compilers.
>> * Tried to make code readable and compact.
>>
>>
>>    arch/powerpc/net/bpf_jit_comp32.c | 25 ++++++++++++++++++++++---
>>    1 file changed, 22 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
>> index 28dc6a1a8f2f..43f1c76d48ce 100644
>> --- a/arch/powerpc/net/bpf_jit_comp32.c
>> +++ b/arch/powerpc/net/bpf_jit_comp32.c
>> @@ -297,6 +297,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>>    		u32 ax_reg = bpf_to_ppc(BPF_REG_AX);
>>    		u32 tmp_reg = bpf_to_ppc(TMP_REG);
>>    		u32 size = BPF_SIZE(code);
>> +		u32 save_reg, ret_reg;
>>    		s16 off = insn[i].off;
>>    		s32 imm = insn[i].imm;
>>    		bool func_addr_fixed;
>> @@ -799,6 +800,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>>    		 * BPF_STX ATOMIC (atomic ops)
>>    		 */
>>    		case BPF_STX | BPF_ATOMIC | BPF_W:
>> +			save_reg = _R0;
>> +			ret_reg = src_reg;
>> +
>>    			bpf_set_seen_register(ctx, tmp_reg);
>>    			bpf_set_seen_register(ctx, ax_reg);
>>    
>> @@ -829,6 +833,21 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>>    			case BPF_XOR | BPF_FETCH:
>>    				EMIT(PPC_RAW_XOR(_R0, _R0, src_reg));
>>    				break;
>> +			case BPF_CMPXCHG:
>> +				/*
>> +				 * Return old value in BPF_REG_0 for BPF_CMPXCHG &
>> +				 * in src_reg for other cases.
>> +				 */
>> +				ret_reg = bpf_to_ppc(BPF_REG_0);
>> +
>> +				/* Compare with old value in BPF_REG_0 */
>> +				EMIT(PPC_RAW_CMPW(bpf_to_ppc(BPF_REG_0), _R0));
>> +				/* Don't set if different from old value */
>> +				PPC_BCC_SHORT(COND_NE, (ctx->idx + 3) * 4);
>> +				fallthrough;
>> +			case BPF_XCHG:
>> +				save_reg = src_reg;
> 
> I'm a bit lost, when save_reg is src_reg, don't we expect the upper part
> (ie src_reg - 1) to be explicitely zeroised ?
> 

For BPF_FETCH variants, old value is returned in src_reg (ret_reg).
In all such cases, higher 32-bit is zero'ed. But in case of BPF_CMPXCHG,
src_reg is untouched as BPF_REG_0 is used instead. So, higher 32-bit
remains untouched for that case alone..


>> +				break;
>>    			default:
>>    				pr_err_ratelimited("eBPF filter atomic op code %02x (@%d) unsupported\n",
>>    						   code, i);
>> @@ -836,15 +855,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>>    			}
>>    
>>    			/* store new value */
>> -			EMIT(PPC_RAW_STWCX(_R0, tmp_reg, dst_reg));
>> +			EMIT(PPC_RAW_STWCX(save_reg, tmp_reg, dst_reg));
>>    			/* we're done if this succeeded */
>>    			PPC_BCC_SHORT(COND_NE, tmp_idx);
>>    

>>    			/* For the BPF_FETCH variant, get old data into src_reg */

With this commit, this comment is not true for BPF_CMPXCHG. So, this
comment should not be removed..

Thanks
Hari

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/5] bpf ppc32: Add instructions for atomic_[cmp]xchg
  2022-06-13 19:11       ` Hari Bathini
@ 2022-06-13 19:14         ` Hari Bathini
  -1 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-13 19:14 UTC (permalink / raw)
  To: Christophe Leroy, bpf, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, netdev, John Fastabend,
	Alexei Starovoitov, Andrii Nakryiko, Paul Mackerras,
	Jordan Niethe, Naveen N. Rao, Yonghong Song, KP Singh,
	Martin KaFai Lau



On 14/06/22 12:41 am, Hari Bathini wrote:
> 
> 
> On 11/06/22 11:04 pm, Christophe Leroy wrote:
>>
>>
>> Le 10/06/2022 à 17:55, Hari Bathini a écrit :
>>> This adds two atomic opcodes BPF_XCHG and BPF_CMPXCHG on ppc32, both
>>> of which include the BPF_FETCH flag.  The kernel's atomic_cmpxchg
>>> operation fundamentally has 3 operands, but we only have two register
>>> fields. Therefore the operand we compare against (the kernel's API
>>> calls it 'old') is hard-coded to be BPF_REG_R0. Also, kernel's
>>> atomic_cmpxchg returns the previous value at dst_reg + off. JIT the
>>> same for BPF too with return value put in BPF_REG_0.
>>>
>>>     BPF_REG_R0 = atomic_cmpxchg(dst_reg + off, BPF_REG_R0, src_reg);
>>>
>>> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
>>> ---
>>>
>>> Changes in v2:
>>> * Moved variable declaration to avoid late declaration error on
>>>     some compilers.
>>> * Tried to make code readable and compact.
>>>
>>>
>>>    arch/powerpc/net/bpf_jit_comp32.c | 25 ++++++++++++++++++++++---
>>>    1 file changed, 22 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/powerpc/net/bpf_jit_comp32.c 
>>> b/arch/powerpc/net/bpf_jit_comp32.c
>>> index 28dc6a1a8f2f..43f1c76d48ce 100644
>>> --- a/arch/powerpc/net/bpf_jit_comp32.c
>>> +++ b/arch/powerpc/net/bpf_jit_comp32.c
>>> @@ -297,6 +297,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>> *image, struct codegen_context *
>>>            u32 ax_reg = bpf_to_ppc(BPF_REG_AX);
>>>            u32 tmp_reg = bpf_to_ppc(TMP_REG);
>>>            u32 size = BPF_SIZE(code);
>>> +        u32 save_reg, ret_reg;
>>>            s16 off = insn[i].off;
>>>            s32 imm = insn[i].imm;
>>>            bool func_addr_fixed;
>>> @@ -799,6 +800,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>> *image, struct codegen_context *
>>>             * BPF_STX ATOMIC (atomic ops)
>>>             */
>>>            case BPF_STX | BPF_ATOMIC | BPF_W:
>>> +            save_reg = _R0;
>>> +            ret_reg = src_reg;
>>> +
>>>                bpf_set_seen_register(ctx, tmp_reg);
>>>                bpf_set_seen_register(ctx, ax_reg);
>>> @@ -829,6 +833,21 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>> *image, struct codegen_context *
>>>                case BPF_XOR | BPF_FETCH:
>>>                    EMIT(PPC_RAW_XOR(_R0, _R0, src_reg));
>>>                    break;
>>> +            case BPF_CMPXCHG:
>>> +                /*
>>> +                 * Return old value in BPF_REG_0 for BPF_CMPXCHG &
>>> +                 * in src_reg for other cases.
>>> +                 */
>>> +                ret_reg = bpf_to_ppc(BPF_REG_0);
>>> +
>>> +                /* Compare with old value in BPF_REG_0 */
>>> +                EMIT(PPC_RAW_CMPW(bpf_to_ppc(BPF_REG_0), _R0));
>>> +                /* Don't set if different from old value */
>>> +                PPC_BCC_SHORT(COND_NE, (ctx->idx + 3) * 4);
>>> +                fallthrough;
>>> +            case BPF_XCHG:
>>> +                save_reg = src_reg;
>>
>> I'm a bit lost, when save_reg is src_reg, don't we expect the upper part
>> (ie src_reg - 1) to be explicitely zeroised ?
>>
> 
> For BPF_FETCH variants, old value is returned in src_reg (ret_reg).
> In all such cases, higher 32-bit is zero'ed. But in case of BPF_CMPXCHG,
> src_reg is untouched as BPF_REG_0 is used instead. So, higher 32-bit
> remains untouched for that case alone..
> 
> 
>>> +                break;
>>>                default:
>>>                    pr_err_ratelimited("eBPF filter atomic op code 
>>> %02x (@%d) unsupported\n",
>>>                               code, i);
>>> @@ -836,15 +855,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>> *image, struct codegen_context *
>>>                }
>>>                /* store new value */
>>> -            EMIT(PPC_RAW_STWCX(_R0, tmp_reg, dst_reg));
>>> +            EMIT(PPC_RAW_STWCX(save_reg, tmp_reg, dst_reg));
>>>                /* we're done if this succeeded */
>>>                PPC_BCC_SHORT(COND_NE, tmp_idx);

> 
>>>                /* For the BPF_FETCH variant, get old data into 
>>> src_reg */
> 
> With this commit, this comment is not true for BPF_CMPXCHG. So, this
> comment should not be removed..

Sorry, the above should read:
   "should be removed" instead of "should not be removed"..

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/5] bpf ppc32: Add instructions for atomic_[cmp]xchg
@ 2022-06-13 19:14         ` Hari Bathini
  0 siblings, 0 replies; 28+ messages in thread
From: Hari Bathini @ 2022-06-13 19:14 UTC (permalink / raw)
  To: Christophe Leroy, bpf, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, netdev, Paul Mackerras, Naveen N. Rao,
	Yonghong Song, KP Singh, Jordan Niethe, Martin KaFai Lau



On 14/06/22 12:41 am, Hari Bathini wrote:
> 
> 
> On 11/06/22 11:04 pm, Christophe Leroy wrote:
>>
>>
>> Le 10/06/2022 à 17:55, Hari Bathini a écrit :
>>> This adds two atomic opcodes BPF_XCHG and BPF_CMPXCHG on ppc32, both
>>> of which include the BPF_FETCH flag.  The kernel's atomic_cmpxchg
>>> operation fundamentally has 3 operands, but we only have two register
>>> fields. Therefore the operand we compare against (the kernel's API
>>> calls it 'old') is hard-coded to be BPF_REG_R0. Also, kernel's
>>> atomic_cmpxchg returns the previous value at dst_reg + off. JIT the
>>> same for BPF too with return value put in BPF_REG_0.
>>>
>>>     BPF_REG_R0 = atomic_cmpxchg(dst_reg + off, BPF_REG_R0, src_reg);
>>>
>>> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
>>> ---
>>>
>>> Changes in v2:
>>> * Moved variable declaration to avoid late declaration error on
>>>     some compilers.
>>> * Tried to make code readable and compact.
>>>
>>>
>>>    arch/powerpc/net/bpf_jit_comp32.c | 25 ++++++++++++++++++++++---
>>>    1 file changed, 22 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/powerpc/net/bpf_jit_comp32.c 
>>> b/arch/powerpc/net/bpf_jit_comp32.c
>>> index 28dc6a1a8f2f..43f1c76d48ce 100644
>>> --- a/arch/powerpc/net/bpf_jit_comp32.c
>>> +++ b/arch/powerpc/net/bpf_jit_comp32.c
>>> @@ -297,6 +297,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>> *image, struct codegen_context *
>>>            u32 ax_reg = bpf_to_ppc(BPF_REG_AX);
>>>            u32 tmp_reg = bpf_to_ppc(TMP_REG);
>>>            u32 size = BPF_SIZE(code);
>>> +        u32 save_reg, ret_reg;
>>>            s16 off = insn[i].off;
>>>            s32 imm = insn[i].imm;
>>>            bool func_addr_fixed;
>>> @@ -799,6 +800,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>> *image, struct codegen_context *
>>>             * BPF_STX ATOMIC (atomic ops)
>>>             */
>>>            case BPF_STX | BPF_ATOMIC | BPF_W:
>>> +            save_reg = _R0;
>>> +            ret_reg = src_reg;
>>> +
>>>                bpf_set_seen_register(ctx, tmp_reg);
>>>                bpf_set_seen_register(ctx, ax_reg);
>>> @@ -829,6 +833,21 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>> *image, struct codegen_context *
>>>                case BPF_XOR | BPF_FETCH:
>>>                    EMIT(PPC_RAW_XOR(_R0, _R0, src_reg));
>>>                    break;
>>> +            case BPF_CMPXCHG:
>>> +                /*
>>> +                 * Return old value in BPF_REG_0 for BPF_CMPXCHG &
>>> +                 * in src_reg for other cases.
>>> +                 */
>>> +                ret_reg = bpf_to_ppc(BPF_REG_0);
>>> +
>>> +                /* Compare with old value in BPF_REG_0 */
>>> +                EMIT(PPC_RAW_CMPW(bpf_to_ppc(BPF_REG_0), _R0));
>>> +                /* Don't set if different from old value */
>>> +                PPC_BCC_SHORT(COND_NE, (ctx->idx + 3) * 4);
>>> +                fallthrough;
>>> +            case BPF_XCHG:
>>> +                save_reg = src_reg;
>>
>> I'm a bit lost, when save_reg is src_reg, don't we expect the upper part
>> (ie src_reg - 1) to be explicitely zeroised ?
>>
> 
> For BPF_FETCH variants, old value is returned in src_reg (ret_reg).
> In all such cases, higher 32-bit is zero'ed. But in case of BPF_CMPXCHG,
> src_reg is untouched as BPF_REG_0 is used instead. So, higher 32-bit
> remains untouched for that case alone..
> 
> 
>>> +                break;
>>>                default:
>>>                    pr_err_ratelimited("eBPF filter atomic op code 
>>> %02x (@%d) unsupported\n",
>>>                               code, i);
>>> @@ -836,15 +855,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>> *image, struct codegen_context *
>>>                }
>>>                /* store new value */
>>> -            EMIT(PPC_RAW_STWCX(_R0, tmp_reg, dst_reg));
>>> +            EMIT(PPC_RAW_STWCX(save_reg, tmp_reg, dst_reg));
>>>                /* we're done if this succeeded */
>>>                PPC_BCC_SHORT(COND_NE, tmp_idx);

> 
>>>                /* For the BPF_FETCH variant, get old data into 
>>> src_reg */
> 
> With this commit, this comment is not true for BPF_CMPXCHG. So, this
> comment should not be removed..

Sorry, the above should read:
   "should be removed" instead of "should not be removed"..

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/5] Atomics support for eBPF on powerpc
  2022-06-10 15:55 ` Hari Bathini
@ 2022-06-24 10:37   ` Naveen N. Rao
  -1 siblings, 0 replies; 28+ messages in thread
From: Naveen N. Rao @ 2022-06-24 10:37 UTC (permalink / raw)
  To: bpf, Hari Bathini, linuxppc-dev
  Cc: Andrii Nakryiko, Alexei Starovoitov, Benjamin Herrenschmidt,
	Christophe Leroy, Daniel Borkmann, Jordan Niethe, John Fastabend,
	Martin KaFai Lau, KP Singh, Michael Ellerman, netdev,
	Paul Mackerras, Russell Currey, Song Liu, Yonghong Song

Hi Hari,

Hari Bathini wrote:
> This patchset adds atomic operations to the eBPF instruction set on
> powerpc. The instructions that are added here can be summarised with
> this list of kernel operations for ppc64:
> 
> * atomic[64]_[fetch_]add
> * atomic[64]_[fetch_]and
> * atomic[64]_[fetch_]or
> * atomic[64]_[fetch_]xor
> * atomic[64]_xchg
> * atomic[64]_cmpxchg
> 
> and this list of kernel operations for ppc32:
> 
> * atomic_[fetch_]add
> * atomic_[fetch_]and
> * atomic_[fetch_]or
> * atomic_[fetch_]xor
> * atomic_xchg
> * atomic_cmpxchg

Thanks for your work on this. For this series:
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>

> 
> The following are left out of scope for this effort:
> 
> * 64 bit operations on ppc32.
> * Explicit memory barriers, 16 and 8 bit operations on both ppc32
>   & ppc64.

The latter is a limitation of the eBPF instruction set itself today, 
rather than a powerpc-specific limitation.

> 
> The first patch adds support for bitwsie atomic operations on ppc64.
> The next patch adds fetch variant support for these instructions. The
> third patch adds support for xchg and cmpxchg atomic operations on
> ppc64. Patch #4 adds support for 32-bit atomic bitwise operations on
> ppc32. patch #5 adds support for xchg and cmpxchg atomic operations
> on ppc32.
> 
> Validated these changes successfully with atomics test cases in
> test_bpf testsuite and  test_verifier & test_progs selftests.
> With test_bpf testsuite:
> 
>   all 147 atomics related test cases (both 32-bit & 64-bit) JIT'ed
>   successfully on ppc64:
> 
>     test_bpf: Summary: 1026 PASSED, 0 FAILED, [1014/1014 JIT'ed]
> 
>   all 76 atomics related test cases (32-bit) JIT'ed successfully
>   on ppc32:
> 
>     test_bpf: Summary: 1027 PASSED, 0 FAILED, [915/1015 JIT'ed]

Indeed. In my tests, before this series, with CONFIG_BPF_JIT_ALWAYS_ON=y:
test_bpf: Summary: 894 PASSED, 132 FAILED, [882/882 JIT'ed]
test_progs --name=atomic: Summary: 0/0 PASSED, 0 SKIPPED, 2 FAILED
test_verifier 0 100: Summary: 46 PASSED, 151 SKIPPED, 0 FAILED

With your patches:
test_bpf: Summary: 1026 PASSED, 0 FAILED, [1014/1014 JIT'ed]
test_progs --name=atomic: Summary: 2/7 PASSED, 0 SKIPPED, 0 FAILED
test_verifier 0 100: Summary: 101 PASSED, 96 SKIPPED, 0 FAILED

It is nice to see all the test_bpf tests pass again on ppc64le!

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> (ppc64le)


- Naveen


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/5] Atomics support for eBPF on powerpc
@ 2022-06-24 10:37   ` Naveen N. Rao
  0 siblings, 0 replies; 28+ messages in thread
From: Naveen N. Rao @ 2022-06-24 10:37 UTC (permalink / raw)
  To: bpf, Hari Bathini, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, John Fastabend, Andrii Nakryiko,
	Alexei Starovoitov, netdev, Paul Mackerras, Jordan Niethe,
	KP Singh, Yonghong Song, Martin KaFai Lau

Hi Hari,

Hari Bathini wrote:
> This patchset adds atomic operations to the eBPF instruction set on
> powerpc. The instructions that are added here can be summarised with
> this list of kernel operations for ppc64:
> 
> * atomic[64]_[fetch_]add
> * atomic[64]_[fetch_]and
> * atomic[64]_[fetch_]or
> * atomic[64]_[fetch_]xor
> * atomic[64]_xchg
> * atomic[64]_cmpxchg
> 
> and this list of kernel operations for ppc32:
> 
> * atomic_[fetch_]add
> * atomic_[fetch_]and
> * atomic_[fetch_]or
> * atomic_[fetch_]xor
> * atomic_xchg
> * atomic_cmpxchg

Thanks for your work on this. For this series:
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>

> 
> The following are left out of scope for this effort:
> 
> * 64 bit operations on ppc32.
> * Explicit memory barriers, 16 and 8 bit operations on both ppc32
>   & ppc64.

The latter is a limitation of the eBPF instruction set itself today, 
rather than a powerpc-specific limitation.

> 
> The first patch adds support for bitwsie atomic operations on ppc64.
> The next patch adds fetch variant support for these instructions. The
> third patch adds support for xchg and cmpxchg atomic operations on
> ppc64. Patch #4 adds support for 32-bit atomic bitwise operations on
> ppc32. patch #5 adds support for xchg and cmpxchg atomic operations
> on ppc32.
> 
> Validated these changes successfully with atomics test cases in
> test_bpf testsuite and  test_verifier & test_progs selftests.
> With test_bpf testsuite:
> 
>   all 147 atomics related test cases (both 32-bit & 64-bit) JIT'ed
>   successfully on ppc64:
> 
>     test_bpf: Summary: 1026 PASSED, 0 FAILED, [1014/1014 JIT'ed]
> 
>   all 76 atomics related test cases (32-bit) JIT'ed successfully
>   on ppc32:
> 
>     test_bpf: Summary: 1027 PASSED, 0 FAILED, [915/1015 JIT'ed]

Indeed. In my tests, before this series, with CONFIG_BPF_JIT_ALWAYS_ON=y:
test_bpf: Summary: 894 PASSED, 132 FAILED, [882/882 JIT'ed]
test_progs --name=atomic: Summary: 0/0 PASSED, 0 SKIPPED, 2 FAILED
test_verifier 0 100: Summary: 46 PASSED, 151 SKIPPED, 0 FAILED

With your patches:
test_bpf: Summary: 1026 PASSED, 0 FAILED, [1014/1014 JIT'ed]
test_progs --name=atomic: Summary: 2/7 PASSED, 0 SKIPPED, 0 FAILED
test_verifier 0 100: Summary: 101 PASSED, 96 SKIPPED, 0 FAILED

It is nice to see all the test_bpf tests pass again on ppc64le!

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> (ppc64le)


- Naveen


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/5] bpf ppc32: Add instructions for atomic_[cmp]xchg
  2022-06-13 19:14         ` Hari Bathini
@ 2022-06-24 10:41           ` Naveen N. Rao
  -1 siblings, 0 replies; 28+ messages in thread
From: Naveen N. Rao @ 2022-06-24 10:41 UTC (permalink / raw)
  To: bpf, Christophe Leroy, Hari Bathini, linuxppc-dev
  Cc: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann,
	Jordan Niethe, John Fastabend, Martin KaFai Lau, KP Singh,
	netdev, Paul Mackerras, Song Liu, Yonghong Song

Hari Bathini wrote:
> 
> 
> On 14/06/22 12:41 am, Hari Bathini wrote:
>> 
>> 
>> On 11/06/22 11:04 pm, Christophe Leroy wrote:
>>>
>>>
>>> Le 10/06/2022 à 17:55, Hari Bathini a écrit :
>>>> This adds two atomic opcodes BPF_XCHG and BPF_CMPXCHG on ppc32, both
>>>> of which include the BPF_FETCH flag.  The kernel's atomic_cmpxchg
>>>> operation fundamentally has 3 operands, but we only have two register
>>>> fields. Therefore the operand we compare against (the kernel's API
>>>> calls it 'old') is hard-coded to be BPF_REG_R0. Also, kernel's
>>>> atomic_cmpxchg returns the previous value at dst_reg + off. JIT the
>>>> same for BPF too with return value put in BPF_REG_0.
>>>>
>>>>     BPF_REG_R0 = atomic_cmpxchg(dst_reg + off, BPF_REG_R0, src_reg);
>>>>
>>>> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
>>>> ---
>>>>
>>>> Changes in v2:
>>>> * Moved variable declaration to avoid late declaration error on
>>>>     some compilers.
>>>> * Tried to make code readable and compact.
>>>>
>>>>
>>>>    arch/powerpc/net/bpf_jit_comp32.c | 25 ++++++++++++++++++++++---
>>>>    1 file changed, 22 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/net/bpf_jit_comp32.c 
>>>> b/arch/powerpc/net/bpf_jit_comp32.c
>>>> index 28dc6a1a8f2f..43f1c76d48ce 100644
>>>> --- a/arch/powerpc/net/bpf_jit_comp32.c
>>>> +++ b/arch/powerpc/net/bpf_jit_comp32.c
>>>> @@ -297,6 +297,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>>> *image, struct codegen_context *
>>>>            u32 ax_reg = bpf_to_ppc(BPF_REG_AX);
>>>>            u32 tmp_reg = bpf_to_ppc(TMP_REG);
>>>>            u32 size = BPF_SIZE(code);
>>>> +        u32 save_reg, ret_reg;
>>>>            s16 off = insn[i].off;
>>>>            s32 imm = insn[i].imm;
>>>>            bool func_addr_fixed;
>>>> @@ -799,6 +800,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>>> *image, struct codegen_context *
>>>>             * BPF_STX ATOMIC (atomic ops)
>>>>             */
>>>>            case BPF_STX | BPF_ATOMIC | BPF_W:
>>>> +            save_reg = _R0;
>>>> +            ret_reg = src_reg;
>>>> +
>>>>                bpf_set_seen_register(ctx, tmp_reg);
>>>>                bpf_set_seen_register(ctx, ax_reg);
>>>> @@ -829,6 +833,21 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>>> *image, struct codegen_context *
>>>>                case BPF_XOR | BPF_FETCH:
>>>>                    EMIT(PPC_RAW_XOR(_R0, _R0, src_reg));
>>>>                    break;
>>>> +            case BPF_CMPXCHG:
>>>> +                /*
>>>> +                 * Return old value in BPF_REG_0 for BPF_CMPXCHG &
>>>> +                 * in src_reg for other cases.
>>>> +                 */
>>>> +                ret_reg = bpf_to_ppc(BPF_REG_0);
>>>> +
>>>> +                /* Compare with old value in BPF_REG_0 */
>>>> +                EMIT(PPC_RAW_CMPW(bpf_to_ppc(BPF_REG_0), _R0));
>>>> +                /* Don't set if different from old value */
>>>> +                PPC_BCC_SHORT(COND_NE, (ctx->idx + 3) * 4);
>>>> +                fallthrough;
>>>> +            case BPF_XCHG:
>>>> +                save_reg = src_reg;
>>>
>>> I'm a bit lost, when save_reg is src_reg, don't we expect the upper part
>>> (ie src_reg - 1) to be explicitely zeroised ?
>>>
>> 
>> For BPF_FETCH variants, old value is returned in src_reg (ret_reg).
>> In all such cases, higher 32-bit is zero'ed. But in case of BPF_CMPXCHG,
>> src_reg is untouched as BPF_REG_0 is used instead. So, higher 32-bit
>> remains untouched for that case alone..
>> 
>> 
>>>> +                break;
>>>>                default:
>>>>                    pr_err_ratelimited("eBPF filter atomic op code 
>>>> %02x (@%d) unsupported\n",
>>>>                               code, i);
>>>> @@ -836,15 +855,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>>> *image, struct codegen_context *
>>>>                }
>>>>                /* store new value */
>>>> -            EMIT(PPC_RAW_STWCX(_R0, tmp_reg, dst_reg));
>>>> +            EMIT(PPC_RAW_STWCX(save_reg, tmp_reg, dst_reg));
>>>>                /* we're done if this succeeded */
>>>>                PPC_BCC_SHORT(COND_NE, tmp_idx);
> 
>> 
>>>>                /* For the BPF_FETCH variant, get old data into 
>>>> src_reg */
>> 
>> With this commit, this comment is not true for BPF_CMPXCHG. So, this
>> comment should not be removed..
> 
> Sorry, the above should read:
>    "should be removed" instead of "should not be removed"..
> 

Or, just add BPF_REG_0 at the end:
  /* For the BPF_FETCH variant, get old data into src_reg/BPF_REG_0 */

The comment in CMPXCHG anyway details the difference. In any case, we 
can clean this up subsequently.


- Naveen


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/5] bpf ppc32: Add instructions for atomic_[cmp]xchg
@ 2022-06-24 10:41           ` Naveen N. Rao
  0 siblings, 0 replies; 28+ messages in thread
From: Naveen N. Rao @ 2022-06-24 10:41 UTC (permalink / raw)
  To: bpf, Christophe Leroy, Hari Bathini, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, Jordan Niethe, John Fastabend,
	Andrii Nakryiko, Alexei Starovoitov, Paul Mackerras, netdev,
	KP Singh, Yonghong Song, Martin KaFai Lau

Hari Bathini wrote:
> 
> 
> On 14/06/22 12:41 am, Hari Bathini wrote:
>> 
>> 
>> On 11/06/22 11:04 pm, Christophe Leroy wrote:
>>>
>>>
>>> Le 10/06/2022 à 17:55, Hari Bathini a écrit :
>>>> This adds two atomic opcodes BPF_XCHG and BPF_CMPXCHG on ppc32, both
>>>> of which include the BPF_FETCH flag.  The kernel's atomic_cmpxchg
>>>> operation fundamentally has 3 operands, but we only have two register
>>>> fields. Therefore the operand we compare against (the kernel's API
>>>> calls it 'old') is hard-coded to be BPF_REG_R0. Also, kernel's
>>>> atomic_cmpxchg returns the previous value at dst_reg + off. JIT the
>>>> same for BPF too with return value put in BPF_REG_0.
>>>>
>>>>     BPF_REG_R0 = atomic_cmpxchg(dst_reg + off, BPF_REG_R0, src_reg);
>>>>
>>>> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
>>>> ---
>>>>
>>>> Changes in v2:
>>>> * Moved variable declaration to avoid late declaration error on
>>>>     some compilers.
>>>> * Tried to make code readable and compact.
>>>>
>>>>
>>>>    arch/powerpc/net/bpf_jit_comp32.c | 25 ++++++++++++++++++++++---
>>>>    1 file changed, 22 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/net/bpf_jit_comp32.c 
>>>> b/arch/powerpc/net/bpf_jit_comp32.c
>>>> index 28dc6a1a8f2f..43f1c76d48ce 100644
>>>> --- a/arch/powerpc/net/bpf_jit_comp32.c
>>>> +++ b/arch/powerpc/net/bpf_jit_comp32.c
>>>> @@ -297,6 +297,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>>> *image, struct codegen_context *
>>>>            u32 ax_reg = bpf_to_ppc(BPF_REG_AX);
>>>>            u32 tmp_reg = bpf_to_ppc(TMP_REG);
>>>>            u32 size = BPF_SIZE(code);
>>>> +        u32 save_reg, ret_reg;
>>>>            s16 off = insn[i].off;
>>>>            s32 imm = insn[i].imm;
>>>>            bool func_addr_fixed;
>>>> @@ -799,6 +800,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>>> *image, struct codegen_context *
>>>>             * BPF_STX ATOMIC (atomic ops)
>>>>             */
>>>>            case BPF_STX | BPF_ATOMIC | BPF_W:
>>>> +            save_reg = _R0;
>>>> +            ret_reg = src_reg;
>>>> +
>>>>                bpf_set_seen_register(ctx, tmp_reg);
>>>>                bpf_set_seen_register(ctx, ax_reg);
>>>> @@ -829,6 +833,21 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>>> *image, struct codegen_context *
>>>>                case BPF_XOR | BPF_FETCH:
>>>>                    EMIT(PPC_RAW_XOR(_R0, _R0, src_reg));
>>>>                    break;
>>>> +            case BPF_CMPXCHG:
>>>> +                /*
>>>> +                 * Return old value in BPF_REG_0 for BPF_CMPXCHG &
>>>> +                 * in src_reg for other cases.
>>>> +                 */
>>>> +                ret_reg = bpf_to_ppc(BPF_REG_0);
>>>> +
>>>> +                /* Compare with old value in BPF_REG_0 */
>>>> +                EMIT(PPC_RAW_CMPW(bpf_to_ppc(BPF_REG_0), _R0));
>>>> +                /* Don't set if different from old value */
>>>> +                PPC_BCC_SHORT(COND_NE, (ctx->idx + 3) * 4);
>>>> +                fallthrough;
>>>> +            case BPF_XCHG:
>>>> +                save_reg = src_reg;
>>>
>>> I'm a bit lost, when save_reg is src_reg, don't we expect the upper part
>>> (ie src_reg - 1) to be explicitely zeroised ?
>>>
>> 
>> For BPF_FETCH variants, old value is returned in src_reg (ret_reg).
>> In all such cases, higher 32-bit is zero'ed. But in case of BPF_CMPXCHG,
>> src_reg is untouched as BPF_REG_0 is used instead. So, higher 32-bit
>> remains untouched for that case alone..
>> 
>> 
>>>> +                break;
>>>>                default:
>>>>                    pr_err_ratelimited("eBPF filter atomic op code 
>>>> %02x (@%d) unsupported\n",
>>>>                               code, i);
>>>> @@ -836,15 +855,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 
>>>> *image, struct codegen_context *
>>>>                }
>>>>                /* store new value */
>>>> -            EMIT(PPC_RAW_STWCX(_R0, tmp_reg, dst_reg));
>>>> +            EMIT(PPC_RAW_STWCX(save_reg, tmp_reg, dst_reg));
>>>>                /* we're done if this succeeded */
>>>>                PPC_BCC_SHORT(COND_NE, tmp_idx);
> 
>> 
>>>>                /* For the BPF_FETCH variant, get old data into 
>>>> src_reg */
>> 
>> With this commit, this comment is not true for BPF_CMPXCHG. So, this
>> comment should not be removed..
> 
> Sorry, the above should read:
>    "should be removed" instead of "should not be removed"..
> 

Or, just add BPF_REG_0 at the end:
  /* For the BPF_FETCH variant, get old data into src_reg/BPF_REG_0 */

The comment in CMPXCHG anyway details the difference. In any case, we 
can clean this up subsequently.


- Naveen


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/5] Atomics support for eBPF on powerpc
  2022-06-10 15:55 ` Hari Bathini
@ 2022-07-04 11:33   ` Michael Ellerman
  -1 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2022-07-04 11:33 UTC (permalink / raw)
  To: bpf, Hari Bathini, linuxppc-dev
  Cc: Jordan Niethe, Alexei Starovoitov, Yonghong Song, Naveen N. Rao,
	netdev, John Fastabend, Christophe Leroy, Michael Ellerman,
	KP Singh, Paul Mackerras, Song Liu, Martin KaFai Lau,
	Daniel Borkmann, Russell Currey, Andrii Nakryiko,
	Benjamin Herrenschmidt

On Fri, 10 Jun 2022 21:25:47 +0530, Hari Bathini wrote:
> This patchset adds atomic operations to the eBPF instruction set on
> powerpc. The instructions that are added here can be summarised with
> this list of kernel operations for ppc64:
> 
> * atomic[64]_[fetch_]add
> * atomic[64]_[fetch_]and
> * atomic[64]_[fetch_]or
> * atomic[64]_[fetch_]xor
> * atomic[64]_xchg
> * atomic[64]_cmpxchg
> 
> [...]

Applied to powerpc/next.

[1/5] bpf ppc64: add support for BPF_ATOMIC bitwise operations
      https://git.kernel.org/powerpc/c/65112709115f48f16d7082bcabf173d08622e69f
[2/5] bpf ppc64: add support for atomic fetch operations
      https://git.kernel.org/powerpc/c/dbe6e2456fb0263a5a961a92836d2cebdbca979c
[3/5] bpf ppc64: Add instructions for atomic_[cmp]xchg
      https://git.kernel.org/powerpc/c/1e82dfaa7819f03f0b0022be7ca15bbc83090da1
[4/5] bpf ppc32: add support for BPF_ATOMIC bitwise operations
      https://git.kernel.org/powerpc/c/aea7ef8a82c0ea13ff20b65ff2edf8a38a17eda8
[5/5] bpf ppc32: Add instructions for atomic_[cmp]xchg
      https://git.kernel.org/powerpc/c/2d9206b227434912582049c49af1085660fa1e50

cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/5] Atomics support for eBPF on powerpc
@ 2022-07-04 11:33   ` Michael Ellerman
  0 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2022-07-04 11:33 UTC (permalink / raw)
  To: bpf, Hari Bathini, linuxppc-dev
  Cc: Song Liu, Daniel Borkmann, Jordan Niethe, John Fastabend,
	Alexei Starovoitov, KP Singh, Andrii Nakryiko, Paul Mackerras,
	netdev, Naveen N. Rao, Yonghong Song, Martin KaFai Lau

On Fri, 10 Jun 2022 21:25:47 +0530, Hari Bathini wrote:
> This patchset adds atomic operations to the eBPF instruction set on
> powerpc. The instructions that are added here can be summarised with
> this list of kernel operations for ppc64:
> 
> * atomic[64]_[fetch_]add
> * atomic[64]_[fetch_]and
> * atomic[64]_[fetch_]or
> * atomic[64]_[fetch_]xor
> * atomic[64]_xchg
> * atomic[64]_cmpxchg
> 
> [...]

Applied to powerpc/next.

[1/5] bpf ppc64: add support for BPF_ATOMIC bitwise operations
      https://git.kernel.org/powerpc/c/65112709115f48f16d7082bcabf173d08622e69f
[2/5] bpf ppc64: add support for atomic fetch operations
      https://git.kernel.org/powerpc/c/dbe6e2456fb0263a5a961a92836d2cebdbca979c
[3/5] bpf ppc64: Add instructions for atomic_[cmp]xchg
      https://git.kernel.org/powerpc/c/1e82dfaa7819f03f0b0022be7ca15bbc83090da1
[4/5] bpf ppc32: add support for BPF_ATOMIC bitwise operations
      https://git.kernel.org/powerpc/c/aea7ef8a82c0ea13ff20b65ff2edf8a38a17eda8
[5/5] bpf ppc32: Add instructions for atomic_[cmp]xchg
      https://git.kernel.org/powerpc/c/2d9206b227434912582049c49af1085660fa1e50

cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2022-07-04 11:41 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-10 15:55 [PATCH v2 0/5] Atomics support for eBPF on powerpc Hari Bathini
2022-06-10 15:55 ` Hari Bathini
2022-06-10 15:55 ` [PATCH v2 1/5] bpf ppc64: add support for BPF_ATOMIC bitwise operations Hari Bathini
2022-06-10 15:55   ` Hari Bathini
2022-06-10 15:55 ` [PATCH v2 2/5] bpf ppc64: add support for atomic fetch operations Hari Bathini
2022-06-10 15:55   ` Hari Bathini
2022-06-10 15:55 ` [PATCH v2 3/5] bpf ppc64: Add instructions for atomic_[cmp]xchg Hari Bathini
2022-06-10 15:55   ` Hari Bathini
2022-06-10 15:55 ` [PATCH v2 4/5] bpf ppc32: add support for BPF_ATOMIC bitwise operations Hari Bathini
2022-06-10 15:55   ` Hari Bathini
2022-06-11 17:14   ` Christophe Leroy
2022-06-11 17:14     ` Christophe Leroy
2022-06-13 19:00     ` Hari Bathini
2022-06-13 19:00       ` Hari Bathini
2022-06-10 15:55 ` [PATCH v2 5/5] bpf ppc32: Add instructions for atomic_[cmp]xchg Hari Bathini
2022-06-10 15:55   ` Hari Bathini
2022-06-11 17:34   ` Christophe Leroy
2022-06-11 17:34     ` Christophe Leroy
2022-06-13 19:11     ` Hari Bathini
2022-06-13 19:11       ` Hari Bathini
2022-06-13 19:14       ` Hari Bathini
2022-06-13 19:14         ` Hari Bathini
2022-06-24 10:41         ` Naveen N. Rao
2022-06-24 10:41           ` Naveen N. Rao
2022-06-24 10:37 ` [PATCH v2 0/5] Atomics support for eBPF on powerpc Naveen N. Rao
2022-06-24 10:37   ` Naveen N. Rao
2022-07-04 11:33 ` Michael Ellerman
2022-07-04 11:33   ` Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.