bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/13] powerpc/bpf: Some fixes and updates
@ 2022-01-06 11:45 Naveen N. Rao
  2022-01-06 11:45 ` [PATCH 01/13] bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack() Naveen N. Rao
                   ` (14 more replies)
  0 siblings, 15 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-06 11:45 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

A set of fixes and updates to powerpc BPF JIT:
- Patches 1-3 fix issues with the existing powerpc JIT and are tagged 
  for -stable.
- Patch 4 fixes a build issue with bpf selftests on powerpc.
- Patches 5-9 handle some corner cases and make some small improvements.
- Patches 10-13 optimize how function calls are handled in ppc64. 

Patches 7 and 8 were previously posted, and while patch 7 has no 
changes, patch 8 has been reworked to handle BPF_EXIT differently.


- Naveen


Naveen N. Rao (13):
  bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack()
  powerpc32/bpf: Fix codegen for bpf-to-bpf calls
  powerpc/bpf: Update ldimm64 instructions during extra pass
  tools/bpf: Rename 'struct event' to avoid naming conflict
  powerpc/bpf: Skip branch range validation during first pass
  powerpc/bpf: Emit a single branch instruction for known short branch
    ranges
  powerpc/bpf: Handle large branch ranges with BPF_EXIT
  powerpc64/bpf: Limit 'ldbrx' to processors compliant with ISA v2.06
  powerpc64/bpf: Do not save/restore LR on each call to
    bpf_stf_barrier()
  powerpc64/bpf: Use r12 for constant blinding
  powerpc64/bpf elfv2: Setup kernel TOC in r2 on entry
  powerpc64/bpf elfv1: Do not load TOC before calling functions
  powerpc64/bpf: Optimize instruction sequence used for function calls

 arch/powerpc/include/asm/ppc-opcode.h |   1 +
 arch/powerpc/net/bpf_jit.h            |   8 +-
 arch/powerpc/net/bpf_jit64.h          |   2 +-
 arch/powerpc/net/bpf_jit_comp.c       |  55 ++++++++++--
 arch/powerpc/net/bpf_jit_comp32.c     |  32 +++++--
 arch/powerpc/net/bpf_jit_comp64.c     | 124 ++++++++++++++------------
 kernel/bpf/stackmap.c                 |   5 +-
 tools/bpf/runqslower/runqslower.bpf.c |   2 +-
 tools/bpf/runqslower/runqslower.c     |   2 +-
 tools/bpf/runqslower/runqslower.h     |   2 +-
 10 files changed, 153 insertions(+), 80 deletions(-)


base-commit: bdcf18e133f656b2c97390a594fc95e37849e682
-- 
2.34.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 01/13] bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack()
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
@ 2022-01-06 11:45 ` Naveen N. Rao
  2022-01-07 10:21   ` Daniel Borkmann
  2022-01-10  8:57   ` Christophe Leroy
  2022-01-06 11:45 ` [PATCH 02/13] powerpc32/bpf: Fix codegen for bpf-to-bpf calls Naveen N. Rao
                   ` (13 subsequent siblings)
  14 siblings, 2 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-06 11:45 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

task_pt_regs() can return NULL on powerpc for kernel threads. This is
then used in __bpf_get_stack() to check for user mode, resulting in a
kernel oops. Guard against this by checking return value of
task_pt_regs() before trying to obtain the call chain.

Fixes: fa28dcb82a38f8 ("bpf: Introduce helper bpf_get_task_stack()")
Cc: stable@vger.kernel.org # v5.9+
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 kernel/bpf/stackmap.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 6e75bbee39f0b5..0dcaed4d3f4cec 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -525,13 +525,14 @@ BPF_CALL_4(bpf_get_task_stack, struct task_struct *, task, void *, buf,
 	   u32, size, u64, flags)
 {
 	struct pt_regs *regs;
-	long res;
+	long res = -EINVAL;
 
 	if (!try_get_task_stack(task))
 		return -EFAULT;
 
 	regs = task_pt_regs(task);
-	res = __bpf_get_stack(regs, task, NULL, buf, size, flags);
+	if (regs)
+		res = __bpf_get_stack(regs, task, NULL, buf, size, flags);
 	put_task_stack(task);
 
 	return res;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 02/13] powerpc32/bpf: Fix codegen for bpf-to-bpf calls
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
  2022-01-06 11:45 ` [PATCH 01/13] bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack() Naveen N. Rao
@ 2022-01-06 11:45 ` Naveen N. Rao
  2022-01-10  9:06   ` Christophe Leroy
  2022-01-06 11:45 ` [PATCH 03/13] powerpc/bpf: Update ldimm64 instructions during extra pass Naveen N. Rao
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-06 11:45 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

Pad instructions emitted for BPF_CALL so that the number of instructions
generated does not change for different function addresses. This is
especially important for calls to other bpf functions, whose address
will only be known during extra pass.

Fixes: 51c66ad849a703 ("powerpc/bpf: Implement extended BPF on PPC32")
Cc: stable@vger.kernel.org # v5.13+
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit_comp32.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
index d3a52cd42f5346..997a47fa615b30 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -191,6 +191,9 @@ void bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 fun
 
 	if (image && rel < 0x2000000 && rel >= -0x2000000) {
 		PPC_BL_ABS(func);
+		EMIT(PPC_RAW_NOP());
+		EMIT(PPC_RAW_NOP());
+		EMIT(PPC_RAW_NOP());
 	} else {
 		/* Load function address into r0 */
 		EMIT(PPC_RAW_LIS(_R0, IMM_H(func)));
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 03/13] powerpc/bpf: Update ldimm64 instructions during extra pass
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
  2022-01-06 11:45 ` [PATCH 01/13] bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack() Naveen N. Rao
  2022-01-06 11:45 ` [PATCH 02/13] powerpc32/bpf: Fix codegen for bpf-to-bpf calls Naveen N. Rao
@ 2022-01-06 11:45 ` Naveen N. Rao
  2022-01-08 14:45   ` Jiri Olsa
  2022-01-10  9:27   ` Christophe Leroy
  2022-01-06 11:45 ` [PATCH 04/13] tools/bpf: Rename 'struct event' to avoid naming conflict Naveen N. Rao
                   ` (11 subsequent siblings)
  14 siblings, 2 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-06 11:45 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

These instructions are updated after the initial JIT, so redo codegen
during the extra pass. Rename bpf_jit_fixup_subprog_calls() to clarify
that this is more than just subprog calls.

Fixes: 69c087ba6225b5 ("bpf: Add bpf_for_each_map_elem() helper")
Cc: stable@vger.kernel.org # v5.15
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit_comp.c   | 29 +++++++++++++++++++++++------
 arch/powerpc/net/bpf_jit_comp32.c |  6 ++++++
 arch/powerpc/net/bpf_jit_comp64.c |  7 ++++++-
 3 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index d6ffdd0f2309d0..56dd1f4e3e4447 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -23,15 +23,15 @@ static void bpf_jit_fill_ill_insns(void *area, unsigned int size)
 	memset32(area, BREAKPOINT_INSTRUCTION, size / 4);
 }
 
-/* Fix the branch target addresses for subprog calls */
-static int bpf_jit_fixup_subprog_calls(struct bpf_prog *fp, u32 *image,
-				       struct codegen_context *ctx, u32 *addrs)
+/* Fix updated addresses (for subprog calls, ldimm64, et al) during extra pass */
+static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image,
+				   struct codegen_context *ctx, u32 *addrs)
 {
 	const struct bpf_insn *insn = fp->insnsi;
 	bool func_addr_fixed;
 	u64 func_addr;
 	u32 tmp_idx;
-	int i, ret;
+	int i, j, ret;
 
 	for (i = 0; i < fp->len; i++) {
 		/*
@@ -66,6 +66,23 @@ static int bpf_jit_fixup_subprog_calls(struct bpf_prog *fp, u32 *image,
 			 * of the JITed sequence remains unchanged.
 			 */
 			ctx->idx = tmp_idx;
+		} else if (insn[i].code == (BPF_LD | BPF_IMM | BPF_DW)) {
+			tmp_idx = ctx->idx;
+			ctx->idx = addrs[i] / 4;
+#ifdef CONFIG_PPC32
+			PPC_LI32(ctx->b2p[insn[i].dst_reg] - 1, (u32)insn[i + 1].imm);
+			PPC_LI32(ctx->b2p[insn[i].dst_reg], (u32)insn[i].imm);
+			for (j = ctx->idx - addrs[i] / 4; j < 4; j++)
+				EMIT(PPC_RAW_NOP());
+#else
+			func_addr = ((u64)(u32)insn[i].imm) | (((u64)(u32)insn[i + 1].imm) << 32);
+			PPC_LI64(b2p[insn[i].dst_reg], func_addr);
+			/* overwrite rest with nops */
+			for (j = ctx->idx - addrs[i] / 4; j < 5; j++)
+				EMIT(PPC_RAW_NOP());
+#endif
+			ctx->idx = tmp_idx;
+			i++;
 		}
 	}
 
@@ -200,13 +217,13 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 		/*
 		 * Do not touch the prologue and epilogue as they will remain
 		 * unchanged. Only fix the branch target address for subprog
-		 * calls in the body.
+		 * calls in the body, and ldimm64 instructions.
 		 *
 		 * This does not change the offsets and lengths of the subprog
 		 * call instruction sequences and hence, the size of the JITed
 		 * image as well.
 		 */
-		bpf_jit_fixup_subprog_calls(fp, code_base, &cgctx, addrs);
+		bpf_jit_fixup_addresses(fp, code_base, &cgctx, addrs);
 
 		/* There is no need to perform the usual passes. */
 		goto skip_codegen_passes;
diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
index 997a47fa615b30..2258d3886d02ec 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -293,6 +293,8 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		bool func_addr_fixed;
 		u64 func_addr;
 		u32 true_cond;
+		u32 tmp_idx;
+		int j;
 
 		/*
 		 * addrs[] maps a BPF bytecode address into a real offset from
@@ -908,8 +910,12 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		 * 16 byte instruction that uses two 'struct bpf_insn'
 		 */
 		case BPF_LD | BPF_IMM | BPF_DW: /* dst = (u64) imm */
+			tmp_idx = ctx->idx;
 			PPC_LI32(dst_reg_h, (u32)insn[i + 1].imm);
 			PPC_LI32(dst_reg, (u32)insn[i].imm);
+			/* padding to allow full 4 instructions for later patching */
+			for (j = ctx->idx - tmp_idx; j < 4; j++)
+				EMIT(PPC_RAW_NOP());
 			/* Adjust for two bpf instructions */
 			addrs[++i] = ctx->idx * 4;
 			break;
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 472d4a551945dd..3d018ecc475b2b 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -319,6 +319,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		u64 imm64;
 		u32 true_cond;
 		u32 tmp_idx;
+		int j;
 
 		/*
 		 * addrs[] maps a BPF bytecode address into a real offset from
@@ -848,9 +849,13 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 		case BPF_LD | BPF_IMM | BPF_DW: /* dst = (u64) imm */
 			imm64 = ((u64)(u32) insn[i].imm) |
 				    (((u64)(u32) insn[i+1].imm) << 32);
+			tmp_idx = ctx->idx;
+			PPC_LI64(dst_reg, imm64);
+			/* padding to allow full 5 instructions for later patching */
+			for (j = ctx->idx - tmp_idx; j < 5; j++)
+				EMIT(PPC_RAW_NOP());
 			/* Adjust for two bpf instructions */
 			addrs[++i] = ctx->idx * 4;
-			PPC_LI64(dst_reg, imm64);
 			break;
 
 		/*
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 04/13] tools/bpf: Rename 'struct event' to avoid naming conflict
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
                   ` (2 preceding siblings ...)
  2022-01-06 11:45 ` [PATCH 03/13] powerpc/bpf: Update ldimm64 instructions during extra pass Naveen N. Rao
@ 2022-01-06 11:45 ` Naveen N. Rao
  2022-01-07 10:21   ` Daniel Borkmann
  2022-01-06 11:45 ` [PATCH 05/13] powerpc/bpf: Skip branch range validation during first pass Naveen N. Rao
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-06 11:45 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

On ppc64le, trying to build bpf seltests throws the below warning:
  In file included from runqslower.bpf.c:5:
  ./runqslower.h:7:8: error: redefinition of 'event'
  struct event {
	 ^
  /home/naveen/linux/tools/testing/selftests/bpf/tools/build/runqslower/vmlinux.h:156602:8:
  note: previous definition is here
  struct event {
	 ^

This happens since 'struct event' is defined in
drivers/net/ethernet/alteon/acenic.h . Rename the one in runqslower to a
more appropriate 'runq_event' to avoid the naming conflict.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 tools/bpf/runqslower/runqslower.bpf.c | 2 +-
 tools/bpf/runqslower/runqslower.c     | 2 +-
 tools/bpf/runqslower/runqslower.h     | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/bpf/runqslower/runqslower.bpf.c b/tools/bpf/runqslower/runqslower.bpf.c
index ab9353f2fd46ab..9a5c1f008fe603 100644
--- a/tools/bpf/runqslower/runqslower.bpf.c
+++ b/tools/bpf/runqslower/runqslower.bpf.c
@@ -68,7 +68,7 @@ int handle__sched_switch(u64 *ctx)
 	 */
 	struct task_struct *prev = (struct task_struct *)ctx[1];
 	struct task_struct *next = (struct task_struct *)ctx[2];
-	struct event event = {};
+	struct runq_event event = {};
 	u64 *tsp, delta_us;
 	long state;
 	u32 pid;
diff --git a/tools/bpf/runqslower/runqslower.c b/tools/bpf/runqslower/runqslower.c
index d8971584495213..ff7f5e8485e937 100644
--- a/tools/bpf/runqslower/runqslower.c
+++ b/tools/bpf/runqslower/runqslower.c
@@ -100,7 +100,7 @@ static int bump_memlock_rlimit(void)
 
 void handle_event(void *ctx, int cpu, void *data, __u32 data_sz)
 {
-	const struct event *e = data;
+	const struct runq_event *e = data;
 	struct tm *tm;
 	char ts[32];
 	time_t t;
diff --git a/tools/bpf/runqslower/runqslower.h b/tools/bpf/runqslower/runqslower.h
index 9db225425e5ff9..4f70f07200c23d 100644
--- a/tools/bpf/runqslower/runqslower.h
+++ b/tools/bpf/runqslower/runqslower.h
@@ -4,7 +4,7 @@
 
 #define TASK_COMM_LEN 16
 
-struct event {
+struct runq_event {
 	char task[TASK_COMM_LEN];
 	__u64 delta_us;
 	pid_t pid;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 05/13] powerpc/bpf: Skip branch range validation during first pass
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
                   ` (3 preceding siblings ...)
  2022-01-06 11:45 ` [PATCH 04/13] tools/bpf: Rename 'struct event' to avoid naming conflict Naveen N. Rao
@ 2022-01-06 11:45 ` Naveen N. Rao
  2022-01-06 11:45 ` [PATCH 06/13] powerpc/bpf: Emit a single branch instruction for known short branch ranges Naveen N. Rao
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-06 11:45 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

During the first pass, addrs[] is still being populated. So, all
branches to following instructions will appear to be going to the start
of the JIT program. Ignore branch range validation for such instructions
and assume those to be in range. Branch range validation will happen
during the second pass after addrs[] is setup properly.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index b20a2a83a6e75b..9cdd33d6be4cc0 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -27,7 +27,7 @@
 #define PPC_JMP(dest)							      \
 	do {								      \
 		long offset = (long)(dest) - (ctx->idx * 4);		      \
-		if (!is_offset_in_branch_range(offset)) {		      \
+		if ((dest) != 0 && !is_offset_in_branch_range(offset)) {		      \
 			pr_err_ratelimited("Branch offset 0x%lx (@%u) out of range\n", offset, ctx->idx);			\
 			return -ERANGE;					      \
 		}							      \
@@ -41,7 +41,7 @@
 #define PPC_BCC_SHORT(cond, dest)					      \
 	do {								      \
 		long offset = (long)(dest) - (ctx->idx * 4);		      \
-		if (!is_offset_in_cond_branch_range(offset)) {		      \
+		if ((dest) != 0 && !is_offset_in_cond_branch_range(offset)) {		      \
 			pr_err_ratelimited("Conditional branch offset 0x%lx (@%u) out of range\n", offset, ctx->idx);		\
 			return -ERANGE;					      \
 		}							      \
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 06/13] powerpc/bpf: Emit a single branch instruction for known short branch ranges
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
                   ` (4 preceding siblings ...)
  2022-01-06 11:45 ` [PATCH 05/13] powerpc/bpf: Skip branch range validation during first pass Naveen N. Rao
@ 2022-01-06 11:45 ` Naveen N. Rao
  2022-01-06 11:45 ` [PATCH 07/13] powerpc/bpf: Handle large branch ranges with BPF_EXIT Naveen N. Rao
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-06 11:45 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

PPC_BCC() emits two instructions to accommodate scenarios where we need
to branch outside the range of a conditional branch. PPC_BCC_SHORT()
emits a single branch instruction and can be used when the branch is
known to be within a conditional branch range.

Convert some of the uses of PPC_BCC() in the powerpc BPF JIT over to
PPC_BCC_SHORT() where we know the branch range.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit_comp32.c | 8 ++++----
 arch/powerpc/net/bpf_jit_comp64.c | 8 ++++----
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
index 2258d3886d02ec..72c2c47612964d 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -221,7 +221,7 @@ static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 o
 	EMIT(PPC_RAW_LWZ(_R0, b2p_bpf_array, offsetof(struct bpf_array, map.max_entries)));
 	EMIT(PPC_RAW_CMPLW(b2p_index, _R0));
 	EMIT(PPC_RAW_LWZ(_R0, _R1, bpf_jit_stack_offsetof(ctx, BPF_PPC_TC)));
-	PPC_BCC(COND_GE, out);
+	PPC_BCC_SHORT(COND_GE, out);
 
 	/*
 	 * if (tail_call_cnt > MAX_TAIL_CALL_CNT)
@@ -230,7 +230,7 @@ static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 o
 	EMIT(PPC_RAW_CMPLWI(_R0, MAX_TAIL_CALL_CNT));
 	/* tail_call_cnt++; */
 	EMIT(PPC_RAW_ADDIC(_R0, _R0, 1));
-	PPC_BCC(COND_GT, out);
+	PPC_BCC_SHORT(COND_GT, out);
 
 	/* prog = array->ptrs[index]; */
 	EMIT(PPC_RAW_RLWINM(_R3, b2p_index, 2, 0, 29));
@@ -243,7 +243,7 @@ static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 o
 	 *   goto out;
 	 */
 	EMIT(PPC_RAW_CMPLWI(_R3, 0));
-	PPC_BCC(COND_EQ, out);
+	PPC_BCC_SHORT(COND_EQ, out);
 
 	/* goto *(prog->bpf_func + prologue_size); */
 	EMIT(PPC_RAW_LWZ(_R3, _R3, offsetof(struct bpf_prog, bpf_func)));
@@ -834,7 +834,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 			if (BPF_MODE(code) == BPF_PROBE_MEM) {
 				PPC_LI32(_R0, TASK_SIZE - off);
 				EMIT(PPC_RAW_CMPLW(src_reg, _R0));
-				PPC_BCC(COND_GT, (ctx->idx + 5) * 4);
+				PPC_BCC_SHORT(COND_GT, (ctx->idx + 4) * 4);
 				EMIT(PPC_RAW_LI(dst_reg, 0));
 				/*
 				 * For BPF_DW case, "li reg_h,0" would be needed when
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 3d018ecc475b2b..2b291d435d2e26 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -225,7 +225,7 @@ static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 o
 	EMIT(PPC_RAW_LWZ(b2p[TMP_REG_1], b2p_bpf_array, offsetof(struct bpf_array, map.max_entries)));
 	EMIT(PPC_RAW_RLWINM(b2p_index, b2p_index, 0, 0, 31));
 	EMIT(PPC_RAW_CMPLW(b2p_index, b2p[TMP_REG_1]));
-	PPC_BCC(COND_GE, out);
+	PPC_BCC_SHORT(COND_GE, out);
 
 	/*
 	 * if (tail_call_cnt > MAX_TAIL_CALL_CNT)
@@ -233,7 +233,7 @@ static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 o
 	 */
 	PPC_BPF_LL(b2p[TMP_REG_1], 1, bpf_jit_stack_tailcallcnt(ctx));
 	EMIT(PPC_RAW_CMPLWI(b2p[TMP_REG_1], MAX_TAIL_CALL_CNT));
-	PPC_BCC(COND_GT, out);
+	PPC_BCC_SHORT(COND_GT, out);
 
 	/*
 	 * tail_call_cnt++;
@@ -251,7 +251,7 @@ static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 o
 	 *   goto out;
 	 */
 	EMIT(PPC_RAW_CMPLDI(b2p[TMP_REG_1], 0));
-	PPC_BCC(COND_EQ, out);
+	PPC_BCC_SHORT(COND_EQ, out);
 
 	/* goto *(prog->bpf_func + prologue_size); */
 	PPC_BPF_LL(b2p[TMP_REG_1], b2p[TMP_REG_1], offsetof(struct bpf_prog, bpf_func));
@@ -803,7 +803,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 				else /* BOOK3S_64 */
 					PPC_LI64(b2p[TMP_REG_2], PAGE_OFFSET);
 				EMIT(PPC_RAW_CMPLD(b2p[TMP_REG_1], b2p[TMP_REG_2]));
-				PPC_BCC(COND_GT, (ctx->idx + 4) * 4);
+				PPC_BCC_SHORT(COND_GT, (ctx->idx + 3) * 4);
 				EMIT(PPC_RAW_LI(dst_reg, 0));
 				/*
 				 * Check if 'off' is word aligned because PPC_BPF_LL()
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 07/13] powerpc/bpf: Handle large branch ranges with BPF_EXIT
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
                   ` (5 preceding siblings ...)
  2022-01-06 11:45 ` [PATCH 06/13] powerpc/bpf: Emit a single branch instruction for known short branch ranges Naveen N. Rao
@ 2022-01-06 11:45 ` Naveen N. Rao
  2022-01-06 11:45 ` [PATCH 08/13] powerpc64/bpf: Limit 'ldbrx' to processors compliant with ISA v2.06 Naveen N. Rao
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-06 11:45 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

In some scenarios, it is possible that the program epilogue is outside
the branch range for a BPF_EXIT instruction. Instead of rejecting such
programs, emit epilogue as an alternate exit point from the program.
Track the location of the same so that subsequent exits can take either
of the two paths.

Reported-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit.h        |  2 ++
 arch/powerpc/net/bpf_jit_comp.c   | 22 +++++++++++++++++++++-
 arch/powerpc/net/bpf_jit_comp32.c |  7 +++++--
 arch/powerpc/net/bpf_jit_comp64.c |  7 +++++--
 4 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 9cdd33d6be4cc0..3b5c44c0b6638d 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -151,6 +151,7 @@ struct codegen_context {
 	unsigned int stack_size;
 	int b2p[ARRAY_SIZE(b2p)];
 	unsigned int exentry_idx;
+	unsigned int alt_exit_addr;
 };
 
 #ifdef CONFIG_PPC32
@@ -186,6 +187,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx);
 void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx);
 void bpf_jit_realloc_regs(struct codegen_context *ctx);
+int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, int tmp_reg, long exit_addr);
 
 int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, int pass, struct codegen_context *ctx,
 			  int insn_idx, int jmp_off, int dst_reg);
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 56dd1f4e3e4447..141e64585b6458 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -89,6 +89,22 @@ static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image,
 	return 0;
 }
 
+int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, int tmp_reg, long exit_addr)
+{
+	if (!exit_addr || is_offset_in_branch_range(exit_addr - (ctx->idx * 4))) {
+		PPC_JMP(exit_addr);
+	} else if (ctx->alt_exit_addr) {
+		if (WARN_ON(!is_offset_in_branch_range((long)ctx->alt_exit_addr - (ctx->idx * 4))))
+			return -1;
+		PPC_JMP(ctx->alt_exit_addr);
+	} else {
+		ctx->alt_exit_addr = ctx->idx * 4;
+		bpf_jit_build_epilogue(image, ctx);
+	}
+
+	return 0;
+}
+
 struct powerpc64_jit_data {
 	struct bpf_binary_header *header;
 	u32 *addrs;
@@ -177,8 +193,10 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 	 * If we have seen a tail call, we need a second pass.
 	 * This is because bpf_jit_emit_common_epilogue() is called
 	 * from bpf_jit_emit_tail_call() with a not yet stable ctx->seen.
+	 * We also need a second pass if we ended up with too large
+	 * a program so as to ensure BPF_EXIT branches are in range.
 	 */
-	if (cgctx.seen & SEEN_TAILCALL) {
+	if (cgctx.seen & SEEN_TAILCALL || !is_offset_in_branch_range((long)cgctx.idx * 4)) {
 		cgctx.idx = 0;
 		if (bpf_jit_build_body(fp, 0, &cgctx, addrs, 0)) {
 			fp = org_fp;
@@ -193,6 +211,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 	 * calculate total size from idx.
 	 */
 	bpf_jit_build_prologue(0, &cgctx);
+	addrs[fp->len] = cgctx.idx * 4;
 	bpf_jit_build_epilogue(0, &cgctx);
 
 	fixup_len = fp->aux->num_exentries * BPF_FIXUP_LEN * 4;
@@ -233,6 +252,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 	for (pass = 1; pass < 3; pass++) {
 		/* Now build the prologue, body code & epilogue for real. */
 		cgctx.idx = 0;
+		cgctx.alt_exit_addr = 0;
 		bpf_jit_build_prologue(code_base, &cgctx);
 		if (bpf_jit_build_body(fp, code_base, &cgctx, addrs, pass)) {
 			bpf_jit_binary_free(bpf_hdr);
diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
index 72c2c47612964d..8c918db4c2c486 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -929,8 +929,11 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 			 * the epilogue. If we _are_ the last instruction,
 			 * we'll just fall through to the epilogue.
 			 */
-			if (i != flen - 1)
-				PPC_JMP(exit_addr);
+			if (i != flen - 1) {
+				ret = bpf_jit_emit_exit_insn(image, ctx, _R0, exit_addr);
+				if (ret)
+					return ret;
+			}
 			/* else fall through to the epilogue */
 			break;
 
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 2b291d435d2e26..48d2ca3fe126dd 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -867,8 +867,11 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 			 * the epilogue. If we _are_ the last instruction,
 			 * we'll just fall through to the epilogue.
 			 */
-			if (i != flen - 1)
-				PPC_JMP(exit_addr);
+			if (i != flen - 1) {
+				ret = bpf_jit_emit_exit_insn(image, ctx, b2p[TMP_REG_1], exit_addr);
+				if (ret)
+					return ret;
+			}
 			/* else fall through to the epilogue */
 			break;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 08/13] powerpc64/bpf: Limit 'ldbrx' to processors compliant with ISA v2.06
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
                   ` (6 preceding siblings ...)
  2022-01-06 11:45 ` [PATCH 07/13] powerpc/bpf: Handle large branch ranges with BPF_EXIT Naveen N. Rao
@ 2022-01-06 11:45 ` Naveen N. Rao
  2022-01-06 11:45 ` [PATCH 09/13] powerpc64/bpf: Do not save/restore LR on each call to bpf_stf_barrier() Naveen N. Rao
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-06 11:45 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

Johan reported the below crash with test_bpf on ppc64 e5500:

  test_bpf: #296 ALU_END_FROM_LE 64: 0x0123456789abcdef -> 0x67452301 jited:1
  Oops: Exception in kernel mode, sig: 4 [#1]
  BE PAGE_SIZE=4K SMP NR_CPUS=24 QEMU e500
  Modules linked in: test_bpf(+)
  CPU: 0 PID: 76 Comm: insmod Not tainted 5.14.0-03771-g98c2059e008a-dirty #1
  NIP:  8000000000061c3c LR: 80000000006dea64 CTR: 8000000000061c18
  REGS: c0000000032d3420 TRAP: 0700   Not tainted (5.14.0-03771-g98c2059e008a-dirty)
  MSR:  0000000080089000 <EE,ME>  CR: 88002822  XER: 20000000 IRQMASK: 0
  <...>
  NIP [8000000000061c3c] 0x8000000000061c3c
  LR [80000000006dea64] .__run_one+0x104/0x17c [test_bpf]
  Call Trace:
   .__run_one+0x60/0x17c [test_bpf] (unreliable)
   .test_bpf_init+0x6a8/0xdc8 [test_bpf]
   .do_one_initcall+0x6c/0x28c
   .do_init_module+0x68/0x28c
   .load_module+0x2460/0x2abc
   .__do_sys_init_module+0x120/0x18c
   .system_call_exception+0x110/0x1b8
   system_call_common+0xf0/0x210
  --- interrupt: c00 at 0x101d0acc
  <...>
  ---[ end trace 47b2bf19090bb3d0 ]---

  Illegal instruction

The illegal instruction turned out to be 'ldbrx' emitted for
BPF_FROM_[L|B]E, which was only introduced in ISA v2.06. Guard use of
the same and implement an alternative approach for older processors.

Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Tested-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Fixes: 156d0e290e969c ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF")
Reported-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/ppc-opcode.h |  1 +
 arch/powerpc/net/bpf_jit_comp64.c     | 22 +++++++++++++---------
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index efad07081cc0e5..9675303b724e93 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -500,6 +500,7 @@
 #define PPC_RAW_LDX(r, base, b)		(0x7c00002a | ___PPC_RT(r) | ___PPC_RA(base) | ___PPC_RB(b))
 #define PPC_RAW_LHZ(r, base, i)		(0xa0000000 | ___PPC_RT(r) | ___PPC_RA(base) | IMM_L(i))
 #define PPC_RAW_LHBRX(r, base, b)	(0x7c00062c | ___PPC_RT(r) | ___PPC_RA(base) | ___PPC_RB(b))
+#define PPC_RAW_LWBRX(r, base, b)	(0x7c00042c | ___PPC_RT(r) | ___PPC_RA(base) | ___PPC_RB(b))
 #define PPC_RAW_LDBRX(r, base, b)	(0x7c000428 | ___PPC_RT(r) | ___PPC_RA(base) | ___PPC_RB(b))
 #define PPC_RAW_STWCX(s, a, b)		(0x7c00012d | ___PPC_RS(s) | ___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_RAW_CMPWI(a, i)		(0x2c000000 | ___PPC_RA(a) | IMM_L(i))
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 48d2ca3fe126dd..7d38b4be26c3a5 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -634,17 +634,21 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 				EMIT(PPC_RAW_MR(dst_reg, b2p[TMP_REG_1]));
 				break;
 			case 64:
-				/*
-				 * Way easier and faster(?) to store the value
-				 * into stack and then use ldbrx
-				 *
-				 * ctx->seen will be reliable in pass2, but
-				 * the instructions generated will remain the
-				 * same across all passes
-				 */
+				/* Store the value to stack and then use byte-reverse loads */
 				PPC_BPF_STL(dst_reg, 1, bpf_jit_stack_local(ctx));
 				EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], 1, bpf_jit_stack_local(ctx)));
-				EMIT(PPC_RAW_LDBRX(dst_reg, 0, b2p[TMP_REG_1]));
+				if (cpu_has_feature(CPU_FTR_ARCH_206)) {
+					EMIT(PPC_RAW_LDBRX(dst_reg, 0, b2p[TMP_REG_1]));
+				} else {
+					EMIT(PPC_RAW_LWBRX(dst_reg, 0, b2p[TMP_REG_1]));
+					if (IS_ENABLED(CONFIG_CPU_LITTLE_ENDIAN))
+						EMIT(PPC_RAW_SLDI(dst_reg, dst_reg, 32));
+					EMIT(PPC_RAW_LI(b2p[TMP_REG_2], 4));
+					EMIT(PPC_RAW_LWBRX(b2p[TMP_REG_2], b2p[TMP_REG_2], b2p[TMP_REG_1]));
+					if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
+						EMIT(PPC_RAW_SLDI(b2p[TMP_REG_2], b2p[TMP_REG_2], 32));
+					EMIT(PPC_RAW_OR(dst_reg, dst_reg, b2p[TMP_REG_2]));
+				}
 				break;
 			}
 			break;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 09/13] powerpc64/bpf: Do not save/restore LR on each call to bpf_stf_barrier()
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
                   ` (7 preceding siblings ...)
  2022-01-06 11:45 ` [PATCH 08/13] powerpc64/bpf: Limit 'ldbrx' to processors compliant with ISA v2.06 Naveen N. Rao
@ 2022-01-06 11:45 ` Naveen N. Rao
  2022-01-06 11:45 ` [PATCH 10/13] powerpc64/bpf: Use r12 for constant blinding Naveen N. Rao
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-06 11:45 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

Instead of saving and restoring LR before each invocation to
bpf_stf_barrier(), set SEEN_FUNC flag so that we save/restore LR in
prologue/epilogue.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit_comp64.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 7d38b4be26c3a5..ce4fc59bbd6a92 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -690,11 +690,10 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 				EMIT(PPC_RAW_ORI(_R31, _R31, 0));
 				break;
 			case STF_BARRIER_FALLBACK:
-				EMIT(PPC_RAW_MFLR(b2p[TMP_REG_1]));
+				ctx->seen |= SEEN_FUNC;
 				PPC_LI64(12, dereference_kernel_function_descriptor(bpf_stf_barrier));
 				EMIT(PPC_RAW_MTCTR(12));
 				EMIT(PPC_RAW_BCTRL());
-				EMIT(PPC_RAW_MTLR(b2p[TMP_REG_1]));
 				break;
 			case STF_BARRIER_NONE:
 				break;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 10/13] powerpc64/bpf: Use r12 for constant blinding
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
                   ` (8 preceding siblings ...)
  2022-01-06 11:45 ` [PATCH 09/13] powerpc64/bpf: Do not save/restore LR on each call to bpf_stf_barrier() Naveen N. Rao
@ 2022-01-06 11:45 ` Naveen N. Rao
  2022-01-06 11:45 ` [PATCH 11/13] powerpc64/bpf elfv2: Setup kernel TOC in r2 on entry Naveen N. Rao
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-06 11:45 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

In preparation for preserving kernel toc in r2, switch BPF_REG_AX from
r2 to r12. r12 is not used by bpf JIT except during external helper/bpf
calls, or with BPF_NOSPEC. These sequences aren't emitted when
BPF_REG_AX is used for constant blinding and other purposes.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit64.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h
index b63b35e45e558c..82cdfee412784a 100644
--- a/arch/powerpc/net/bpf_jit64.h
+++ b/arch/powerpc/net/bpf_jit64.h
@@ -56,7 +56,7 @@ const int b2p[MAX_BPF_JIT_REG + 2] = {
 	/* frame pointer aka BPF_REG_10 */
 	[BPF_REG_FP] = 31,
 	/* eBPF jit internal registers */
-	[BPF_REG_AX] = 2,
+	[BPF_REG_AX] = 12,
 	[TMP_REG_1] = 9,
 	[TMP_REG_2] = 10
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 11/13] powerpc64/bpf elfv2: Setup kernel TOC in r2 on entry
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
                   ` (9 preceding siblings ...)
  2022-01-06 11:45 ` [PATCH 10/13] powerpc64/bpf: Use r12 for constant blinding Naveen N. Rao
@ 2022-01-06 11:45 ` Naveen N. Rao
  2022-01-10  9:20   ` Christophe Leroy
  2022-01-06 11:45 ` [PATCH 12/13] powerpc64/bpf elfv1: Do not load TOC before calling functions Naveen N. Rao
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-06 11:45 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

In preparation for using kernel TOC, load the same in r2 on entry. With
elfv1, the kernel TOC is already setup by our caller so we just emit a
nop. We adjust the number of instructions to skip on a tail call
accordingly.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit_comp64.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index ce4fc59bbd6a92..e05b577d95bf11 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -73,6 +73,12 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
 {
 	int i;
 
+#ifdef PPC64_ELF_ABI_v2
+	PPC_BPF_LL(_R2, _R13, offsetof(struct paca_struct, kernel_toc));
+#else
+	EMIT(PPC_RAW_NOP());
+#endif
+
 	/*
 	 * Initialize tail_call_cnt if we do tail calls.
 	 * Otherwise, put in NOPs so that it can be skipped when we are
@@ -87,7 +93,7 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
 		EMIT(PPC_RAW_NOP());
 	}
 
-#define BPF_TAILCALL_PROLOGUE_SIZE	8
+#define BPF_TAILCALL_PROLOGUE_SIZE	12
 
 	if (bpf_has_stack_frame(ctx)) {
 		/*
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 12/13] powerpc64/bpf elfv1: Do not load TOC before calling functions
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
                   ` (10 preceding siblings ...)
  2022-01-06 11:45 ` [PATCH 11/13] powerpc64/bpf elfv2: Setup kernel TOC in r2 on entry Naveen N. Rao
@ 2022-01-06 11:45 ` Naveen N. Rao
  2022-01-06 11:45 ` [PATCH 13/13] powerpc64/bpf: Optimize instruction sequence used for function calls Naveen N. Rao
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-06 11:45 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

BPF helpers always reside in core kernel and all BPF programs use the
kernel TOC. As such, there is no need to load the TOC before calling
helpers or other BPF functions. Drop code to do the same.

Add a check to ensure we don't proceed if this assumption ever changes
in future.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit.h        |  2 +-
 arch/powerpc/net/bpf_jit_comp.c   |  4 +++-
 arch/powerpc/net/bpf_jit_comp32.c |  8 +++++--
 arch/powerpc/net/bpf_jit_comp64.c | 39 ++++++++++++++++---------------
 4 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 3b5c44c0b6638d..5cb3efd76715a9 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -181,7 +181,7 @@ static inline void bpf_clear_seen_register(struct codegen_context *ctx, int i)
 	ctx->seen &= ~(1 << (31 - i));
 }
 
-void bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 func);
+int bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 func);
 int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *ctx,
 		       u32 *addrs, int pass);
 void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx);
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 141e64585b6458..635f7448ff7952 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -59,7 +59,9 @@ static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image,
 			 */
 			tmp_idx = ctx->idx;
 			ctx->idx = addrs[i] / 4;
-			bpf_jit_emit_func_call_rel(image, ctx, func_addr);
+			ret = bpf_jit_emit_func_call_rel(image, ctx, func_addr);
+			if (ret)
+				return ret;
 
 			/*
 			 * Restore ctx->idx here. This is safe as the length
diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
index 8c918db4c2c486..ce753aca5b3321 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -185,7 +185,7 @@ void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx)
 	EMIT(PPC_RAW_BLR());
 }
 
-void bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 func)
+int bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 func)
 {
 	s32 rel = (s32)func - (s32)(image + ctx->idx);
 
@@ -201,6 +201,8 @@ void bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 fun
 		EMIT(PPC_RAW_MTCTR(_R0));
 		EMIT(PPC_RAW_BCTRL());
 	}
+
+	return 0;
 }
 
 static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 out)
@@ -953,7 +955,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 				EMIT(PPC_RAW_STW(bpf_to_ppc(ctx, BPF_REG_5), _R1, 12));
 			}
 
-			bpf_jit_emit_func_call_rel(image, ctx, func_addr);
+			ret = bpf_jit_emit_func_call_rel(image, ctx, func_addr);
+			if (ret)
+				return ret;
 
 			EMIT(PPC_RAW_MR(bpf_to_ppc(ctx, BPF_REG_0) - 1, _R3));
 			EMIT(PPC_RAW_MR(bpf_to_ppc(ctx, BPF_REG_0), _R4));
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index e05b577d95bf11..5da8e54d4d70b6 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -152,9 +152,13 @@ void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx)
 	EMIT(PPC_RAW_BLR());
 }
 
-static void bpf_jit_emit_func_call_hlp(u32 *image, struct codegen_context *ctx,
-				       u64 func)
+static int bpf_jit_emit_func_call_hlp(u32 *image, struct codegen_context *ctx, u64 func)
 {
+	unsigned long func_addr = func ? ppc_function_entry((void *)func) : 0;
+
+	if (WARN_ON_ONCE(!core_kernel_text(func_addr)))
+		return -EINVAL;
+
 #ifdef PPC64_ELF_ABI_v1
 	/* func points to the function descriptor */
 	PPC_LI64(b2p[TMP_REG_2], func);
@@ -162,25 +166,23 @@ static void bpf_jit_emit_func_call_hlp(u32 *image, struct codegen_context *ctx,
 	PPC_BPF_LL(b2p[TMP_REG_1], b2p[TMP_REG_2], 0);
 	/* ... and move it to CTR */
 	EMIT(PPC_RAW_MTCTR(b2p[TMP_REG_1]));
-	/*
-	 * Load TOC from function descriptor at offset 8.
-	 * We can clobber r2 since we get called through a
-	 * function pointer (so caller will save/restore r2)
-	 * and since we don't use a TOC ourself.
-	 */
-	PPC_BPF_LL(2, b2p[TMP_REG_2], 8);
 #else
 	/* We can clobber r12 */
 	PPC_FUNC_ADDR(12, func);
 	EMIT(PPC_RAW_MTCTR(12));
 #endif
 	EMIT(PPC_RAW_BCTRL());
+
+	return 0;
 }
 
-void bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 func)
+int bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 func)
 {
 	unsigned int i, ctx_idx = ctx->idx;
 
+	if (WARN_ON_ONCE(func && is_module_text_address(func)))
+		return -EINVAL;
+
 	/* Load function address into r12 */
 	PPC_LI64(12, func);
 
@@ -198,19 +200,14 @@ void bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 fun
 		EMIT(PPC_RAW_NOP());
 
 #ifdef PPC64_ELF_ABI_v1
-	/*
-	 * Load TOC from function descriptor at offset 8.
-	 * We can clobber r2 since we get called through a
-	 * function pointer (so caller will save/restore r2)
-	 * and since we don't use a TOC ourself.
-	 */
-	PPC_BPF_LL(2, 12, 8);
 	/* Load actual entry point from function descriptor */
 	PPC_BPF_LL(12, 12, 0);
 #endif
 
 	EMIT(PPC_RAW_MTCTR(12));
 	EMIT(PPC_RAW_BCTRL());
+
+	return 0;
 }
 
 static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 out)
@@ -896,9 +893,13 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
 				return ret;
 
 			if (func_addr_fixed)
-				bpf_jit_emit_func_call_hlp(image, ctx, func_addr);
+				ret = bpf_jit_emit_func_call_hlp(image, ctx, func_addr);
 			else
-				bpf_jit_emit_func_call_rel(image, ctx, func_addr);
+				ret = bpf_jit_emit_func_call_rel(image, ctx, func_addr);
+
+			if (ret)
+				return ret;
+
 			/* move return value from r3 to BPF_REG_0 */
 			EMIT(PPC_RAW_MR(b2p[BPF_REG_0], 3));
 			break;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 13/13] powerpc64/bpf: Optimize instruction sequence used for function calls
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
                   ` (11 preceding siblings ...)
  2022-01-06 11:45 ` [PATCH 12/13] powerpc64/bpf elfv1: Do not load TOC before calling functions Naveen N. Rao
@ 2022-01-06 11:45 ` Naveen N. Rao
  2022-01-06 21:46 ` [PATCH 00/13] powerpc/bpf: Some fixes and updates Daniel Borkmann
  2022-01-16 10:41 ` Michael Ellerman
  14 siblings, 0 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-06 11:45 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

When calling BPF helpers, we load the function address to call into a
register. This can result in upto 5 instructions. Optimize this by
instead using the kernel toc in r2 and adjusting offset to the BPF
helper. This works since all BPF helpers are part of kernel text, and
all BPF programs/functions utilize the kernel TOC.

Further more:
- load the actual function entry address in elf v1, rather than loading
  it through the function descriptor address.
- load the Local Entry Point (LEP) in elf v2 skipping TOC setup.
- consolidate code across elf abi v1 and v2 by using r12 on both.
- skip the initial instruction setting up r2 in case of BPF function
  calls, since we already have the kernel TOC setup in r2.

Reported-by: Anton Blanchard <anton@ozlabs.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/net/bpf_jit_comp64.c | 30 +++++++++++++-----------------
 1 file changed, 13 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 5da8e54d4d70b6..72179295f8e8c8 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -155,22 +155,20 @@ void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx)
 static int bpf_jit_emit_func_call_hlp(u32 *image, struct codegen_context *ctx, u64 func)
 {
 	unsigned long func_addr = func ? ppc_function_entry((void *)func) : 0;
+	long reladdr;
 
 	if (WARN_ON_ONCE(!core_kernel_text(func_addr)))
 		return -EINVAL;
 
-#ifdef PPC64_ELF_ABI_v1
-	/* func points to the function descriptor */
-	PPC_LI64(b2p[TMP_REG_2], func);
-	/* Load actual entry point from function descriptor */
-	PPC_BPF_LL(b2p[TMP_REG_1], b2p[TMP_REG_2], 0);
-	/* ... and move it to CTR */
-	EMIT(PPC_RAW_MTCTR(b2p[TMP_REG_1]));
-#else
-	/* We can clobber r12 */
-	PPC_FUNC_ADDR(12, func);
-	EMIT(PPC_RAW_MTCTR(12));
-#endif
+	reladdr = func_addr - kernel_toc_addr();
+	if (reladdr > 0x7FFFFFFF || reladdr < -(0x80000000L)) {
+		pr_err("eBPF: address of %ps out of range of kernel_toc.\n", (void *)func);
+		return -ERANGE;
+	}
+
+	EMIT(PPC_RAW_ADDIS(_R12, _R2, PPC_HA(reladdr)));
+	EMIT(PPC_RAW_ADDI(_R12, _R12, PPC_LO(reladdr)));
+	EMIT(PPC_RAW_MTCTR(_R12));
 	EMIT(PPC_RAW_BCTRL());
 
 	return 0;
@@ -183,6 +181,9 @@ int bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 func
 	if (WARN_ON_ONCE(func && is_module_text_address(func)))
 		return -EINVAL;
 
+	/* skip past descriptor (elf v1) and toc load (elf v2) */
+	func += FUNCTION_DESCR_SIZE + 4;
+
 	/* Load function address into r12 */
 	PPC_LI64(12, func);
 
@@ -199,11 +200,6 @@ int bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 func
 	for (i = ctx->idx - ctx_idx; i < 5; i++)
 		EMIT(PPC_RAW_NOP());
 
-#ifdef PPC64_ELF_ABI_v1
-	/* Load actual entry point from function descriptor */
-	PPC_BPF_LL(12, 12, 0);
-#endif
-
 	EMIT(PPC_RAW_MTCTR(12));
 	EMIT(PPC_RAW_BCTRL());
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 00/13] powerpc/bpf: Some fixes and updates
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
                   ` (12 preceding siblings ...)
  2022-01-06 11:45 ` [PATCH 13/13] powerpc64/bpf: Optimize instruction sequence used for function calls Naveen N. Rao
@ 2022-01-06 21:46 ` Daniel Borkmann
  2022-01-07  7:36   ` Naveen N. Rao
  2022-01-16 10:41 ` Michael Ellerman
  14 siblings, 1 reply; 33+ messages in thread
From: Daniel Borkmann @ 2022-01-06 21:46 UTC (permalink / raw)
  To: Naveen N. Rao, Michael Ellerman, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

Hi Naveen,

On 1/6/22 12:45 PM, Naveen N. Rao wrote:
> A set of fixes and updates to powerpc BPF JIT:
> - Patches 1-3 fix issues with the existing powerpc JIT and are tagged
>    for -stable.
> - Patch 4 fixes a build issue with bpf selftests on powerpc.
> - Patches 5-9 handle some corner cases and make some small improvements.
> - Patches 10-13 optimize how function calls are handled in ppc64.
> 
> Patches 7 and 8 were previously posted, and while patch 7 has no
> changes, patch 8 has been reworked to handle BPF_EXIT differently.

Is the plan to route these via ppc trees? Fwiw, patch 1 and 4 look generic
and in general good to me, we could also take these two via bpf-next tree
given outside of arch/powerpc/? Whichever works best.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 00/13] powerpc/bpf: Some fixes and updates
  2022-01-06 21:46 ` [PATCH 00/13] powerpc/bpf: Some fixes and updates Daniel Borkmann
@ 2022-01-07  7:36   ` Naveen N. Rao
  2022-01-07 10:20     ` Daniel Borkmann
  0 siblings, 1 reply; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-07  7:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Michael Ellerman
  Cc: bpf, Hari Bathini, johan.almbladh, Jiri Olsa, linuxppc-dev, song,
	ykaliuta

Hi Daniel,

Daniel Borkmann wrote:
> Hi Naveen,
> 
> On 1/6/22 12:45 PM, Naveen N. Rao wrote:
>> A set of fixes and updates to powerpc BPF JIT:
>> - Patches 1-3 fix issues with the existing powerpc JIT and are tagged
>>    for -stable.
>> - Patch 4 fixes a build issue with bpf selftests on powerpc.
>> - Patches 5-9 handle some corner cases and make some small improvements.
>> - Patches 10-13 optimize how function calls are handled in ppc64.
>> 
>> Patches 7 and 8 were previously posted, and while patch 7 has no
>> changes, patch 8 has been reworked to handle BPF_EXIT differently.
> 
> Is the plan to route these via ppc trees? Fwiw, patch 1 and 4 look generic
> and in general good to me, we could also take these two via bpf-next tree
> given outside of arch/powerpc/? Whichever works best.

Yes, I would like to route this through the powerpc tree. Though patches 
1 and 4 are generic, they primarily affect powerpc and I do not see 
conflicting changes in bpf-next. Request you to please ack those patches 
so that Michael can take it through the powerpc tree.


Thanks!
- Naveen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 00/13] powerpc/bpf: Some fixes and updates
  2022-01-07  7:36   ` Naveen N. Rao
@ 2022-01-07 10:20     ` Daniel Borkmann
  2022-01-10  3:47       ` Michael Ellerman
  0 siblings, 1 reply; 33+ messages in thread
From: Daniel Borkmann @ 2022-01-07 10:20 UTC (permalink / raw)
  To: Naveen N. Rao, Alexei Starovoitov, Michael Ellerman
  Cc: bpf, Hari Bathini, johan.almbladh, Jiri Olsa, linuxppc-dev, song,
	ykaliuta

On 1/7/22 8:36 AM, Naveen N. Rao wrote:
> Daniel Borkmann wrote:
>> On 1/6/22 12:45 PM, Naveen N. Rao wrote:
>>> A set of fixes and updates to powerpc BPF JIT:
>>> - Patches 1-3 fix issues with the existing powerpc JIT and are tagged
>>>    for -stable.
>>> - Patch 4 fixes a build issue with bpf selftests on powerpc.
>>> - Patches 5-9 handle some corner cases and make some small improvements.
>>> - Patches 10-13 optimize how function calls are handled in ppc64.
>>>
>>> Patches 7 and 8 were previously posted, and while patch 7 has no
>>> changes, patch 8 has been reworked to handle BPF_EXIT differently.
>>
>> Is the plan to route these via ppc trees? Fwiw, patch 1 and 4 look generic
>> and in general good to me, we could also take these two via bpf-next tree
>> given outside of arch/powerpc/? Whichever works best.
> 
> Yes, I would like to route this through the powerpc tree. Though patches 1 and 4 are generic, they primarily affect powerpc and I do not see conflicting changes in bpf-next. Request you to please ack those patches so that Michael can take it through the powerpc tree.

Ok, works for me. I presume this will end up in the upcoming merge window
anyway, so not too long time until we can sync these back to bpf/bpf-next
trees then.

Thanks!
Daniel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 04/13] tools/bpf: Rename 'struct event' to avoid naming conflict
  2022-01-06 11:45 ` [PATCH 04/13] tools/bpf: Rename 'struct event' to avoid naming conflict Naveen N. Rao
@ 2022-01-07 10:21   ` Daniel Borkmann
  0 siblings, 0 replies; 33+ messages in thread
From: Daniel Borkmann @ 2022-01-07 10:21 UTC (permalink / raw)
  To: Naveen N. Rao, Michael Ellerman, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

On 1/6/22 12:45 PM, Naveen N. Rao wrote:
> On ppc64le, trying to build bpf seltests throws the below warning:
>    In file included from runqslower.bpf.c:5:
>    ./runqslower.h:7:8: error: redefinition of 'event'
>    struct event {
> 	 ^
>    /home/naveen/linux/tools/testing/selftests/bpf/tools/build/runqslower/vmlinux.h:156602:8:
>    note: previous definition is here
>    struct event {
> 	 ^
> 
> This happens since 'struct event' is defined in
> drivers/net/ethernet/alteon/acenic.h . Rename the one in runqslower to a
> more appropriate 'runq_event' to avoid the naming conflict.
> 
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 01/13] bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack()
  2022-01-06 11:45 ` [PATCH 01/13] bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack() Naveen N. Rao
@ 2022-01-07 10:21   ` Daniel Borkmann
  2022-01-10  8:57   ` Christophe Leroy
  1 sibling, 0 replies; 33+ messages in thread
From: Daniel Borkmann @ 2022-01-07 10:21 UTC (permalink / raw)
  To: Naveen N. Rao, Michael Ellerman, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, Christophe Leroy, song, johan.almbladh,
	Hari Bathini, bpf, linuxppc-dev

On 1/6/22 12:45 PM, Naveen N. Rao wrote:
> task_pt_regs() can return NULL on powerpc for kernel threads. This is
> then used in __bpf_get_stack() to check for user mode, resulting in a
> kernel oops. Guard against this by checking return value of
> task_pt_regs() before trying to obtain the call chain.
> 
> Fixes: fa28dcb82a38f8 ("bpf: Introduce helper bpf_get_task_stack()")
> Cc: stable@vger.kernel.org # v5.9+
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 03/13] powerpc/bpf: Update ldimm64 instructions during extra pass
  2022-01-06 11:45 ` [PATCH 03/13] powerpc/bpf: Update ldimm64 instructions during extra pass Naveen N. Rao
@ 2022-01-08 14:45   ` Jiri Olsa
  2022-01-10  9:27   ` Christophe Leroy
  1 sibling, 0 replies; 33+ messages in thread
From: Jiri Olsa @ 2022-01-08 14:45 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov, ykaliuta,
	Christophe Leroy, song, johan.almbladh, Hari Bathini, bpf,
	linuxppc-dev

On Thu, Jan 06, 2022 at 05:15:07PM +0530, Naveen N. Rao wrote:
> These instructions are updated after the initial JIT, so redo codegen
> during the extra pass. Rename bpf_jit_fixup_subprog_calls() to clarify
> that this is more than just subprog calls.
> 
> Fixes: 69c087ba6225b5 ("bpf: Add bpf_for_each_map_elem() helper")
> Cc: stable@vger.kernel.org # v5.15
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>

Tested-by: Jiri Olsa <jolsa@redhat.com>

thanks,
jirka

> ---
>  arch/powerpc/net/bpf_jit_comp.c   | 29 +++++++++++++++++++++++------
>  arch/powerpc/net/bpf_jit_comp32.c |  6 ++++++
>  arch/powerpc/net/bpf_jit_comp64.c |  7 ++++++-
>  3 files changed, 35 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
> index d6ffdd0f2309d0..56dd1f4e3e4447 100644
> --- a/arch/powerpc/net/bpf_jit_comp.c
> +++ b/arch/powerpc/net/bpf_jit_comp.c
> @@ -23,15 +23,15 @@ static void bpf_jit_fill_ill_insns(void *area, unsigned int size)
>  	memset32(area, BREAKPOINT_INSTRUCTION, size / 4);
>  }
>  
> -/* Fix the branch target addresses for subprog calls */
> -static int bpf_jit_fixup_subprog_calls(struct bpf_prog *fp, u32 *image,
> -				       struct codegen_context *ctx, u32 *addrs)
> +/* Fix updated addresses (for subprog calls, ldimm64, et al) during extra pass */
> +static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image,
> +				   struct codegen_context *ctx, u32 *addrs)
>  {
>  	const struct bpf_insn *insn = fp->insnsi;
>  	bool func_addr_fixed;
>  	u64 func_addr;
>  	u32 tmp_idx;
> -	int i, ret;
> +	int i, j, ret;
>  
>  	for (i = 0; i < fp->len; i++) {
>  		/*
> @@ -66,6 +66,23 @@ static int bpf_jit_fixup_subprog_calls(struct bpf_prog *fp, u32 *image,
>  			 * of the JITed sequence remains unchanged.
>  			 */
>  			ctx->idx = tmp_idx;
> +		} else if (insn[i].code == (BPF_LD | BPF_IMM | BPF_DW)) {
> +			tmp_idx = ctx->idx;
> +			ctx->idx = addrs[i] / 4;
> +#ifdef CONFIG_PPC32
> +			PPC_LI32(ctx->b2p[insn[i].dst_reg] - 1, (u32)insn[i + 1].imm);
> +			PPC_LI32(ctx->b2p[insn[i].dst_reg], (u32)insn[i].imm);
> +			for (j = ctx->idx - addrs[i] / 4; j < 4; j++)
> +				EMIT(PPC_RAW_NOP());
> +#else
> +			func_addr = ((u64)(u32)insn[i].imm) | (((u64)(u32)insn[i + 1].imm) << 32);
> +			PPC_LI64(b2p[insn[i].dst_reg], func_addr);
> +			/* overwrite rest with nops */
> +			for (j = ctx->idx - addrs[i] / 4; j < 5; j++)
> +				EMIT(PPC_RAW_NOP());
> +#endif
> +			ctx->idx = tmp_idx;
> +			i++;
>  		}
>  	}
>  
> @@ -200,13 +217,13 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
>  		/*
>  		 * Do not touch the prologue and epilogue as they will remain
>  		 * unchanged. Only fix the branch target address for subprog
> -		 * calls in the body.
> +		 * calls in the body, and ldimm64 instructions.
>  		 *
>  		 * This does not change the offsets and lengths of the subprog
>  		 * call instruction sequences and hence, the size of the JITed
>  		 * image as well.
>  		 */
> -		bpf_jit_fixup_subprog_calls(fp, code_base, &cgctx, addrs);
> +		bpf_jit_fixup_addresses(fp, code_base, &cgctx, addrs);
>  
>  		/* There is no need to perform the usual passes. */
>  		goto skip_codegen_passes;
> diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
> index 997a47fa615b30..2258d3886d02ec 100644
> --- a/arch/powerpc/net/bpf_jit_comp32.c
> +++ b/arch/powerpc/net/bpf_jit_comp32.c
> @@ -293,6 +293,8 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>  		bool func_addr_fixed;
>  		u64 func_addr;
>  		u32 true_cond;
> +		u32 tmp_idx;
> +		int j;
>  
>  		/*
>  		 * addrs[] maps a BPF bytecode address into a real offset from
> @@ -908,8 +910,12 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>  		 * 16 byte instruction that uses two 'struct bpf_insn'
>  		 */
>  		case BPF_LD | BPF_IMM | BPF_DW: /* dst = (u64) imm */
> +			tmp_idx = ctx->idx;
>  			PPC_LI32(dst_reg_h, (u32)insn[i + 1].imm);
>  			PPC_LI32(dst_reg, (u32)insn[i].imm);
> +			/* padding to allow full 4 instructions for later patching */
> +			for (j = ctx->idx - tmp_idx; j < 4; j++)
> +				EMIT(PPC_RAW_NOP());
>  			/* Adjust for two bpf instructions */
>  			addrs[++i] = ctx->idx * 4;
>  			break;
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> index 472d4a551945dd..3d018ecc475b2b 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
> @@ -319,6 +319,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>  		u64 imm64;
>  		u32 true_cond;
>  		u32 tmp_idx;
> +		int j;
>  
>  		/*
>  		 * addrs[] maps a BPF bytecode address into a real offset from
> @@ -848,9 +849,13 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>  		case BPF_LD | BPF_IMM | BPF_DW: /* dst = (u64) imm */
>  			imm64 = ((u64)(u32) insn[i].imm) |
>  				    (((u64)(u32) insn[i+1].imm) << 32);
> +			tmp_idx = ctx->idx;
> +			PPC_LI64(dst_reg, imm64);
> +			/* padding to allow full 5 instructions for later patching */
> +			for (j = ctx->idx - tmp_idx; j < 5; j++)
> +				EMIT(PPC_RAW_NOP());
>  			/* Adjust for two bpf instructions */
>  			addrs[++i] = ctx->idx * 4;
> -			PPC_LI64(dst_reg, imm64);
>  			break;
>  
>  		/*
> -- 
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 00/13] powerpc/bpf: Some fixes and updates
  2022-01-07 10:20     ` Daniel Borkmann
@ 2022-01-10  3:47       ` Michael Ellerman
  0 siblings, 0 replies; 33+ messages in thread
From: Michael Ellerman @ 2022-01-10  3:47 UTC (permalink / raw)
  To: Daniel Borkmann, Naveen N. Rao, Alexei Starovoitov
  Cc: bpf, Hari Bathini, johan.almbladh, Jiri Olsa, linuxppc-dev, song,
	ykaliuta

Daniel Borkmann <daniel@iogearbox.net> writes:
> On 1/7/22 8:36 AM, Naveen N. Rao wrote:
>> Daniel Borkmann wrote:
>>> On 1/6/22 12:45 PM, Naveen N. Rao wrote:
>>>> A set of fixes and updates to powerpc BPF JIT:
>>>> - Patches 1-3 fix issues with the existing powerpc JIT and are tagged
>>>>    for -stable.
>>>> - Patch 4 fixes a build issue with bpf selftests on powerpc.
>>>> - Patches 5-9 handle some corner cases and make some small improvements.
>>>> - Patches 10-13 optimize how function calls are handled in ppc64.
>>>>
>>>> Patches 7 and 8 were previously posted, and while patch 7 has no
>>>> changes, patch 8 has been reworked to handle BPF_EXIT differently.
>>>
>>> Is the plan to route these via ppc trees? Fwiw, patch 1 and 4 look generic
>>> and in general good to me, we could also take these two via bpf-next tree
>>> given outside of arch/powerpc/? Whichever works best.
>> 
>> Yes, I would like to route this through the powerpc tree. Though patches 1 and 4 are generic, they primarily affect powerpc and I do not see conflicting changes in bpf-next. Request you to please ack those patches so that Michael can take it through the powerpc tree.
>
> Ok, works for me. I presume this will end up in the upcoming merge window
> anyway, so not too long time until we can sync these back to bpf/bpf-next
> trees then.

Hmm. This series landed a little late for me to get it into linux-next
before the merge window opened.

It's mostly small and includes some bug fixes, so I'm not saying it
needs to wait for the next merge window, but I would like it to get some
testing in linux-next before I ask Linus to pull it.

When would you need it all merged into Linus' tree in order to sync up
with the bpf tree for the next cycle? I assume as long as it's merged
before rc1 that would be sufficient?

cheers

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 01/13] bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack()
  2022-01-06 11:45 ` [PATCH 01/13] bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack() Naveen N. Rao
  2022-01-07 10:21   ` Daniel Borkmann
@ 2022-01-10  8:57   ` Christophe Leroy
  2022-01-10 10:36     ` Naveen N. Rao
  1 sibling, 1 reply; 33+ messages in thread
From: Christophe Leroy @ 2022-01-10  8:57 UTC (permalink / raw)
  To: Naveen N. Rao, Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, song, johan.almbladh, Hari Bathini, bpf,
	linuxppc-dev



Le 06/01/2022 à 12:45, Naveen N. Rao a écrit :
> task_pt_regs() can return NULL on powerpc for kernel threads. This is
> then used in __bpf_get_stack() to check for user mode, resulting in a
> kernel oops. Guard against this by checking return value of
> task_pt_regs() before trying to obtain the call chain.

I started looking at that some time ago, and I'm wondering whether it is 
worth keeping that powerpc particularity.

We used to have a potentially different pt_regs depending on how we 
entered kernel, especially on PPC32, but since the following commits it 
is not the case anymore.

06d67d54741a ("powerpc: make process.c suitable for both 32-bit and 64-bit")
db297c3b07af ("powerpc/32: Don't save thread.regs on interrupt entry")
b5cfc9cd7b04 ("powerpc/32: Fix critical and debug interrupts on BOOKE")

We could therefore just do like other architectures, define

#define task_pt_regs(p) ((struct pt_regs *)(THREAD_SIZE + 
task_stack_page(p)) - 1)

And then remove the regs field we have in thread_struct.


> 
> Fixes: fa28dcb82a38f8 ("bpf: Introduce helper bpf_get_task_stack()")
> Cc: stable@vger.kernel.org # v5.9+
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> ---
>   kernel/bpf/stackmap.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
> index 6e75bbee39f0b5..0dcaed4d3f4cec 100644
> --- a/kernel/bpf/stackmap.c
> +++ b/kernel/bpf/stackmap.c
> @@ -525,13 +525,14 @@ BPF_CALL_4(bpf_get_task_stack, struct task_struct *, task, void *, buf,
>   	   u32, size, u64, flags)
>   {
>   	struct pt_regs *regs;
> -	long res;
> +	long res = -EINVAL;
>   
>   	if (!try_get_task_stack(task))
>   		return -EFAULT;
>   
>   	regs = task_pt_regs(task);
> -	res = __bpf_get_stack(regs, task, NULL, buf, size, flags);
> +	if (regs)
> +		res = __bpf_get_stack(regs, task, NULL, buf, size, flags);

Should there be a comment explaining that on powerpc, 'regs' can be NULL 
for a kernel thread ?

>   	put_task_stack(task);
>   
>   	return res;

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 02/13] powerpc32/bpf: Fix codegen for bpf-to-bpf calls
  2022-01-06 11:45 ` [PATCH 02/13] powerpc32/bpf: Fix codegen for bpf-to-bpf calls Naveen N. Rao
@ 2022-01-10  9:06   ` Christophe Leroy
  2022-01-10 10:52     ` Naveen N. Rao
  0 siblings, 1 reply; 33+ messages in thread
From: Christophe Leroy @ 2022-01-10  9:06 UTC (permalink / raw)
  To: Naveen N. Rao, Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, song, johan.almbladh, Hari Bathini, bpf,
	linuxppc-dev



Le 06/01/2022 à 12:45, Naveen N. Rao a écrit :
> Pad instructions emitted for BPF_CALL so that the number of instructions
> generated does not change for different function addresses. This is
> especially important for calls to other bpf functions, whose address
> will only be known during extra pass.

In first pass, 'image' is NULL and we emit the 4 instructions sequence 
already, so the code won't grow after first pass, it can only shrink.

On PPC32, a huge effort is made to minimise the situations where 'bl' 
cannot be used, see commit 2ec13df16704 ("powerpc/modules: Load modules 
closer to kernel text")

And if you take the 8xx for instance, a NOP a just like any other 
instruction, it takes one cycle.

If it is absolutely needed, then I'd prefer we use an out-of-line 
trampoline for the unlikely case and use 'bl' to that trampoline.

> 
> Fixes: 51c66ad849a703 ("powerpc/bpf: Implement extended BPF on PPC32")
> Cc: stable@vger.kernel.org # v5.13+
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> ---
>   arch/powerpc/net/bpf_jit_comp32.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
> index d3a52cd42f5346..997a47fa615b30 100644
> --- a/arch/powerpc/net/bpf_jit_comp32.c
> +++ b/arch/powerpc/net/bpf_jit_comp32.c
> @@ -191,6 +191,9 @@ void bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 fun
>   
>   	if (image && rel < 0x2000000 && rel >= -0x2000000) {
>   		PPC_BL_ABS(func);
> +		EMIT(PPC_RAW_NOP());
> +		EMIT(PPC_RAW_NOP());
> +		EMIT(PPC_RAW_NOP());
>   	} else {
>   		/* Load function address into r0 */
>   		EMIT(PPC_RAW_LIS(_R0, IMM_H(func)));

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 11/13] powerpc64/bpf elfv2: Setup kernel TOC in r2 on entry
  2022-01-06 11:45 ` [PATCH 11/13] powerpc64/bpf elfv2: Setup kernel TOC in r2 on entry Naveen N. Rao
@ 2022-01-10  9:20   ` Christophe Leroy
  2022-01-11 10:31     ` Naveen N. Rao
  0 siblings, 1 reply; 33+ messages in thread
From: Christophe Leroy @ 2022-01-10  9:20 UTC (permalink / raw)
  To: Naveen N. Rao, Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, song, johan.almbladh, Hari Bathini, bpf,
	linuxppc-dev



Le 06/01/2022 à 12:45, Naveen N. Rao a écrit :
> In preparation for using kernel TOC, load the same in r2 on entry. With
> elfv1, the kernel TOC is already setup by our caller so we just emit a
> nop. We adjust the number of instructions to skip on a tail call
> accordingly.
> 
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> ---
>   arch/powerpc/net/bpf_jit_comp64.c | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> index ce4fc59bbd6a92..e05b577d95bf11 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
> @@ -73,6 +73,12 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
>   {
>   	int i;
>   
> +#ifdef PPC64_ELF_ABI_v2
> +	PPC_BPF_LL(_R2, _R13, offsetof(struct paca_struct, kernel_toc));
> +#else
> +	EMIT(PPC_RAW_NOP());
> +#endif

Can we avoid the #ifdef, using

	if (__is_defined(PPC64_ELF_ABI_v2))
		PPC_BPF_LL(_R2, _R13, offsetof(struct paca_struct, kernel_toc));
	else
		EMIT(PPC_RAW_NOP());

> +
>   	/*
>   	 * Initialize tail_call_cnt if we do tail calls.
>   	 * Otherwise, put in NOPs so that it can be skipped when we are
> @@ -87,7 +93,7 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
>   		EMIT(PPC_RAW_NOP());
>   	}
>   
> -#define BPF_TAILCALL_PROLOGUE_SIZE	8
> +#define BPF_TAILCALL_PROLOGUE_SIZE	12

Why not change that for v2 ABI only instead of adding a NOP ? ABI won't 
change during runtime AFAIU

>   
>   	if (bpf_has_stack_frame(ctx)) {
>   		/*

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 03/13] powerpc/bpf: Update ldimm64 instructions during extra pass
  2022-01-06 11:45 ` [PATCH 03/13] powerpc/bpf: Update ldimm64 instructions during extra pass Naveen N. Rao
  2022-01-08 14:45   ` Jiri Olsa
@ 2022-01-10  9:27   ` Christophe Leroy
  2022-01-10 10:56     ` Naveen N. Rao
  1 sibling, 1 reply; 33+ messages in thread
From: Christophe Leroy @ 2022-01-10  9:27 UTC (permalink / raw)
  To: Naveen N. Rao, Michael Ellerman, Daniel Borkmann, Alexei Starovoitov
  Cc: Jiri Olsa, ykaliuta, song, johan.almbladh, Hari Bathini, bpf,
	linuxppc-dev



Le 06/01/2022 à 12:45, Naveen N. Rao a écrit :
> These instructions are updated after the initial JIT, so redo codegen
> during the extra pass. Rename bpf_jit_fixup_subprog_calls() to clarify
> that this is more than just subprog calls.
> 
> Fixes: 69c087ba6225b5 ("bpf: Add bpf_for_each_map_elem() helper")
> Cc: stable@vger.kernel.org # v5.15
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> ---
>   arch/powerpc/net/bpf_jit_comp.c   | 29 +++++++++++++++++++++++------
>   arch/powerpc/net/bpf_jit_comp32.c |  6 ++++++
>   arch/powerpc/net/bpf_jit_comp64.c |  7 ++++++-
>   3 files changed, 35 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
> index d6ffdd0f2309d0..56dd1f4e3e4447 100644
> --- a/arch/powerpc/net/bpf_jit_comp.c
> +++ b/arch/powerpc/net/bpf_jit_comp.c
> @@ -23,15 +23,15 @@ static void bpf_jit_fill_ill_insns(void *area, unsigned int size)
>   	memset32(area, BREAKPOINT_INSTRUCTION, size / 4);
>   }
>   
> -/* Fix the branch target addresses for subprog calls */
> -static int bpf_jit_fixup_subprog_calls(struct bpf_prog *fp, u32 *image,
> -				       struct codegen_context *ctx, u32 *addrs)
> +/* Fix updated addresses (for subprog calls, ldimm64, et al) during extra pass */
> +static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image,
> +				   struct codegen_context *ctx, u32 *addrs)
>   {
>   	const struct bpf_insn *insn = fp->insnsi;
>   	bool func_addr_fixed;
>   	u64 func_addr;
>   	u32 tmp_idx;
> -	int i, ret;
> +	int i, j, ret;
>   
>   	for (i = 0; i < fp->len; i++) {
>   		/*
> @@ -66,6 +66,23 @@ static int bpf_jit_fixup_subprog_calls(struct bpf_prog *fp, u32 *image,
>   			 * of the JITed sequence remains unchanged.
>   			 */
>   			ctx->idx = tmp_idx;
> +		} else if (insn[i].code == (BPF_LD | BPF_IMM | BPF_DW)) {
> +			tmp_idx = ctx->idx;
> +			ctx->idx = addrs[i] / 4;
> +#ifdef CONFIG_PPC32
> +			PPC_LI32(ctx->b2p[insn[i].dst_reg] - 1, (u32)insn[i + 1].imm);
> +			PPC_LI32(ctx->b2p[insn[i].dst_reg], (u32)insn[i].imm);
> +			for (j = ctx->idx - addrs[i] / 4; j < 4; j++)
> +				EMIT(PPC_RAW_NOP());
> +#else
> +			func_addr = ((u64)(u32)insn[i].imm) | (((u64)(u32)insn[i + 1].imm) << 32);
> +			PPC_LI64(b2p[insn[i].dst_reg], func_addr);
> +			/* overwrite rest with nops */
> +			for (j = ctx->idx - addrs[i] / 4; j < 5; j++)
> +				EMIT(PPC_RAW_NOP());
> +#endif

#ifdefs should be avoided as much as possible.

Here it seems we could easily do an

	if (IS_ENABLED(CONFIG_PPC32)) {
	} else {
	}

And it looks like the CONFIG_PPC64 alternative would in fact also work 
on PPC32, wouldn't it ?


> +			ctx->idx = tmp_idx;
> +			i++;
>   		}
>   	}
>   
> @@ -200,13 +217,13 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
>   		/*
>   		 * Do not touch the prologue and epilogue as they will remain
>   		 * unchanged. Only fix the branch target address for subprog
> -		 * calls in the body.
> +		 * calls in the body, and ldimm64 instructions.
>   		 *
>   		 * This does not change the offsets and lengths of the subprog
>   		 * call instruction sequences and hence, the size of the JITed
>   		 * image as well.
>   		 */
> -		bpf_jit_fixup_subprog_calls(fp, code_base, &cgctx, addrs);
> +		bpf_jit_fixup_addresses(fp, code_base, &cgctx, addrs);
>   
>   		/* There is no need to perform the usual passes. */
>   		goto skip_codegen_passes;
> diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
> index 997a47fa615b30..2258d3886d02ec 100644
> --- a/arch/powerpc/net/bpf_jit_comp32.c
> +++ b/arch/powerpc/net/bpf_jit_comp32.c
> @@ -293,6 +293,8 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   		bool func_addr_fixed;
>   		u64 func_addr;
>   		u32 true_cond;
> +		u32 tmp_idx;
> +		int j;
>   
>   		/*
>   		 * addrs[] maps a BPF bytecode address into a real offset from
> @@ -908,8 +910,12 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   		 * 16 byte instruction that uses two 'struct bpf_insn'
>   		 */
>   		case BPF_LD | BPF_IMM | BPF_DW: /* dst = (u64) imm */
> +			tmp_idx = ctx->idx;
>   			PPC_LI32(dst_reg_h, (u32)insn[i + 1].imm);
>   			PPC_LI32(dst_reg, (u32)insn[i].imm);
> +			/* padding to allow full 4 instructions for later patching */
> +			for (j = ctx->idx - tmp_idx; j < 4; j++)
> +				EMIT(PPC_RAW_NOP());
>   			/* Adjust for two bpf instructions */
>   			addrs[++i] = ctx->idx * 4;
>   			break;
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> index 472d4a551945dd..3d018ecc475b2b 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
> @@ -319,6 +319,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   		u64 imm64;
>   		u32 true_cond;
>   		u32 tmp_idx;
> +		int j;
>   
>   		/*
>   		 * addrs[] maps a BPF bytecode address into a real offset from
> @@ -848,9 +849,13 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
>   		case BPF_LD | BPF_IMM | BPF_DW: /* dst = (u64) imm */
>   			imm64 = ((u64)(u32) insn[i].imm) |
>   				    (((u64)(u32) insn[i+1].imm) << 32);
> +			tmp_idx = ctx->idx;
> +			PPC_LI64(dst_reg, imm64);
> +			/* padding to allow full 5 instructions for later patching */
> +			for (j = ctx->idx - tmp_idx; j < 5; j++)
> +				EMIT(PPC_RAW_NOP());
>   			/* Adjust for two bpf instructions */
>   			addrs[++i] = ctx->idx * 4;
> -			PPC_LI64(dst_reg, imm64);
>   			break;
>   
>   		/*

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 01/13] bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack()
  2022-01-10  8:57   ` Christophe Leroy
@ 2022-01-10 10:36     ` Naveen N. Rao
  0 siblings, 0 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-10 10:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Christophe Leroy, Daniel Borkmann, Michael Ellerman
  Cc: bpf, Hari Bathini, johan.almbladh, Jiri Olsa, linuxppc-dev, song,
	ykaliuta

Christophe Leroy wrote:
> 
> 
> Le 06/01/2022 à 12:45, Naveen N. Rao a écrit :
>> task_pt_regs() can return NULL on powerpc for kernel threads. This is
>> then used in __bpf_get_stack() to check for user mode, resulting in a
>> kernel oops. Guard against this by checking return value of
>> task_pt_regs() before trying to obtain the call chain.
> 
> I started looking at that some time ago, and I'm wondering whether it is 
> worth keeping that powerpc particularity.
> 
> We used to have a potentially different pt_regs depending on how we 
> entered kernel, especially on PPC32, but since the following commits it 
> is not the case anymore.
> 
> 06d67d54741a ("powerpc: make process.c suitable for both 32-bit and 64-bit")
> db297c3b07af ("powerpc/32: Don't save thread.regs on interrupt entry")
> b5cfc9cd7b04 ("powerpc/32: Fix critical and debug interrupts on BOOKE")
> 
> We could therefore just do like other architectures, define
> 
> #define task_pt_regs(p) ((struct pt_regs *)(THREAD_SIZE + 
> task_stack_page(p)) - 1)
> 
> And then remove the regs field we have in thread_struct.

Sure, I don't have an opinion on that, but I think this patch will still 
be needed for -stable.

> 
> 
>> 
>> Fixes: fa28dcb82a38f8 ("bpf: Introduce helper bpf_get_task_stack()")
>> Cc: stable@vger.kernel.org # v5.9+
>> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
>> ---
>>   kernel/bpf/stackmap.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>> 
>> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
>> index 6e75bbee39f0b5..0dcaed4d3f4cec 100644
>> --- a/kernel/bpf/stackmap.c
>> +++ b/kernel/bpf/stackmap.c
>> @@ -525,13 +525,14 @@ BPF_CALL_4(bpf_get_task_stack, struct task_struct *, task, void *, buf,
>>   	   u32, size, u64, flags)
>>   {
>>   	struct pt_regs *regs;
>> -	long res;
>> +	long res = -EINVAL;
>>   
>>   	if (!try_get_task_stack(task))
>>   		return -EFAULT;
>>   
>>   	regs = task_pt_regs(task);
>> -	res = __bpf_get_stack(regs, task, NULL, buf, size, flags);
>> +	if (regs)
>> +		res = __bpf_get_stack(regs, task, NULL, buf, size, flags);
> 
> Should there be a comment explaining that on powerpc, 'regs' can be NULL 
> for a kernel thread ?

I guess this won't be required if we end up with the change you are 
proposing above?


- Naveen


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 02/13] powerpc32/bpf: Fix codegen for bpf-to-bpf calls
  2022-01-10  9:06   ` Christophe Leroy
@ 2022-01-10 10:52     ` Naveen N. Rao
  0 siblings, 0 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-10 10:52 UTC (permalink / raw)
  To: Alexei Starovoitov, Christophe Leroy, Daniel Borkmann, Michael Ellerman
  Cc: bpf, Hari Bathini, johan.almbladh, Jiri Olsa, linuxppc-dev, song,
	ykaliuta

Christophe Leroy wrote:
> 
> 
> Le 06/01/2022 à 12:45, Naveen N. Rao a écrit :
>> Pad instructions emitted for BPF_CALL so that the number of instructions
>> generated does not change for different function addresses. This is
>> especially important for calls to other bpf functions, whose address
>> will only be known during extra pass.
> 
> In first pass, 'image' is NULL and we emit the 4 instructions sequence 
> already, so the code won't grow after first pass, it can only shrink.

Right, but this patch addresses the scenario where the function address 
is only provided during the extra pass. So, even though we will not 
write past the end of the BPF image, the emitted instructions can still 
be wrong.

> 
> On PPC32, a huge effort is made to minimise the situations where 'bl' 
> cannot be used, see commit 2ec13df16704 ("powerpc/modules: Load modules 
> closer to kernel text")
> 
> And if you take the 8xx for instance, a NOP a just like any other 
> instruction, it takes one cycle.
> 
> If it is absolutely needed, then I'd prefer we use an out-of-line 
> trampoline for the unlikely case and use 'bl' to that trampoline.

Yes, something like that will be nice to do, but we will still need this 
patch for -stable.

The other option is to redo the whole JIT during the extra pass, but 
only if we can ensure that we have provisioned for the maximum image 
size.


- Naveen


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 03/13] powerpc/bpf: Update ldimm64 instructions during extra pass
  2022-01-10  9:27   ` Christophe Leroy
@ 2022-01-10 10:56     ` Naveen N. Rao
  0 siblings, 0 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-10 10:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Christophe Leroy, Daniel Borkmann, Michael Ellerman
  Cc: bpf, Hari Bathini, johan.almbladh, Jiri Olsa, linuxppc-dev, song,
	ykaliuta

Christophe Leroy wrote:
> 
> 
> Le 06/01/2022 à 12:45, Naveen N. Rao a écrit :
>> These instructions are updated after the initial JIT, so redo codegen
>> during the extra pass. Rename bpf_jit_fixup_subprog_calls() to clarify
>> that this is more than just subprog calls.
>> 
>> Fixes: 69c087ba6225b5 ("bpf: Add bpf_for_each_map_elem() helper")
>> Cc: stable@vger.kernel.org # v5.15
>> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
>> ---
>>   arch/powerpc/net/bpf_jit_comp.c   | 29 +++++++++++++++++++++++------
>>   arch/powerpc/net/bpf_jit_comp32.c |  6 ++++++
>>   arch/powerpc/net/bpf_jit_comp64.c |  7 ++++++-
>>   3 files changed, 35 insertions(+), 7 deletions(-)
>> 
>> diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
>> index d6ffdd0f2309d0..56dd1f4e3e4447 100644
>> --- a/arch/powerpc/net/bpf_jit_comp.c
>> +++ b/arch/powerpc/net/bpf_jit_comp.c
>> @@ -23,15 +23,15 @@ static void bpf_jit_fill_ill_insns(void *area, unsigned int size)
>>   	memset32(area, BREAKPOINT_INSTRUCTION, size / 4);
>>   }
>>   
>> -/* Fix the branch target addresses for subprog calls */
>> -static int bpf_jit_fixup_subprog_calls(struct bpf_prog *fp, u32 *image,
>> -				       struct codegen_context *ctx, u32 *addrs)
>> +/* Fix updated addresses (for subprog calls, ldimm64, et al) during extra pass */
>> +static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image,
>> +				   struct codegen_context *ctx, u32 *addrs)
>>   {
>>   	const struct bpf_insn *insn = fp->insnsi;
>>   	bool func_addr_fixed;
>>   	u64 func_addr;
>>   	u32 tmp_idx;
>> -	int i, ret;
>> +	int i, j, ret;
>>   
>>   	for (i = 0; i < fp->len; i++) {
>>   		/*
>> @@ -66,6 +66,23 @@ static int bpf_jit_fixup_subprog_calls(struct bpf_prog *fp, u32 *image,
>>   			 * of the JITed sequence remains unchanged.
>>   			 */
>>   			ctx->idx = tmp_idx;
>> +		} else if (insn[i].code == (BPF_LD | BPF_IMM | BPF_DW)) {
>> +			tmp_idx = ctx->idx;
>> +			ctx->idx = addrs[i] / 4;
>> +#ifdef CONFIG_PPC32
>> +			PPC_LI32(ctx->b2p[insn[i].dst_reg] - 1, (u32)insn[i + 1].imm);
>> +			PPC_LI32(ctx->b2p[insn[i].dst_reg], (u32)insn[i].imm);
>> +			for (j = ctx->idx - addrs[i] / 4; j < 4; j++)
>> +				EMIT(PPC_RAW_NOP());
>> +#else
>> +			func_addr = ((u64)(u32)insn[i].imm) | (((u64)(u32)insn[i + 1].imm) << 32);
>> +			PPC_LI64(b2p[insn[i].dst_reg], func_addr);
>> +			/* overwrite rest with nops */
>> +			for (j = ctx->idx - addrs[i] / 4; j < 5; j++)
>> +				EMIT(PPC_RAW_NOP());
>> +#endif
> 
> #ifdefs should be avoided as much as possible.
> 
> Here it seems we could easily do an
> 
> 	if (IS_ENABLED(CONFIG_PPC32)) {
> 	} else {
> 	}
> 
> And it looks like the CONFIG_PPC64 alternative would in fact also work 
> on PPC32, wouldn't it ?

We never implemented PPC_LI64() for ppc32:
  /linux/arch/powerpc/net/bpf_jit_comp.c: In function 
  'bpf_jit_fixup_addresses':
  /linux/arch/powerpc/net/bpf_jit_comp.c:81:5: error: this decimal constant is unsigned only in ISO C90 [-Werror]
     81 |     PPC_LI64(b2p[insn[i].dst_reg], func_addr);
	|     ^~~~~~~~
  /linux/arch/powerpc/net/bpf_jit_comp.c:81:5: error: this decimal constant is unsigned only in ISO C90 [-Werror]
  In file included from /linux/arch/powerpc/net/bpf_jit_comp.c:19:
  /linux/arch/powerpc/net/bpf_jit.h:78:40: error: right shift count >= width of type [-Werror=shift-count-overflow]
     78 |     EMIT(PPC_RAW_LI(d, ((uintptr_t)(i) >> 32) &   \
	|                                        ^~


We should move that out from bpf_jit.h


- Naveen


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 11/13] powerpc64/bpf elfv2: Setup kernel TOC in r2 on entry
  2022-01-10  9:20   ` Christophe Leroy
@ 2022-01-11 10:31     ` Naveen N. Rao
  2022-01-11 14:35       ` Christophe Leroy
  0 siblings, 1 reply; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-11 10:31 UTC (permalink / raw)
  To: Alexei Starovoitov, Christophe Leroy, Daniel Borkmann, Michael Ellerman
  Cc: bpf, Hari Bathini, johan.almbladh, Jiri Olsa, linuxppc-dev, song,
	ykaliuta

Christophe Leroy wrote:
> 
> 
> Le 06/01/2022 à 12:45, Naveen N. Rao a écrit :
>> In preparation for using kernel TOC, load the same in r2 on entry. With
>> elfv1, the kernel TOC is already setup by our caller so we just emit a
>> nop. We adjust the number of instructions to skip on a tail call
>> accordingly.
>> 
>> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
>> ---
>>   arch/powerpc/net/bpf_jit_comp64.c | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>> 
>> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
>> index ce4fc59bbd6a92..e05b577d95bf11 100644
>> --- a/arch/powerpc/net/bpf_jit_comp64.c
>> +++ b/arch/powerpc/net/bpf_jit_comp64.c
>> @@ -73,6 +73,12 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
>>   {
>>   	int i;
>>   
>> +#ifdef PPC64_ELF_ABI_v2
>> +	PPC_BPF_LL(_R2, _R13, offsetof(struct paca_struct, kernel_toc));
>> +#else
>> +	EMIT(PPC_RAW_NOP());
>> +#endif
> 
> Can we avoid the #ifdef, using
> 
> 	if (__is_defined(PPC64_ELF_ABI_v2))
> 		PPC_BPF_LL(_R2, _R13, offsetof(struct paca_struct, kernel_toc));
> 	else
> 		EMIT(PPC_RAW_NOP());

Hmm... that doesn't work for me. Is __is_defined() expected to work with 
macros other than CONFIG options?

> 
>> +
>>   	/*
>>   	 * Initialize tail_call_cnt if we do tail calls.
>>   	 * Otherwise, put in NOPs so that it can be skipped when we are
>> @@ -87,7 +93,7 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
>>   		EMIT(PPC_RAW_NOP());
>>   	}
>>   
>> -#define BPF_TAILCALL_PROLOGUE_SIZE	8
>> +#define BPF_TAILCALL_PROLOGUE_SIZE	12
> 
> Why not change that for v2 ABI only instead of adding a NOP ? ABI won't 
> change during runtime AFAIU

Yeah, I wanted to keep this simple and I felt an additional nop 
shouldn't matter too much. But, I guess we can get rid of 
BPF_TAILCALL_PROLOGUE_SIZE since the only user is the function emitting 
a tail call. I will submit that as a separate cleanup unless I need to 
redo this series.

Thanks for the reviews!
- Naveen


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 11/13] powerpc64/bpf elfv2: Setup kernel TOC in r2 on entry
  2022-01-11 10:31     ` Naveen N. Rao
@ 2022-01-11 14:35       ` Christophe Leroy
  2022-01-11 14:43         ` Christophe Leroy
  0 siblings, 1 reply; 33+ messages in thread
From: Christophe Leroy @ 2022-01-11 14:35 UTC (permalink / raw)
  To: Naveen N. Rao, Alexei Starovoitov, Daniel Borkmann, Michael Ellerman
  Cc: bpf, Hari Bathini, johan.almbladh, Jiri Olsa, linuxppc-dev, song,
	ykaliuta



Le 11/01/2022 à 11:31, Naveen N. Rao a écrit :
> Christophe Leroy wrote:
>>
>>
>> Le 06/01/2022 à 12:45, Naveen N. Rao a écrit :
>>> In preparation for using kernel TOC, load the same in r2 on entry. With
>>> elfv1, the kernel TOC is already setup by our caller so we just emit a
>>> nop. We adjust the number of instructions to skip on a tail call
>>> accordingly.
>>>
>>> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
>>> ---
>>>   arch/powerpc/net/bpf_jit_comp64.c | 8 +++++++-
>>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
>>> b/arch/powerpc/net/bpf_jit_comp64.c
>>> index ce4fc59bbd6a92..e05b577d95bf11 100644
>>> --- a/arch/powerpc/net/bpf_jit_comp64.c
>>> +++ b/arch/powerpc/net/bpf_jit_comp64.c
>>> @@ -73,6 +73,12 @@ void bpf_jit_build_prologue(u32 *image, struct 
>>> codegen_context *ctx)
>>>   {
>>>       int i;
>>> +#ifdef PPC64_ELF_ABI_v2
>>> +    PPC_BPF_LL(_R2, _R13, offsetof(struct paca_struct, kernel_toc));
>>> +#else
>>> +    EMIT(PPC_RAW_NOP());
>>> +#endif
>>
>> Can we avoid the #ifdef, using
>>
>>     if (__is_defined(PPC64_ELF_ABI_v2))
>>         PPC_BPF_LL(_R2, _R13, offsetof(struct paca_struct, kernel_toc));
>>     else
>>         EMIT(PPC_RAW_NOP());
> 
> Hmm... that doesn't work for me. Is __is_defined() expected to work with 
> macros other than CONFIG options?

Yes, __is_defined() should work with any item.

It is IS_ENABLED() which is supposed to work only with CONFIG options.

See commit 5c189c523e78 ("powerpc/time: Fix mftb()/get_tb() for use with 
the compat VDSO")

Or commit ca5999fde0a1 ("mm: introduce include/linux/pgtable.h")


> 
>>
>>> +
>>>       /*
>>>        * Initialize tail_call_cnt if we do tail calls.
>>>        * Otherwise, put in NOPs so that it can be skipped when we are
>>> @@ -87,7 +93,7 @@ void bpf_jit_build_prologue(u32 *image, struct 
>>> codegen_context *ctx)
>>>           EMIT(PPC_RAW_NOP());
>>>       }
>>> -#define BPF_TAILCALL_PROLOGUE_SIZE    8
>>> +#define BPF_TAILCALL_PROLOGUE_SIZE    12
>>
>> Why not change that for v2 ABI only instead of adding a NOP ? ABI 
>> won't change during runtime AFAIU
> 
> Yeah, I wanted to keep this simple and I felt an additional nop 
> shouldn't matter too much. But, I guess we can get rid of 
> BPF_TAILCALL_PROLOGUE_SIZE since the only user is the function emitting 
> a tail call. I will submit that as a separate cleanup unless I need to 
> redo this series.
> 

All this make me think about a discussion I had some time ago with Nic, 
and we ended up trying to get rid of PPC64_ELF_ABI_v2 macro and use a 
CONFIG item instead.

For the result see 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/ad0eb12f6e3f49b4a3284fc54c4c4d70c496609e.1634457599.git.christophe.leroy@csgroup.eu/

For the discussion see 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/4fda65cda906e56aa87806b658e0828c64792403.1634190022.git.christophe.leroy@csgroup.eu/

Christophe

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 11/13] powerpc64/bpf elfv2: Setup kernel TOC in r2 on entry
  2022-01-11 14:35       ` Christophe Leroy
@ 2022-01-11 14:43         ` Christophe Leroy
  2022-01-14 11:17           ` Naveen N. Rao
  0 siblings, 1 reply; 33+ messages in thread
From: Christophe Leroy @ 2022-01-11 14:43 UTC (permalink / raw)
  To: Naveen N. Rao, Alexei Starovoitov, Daniel Borkmann, Michael Ellerman
  Cc: ykaliuta, johan.almbladh, Jiri Olsa, song, bpf, linuxppc-dev,
	Hari Bathini



Le 11/01/2022 à 15:35, Christophe Leroy a écrit :
> 
> 
> Le 11/01/2022 à 11:31, Naveen N. Rao a écrit :
>> Christophe Leroy wrote:
>>>
>>>
>>> Le 06/01/2022 à 12:45, Naveen N. Rao a écrit :
>>>> In preparation for using kernel TOC, load the same in r2 on entry. With
>>>> elfv1, the kernel TOC is already setup by our caller so we just emit a
>>>> nop. We adjust the number of instructions to skip on a tail call
>>>> accordingly.
>>>>
>>>> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
>>>> ---
>>>>    arch/powerpc/net/bpf_jit_comp64.c | 8 +++++++-
>>>>    1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/powerpc/net/bpf_jit_comp64.c
>>>> b/arch/powerpc/net/bpf_jit_comp64.c
>>>> index ce4fc59bbd6a92..e05b577d95bf11 100644
>>>> --- a/arch/powerpc/net/bpf_jit_comp64.c
>>>> +++ b/arch/powerpc/net/bpf_jit_comp64.c
>>>> @@ -73,6 +73,12 @@ void bpf_jit_build_prologue(u32 *image, struct
>>>> codegen_context *ctx)
>>>>    {
>>>>        int i;
>>>> +#ifdef PPC64_ELF_ABI_v2
>>>> +    PPC_BPF_LL(_R2, _R13, offsetof(struct paca_struct, kernel_toc));
>>>> +#else
>>>> +    EMIT(PPC_RAW_NOP());
>>>> +#endif
>>>
>>> Can we avoid the #ifdef, using
>>>
>>>      if (__is_defined(PPC64_ELF_ABI_v2))
>>>          PPC_BPF_LL(_R2, _R13, offsetof(struct paca_struct, kernel_toc));
>>>      else
>>>          EMIT(PPC_RAW_NOP());
>>
>> Hmm... that doesn't work for me. Is __is_defined() expected to work with
>> macros other than CONFIG options?
> 
> Yes, __is_defined() should work with any item.
> 
> It is IS_ENABLED() which is supposed to work only with CONFIG options.
> 
> See commit 5c189c523e78 ("powerpc/time: Fix mftb()/get_tb() for use with
> the compat VDSO")
> 
> Or commit ca5999fde0a1 ("mm: introduce include/linux/pgtable.h")

Ah ... wait.

It seems it expects a macro set to 1.

So it would require arch/powerpc/include/asm/types.h to be modified to 
define PPC64_ELF_ABI_v2 or PPC64_ELF_ABI_v1 as 1

Christophe

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 11/13] powerpc64/bpf elfv2: Setup kernel TOC in r2 on entry
  2022-01-11 14:43         ` Christophe Leroy
@ 2022-01-14 11:17           ` Naveen N. Rao
  0 siblings, 0 replies; 33+ messages in thread
From: Naveen N. Rao @ 2022-01-14 11:17 UTC (permalink / raw)
  To: Alexei Starovoitov, Christophe Leroy, Daniel Borkmann, Michael Ellerman
  Cc: bpf, Hari Bathini, johan.almbladh, Jiri Olsa, linuxppc-dev, song,
	ykaliuta, masahiroy

Christophe Leroy wrote:
> 
> 
> Le 11/01/2022 à 15:35, Christophe Leroy a écrit :
>> 
>> 
>> Le 11/01/2022 à 11:31, Naveen N. Rao a écrit :
>>> Christophe Leroy wrote:
>>>>
>>>>
>>>> Le 06/01/2022 à 12:45, Naveen N. Rao a écrit :
>>>>> In preparation for using kernel TOC, load the same in r2 on entry. With
>>>>> elfv1, the kernel TOC is already setup by our caller so we just emit a
>>>>> nop. We adjust the number of instructions to skip on a tail call
>>>>> accordingly.
>>>>>
>>>>> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
>>>>> ---
>>>>>    arch/powerpc/net/bpf_jit_comp64.c | 8 +++++++-
>>>>>    1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/arch/powerpc/net/bpf_jit_comp64.c
>>>>> b/arch/powerpc/net/bpf_jit_comp64.c
>>>>> index ce4fc59bbd6a92..e05b577d95bf11 100644
>>>>> --- a/arch/powerpc/net/bpf_jit_comp64.c
>>>>> +++ b/arch/powerpc/net/bpf_jit_comp64.c
>>>>> @@ -73,6 +73,12 @@ void bpf_jit_build_prologue(u32 *image, struct
>>>>> codegen_context *ctx)
>>>>>    {
>>>>>        int i;
>>>>> +#ifdef PPC64_ELF_ABI_v2
>>>>> +    PPC_BPF_LL(_R2, _R13, offsetof(struct paca_struct, kernel_toc));
>>>>> +#else
>>>>> +    EMIT(PPC_RAW_NOP());
>>>>> +#endif
>>>>
>>>> Can we avoid the #ifdef, using
>>>>
>>>>      if (__is_defined(PPC64_ELF_ABI_v2))
>>>>          PPC_BPF_LL(_R2, _R13, offsetof(struct paca_struct, kernel_toc));
>>>>      else
>>>>          EMIT(PPC_RAW_NOP());
>>>
>>> Hmm... that doesn't work for me. Is __is_defined() expected to work with
>>> macros other than CONFIG options?
>> 
>> Yes, __is_defined() should work with any item.
>> 
>> It is IS_ENABLED() which is supposed to work only with CONFIG options.

I suppose you are saying that due to the name? Since IS_ENABLED() and 
IS_BUILTIN() seem to work fine too, once I define the macro as 1.

Along those lines, it would have been nice to have IS_DEFINED().

>> 
>> See commit 5c189c523e78 ("powerpc/time: Fix mftb()/get_tb() for use with
>> the compat VDSO")
>> 
>> Or commit ca5999fde0a1 ("mm: introduce include/linux/pgtable.h")
> 
> Ah ... wait.
> 
> It seems it expects a macro set to 1.
> 
> So it would require arch/powerpc/include/asm/types.h to be modified to 
> define PPC64_ELF_ABI_v2 or PPC64_ELF_ABI_v1 as 1

Thanks, that works.


- Naveen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 00/13] powerpc/bpf: Some fixes and updates
  2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
                   ` (13 preceding siblings ...)
  2022-01-06 21:46 ` [PATCH 00/13] powerpc/bpf: Some fixes and updates Daniel Borkmann
@ 2022-01-16 10:41 ` Michael Ellerman
  14 siblings, 0 replies; 33+ messages in thread
From: Michael Ellerman @ 2022-01-16 10:41 UTC (permalink / raw)
  To: Michael Ellerman, Daniel Borkmann, Alexei Starovoitov, Naveen N. Rao
  Cc: ykaliuta, Jiri Olsa, Hari Bathini, linuxppc-dev,
	Christophe Leroy, bpf, song, johan.almbladh

On Thu, 6 Jan 2022 17:15:04 +0530, Naveen N. Rao wrote:
> A set of fixes and updates to powerpc BPF JIT:
> - Patches 1-3 fix issues with the existing powerpc JIT and are tagged
>   for -stable.
> - Patch 4 fixes a build issue with bpf selftests on powerpc.
> - Patches 5-9 handle some corner cases and make some small improvements.
> - Patches 10-13 optimize how function calls are handled in ppc64.
> 
> [...]

Patches 1-4, and 8 applied to powerpc/fixes.

[01/13] bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack()
        https://git.kernel.org/powerpc/c/b992f01e66150fc5e90be4a96f5eb8e634c8249e
[02/13] powerpc32/bpf: Fix codegen for bpf-to-bpf calls
        https://git.kernel.org/powerpc/c/fab07611fb2e6a15fac05c4583045ca5582fd826
[03/13] powerpc/bpf: Update ldimm64 instructions during extra pass
        https://git.kernel.org/powerpc/c/f9320c49993ca3c0ec0f9a7026b313735306bb8b
[04/13] tools/bpf: Rename 'struct event' to avoid naming conflict
        https://git.kernel.org/powerpc/c/88a71086c48ae98e93c0208044827621e9717f7e
[08/13] powerpc64/bpf: Limit 'ldbrx' to processors compliant with ISA v2.06
        https://git.kernel.org/powerpc/c/3f5f766d5f7f95a69a630da3544a1a0cee1cdddf

cheers

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2022-01-16 10:42 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-06 11:45 [PATCH 00/13] powerpc/bpf: Some fixes and updates Naveen N. Rao
2022-01-06 11:45 ` [PATCH 01/13] bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack() Naveen N. Rao
2022-01-07 10:21   ` Daniel Borkmann
2022-01-10  8:57   ` Christophe Leroy
2022-01-10 10:36     ` Naveen N. Rao
2022-01-06 11:45 ` [PATCH 02/13] powerpc32/bpf: Fix codegen for bpf-to-bpf calls Naveen N. Rao
2022-01-10  9:06   ` Christophe Leroy
2022-01-10 10:52     ` Naveen N. Rao
2022-01-06 11:45 ` [PATCH 03/13] powerpc/bpf: Update ldimm64 instructions during extra pass Naveen N. Rao
2022-01-08 14:45   ` Jiri Olsa
2022-01-10  9:27   ` Christophe Leroy
2022-01-10 10:56     ` Naveen N. Rao
2022-01-06 11:45 ` [PATCH 04/13] tools/bpf: Rename 'struct event' to avoid naming conflict Naveen N. Rao
2022-01-07 10:21   ` Daniel Borkmann
2022-01-06 11:45 ` [PATCH 05/13] powerpc/bpf: Skip branch range validation during first pass Naveen N. Rao
2022-01-06 11:45 ` [PATCH 06/13] powerpc/bpf: Emit a single branch instruction for known short branch ranges Naveen N. Rao
2022-01-06 11:45 ` [PATCH 07/13] powerpc/bpf: Handle large branch ranges with BPF_EXIT Naveen N. Rao
2022-01-06 11:45 ` [PATCH 08/13] powerpc64/bpf: Limit 'ldbrx' to processors compliant with ISA v2.06 Naveen N. Rao
2022-01-06 11:45 ` [PATCH 09/13] powerpc64/bpf: Do not save/restore LR on each call to bpf_stf_barrier() Naveen N. Rao
2022-01-06 11:45 ` [PATCH 10/13] powerpc64/bpf: Use r12 for constant blinding Naveen N. Rao
2022-01-06 11:45 ` [PATCH 11/13] powerpc64/bpf elfv2: Setup kernel TOC in r2 on entry Naveen N. Rao
2022-01-10  9:20   ` Christophe Leroy
2022-01-11 10:31     ` Naveen N. Rao
2022-01-11 14:35       ` Christophe Leroy
2022-01-11 14:43         ` Christophe Leroy
2022-01-14 11:17           ` Naveen N. Rao
2022-01-06 11:45 ` [PATCH 12/13] powerpc64/bpf elfv1: Do not load TOC before calling functions Naveen N. Rao
2022-01-06 11:45 ` [PATCH 13/13] powerpc64/bpf: Optimize instruction sequence used for function calls Naveen N. Rao
2022-01-06 21:46 ` [PATCH 00/13] powerpc/bpf: Some fixes and updates Daniel Borkmann
2022-01-07  7:36   ` Naveen N. Rao
2022-01-07 10:20     ` Daniel Borkmann
2022-01-10  3:47       ` Michael Ellerman
2022-01-16 10:41 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).