* [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc @ 2022-11-10 18:43 Hari Bathini 2022-11-10 18:43 ` [RFC PATCH 1/3] powerpc/bpf: implement bpf_arch_text_copy Hari Bathini ` (3 more replies) 0 siblings, 4 replies; 19+ messages in thread From: Hari Bathini @ 2022-11-10 18:43 UTC (permalink / raw) To: linuxppc-dev, bpf Cc: Naveen N. Rao, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Song Liu, Michael Ellerman Most BPF programs are small, but they consume a page each. For systems with busy traffic and many BPF programs, this may also add significant pressure on instruction TLB. High iTLB pressure usually slows down the whole system causing visible performance degradation for production workloads. bpf_prog_pack, a customized allocator that packs multiple bpf programs into preallocated memory chunks, was proposed [1] to address it. This series extends this support on powerpc. Patches 1 & 2 add the arch specific functions needed to support this feature. Patch 3 enables the support for powerpc. The last patch ensures cleanup is handled racefully. Tested the changes successfully on a PowerVM. patch_instruction(), needed for bpf_arch_text_copy(), is failing for ppc32. Debugging it. Posting the patches in the meanwhile for feedback on these changes. [1] https://lore.kernel.org/bpf/20220204185742.271030-1-song@kernel.org/ Hari Bathini (3): powerpc/bpf: implement bpf_arch_text_copy powerpc/bpf: implement bpf_arch_text_invalidate for bpf_prog_pack powerpc/bpf: use bpf_jit_binary_pack_[alloc|finalize|free] arch/powerpc/net/bpf_jit.h | 18 +-- arch/powerpc/net/bpf_jit_comp.c | 194 ++++++++++++++++++++++++------ arch/powerpc/net/bpf_jit_comp32.c | 26 ++-- arch/powerpc/net/bpf_jit_comp64.c | 32 ++--- 4 files changed, 198 insertions(+), 72 deletions(-) -- 2.37.3 ^ permalink raw reply [flat|nested] 19+ messages in thread
* [RFC PATCH 1/3] powerpc/bpf: implement bpf_arch_text_copy 2022-11-10 18:43 [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc Hari Bathini @ 2022-11-10 18:43 ` Hari Bathini 2022-11-13 13:17 ` Christophe Leroy 2022-11-10 18:43 ` [RFC PATCH 2/3] powerpc/bpf: implement bpf_arch_text_invalidate for bpf_prog_pack Hari Bathini ` (2 subsequent siblings) 3 siblings, 1 reply; 19+ messages in thread From: Hari Bathini @ 2022-11-10 18:43 UTC (permalink / raw) To: linuxppc-dev, bpf Cc: Naveen N. Rao, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Song Liu, Michael Ellerman bpf_arch_text_copy is used to dump JITed binary to RX page, allowing multiple BPF programs to share the same page. Using patch_instruction to implement it. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> --- arch/powerpc/net/bpf_jit_comp.c | 39 ++++++++++++++++++++++++++++++++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c index 43e634126514..7383e0effad2 100644 --- a/arch/powerpc/net/bpf_jit_comp.c +++ b/arch/powerpc/net/bpf_jit_comp.c @@ -13,9 +13,12 @@ #include <linux/netdevice.h> #include <linux/filter.h> #include <linux/if_vlan.h> -#include <asm/kprobes.h> +#include <linux/memory.h> #include <linux/bpf.h> +#include <asm/kprobes.h> +#include <asm/code-patching.h> + #include "bpf_jit.h" static void bpf_jit_fill_ill_insns(void *area, unsigned int size) @@ -23,6 +26,35 @@ static void bpf_jit_fill_ill_insns(void *area, unsigned int size) memset32(area, BREAKPOINT_INSTRUCTION, size / 4); } +/* + * Patch 'len' bytes of instructions from opcode to addr, one instruction + * at a time. Returns addr on success. ERR_PTR(-EINVAL), otherwise. + */ +static void *bpf_patch_instructions(void *addr, void *opcode, size_t len) +{ + void *ret = ERR_PTR(-EINVAL); + size_t patched = 0; + u32 *inst = opcode; + u32 *start = addr; + + if (WARN_ON_ONCE(core_kernel_text((unsigned long)addr))) + return ret; + + mutex_lock(&text_mutex); + while (patched < len) { + if (patch_instruction(start++, ppc_inst(*inst))) + goto error; + + inst++; + patched += 4; + } + + ret = addr; +error: + mutex_unlock(&text_mutex); + return ret; +} + /* Fix updated addresses (for subprog calls, ldimm64, et al) during extra pass */ static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image, struct codegen_context *ctx, u32 *addrs) @@ -357,3 +389,8 @@ int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, int pass, struct code ctx->exentry_idx++; return 0; } + +void *bpf_arch_text_copy(void *dst, void *src, size_t len) +{ + return bpf_patch_instructions(dst, src, len); +} -- 2.37.3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 1/3] powerpc/bpf: implement bpf_arch_text_copy 2022-11-10 18:43 ` [RFC PATCH 1/3] powerpc/bpf: implement bpf_arch_text_copy Hari Bathini @ 2022-11-13 13:17 ` Christophe Leroy 2022-11-14 14:54 ` Hari Bathini 0 siblings, 1 reply; 19+ messages in thread From: Christophe Leroy @ 2022-11-13 13:17 UTC (permalink / raw) To: Hari Bathini, linuxppc-dev, bpf Cc: Song Liu, Daniel Borkmann, Alexei Starovoitov, Andrii Nakryiko, Naveen N. Rao Le 10/11/2022 à 19:43, Hari Bathini a écrit : > bpf_arch_text_copy is used to dump JITed binary to RX page, allowing > multiple BPF programs to share the same page. Using patch_instruction > to implement it. Using patch_instruction() is nice for a quick implementation, but it is probably suboptimal. Due to the amount of data to be copied, it is worth a dedicated function that maps a RW copy of the page to be updated then does the copy at once with memcpy() then unmaps the page. > > Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> > --- > arch/powerpc/net/bpf_jit_comp.c | 39 ++++++++++++++++++++++++++++++++- > 1 file changed, 38 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c > index 43e634126514..7383e0effad2 100644 > --- a/arch/powerpc/net/bpf_jit_comp.c > +++ b/arch/powerpc/net/bpf_jit_comp.c > @@ -13,9 +13,12 @@ > #include <linux/netdevice.h> > #include <linux/filter.h> > #include <linux/if_vlan.h> > -#include <asm/kprobes.h> > +#include <linux/memory.h> > #include <linux/bpf.h> > > +#include <asm/kprobes.h> > +#include <asm/code-patching.h> > + > #include "bpf_jit.h" > > static void bpf_jit_fill_ill_insns(void *area, unsigned int size) > @@ -23,6 +26,35 @@ static void bpf_jit_fill_ill_insns(void *area, unsigned int size) > memset32(area, BREAKPOINT_INSTRUCTION, size / 4); > } > > +/* > + * Patch 'len' bytes of instructions from opcode to addr, one instruction > + * at a time. Returns addr on success. ERR_PTR(-EINVAL), otherwise. > + */ > +static void *bpf_patch_instructions(void *addr, void *opcode, size_t len) > +{ > + void *ret = ERR_PTR(-EINVAL); > + size_t patched = 0; > + u32 *inst = opcode; > + u32 *start = addr; > + > + if (WARN_ON_ONCE(core_kernel_text((unsigned long)addr))) > + return ret; > + > + mutex_lock(&text_mutex); > + while (patched < len) { > + if (patch_instruction(start++, ppc_inst(*inst))) > + goto error; > + > + inst++; > + patched += 4; > + } > + > + ret = addr; > +error: > + mutex_unlock(&text_mutex); > + return ret; > +} > + > /* Fix updated addresses (for subprog calls, ldimm64, et al) during extra pass */ > static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image, > struct codegen_context *ctx, u32 *addrs) > @@ -357,3 +389,8 @@ int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, int pass, struct code > ctx->exentry_idx++; > return 0; > } > + > +void *bpf_arch_text_copy(void *dst, void *src, size_t len) > +{ > + return bpf_patch_instructions(dst, src, len); > +} I can't see the added value of having two functions when the first one just calls the second one and is the only user of it. Why not have implemented bpf_patch_instructions() directly inside bpf_arch_text_copy() ? By the way, it can be nice to have two functions, but split them differently, to avoid the goto: etc .... I also prefer using for loops instead of while loops. It could have looked like below (untested): static void *bpf_patch_instructions(void *addr, void *opcode, size_t len) { u32 *inst = opcode; u32 *start = addr; u32 *end = addr + len; for (inst = opcode, start = addr; start < end; inst++, start++) { if (patch_instruction(start, ppc_inst(*inst))) return ERR_PTR(-EINVAL); } return addr; } void *bpf_arch_text_copy(void *dst, void *src, size_t len) { if (WARN_ON_ONCE(core_kernel_text((unsigned long)dst))) return ret; mutex_lock(&text_mutex); ret = bpf_patch_instructions(dst, src, len); mutex_unlock(&text_mutex); return ret; } ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 1/3] powerpc/bpf: implement bpf_arch_text_copy 2022-11-13 13:17 ` Christophe Leroy @ 2022-11-14 14:54 ` Hari Bathini 0 siblings, 0 replies; 19+ messages in thread From: Hari Bathini @ 2022-11-14 14:54 UTC (permalink / raw) To: Christophe Leroy, linuxppc-dev, bpf Cc: Naveen N. Rao, Song Liu, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko On 13/11/22 6:47 pm, Christophe Leroy wrote: > Le 10/11/2022 à 19:43, Hari Bathini a écrit : >> bpf_arch_text_copy is used to dump JITed binary to RX page, allowing >> multiple BPF programs to share the same page. Using patch_instruction >> to implement it. > > Using patch_instruction() is nice for a quick implementation, but it is > probably suboptimal. Due to the amount of data to be copied, it is worth Yeah. > a dedicated function that maps a RW copy of the page to be updated then > does the copy at once with memcpy() then unmaps the page. I will see if I can come up with such implementation for the respin. > >> >> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> >> --- >> arch/powerpc/net/bpf_jit_comp.c | 39 ++++++++++++++++++++++++++++++++- >> 1 file changed, 38 insertions(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c >> index 43e634126514..7383e0effad2 100644 >> --- a/arch/powerpc/net/bpf_jit_comp.c >> +++ b/arch/powerpc/net/bpf_jit_comp.c >> @@ -13,9 +13,12 @@ >> #include <linux/netdevice.h> >> #include <linux/filter.h> >> #include <linux/if_vlan.h> >> -#include <asm/kprobes.h> >> +#include <linux/memory.h> >> #include <linux/bpf.h> >> >> +#include <asm/kprobes.h> >> +#include <asm/code-patching.h> >> + >> #include "bpf_jit.h" >> >> static void bpf_jit_fill_ill_insns(void *area, unsigned int size) >> @@ -23,6 +26,35 @@ static void bpf_jit_fill_ill_insns(void *area, unsigned int size) >> memset32(area, BREAKPOINT_INSTRUCTION, size / 4); >> } >> >> +/* >> + * Patch 'len' bytes of instructions from opcode to addr, one instruction >> + * at a time. Returns addr on success. ERR_PTR(-EINVAL), otherwise. >> + */ >> +static void *bpf_patch_instructions(void *addr, void *opcode, size_t len) >> +{ >> + void *ret = ERR_PTR(-EINVAL); >> + size_t patched = 0; >> + u32 *inst = opcode; >> + u32 *start = addr; >> + >> + if (WARN_ON_ONCE(core_kernel_text((unsigned long)addr))) >> + return ret; >> + >> + mutex_lock(&text_mutex); >> + while (patched < len) { >> + if (patch_instruction(start++, ppc_inst(*inst))) >> + goto error; >> + >> + inst++; >> + patched += 4; >> + } >> + >> + ret = addr; >> +error: >> + mutex_unlock(&text_mutex); >> + return ret; >> +} >> + >> /* Fix updated addresses (for subprog calls, ldimm64, et al) during extra pass */ >> static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image, >> struct codegen_context *ctx, u32 *addrs) >> @@ -357,3 +389,8 @@ int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, int pass, struct code >> ctx->exentry_idx++; >> return 0; >> } >> + >> +void *bpf_arch_text_copy(void *dst, void *src, size_t len) >> +{ >> + return bpf_patch_instructions(dst, src, len); >> +} > > I can't see the added value of having two functions when the first one > just calls the second one and is the only user of it. Why not have > implemented bpf_patch_instructions() directly inside bpf_arch_text_copy() ? > > By the way, it can be nice to have two functions, but split them > differently, to avoid the goto: etc .... > > I also prefer using for loops instead of while loops. > > It could have looked like below (untested): > > static void *bpf_patch_instructions(void *addr, void *opcode, size_t len) > { > u32 *inst = opcode; > u32 *start = addr; > u32 *end = addr + len; > > for (inst = opcode, start = addr; start < end; inst++, start++) { > if (patch_instruction(start, ppc_inst(*inst))) > return ERR_PTR(-EINVAL); > } > > return addr; > } > > void *bpf_arch_text_copy(void *dst, void *src, size_t len) > { > if (WARN_ON_ONCE(core_kernel_text((unsigned long)dst))) > return ret; > > mutex_lock(&text_mutex); > > ret = bpf_patch_instructions(dst, src, len); > > mutex_unlock(&text_mutex); > > return ret; > } > > Sure. Will use this. Thanks Hari ^ permalink raw reply [flat|nested] 19+ messages in thread
* [RFC PATCH 2/3] powerpc/bpf: implement bpf_arch_text_invalidate for bpf_prog_pack 2022-11-10 18:43 [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc Hari Bathini 2022-11-10 18:43 ` [RFC PATCH 1/3] powerpc/bpf: implement bpf_arch_text_copy Hari Bathini @ 2022-11-10 18:43 ` Hari Bathini 2022-11-13 18:00 ` Christophe Leroy 2022-11-10 18:43 ` [RFC PATCH 3/3] powerpc/bpf: use bpf_jit_binary_pack_[alloc|finalize|free] Hari Bathini 2022-11-11 11:25 ` [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc Christophe Leroy 3 siblings, 1 reply; 19+ messages in thread From: Hari Bathini @ 2022-11-10 18:43 UTC (permalink / raw) To: linuxppc-dev, bpf Cc: Naveen N. Rao, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Song Liu, Michael Ellerman Implement bpf_arch_text_invalidate and use it to fill unused part of the bpf_prog_pack with trap instructions when a BPF program is freed. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> --- arch/powerpc/net/bpf_jit_comp.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c index 7383e0effad2..f925755cd249 100644 --- a/arch/powerpc/net/bpf_jit_comp.c +++ b/arch/powerpc/net/bpf_jit_comp.c @@ -26,6 +26,33 @@ static void bpf_jit_fill_ill_insns(void *area, unsigned int size) memset32(area, BREAKPOINT_INSTRUCTION, size / 4); } +/* + * Patch 'len' bytes with trap instruction at addr, one instruction + * at a time. Returns addr on success. ERR_PTR(-EINVAL), otherwise. + */ +static void *bpf_patch_ill_insns(void *addr, size_t len) +{ + void *ret = ERR_PTR(-EINVAL); + size_t patched = 0; + u32 *start = addr; + + if (WARN_ON_ONCE(core_kernel_text((unsigned long)addr))) + return ret; + + mutex_lock(&text_mutex); + while (patched < len) { + if (patch_instruction(start++, ppc_inst(PPC_RAW_TRAP()))) + goto error; + + patched += 4; + } + + ret = addr; +error: + mutex_unlock(&text_mutex); + return ret; +} + /* * Patch 'len' bytes of instructions from opcode to addr, one instruction * at a time. Returns addr on success. ERR_PTR(-EINVAL), otherwise. @@ -394,3 +421,8 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len) { return bpf_patch_instructions(dst, src, len); } + +int bpf_arch_text_invalidate(void *dst, size_t len) +{ + return IS_ERR(bpf_patch_ill_insns(dst, len)); +} -- 2.37.3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 2/3] powerpc/bpf: implement bpf_arch_text_invalidate for bpf_prog_pack 2022-11-10 18:43 ` [RFC PATCH 2/3] powerpc/bpf: implement bpf_arch_text_invalidate for bpf_prog_pack Hari Bathini @ 2022-11-13 18:00 ` Christophe Leroy 0 siblings, 0 replies; 19+ messages in thread From: Christophe Leroy @ 2022-11-13 18:00 UTC (permalink / raw) To: Hari Bathini, linuxppc-dev, bpf Cc: Song Liu, Daniel Borkmann, Alexei Starovoitov, Andrii Nakryiko, Naveen N. Rao Le 10/11/2022 à 19:43, Hari Bathini a écrit : > Implement bpf_arch_text_invalidate and use it to fill unused part of > the bpf_prog_pack with trap instructions when a BPF program is freed. Same here, allthough patch_instruction() is nice for a first try, it is not the solution on the long run. Same as with previous patch, it should just map the necessary page by allocating a vma area then mapping the associated physical pages over it using map_kernel_page(), then use bpf_jit_fill_ill_insns() over than page. > > Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> > --- > arch/powerpc/net/bpf_jit_comp.c | 32 ++++++++++++++++++++++++++++++++ > 1 file changed, 32 insertions(+) > > diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c > index 7383e0effad2..f925755cd249 100644 > --- a/arch/powerpc/net/bpf_jit_comp.c > +++ b/arch/powerpc/net/bpf_jit_comp.c > @@ -26,6 +26,33 @@ static void bpf_jit_fill_ill_insns(void *area, unsigned int size) > memset32(area, BREAKPOINT_INSTRUCTION, size / 4); > } > > +/* > + * Patch 'len' bytes with trap instruction at addr, one instruction > + * at a time. Returns addr on success. ERR_PTR(-EINVAL), otherwise. > + */ > +static void *bpf_patch_ill_insns(void *addr, size_t len) > +{ > + void *ret = ERR_PTR(-EINVAL); > + size_t patched = 0; > + u32 *start = addr; > + > + if (WARN_ON_ONCE(core_kernel_text((unsigned long)addr))) > + return ret; > + > + mutex_lock(&text_mutex); > + while (patched < len) { > + if (patch_instruction(start++, ppc_inst(PPC_RAW_TRAP()))) Use BREAKPOINT_INSTRUCTION instead of PPC_RAW_TRAP() > + goto error; > + > + patched += 4; > + } > + > + ret = addr; > +error: > + mutex_unlock(&text_mutex); > + return ret; > +} > + > /* > * Patch 'len' bytes of instructions from opcode to addr, one instruction > * at a time. Returns addr on success. ERR_PTR(-EINVAL), otherwise. > @@ -394,3 +421,8 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len) > { > return bpf_patch_instructions(dst, src, len); > } > + > +int bpf_arch_text_invalidate(void *dst, size_t len) > +{ > + return IS_ERR(bpf_patch_ill_insns(dst, len)); > +} The exact same split between bpf_arch_text_invalidate() and bpf_patch_ill_insns() as previous patch could be done here. ^ permalink raw reply [flat|nested] 19+ messages in thread
* [RFC PATCH 3/3] powerpc/bpf: use bpf_jit_binary_pack_[alloc|finalize|free] 2022-11-10 18:43 [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc Hari Bathini 2022-11-10 18:43 ` [RFC PATCH 1/3] powerpc/bpf: implement bpf_arch_text_copy Hari Bathini 2022-11-10 18:43 ` [RFC PATCH 2/3] powerpc/bpf: implement bpf_arch_text_invalidate for bpf_prog_pack Hari Bathini @ 2022-11-10 18:43 ` Hari Bathini 2022-11-13 18:36 ` Christophe Leroy 2022-11-11 11:25 ` [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc Christophe Leroy 3 siblings, 1 reply; 19+ messages in thread From: Hari Bathini @ 2022-11-10 18:43 UTC (permalink / raw) To: linuxppc-dev, bpf Cc: Naveen N. Rao, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Song Liu, Michael Ellerman Use bpf_jit_binary_pack_alloc in powerpc jit. The jit engine first writes the program to the rw buffer. When the jit is done, the program is copied to the final location with bpf_jit_binary_pack_finalize. With multiple jit_subprogs, bpf_jit_free is called on some subprograms that haven't got bpf_jit_binary_pack_finalize() yet. Implement custom bpf_jit_free() like in commit 1d5f82d9dd47 ("bpf, x86: fix freeing of not-finalized bpf_prog_pack") to call bpf_jit_binary_pack_finalize(), if necessary. While here, correct the misnomer powerpc64_jit_data to powerpc_jit_data as it is meant for both ppc32 and ppc64. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> --- arch/powerpc/net/bpf_jit.h | 18 +++-- arch/powerpc/net/bpf_jit_comp.c | 123 +++++++++++++++++++++--------- arch/powerpc/net/bpf_jit_comp32.c | 26 +++---- arch/powerpc/net/bpf_jit_comp64.c | 32 ++++---- 4 files changed, 128 insertions(+), 71 deletions(-) diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h index a4f7880f959d..e314d6a23bce 100644 --- a/arch/powerpc/net/bpf_jit.h +++ b/arch/powerpc/net/bpf_jit.h @@ -21,7 +21,7 @@ #define PLANT_INSTR(d, idx, instr) \ do { if (d) { (d)[idx] = instr; } idx++; } while (0) -#define EMIT(instr) PLANT_INSTR(image, ctx->idx, instr) +#define EMIT(instr) PLANT_INSTR(rw_image, ctx->idx, instr) /* Long jump; (unconditional 'branch') */ #define PPC_JMP(dest) \ @@ -167,16 +167,18 @@ static inline void bpf_clear_seen_register(struct codegen_context *ctx, int i) } void bpf_jit_init_reg_mapping(struct codegen_context *ctx); -int bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 func); -int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *ctx, +int bpf_jit_emit_func_call_rel(u32 *image, u32 *rw_image, struct codegen_context *ctx, u64 func); +int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *rw_image, struct codegen_context *ctx, u32 *addrs, int pass); -void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx); -void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx); +void bpf_jit_build_prologue(u32 *image, u32 *rw_image, struct codegen_context *ctx); +void bpf_jit_build_epilogue(u32 *image, u32 *rw_image, struct codegen_context *ctx); void bpf_jit_realloc_regs(struct codegen_context *ctx); -int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, int tmp_reg, long exit_addr); +int bpf_jit_emit_exit_insn(u32 *image, u32 *rw_image, struct codegen_context *ctx, + int tmp_reg, long exit_addr); -int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, int pass, struct codegen_context *ctx, - int insn_idx, int jmp_off, int dst_reg); +int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, u32 *rw_image, int pass, + struct codegen_context *ctx, int insn_idx, + int jmp_off, int dst_reg); #endif diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c index f925755cd249..c4c1f7a21d89 100644 --- a/arch/powerpc/net/bpf_jit_comp.c +++ b/arch/powerpc/net/bpf_jit_comp.c @@ -83,7 +83,7 @@ static void *bpf_patch_instructions(void *addr, void *opcode, size_t len) } /* Fix updated addresses (for subprog calls, ldimm64, et al) during extra pass */ -static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image, +static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image, u32 *rw_image, struct codegen_context *ctx, u32 *addrs) { const struct bpf_insn *insn = fp->insnsi; @@ -118,7 +118,7 @@ static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image, */ tmp_idx = ctx->idx; ctx->idx = addrs[i] / 4; - ret = bpf_jit_emit_func_call_rel(image, ctx, func_addr); + ret = bpf_jit_emit_func_call_rel(image, rw_image, ctx, func_addr); if (ret) return ret; @@ -150,7 +150,8 @@ static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image, return 0; } -int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, int tmp_reg, long exit_addr) +int bpf_jit_emit_exit_insn(u32 *image, u32 *rw_image, struct codegen_context *ctx, + int tmp_reg, long exit_addr) { if (!exit_addr || is_offset_in_branch_range(exit_addr - (ctx->idx * 4))) { PPC_JMP(exit_addr); @@ -160,13 +161,14 @@ int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, int tmp_reg, PPC_JMP(ctx->alt_exit_addr); } else { ctx->alt_exit_addr = ctx->idx * 4; - bpf_jit_build_epilogue(image, ctx); + bpf_jit_build_epilogue(image, rw_image, ctx); } return 0; } -struct powerpc64_jit_data { +struct powerpc_jit_data { + struct bpf_binary_header *rw_header; struct bpf_binary_header *header; u32 *addrs; u8 *image; @@ -181,22 +183,25 @@ bool bpf_jit_needs_zext(void) struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) { - u32 proglen; - u32 alloclen; - u8 *image = NULL; - u32 *code_base; - u32 *addrs; - struct powerpc64_jit_data *jit_data; - struct codegen_context cgctx; - int pass; - int flen; + struct bpf_binary_header *rw_header = NULL; + struct powerpc_jit_data *jit_data; struct bpf_binary_header *bpf_hdr; + struct codegen_context cgctx; struct bpf_prog *org_fp = fp; struct bpf_prog *tmp_fp; bool bpf_blinded = false; bool extra_pass = false; + u8 *rw_image = NULL; + u32 *rw_code_base; + u8 *image = NULL; u32 extable_len; + u32 *code_base; u32 fixup_len; + u32 alloclen; + u32 proglen; + u32 *addrs; + int pass; + int flen; if (!fp->jit_requested) return org_fp; @@ -227,6 +232,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) image = jit_data->image; bpf_hdr = jit_data->header; proglen = jit_data->proglen; + rw_header = jit_data->rw_header; + rw_image = (void *)rw_header + ((void *)image - (void *)bpf_hdr); extra_pass = true; goto skip_init_ctx; } @@ -244,7 +251,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) cgctx.stack_size = round_up(fp->aux->stack_depth, 16); /* Scouting faux-generate pass 0 */ - if (bpf_jit_build_body(fp, 0, &cgctx, addrs, 0)) { + if (bpf_jit_build_body(fp, 0, 0, &cgctx, addrs, 0)) { /* We hit something illegal or unsupported. */ fp = org_fp; goto out_addrs; @@ -259,7 +266,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) */ if (cgctx.seen & SEEN_TAILCALL || !is_offset_in_branch_range((long)cgctx.idx * 4)) { cgctx.idx = 0; - if (bpf_jit_build_body(fp, 0, &cgctx, addrs, 0)) { + if (bpf_jit_build_body(fp, 0, 0, &cgctx, addrs, 0)) { fp = org_fp; goto out_addrs; } @@ -271,9 +278,9 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) * update ctgtx.idx as it pretends to output instructions, then we can * calculate total size from idx. */ - bpf_jit_build_prologue(0, &cgctx); + bpf_jit_build_prologue(0, 0, &cgctx); addrs[fp->len] = cgctx.idx * 4; - bpf_jit_build_epilogue(0, &cgctx); + bpf_jit_build_epilogue(0, 0, &cgctx); fixup_len = fp->aux->num_exentries * BPF_FIXUP_LEN * 4; extable_len = fp->aux->num_exentries * sizeof(struct exception_table_entry); @@ -281,7 +288,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) proglen = cgctx.idx * 4; alloclen = proglen + FUNCTION_DESCR_SIZE + fixup_len + extable_len; - bpf_hdr = bpf_jit_binary_alloc(alloclen, &image, 4, bpf_jit_fill_ill_insns); + bpf_hdr = bpf_jit_binary_pack_alloc(alloclen, &image, 4, &rw_header, &rw_image, + bpf_jit_fill_ill_insns); if (!bpf_hdr) { fp = org_fp; goto out_addrs; @@ -292,6 +300,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) skip_init_ctx: code_base = (u32 *)(image + FUNCTION_DESCR_SIZE); + rw_code_base = (u32 *)(rw_image + FUNCTION_DESCR_SIZE); if (extra_pass) { /* @@ -303,7 +312,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) * call instruction sequences and hence, the size of the JITed * image as well. */ - bpf_jit_fixup_addresses(fp, code_base, &cgctx, addrs); + bpf_jit_fixup_addresses(fp, code_base, rw_code_base, &cgctx, addrs); /* There is no need to perform the usual passes. */ goto skip_codegen_passes; @@ -314,13 +323,15 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) /* Now build the prologue, body code & epilogue for real. */ cgctx.idx = 0; cgctx.alt_exit_addr = 0; - bpf_jit_build_prologue(code_base, &cgctx); - if (bpf_jit_build_body(fp, code_base, &cgctx, addrs, pass)) { - bpf_jit_binary_free(bpf_hdr); + bpf_jit_build_prologue(code_base, rw_code_base, &cgctx); + if (bpf_jit_build_body(fp, code_base, rw_code_base, &cgctx, addrs, pass)) { + bpf_arch_text_copy(&bpf_hdr->size, &rw_header->size, + sizeof(rw_header->size)); + bpf_jit_binary_pack_free(bpf_hdr, rw_header); fp = org_fp; goto out_addrs; } - bpf_jit_build_epilogue(code_base, &cgctx); + bpf_jit_build_epilogue(code_base, rw_code_base, &cgctx); if (bpf_jit_enable > 1) pr_info("Pass %d: shrink = %d, seen = 0x%x\n", pass, @@ -337,17 +348,26 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) #ifdef CONFIG_PPC64_ELF_ABI_V1 /* Function descriptor nastiness: Address + TOC */ - ((u64 *)image)[0] = (u64)code_base; - ((u64 *)image)[1] = local_paca->kernel_toc; + ((u64 *)rw_image)[0] = (u64)code_base; + ((u64 *)rw_image)[1] = local_paca->kernel_toc; #endif fp->bpf_func = (void *)image; fp->jited = 1; fp->jited_len = proglen + FUNCTION_DESCR_SIZE; - bpf_flush_icache(bpf_hdr, (u8 *)bpf_hdr + bpf_hdr->size); if (!fp->is_func || extra_pass) { - bpf_jit_binary_lock_ro(bpf_hdr); + /* + * bpf_jit_binary_pack_finalize fails in two scenarios: + * 1) header is not pointing to proper module memory; + * 2) the arch doesn't support bpf_arch_text_copy(). + * + * Both cases are serious bugs that justify WARN_ON. + */ + if (WARN_ON(bpf_jit_binary_pack_finalize(fp, bpf_hdr, rw_header))) { + fp = org_fp; + goto out_addrs; + } bpf_prog_fill_jited_linfo(fp, addrs); out_addrs: kfree(addrs); @@ -359,6 +379,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) jit_data->proglen = proglen; jit_data->image = image; jit_data->header = bpf_hdr; + jit_data->rw_header = rw_header; } out: @@ -372,12 +393,13 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) * The caller should check for (BPF_MODE(code) == BPF_PROBE_MEM) before calling * this function, as this only applies to BPF_PROBE_MEM, for now. */ -int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, int pass, struct codegen_context *ctx, - int insn_idx, int jmp_off, int dst_reg) +int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, u32 *rw_image, int pass, + struct codegen_context *ctx, int insn_idx, + int jmp_off, int dst_reg) { - off_t offset; + struct exception_table_entry *ex, *rw_extable; unsigned long pc; - struct exception_table_entry *ex; + off_t offset; u32 *fixup; /* Populate extable entries only in the last pass */ @@ -388,9 +410,16 @@ int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, int pass, struct code WARN_ON_ONCE(ctx->exentry_idx >= fp->aux->num_exentries)) return -EINVAL; - pc = (unsigned long)&image[insn_idx]; + /* + * Program is firt written to rw_image before copying to the + * final location. Accordingly, update in the rw_image first. + * As all offsets used are relative, copying as is to the + * final location should be alright. + */ + pc = (unsigned long)&rw_image[insn_idx]; + rw_extable = (void *)fp->aux->extable - (void *)image + (void *)rw_image; - fixup = (void *)fp->aux->extable - + fixup = (void *)rw_extable - (fp->aux->num_exentries * BPF_FIXUP_LEN * 4) + (ctx->exentry_idx * BPF_FIXUP_LEN * 4); @@ -401,7 +430,7 @@ int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, int pass, struct code fixup[BPF_FIXUP_LEN - 1] = PPC_RAW_BRANCH((long)(pc + jmp_off) - (long)&fixup[BPF_FIXUP_LEN - 1]); - ex = &fp->aux->extable[ctx->exentry_idx]; + ex = &rw_extable[ctx->exentry_idx]; offset = pc - (long)&ex->insn; if (WARN_ON_ONCE(offset >= 0 || offset < INT_MIN)) @@ -426,3 +455,27 @@ int bpf_arch_text_invalidate(void *dst, size_t len) { return IS_ERR(bpf_patch_ill_insns(dst, len)); } + +void bpf_jit_free(struct bpf_prog *fp) +{ + if (fp->jited) { + struct powerpc_jit_data *jit_data = fp->aux->jit_data; + struct bpf_binary_header *hdr; + + /* + * If we fail the final pass of JIT (from jit_subprogs), + * the program may not be finalized yet. Call finalize here + * before freeing it. + */ + if (jit_data) { + bpf_jit_binary_pack_finalize(fp, jit_data->header, jit_data->rw_header); + kvfree(jit_data->addrs); + kfree(jit_data); + } + hdr = bpf_jit_binary_pack_hdr(fp); + bpf_jit_binary_pack_free(hdr, NULL); + WARN_ON_ONCE(!bpf_prog_kallsyms_verify_off(fp)); + } + + bpf_prog_unlock_free(fp); +} diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c index 43f1c76d48ce..047ef627c2aa 100644 --- a/arch/powerpc/net/bpf_jit_comp32.c +++ b/arch/powerpc/net/bpf_jit_comp32.c @@ -109,7 +109,7 @@ void bpf_jit_realloc_regs(struct codegen_context *ctx) } } -void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx) +void bpf_jit_build_prologue(u32 *image, u32 *rw_image, struct codegen_context *ctx) { int i; @@ -162,7 +162,7 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx) EMIT(PPC_RAW_STW(_R0, _R1, BPF_PPC_STACKFRAME(ctx) + PPC_LR_STKOFF)); } -static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx) +static void bpf_jit_emit_common_epilogue(u32 *image, u32 *rw_image, struct codegen_context *ctx) { int i; @@ -172,11 +172,11 @@ static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx EMIT(PPC_RAW_LWZ(i, _R1, bpf_jit_stack_offsetof(ctx, i))); } -void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx) +void bpf_jit_build_epilogue(u32 *image, u32 *rw_image, struct codegen_context *ctx) { EMIT(PPC_RAW_MR(_R3, bpf_to_ppc(BPF_REG_0))); - bpf_jit_emit_common_epilogue(image, ctx); + bpf_jit_emit_common_epilogue(image, rw_image, ctx); /* Tear down our stack frame */ @@ -191,7 +191,7 @@ void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx) EMIT(PPC_RAW_BLR()); } -int bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 func) +int bpf_jit_emit_func_call_rel(u32 *image, u32 *rw_image, struct codegen_context *ctx, u64 func) { s32 rel = (s32)func - (s32)(image + ctx->idx); @@ -211,7 +211,7 @@ int bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 func return 0; } -static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 out) +static int bpf_jit_emit_tail_call(u32 *image, u32 *rw_image, struct codegen_context *ctx, u32 out) { /* * By now, the eBPF program has already setup parameters in r3-r6 @@ -269,7 +269,7 @@ static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 o EMIT(PPC_RAW_MR(_R3, bpf_to_ppc(BPF_REG_1))); /* tear restore NVRs, ... */ - bpf_jit_emit_common_epilogue(image, ctx); + bpf_jit_emit_common_epilogue(image, rw_image, ctx); EMIT(PPC_RAW_BCTR()); @@ -278,7 +278,7 @@ static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 o } /* Assemble the body code between the prologue & epilogue */ -int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *ctx, +int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *rw_image, struct codegen_context *ctx, u32 *addrs, int pass) { const struct bpf_insn *insn = fp->insnsi; @@ -954,8 +954,8 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * jmp_off += 4; } - ret = bpf_add_extable_entry(fp, image, pass, ctx, insn_idx, - jmp_off, dst_reg); + ret = bpf_add_extable_entry(fp, image, rw_image, pass, ctx, + insn_idx, jmp_off, dst_reg); if (ret) return ret; } @@ -986,7 +986,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * * we'll just fall through to the epilogue. */ if (i != flen - 1) { - ret = bpf_jit_emit_exit_insn(image, ctx, _R0, exit_addr); + ret = bpf_jit_emit_exit_insn(image, rw_image, ctx, _R0, exit_addr); if (ret) return ret; } @@ -1009,7 +1009,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * EMIT(PPC_RAW_STW(bpf_to_ppc(BPF_REG_5), _R1, 12)); } - ret = bpf_jit_emit_func_call_rel(image, ctx, func_addr); + ret = bpf_jit_emit_func_call_rel(image, rw_image, ctx, func_addr); if (ret) return ret; @@ -1231,7 +1231,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * */ case BPF_JMP | BPF_TAIL_CALL: ctx->seen |= SEEN_TAILCALL; - ret = bpf_jit_emit_tail_call(image, ctx, addrs[i + 1]); + ret = bpf_jit_emit_tail_call(image, rw_image, ctx, addrs[i + 1]); if (ret < 0) return ret; break; diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c index 29ee306d6302..f15bc20909d8 100644 --- a/arch/powerpc/net/bpf_jit_comp64.c +++ b/arch/powerpc/net/bpf_jit_comp64.c @@ -122,7 +122,7 @@ void bpf_jit_realloc_regs(struct codegen_context *ctx) { } -void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx) +void bpf_jit_build_prologue(u32 *image, u32 *rw_image, struct codegen_context *ctx) { int i; @@ -171,7 +171,7 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx) STACK_FRAME_MIN_SIZE + ctx->stack_size)); } -static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx) +static void bpf_jit_emit_common_epilogue(u32 *image, u32 *rw_image, struct codegen_context *ctx) { int i; @@ -190,9 +190,9 @@ static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx } } -void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx) +void bpf_jit_build_epilogue(u32 *image, u32 *rw_image, struct codegen_context *ctx) { - bpf_jit_emit_common_epilogue(image, ctx); + bpf_jit_emit_common_epilogue(image, rw_image, ctx); /* Move result to r3 */ EMIT(PPC_RAW_MR(_R3, bpf_to_ppc(BPF_REG_0))); @@ -200,7 +200,8 @@ void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx) EMIT(PPC_RAW_BLR()); } -static int bpf_jit_emit_func_call_hlp(u32 *image, struct codegen_context *ctx, u64 func) +static int bpf_jit_emit_func_call_hlp(u32 *image, u32 *rw_image, struct codegen_context *ctx, + u64 func) { unsigned long func_addr = func ? ppc_function_entry((void *)func) : 0; long reladdr; @@ -222,7 +223,7 @@ static int bpf_jit_emit_func_call_hlp(u32 *image, struct codegen_context *ctx, u return 0; } -int bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 func) +int bpf_jit_emit_func_call_rel(u32 *image, u32 *rw_image, struct codegen_context *ctx, u64 func) { unsigned int i, ctx_idx = ctx->idx; @@ -254,7 +255,7 @@ int bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, u64 func return 0; } -static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 out) +static int bpf_jit_emit_tail_call(u32 *image, u32 *rw_image, struct codegen_context *ctx, u32 out) { /* * By now, the eBPF program has already setup parameters in r3, r4 and r5 @@ -311,7 +312,7 @@ static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 o EMIT(PPC_RAW_MTCTR(bpf_to_ppc(TMP_REG_1))); /* tear down stack, restore NVRs, ... */ - bpf_jit_emit_common_epilogue(image, ctx); + bpf_jit_emit_common_epilogue(image, rw_image, ctx); EMIT(PPC_RAW_BCTR()); @@ -342,7 +343,7 @@ asm ( ); /* Assemble the body code between the prologue & epilogue */ -int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *ctx, +int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *rw_image, struct codegen_context *ctx, u32 *addrs, int pass) { enum stf_barrier_type stf_barrier = stf_barrier_type_get(); @@ -921,8 +922,8 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * addrs[++i] = ctx->idx * 4; if (BPF_MODE(code) == BPF_PROBE_MEM) { - ret = bpf_add_extable_entry(fp, image, pass, ctx, ctx->idx - 1, - 4, dst_reg); + ret = bpf_add_extable_entry(fp, image, rw_image, pass, ctx, + ctx->idx - 1, 4, dst_reg); if (ret) return ret; } @@ -954,7 +955,8 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * * we'll just fall through to the epilogue. */ if (i != flen - 1) { - ret = bpf_jit_emit_exit_insn(image, ctx, tmp1_reg, exit_addr); + ret = bpf_jit_emit_exit_insn(image, rw_image, ctx, + tmp1_reg, exit_addr); if (ret) return ret; } @@ -973,9 +975,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * return ret; if (func_addr_fixed) - ret = bpf_jit_emit_func_call_hlp(image, ctx, func_addr); + ret = bpf_jit_emit_func_call_hlp(image, rw_image, ctx, func_addr); else - ret = bpf_jit_emit_func_call_rel(image, ctx, func_addr); + ret = bpf_jit_emit_func_call_rel(image, rw_image, ctx, func_addr); if (ret) return ret; @@ -1184,7 +1186,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * */ case BPF_JMP | BPF_TAIL_CALL: ctx->seen |= SEEN_TAILCALL; - ret = bpf_jit_emit_tail_call(image, ctx, addrs[i + 1]); + ret = bpf_jit_emit_tail_call(image, rw_image, ctx, addrs[i + 1]); if (ret < 0) return ret; break; -- 2.37.3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 3/3] powerpc/bpf: use bpf_jit_binary_pack_[alloc|finalize|free] 2022-11-10 18:43 ` [RFC PATCH 3/3] powerpc/bpf: use bpf_jit_binary_pack_[alloc|finalize|free] Hari Bathini @ 2022-11-13 18:36 ` Christophe Leroy 0 siblings, 0 replies; 19+ messages in thread From: Christophe Leroy @ 2022-11-13 18:36 UTC (permalink / raw) To: Hari Bathini, linuxppc-dev, bpf Cc: Song Liu, Daniel Borkmann, Alexei Starovoitov, Andrii Nakryiko, Naveen N. Rao Le 10/11/2022 à 19:43, Hari Bathini a écrit : > Use bpf_jit_binary_pack_alloc in powerpc jit. The jit engine first > writes the program to the rw buffer. When the jit is done, the program > is copied to the final location with bpf_jit_binary_pack_finalize. > With multiple jit_subprogs, bpf_jit_free is called on some subprograms > that haven't got bpf_jit_binary_pack_finalize() yet. Implement custom > bpf_jit_free() like in commit 1d5f82d9dd47 ("bpf, x86: fix freeing of > not-finalized bpf_prog_pack") to call bpf_jit_binary_pack_finalize(), > if necessary. While here, correct the misnomer powerpc64_jit_data to > powerpc_jit_data as it is meant for both ppc32 and ppc64. This patch looks heavy compared to x86 commit 1022a5498f6f. I didn't look into details, is there really a need to carry that rw_image over all functions you changed ? As far as I can see, ok you need it for EMIT macro. But then some of the function that use EMIT will now use rw_image instead of image, so why do they need both image and rw_image ? Maybe you'd have less churn if you leave image, and add a ro_image wherever necessary but not everywhere. > > Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> > --- > arch/powerpc/net/bpf_jit.h | 18 +++-- > arch/powerpc/net/bpf_jit_comp.c | 123 +++++++++++++++++++++--------- > arch/powerpc/net/bpf_jit_comp32.c | 26 +++---- > arch/powerpc/net/bpf_jit_comp64.c | 32 ++++---- > 4 files changed, 128 insertions(+), 71 deletions(-) > > diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c > index f925755cd249..c4c1f7a21d89 100644 > --- a/arch/powerpc/net/bpf_jit_comp.c > +++ b/arch/powerpc/net/bpf_jit_comp.c > @@ -181,22 +183,25 @@ bool bpf_jit_needs_zext(void) > > struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) > { > - u32 proglen; > - u32 alloclen; > - u8 *image = NULL; > - u32 *code_base; > - u32 *addrs; > - struct powerpc64_jit_data *jit_data; > - struct codegen_context cgctx; > - int pass; > - int flen; > + struct bpf_binary_header *rw_header = NULL; > + struct powerpc_jit_data *jit_data; > struct bpf_binary_header *bpf_hdr; > + struct codegen_context cgctx; > struct bpf_prog *org_fp = fp; > struct bpf_prog *tmp_fp; > bool bpf_blinded = false; > bool extra_pass = false; > + u8 *rw_image = NULL; > + u32 *rw_code_base; > + u8 *image = NULL; > u32 extable_len; > + u32 *code_base; > u32 fixup_len; > + u32 alloclen; > + u32 proglen; > + u32 *addrs; > + int pass; > + int flen; Why so many changes here, a lot of items seems to only have moved without any modification. Why that churn ? > > if (!fp->jit_requested) > return org_fp; > @@ -227,6 +232,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) > image = jit_data->image; > bpf_hdr = jit_data->header; > proglen = jit_data->proglen; > + rw_header = jit_data->rw_header; > + rw_image = (void *)rw_header + ((void *)image - (void *)bpf_hdr); > extra_pass = true; > goto skip_init_ctx; > } > @@ -244,7 +251,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) > cgctx.stack_size = round_up(fp->aux->stack_depth, 16); > > /* Scouting faux-generate pass 0 */ > - if (bpf_jit_build_body(fp, 0, &cgctx, addrs, 0)) { > + if (bpf_jit_build_body(fp, 0, 0, &cgctx, addrs, 0)) { Some of the 0s in this call are pointers. You should use NULL instead. This comment applies to several other lines you have changed. > /* We hit something illegal or unsupported. */ > fp = org_fp; > goto out_addrs; > @@ -259,7 +266,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) > */ > if (cgctx.seen & SEEN_TAILCALL || !is_offset_in_branch_range((long)cgctx.idx * 4)) { > cgctx.idx = 0; > - if (bpf_jit_build_body(fp, 0, &cgctx, addrs, 0)) { > + if (bpf_jit_build_body(fp, 0, 0, &cgctx, addrs, 0)) { 0 ==> NULL > fp = org_fp; > goto out_addrs; > } > @@ -271,9 +278,9 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) > * update ctgtx.idx as it pretends to output instructions, then we can > * calculate total size from idx. > */ > - bpf_jit_build_prologue(0, &cgctx); > + bpf_jit_build_prologue(0, 0, &cgctx); > addrs[fp->len] = cgctx.idx * 4; > - bpf_jit_build_epilogue(0, &cgctx); > + bpf_jit_build_epilogue(0, 0, &cgctx); 0 ==> NULL > > fixup_len = fp->aux->num_exentries * BPF_FIXUP_LEN * 4; > extable_len = fp->aux->num_exentries * sizeof(struct exception_table_entry); > @@ -337,17 +348,26 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) > > #ifdef CONFIG_PPC64_ELF_ABI_V1 > /* Function descriptor nastiness: Address + TOC */ > - ((u64 *)image)[0] = (u64)code_base; > - ((u64 *)image)[1] = local_paca->kernel_toc; > + ((u64 *)rw_image)[0] = (u64)code_base; > + ((u64 *)rw_image)[1] = local_paca->kernel_toc; Would be better to use 'struct func_desc' And the #ifdef is not necessary, IS_ENABLED() would be enough. > #endif > > fp->bpf_func = (void *)image; > fp->jited = 1; > fp->jited_len = proglen + FUNCTION_DESCR_SIZE; > > - bpf_flush_icache(bpf_hdr, (u8 *)bpf_hdr + bpf_hdr->size); > if (!fp->is_func || extra_pass) { > - bpf_jit_binary_lock_ro(bpf_hdr); > + /* > + * bpf_jit_binary_pack_finalize fails in two scenarios: > + * 1) header is not pointing to proper module memory; Can that really happen ? > + * 2) the arch doesn't support bpf_arch_text_copy(). The above cannot happen, you added support bpf_arch_text_copy() in patch 1. So why this comment ? > + * > + * Both cases are serious bugs that justify WARN_ON. > + */ Case 2 would mean a bug in the compiler, if you can't trust your compiler for that you can't trust it for anything else. That's odd. > + if (WARN_ON(bpf_jit_binary_pack_finalize(fp, bpf_hdr, rw_header))) { > + fp = org_fp; > + goto out_addrs; > + } > bpf_prog_fill_jited_linfo(fp, addrs); > out_addrs: > kfree(addrs); ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc 2022-11-10 18:43 [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc Hari Bathini ` (2 preceding siblings ...) 2022-11-10 18:43 ` [RFC PATCH 3/3] powerpc/bpf: use bpf_jit_binary_pack_[alloc|finalize|free] Hari Bathini @ 2022-11-11 11:25 ` Christophe Leroy 2022-11-14 14:47 ` Hari Bathini 3 siblings, 1 reply; 19+ messages in thread From: Christophe Leroy @ 2022-11-11 11:25 UTC (permalink / raw) To: Hari Bathini, linuxppc-dev, bpf Cc: Song Liu, Daniel Borkmann, Alexei Starovoitov, Andrii Nakryiko, Naveen N. Rao Le 10/11/2022 à 19:43, Hari Bathini a écrit : > Most BPF programs are small, but they consume a page each. For systems > with busy traffic and many BPF programs, this may also add significant > pressure on instruction TLB. High iTLB pressure usually slows down the > whole system causing visible performance degradation for production > workloads. > > bpf_prog_pack, a customized allocator that packs multiple bpf programs > into preallocated memory chunks, was proposed [1] to address it. This > series extends this support on powerpc. > > Patches 1 & 2 add the arch specific functions needed to support this > feature. Patch 3 enables the support for powerpc. The last patch > ensures cleanup is handled racefully. > > Tested the changes successfully on a PowerVM. patch_instruction(), > needed for bpf_arch_text_copy(), is failing for ppc32. Debugging it. > Posting the patches in the meanwhile for feedback on these changes. I did a quick test on ppc32, I don't get such a problem, only something wrong in the dump print as traps intructions only are dumped, but tcpdump works as expected: [ 55.692998] bpf_jit_enable = 2 was set! NEVER use this in production, only for JIT debugging! [ 66.279259] device eth0 entered promiscuous mode [ 67.214756] Pass 1: shrink = 0, seen = 0x1f980000 [ 67.214880] Pass 2: shrink = 0, seen = 0x1f980000 [ 67.214966] flen=5 proglen=60 pass=3 image=be7a8038 from=tcpdump pid=459 [ 67.225261] JIT code: 00000000: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.233904] JIT code: 00000010: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.242579] JIT code: 00000020: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.249694] JIT code: 00000030: 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.259255] Pass 1: shrink = 0, seen = 0x3ff801fe [ 67.259421] Pass 2: shrink = 0, seen = 0x3ff801fe [ 67.259514] flen=40 proglen=504 pass=3 image=be7a80a0 from=tcpdump pid=459 [ 67.269467] JIT code: 00000000: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.278001] JIT code: 00000010: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.286519] JIT code: 00000020: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.295041] JIT code: 00000030: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.303596] JIT code: 00000040: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.312164] JIT code: 00000050: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.319231] JIT code: 00000060: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.328822] JIT code: 00000070: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.337382] JIT code: 00000080: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.345901] JIT code: 00000090: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.354423] JIT code: 000000a0: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.362941] JIT code: 000000b0: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.371462] JIT code: 000000c0: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.378526] JIT code: 000000d0: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.388120] JIT code: 000000e0: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.396680] JIT code: 000000f0: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.405199] JIT code: 00000100: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.413756] JIT code: 00000110: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.422324] JIT code: 00000120: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.429389] JIT code: 00000130: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.438982] JIT code: 00000140: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.447541] JIT code: 00000150: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.456059] JIT code: 00000160: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.464578] JIT code: 00000170: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.473201] JIT code: 00000180: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.481705] JIT code: 00000190: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.488770] JIT code: 000001a0: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.498359] JIT code: 000001b0: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.506921] JIT code: 000001c0: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.515439] JIT code: 000001d0: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.523998] JIT code: 000001e0: 7f e0 00 08 7f e0 00 08 7f e0 00 08 7f e0 00 08 [ 67.532565] JIT code: 000001f0: 7f e0 00 08 7f e0 00 08 [ 82.620898] device eth0 left promiscuous mode > > [1] https://lore.kernel.org/bpf/20220204185742.271030-1-song@kernel.org/ > > Hari Bathini (3): > powerpc/bpf: implement bpf_arch_text_copy > powerpc/bpf: implement bpf_arch_text_invalidate for bpf_prog_pack > powerpc/bpf: use bpf_jit_binary_pack_[alloc|finalize|free] > > arch/powerpc/net/bpf_jit.h | 18 +-- > arch/powerpc/net/bpf_jit_comp.c | 194 ++++++++++++++++++++++++------ > arch/powerpc/net/bpf_jit_comp32.c | 26 ++-- > arch/powerpc/net/bpf_jit_comp64.c | 32 ++--- > 4 files changed, 198 insertions(+), 72 deletions(-) > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc 2022-11-11 11:25 ` [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc Christophe Leroy @ 2022-11-14 14:47 ` Hari Bathini [not found] ` <bf0af91e-861c-1608-7150-d31578be9b02@csgroup.eu> 0 siblings, 1 reply; 19+ messages in thread From: Hari Bathini @ 2022-11-14 14:47 UTC (permalink / raw) To: Christophe Leroy, linuxppc-dev, bpf Cc: Naveen N. Rao, Song Liu, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko Hi Christophe, On 11/11/22 4:55 pm, Christophe Leroy wrote: > Le 10/11/2022 à 19:43, Hari Bathini a écrit : >> Most BPF programs are small, but they consume a page each. For systems >> with busy traffic and many BPF programs, this may also add significant >> pressure on instruction TLB. High iTLB pressure usually slows down the >> whole system causing visible performance degradation for production >> workloads. >> >> bpf_prog_pack, a customized allocator that packs multiple bpf programs >> into preallocated memory chunks, was proposed [1] to address it. This >> series extends this support on powerpc. >> >> Patches 1 & 2 add the arch specific functions needed to support this >> feature. Patch 3 enables the support for powerpc. The last patch >> ensures cleanup is handled racefully. >> >> Tested the changes successfully on a PowerVM. patch_instruction(), >> needed for bpf_arch_text_copy(), is failing for ppc32. Debugging it. >> Posting the patches in the meanwhile for feedback on these changes. > > I did a quick test on ppc32, I don't get such a problem, only something > wrong in the dump print as traps intructions only are dumped, but > tcpdump works as expected: Thanks for the quick test. Could you please share the config you used. I am probably missing a few knobs in my conifg... Thanks Hari ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <bf0af91e-861c-1608-7150-d31578be9b02@csgroup.eu>]
[parent not found: <e0266414-843f-db48-a56d-1d8a8944726a@csgroup.eu>]
* Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc [not found] ` <e0266414-843f-db48-a56d-1d8a8944726a@csgroup.eu> @ 2022-11-16 17:01 ` Hari Bathini 2022-11-16 17:16 ` Christophe Leroy 2022-11-17 6:59 ` Christophe Leroy 0 siblings, 2 replies; 19+ messages in thread From: Hari Bathini @ 2022-11-16 17:01 UTC (permalink / raw) To: Christophe Leroy, linuxppc-dev, bpf Cc: Naveen N. Rao, Song Liu, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko On 16/11/22 12:14 am, Christophe Leroy wrote: > > > Le 14/11/2022 à 18:27, Christophe Leroy a écrit : >> >> >> Le 14/11/2022 à 15:47, Hari Bathini a écrit : >>> Hi Christophe, >>> >>> On 11/11/22 4:55 pm, Christophe Leroy wrote: >>>> Le 10/11/2022 à 19:43, Hari Bathini a écrit : >>>>> Most BPF programs are small, but they consume a page each. For systems >>>>> with busy traffic and many BPF programs, this may also add significant >>>>> pressure on instruction TLB. High iTLB pressure usually slows down the >>>>> whole system causing visible performance degradation for production >>>>> workloads. >>>>> >>>>> bpf_prog_pack, a customized allocator that packs multiple bpf programs >>>>> into preallocated memory chunks, was proposed [1] to address it. This >>>>> series extends this support on powerpc. >>>>> >>>>> Patches 1 & 2 add the arch specific functions needed to support this >>>>> feature. Patch 3 enables the support for powerpc. The last patch >>>>> ensures cleanup is handled racefully. >>>>> >>> >>>>> Tested the changes successfully on a PowerVM. patch_instruction(), >>>>> needed for bpf_arch_text_copy(), is failing for ppc32. Debugging it. >>>>> Posting the patches in the meanwhile for feedback on these changes. >>>> >>>> I did a quick test on ppc32, I don't get such a problem, only something >>>> wrong in the dump print as traps intructions only are dumped, but >>>> tcpdump works as expected: >>> >>> Thanks for the quick test. Could you please share the config you used. >>> I am probably missing a few knobs in my conifg... >>> >> > > I also managed to test it on QEMU. The config is based on pmac32_defconfig. I had the same config but hit this problem: # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf test_bpf: #0 TAX ------------[ cut here ]------------ WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367 bpf_int_jit_compile+0x8a0/0x9f8 Modules linked in: test_bpf(+) CPU: 0 PID: 96 Comm: modprobe Not tainted 6.1.0-rc5+ #116 Hardware name: PowerMac3,1 750CL 0x87210 PowerMac NIP: c0038224 LR: c0037f74 CTR: 00000009 REGS: f1b41b10 TRAP: 0700 Not tainted (6.1.0-rc5+) MSR: 00029032 <EE,ME,IR,DR,RI> CR: 82004882 XER: 20000000 GPR00: c0037f74 f1b41bd0 c12333c0 ffffffea c19747e0 c0123a08 c1974a00 c12333c0 GPR08: 00000000 00000b58 00009032 00000003 82004882 100d816a f2660000 00000000 GPR16: 00000000 c0c10000 00000000 c0c10000 c1913780 00000000 0000001a 100a2ee3 GPR24: 0000014c be857000 be857020 f1b41c00 c194c180 f1b46000 ffffffea f1b46000 NIP [c0038224] bpf_int_jit_compile+0x8a0/0x9f8 LR [c0037f74] bpf_int_jit_compile+0x5f0/0x9f8 Call Trace: [f1b41bd0] [c0037f74] bpf_int_jit_compile+0x5f0/0x9f8 (unreliable) [f1b41ca0] [c0123790] bpf_prog_select_runtime+0x178/0x1c4 [f1b41cd0] [c06cc4b0] bpf_prepare_filter+0x524/0x624 [f1b41d20] [c06cc63c] bpf_prog_create+0x8c/0xd0 [f1b41d40] [be85083c] test_bpf_init+0x46c/0x11b0 [test_bpf] [f1b41df0] [c0008534] do_one_initcall+0x58/0x1b8 [f1b41e60] [c00b6c38] do_init_module+0x54/0x1e4 [f1b41e80] [c00b8f80] sys_init_module+0x10c/0x174 [f1b41f10] [c0014390] system_call_exception+0x94/0x144 [f1b41f30] [c001a1ac] ret_from_syscall+0x0/0x2c --- interrupt: c00 at 0xfef48ac NIP: 0fef48ac LR: 10015de0 CTR: 00000000 REGS: f1b41f40 TRAP: 0c00 Not tainted (6.1.0-rc5+) MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 88002224 XER: 20000000 GPR00: 00000080 aff34990 a7a8d380 a75c0010 00477e90 1051b9a0 0fedfdd8 0000d032 GPR08: 00000000 00000008 00478ffa 400f26fa 400f23c7 100d816a 00000000 00000000 GPR16: 00000000 1051b950 00000000 1051b92c 100d0000 100a2eab 100a2e97 100a2ee3 GPR24: 100a2ed5 1051b980 00000001 100d01a8 1051b920 a75c0010 1051b9a0 a7a86388 NIP [0fef48ac] 0xfef48ac LR [10015de0] 0x10015de0 --- interrupt: c00 Instruction dump: 8081001c 38a00004 7f23cb78 4bfff5d1 7f23cb78 8081001c 480eba85 82410098 8261009c 82a100a4 82e100ac 4bfffce8 <0fe00000> 92010090 92410098 92e100ac ---[ end trace 0000000000000000 ]--- jited:1 kernel tried to execute exec-protected page (be857020) - exploit attempt? (uid: 0) BUG: Unable to handle kernel instruction fetch Faulting instruction address: 0xbe857020 Vector: 400 (Instruction Access) at [f1b41c30] pc: be857020 lr: be84eaa4: __run_one+0xec/0x248 [test_bpf] sp: f1b41cf0 msr: 40009032 current = 0xc12333c0 pid = 96, comm = modprobe Linux version 6.1.0-rc5+ (root@hbathini-Standard-PC-Q35-ICH9-2009) (powerpc-linux-gnu-gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #116 Wed Nov 16 20:48:10 IST 2022 enter ? for help [link register ] be84eaa4 __run_one+0xec/0x248 [test_bpf] [f1b41cf0] be84e9dc __run_one+0x24/0x248 [test_bpf] (unreliable) [f1b41d40] be850c78 test_bpf_init+0x8a8/0x11b0 [test_bpf] [f1b41df0] c0008534 do_one_initcall+0x58/0x1b8 [f1b41e60] c00b6c38 do_init_module+0x54/0x1e4 [f1b41e80] c00b8f80 sys_init_module+0x10c/0x174 [f1b41f10] c0014390 system_call_exception+0x94/0x144 [f1b41f30] c001a1ac ret_from_syscall+0x0/0x2c --- Exception: c00 (System Call) at 0fef48ac SP (aff34990) is in userspace mon> bpf_jit_binary_pack_finalize() function failed due to patch_instruction() .. Also, I am hitting the same problem with the other config too.. Thanks Hari ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc 2022-11-16 17:01 ` Hari Bathini @ 2022-11-16 17:16 ` Christophe Leroy 2022-11-17 6:59 ` Christophe Leroy 1 sibling, 0 replies; 19+ messages in thread From: Christophe Leroy @ 2022-11-16 17:16 UTC (permalink / raw) To: Hari Bathini, linuxppc-dev, bpf Cc: Naveen N. Rao, Song Liu, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko Le 16/11/2022 à 18:01, Hari Bathini a écrit : >> >> I also managed to test it on QEMU. The config is based on >> pmac32_defconfig. > > I had the same config but hit this problem: > > # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf > test_bpf: #0 TAX > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367 > bpf_int_jit_compile+0x8a0/0x9f8 Ok, I'll give it a try with test_bpf module. Christophe ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc 2022-11-16 17:01 ` Hari Bathini 2022-11-16 17:16 ` Christophe Leroy @ 2022-11-17 6:59 ` Christophe Leroy 2022-11-18 8:39 ` Hari Bathini 1 sibling, 1 reply; 19+ messages in thread From: Christophe Leroy @ 2022-11-17 6:59 UTC (permalink / raw) To: Hari Bathini, linuxppc-dev, bpf Cc: Naveen N. Rao, Song Liu, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko Le 16/11/2022 à 18:01, Hari Bathini a écrit : > > > On 16/11/22 12:14 am, Christophe Leroy wrote: >> >> >> Le 14/11/2022 à 18:27, Christophe Leroy a écrit : >>> >>> >>> Le 14/11/2022 à 15:47, Hari Bathini a écrit : >>>> Hi Christophe, >>>> >>>> On 11/11/22 4:55 pm, Christophe Leroy wrote: >>>>> Le 10/11/2022 à 19:43, Hari Bathini a écrit : >>>>>> Most BPF programs are small, but they consume a page each. For >>>>>> systems >>>>>> with busy traffic and many BPF programs, this may also add >>>>>> significant >>>>>> pressure on instruction TLB. High iTLB pressure usually slows down >>>>>> the >>>>>> whole system causing visible performance degradation for production >>>>>> workloads. >>>>>> >>>>>> bpf_prog_pack, a customized allocator that packs multiple bpf >>>>>> programs >>>>>> into preallocated memory chunks, was proposed [1] to address it. This >>>>>> series extends this support on powerpc. >>>>>> >>>>>> Patches 1 & 2 add the arch specific functions needed to support this >>>>>> feature. Patch 3 enables the support for powerpc. The last patch >>>>>> ensures cleanup is handled racefully. >>>>>> >>>> >>>>>> Tested the changes successfully on a PowerVM. patch_instruction(), >>>>>> needed for bpf_arch_text_copy(), is failing for ppc32. Debugging it. >>>>>> Posting the patches in the meanwhile for feedback on these changes. >>>>> >>>>> I did a quick test on ppc32, I don't get such a problem, only >>>>> something >>>>> wrong in the dump print as traps intructions only are dumped, but >>>>> tcpdump works as expected: >>>> >>>> Thanks for the quick test. Could you please share the config you used. >>>> I am probably missing a few knobs in my conifg... >>>> >>> >> >> I also managed to test it on QEMU. The config is based on >> pmac32_defconfig. > > I had the same config but hit this problem: > > # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf > test_bpf: #0 TAX > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367 > bpf_int_jit_compile+0x8a0/0x9f8 I get no such problem, on QEMU, and I checked the .config has: CONFIG_STRICT_KERNEL_RWX=y CONFIG_STRICT_MODULE_RWX=y Boot successful. / # ifconfig eth0 10.0.2.15 e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX / # tftp -g 10.0.2.2 -r test_bpf.ko / # echo 1 > /proc/sys/net/core/bpf_jit_enable / # insmod ./test_bpf.ko test_bpf: #0 TAX jited:1 216 87 86 PASS test_bpf: #1 TXA jited:1 57 27 27 PASS test_bpf: #2 ADD_SUB_MUL_K jited:1 50 PASS test_bpf: #3 DIV_MOD_KX jited:1 110 PASS test_bpf: #4 AND_OR_LSH_K jited:1 67 26 PASS test_bpf: #5 LD_IMM_0 jited:1 77 PASS ... By the way, you can note that during the boot you get: This platform has HASH MMU, STRICT_MODULE_RWX won't work See why in 0670010f3b10 ("powerpc/32s: Enable STRICT_MODULE_RWX for the 603 core") Nevertheless it should prevent patch_instruction() to work. Could you had a pr_err() in __patch_instruction() in the failure path to print and check exec_addr and patch_addr ? > jited:1 > kernel tried to execute exec-protected page (be857020) - exploit > attempt? (uid: 0) > BUG: Unable to handle kernel instruction fetch > Faulting instruction address: 0xbe857020 I'm a bit surprised of this. On hash based book3s/32 there is no way to protect pages for exec-protection. Protection is performed at segment level, all kernel segments have the NX bit set except the segment used for module text, which is by default 0xb0000000-0xbfffffff. Or maybe this is the first time that address is accessed, and the ISI handler does the check before loading the hash table ? > > bpf_jit_binary_pack_finalize() function failed due to > patch_instruction() .. Is there a way tell BPF core that jit failed in that case to avoid that ? Christophe ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc 2022-11-17 6:59 ` Christophe Leroy @ 2022-11-18 8:39 ` Hari Bathini 2022-11-18 8:51 ` Christophe Leroy 0 siblings, 1 reply; 19+ messages in thread From: Hari Bathini @ 2022-11-18 8:39 UTC (permalink / raw) To: Christophe Leroy, linuxppc-dev, bpf Cc: Naveen N. Rao, Song Liu, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko On 17/11/22 12:29 pm, Christophe Leroy wrote: > > > Le 16/11/2022 à 18:01, Hari Bathini a écrit : >> >> >> On 16/11/22 12:14 am, Christophe Leroy wrote: >>> >>> >>> Le 14/11/2022 à 18:27, Christophe Leroy a écrit : >>>> >>>> >>>> Le 14/11/2022 à 15:47, Hari Bathini a écrit : >>>>> Hi Christophe, >>>>> >>>>> On 11/11/22 4:55 pm, Christophe Leroy wrote: >>>>>> Le 10/11/2022 à 19:43, Hari Bathini a écrit : >>>>>>> Most BPF programs are small, but they consume a page each. For >>>>>>> systems >>>>>>> with busy traffic and many BPF programs, this may also add >>>>>>> significant >>>>>>> pressure on instruction TLB. High iTLB pressure usually slows down >>>>>>> the >>>>>>> whole system causing visible performance degradation for production >>>>>>> workloads. >>>>>>> >>>>>>> bpf_prog_pack, a customized allocator that packs multiple bpf >>>>>>> programs >>>>>>> into preallocated memory chunks, was proposed [1] to address it. This >>>>>>> series extends this support on powerpc. >>>>>>> >>>>>>> Patches 1 & 2 add the arch specific functions needed to support this >>>>>>> feature. Patch 3 enables the support for powerpc. The last patch >>>>>>> ensures cleanup is handled racefully. >>>>>>> >>>>> >>>>>>> Tested the changes successfully on a PowerVM. patch_instruction(), >>>>>>> needed for bpf_arch_text_copy(), is failing for ppc32. Debugging it. >>>>>>> Posting the patches in the meanwhile for feedback on these changes. >>>>>> >>>>>> I did a quick test on ppc32, I don't get such a problem, only >>>>>> something >>>>>> wrong in the dump print as traps intructions only are dumped, but >>>>>> tcpdump works as expected: >>>>> >>>>> Thanks for the quick test. Could you please share the config you used. >>>>> I am probably missing a few knobs in my conifg... >>>>> >>>> >>> >>> I also managed to test it on QEMU. The config is based on >>> pmac32_defconfig. >> >> I had the same config but hit this problem: >> >> # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf >> test_bpf: #0 TAX >> ------------[ cut here ]------------ >> WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367 >> bpf_int_jit_compile+0x8a0/0x9f8 > > I get no such problem, on QEMU, and I checked the .config has: > CONFIG_STRICT_KERNEL_RWX=y > CONFIG_STRICT_MODULE_RWX=y Yeah. That did the trick. These options were missing in my config and the pmac config you shared. I could not run the other config you shared on QEMU. Thanks for all the pointers. - Hari ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc 2022-11-18 8:39 ` Hari Bathini @ 2022-11-18 8:51 ` Christophe Leroy 2022-11-18 9:39 ` Hari Bathini 0 siblings, 1 reply; 19+ messages in thread From: Christophe Leroy @ 2022-11-18 8:51 UTC (permalink / raw) To: Hari Bathini, linuxppc-dev, bpf Cc: Naveen N. Rao, Song Liu, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko Le 18/11/2022 à 09:39, Hari Bathini a écrit : > > > On 17/11/22 12:29 pm, Christophe Leroy wrote: >> >> >> Le 16/11/2022 à 18:01, Hari Bathini a écrit : >>> >>> >>> On 16/11/22 12:14 am, Christophe Leroy wrote: >>>> >>>> >>>> Le 14/11/2022 à 18:27, Christophe Leroy a écrit : >>>>> >>>>> >>>>> Le 14/11/2022 à 15:47, Hari Bathini a écrit : >>>>>> Hi Christophe, >>>>>> >>>>>> On 11/11/22 4:55 pm, Christophe Leroy wrote: >>>>>>> Le 10/11/2022 à 19:43, Hari Bathini a écrit : >>>>>>>> Most BPF programs are small, but they consume a page each. For >>>>>>>> systems >>>>>>>> with busy traffic and many BPF programs, this may also add >>>>>>>> significant >>>>>>>> pressure on instruction TLB. High iTLB pressure usually slows down >>>>>>>> the >>>>>>>> whole system causing visible performance degradation for production >>>>>>>> workloads. >>>>>>>> >>>>>>>> bpf_prog_pack, a customized allocator that packs multiple bpf >>>>>>>> programs >>>>>>>> into preallocated memory chunks, was proposed [1] to address it. >>>>>>>> This >>>>>>>> series extends this support on powerpc. >>>>>>>> >>>>>>>> Patches 1 & 2 add the arch specific functions needed to support >>>>>>>> this >>>>>>>> feature. Patch 3 enables the support for powerpc. The last patch >>>>>>>> ensures cleanup is handled racefully. >>>>>>>> >>>>>> >>>>>>>> Tested the changes successfully on a PowerVM. patch_instruction(), >>>>>>>> needed for bpf_arch_text_copy(), is failing for ppc32. Debugging >>>>>>>> it. >>>>>>>> Posting the patches in the meanwhile for feedback on these changes. >>>>>>> >>>>>>> I did a quick test on ppc32, I don't get such a problem, only >>>>>>> something >>>>>>> wrong in the dump print as traps intructions only are dumped, but >>>>>>> tcpdump works as expected: >>>>>> >>>>>> Thanks for the quick test. Could you please share the config you >>>>>> used. >>>>>> I am probably missing a few knobs in my conifg... >>>>>> >>>>> >>>> >>>> I also managed to test it on QEMU. The config is based on >>>> pmac32_defconfig. >>> >>> I had the same config but hit this problem: >>> >>> # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf >>> test_bpf: #0 TAX >>> ------------[ cut here ]------------ >>> WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367 >>> bpf_int_jit_compile+0x8a0/0x9f8 >> >> I get no such problem, on QEMU, and I checked the .config has: > >> CONFIG_STRICT_KERNEL_RWX=y >> CONFIG_STRICT_MODULE_RWX=y > > Yeah. That did the trick. Interesting. I guess we have to find out why it fails when those config are missing. Maybe module code plays with RO and NX flags even if CONFIG_STRICT_MODULE_RWX is not selected ? Christophe ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc 2022-11-18 8:51 ` Christophe Leroy @ 2022-11-18 9:39 ` Hari Bathini 2022-11-18 11:46 ` Christophe Leroy 0 siblings, 1 reply; 19+ messages in thread From: Hari Bathini @ 2022-11-18 9:39 UTC (permalink / raw) To: Christophe Leroy, linuxppc-dev, bpf Cc: Naveen N. Rao, Song Liu, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko On 18/11/22 2:21 pm, Christophe Leroy wrote: > > > Le 18/11/2022 à 09:39, Hari Bathini a écrit : >> >> >> On 17/11/22 12:29 pm, Christophe Leroy wrote: >>> >>> >>> Le 16/11/2022 à 18:01, Hari Bathini a écrit : >>>> >>>> >>>> On 16/11/22 12:14 am, Christophe Leroy wrote: >>>>> >>>>> >>>>> Le 14/11/2022 à 18:27, Christophe Leroy a écrit : >>>>>> >>>>>> >>>>>> Le 14/11/2022 à 15:47, Hari Bathini a écrit : >>>>>>> Hi Christophe, >>>>>>> >>>>>>> On 11/11/22 4:55 pm, Christophe Leroy wrote: >>>>>>>> Le 10/11/2022 à 19:43, Hari Bathini a écrit : >>>>>>>>> Most BPF programs are small, but they consume a page each. For >>>>>>>>> systems >>>>>>>>> with busy traffic and many BPF programs, this may also add >>>>>>>>> significant >>>>>>>>> pressure on instruction TLB. High iTLB pressure usually slows down >>>>>>>>> the >>>>>>>>> whole system causing visible performance degradation for production >>>>>>>>> workloads. >>>>>>>>> >>>>>>>>> bpf_prog_pack, a customized allocator that packs multiple bpf >>>>>>>>> programs >>>>>>>>> into preallocated memory chunks, was proposed [1] to address it. >>>>>>>>> This >>>>>>>>> series extends this support on powerpc. >>>>>>>>> >>>>>>>>> Patches 1 & 2 add the arch specific functions needed to support >>>>>>>>> this >>>>>>>>> feature. Patch 3 enables the support for powerpc. The last patch >>>>>>>>> ensures cleanup is handled racefully. >>>>>>>>> >>>>>>> >>>>>>>>> Tested the changes successfully on a PowerVM. patch_instruction(), >>>>>>>>> needed for bpf_arch_text_copy(), is failing for ppc32. Debugging >>>>>>>>> it. >>>>>>>>> Posting the patches in the meanwhile for feedback on these changes. >>>>>>>> >>>>>>>> I did a quick test on ppc32, I don't get such a problem, only >>>>>>>> something >>>>>>>> wrong in the dump print as traps intructions only are dumped, but >>>>>>>> tcpdump works as expected: >>>>>>> >>>>>>> Thanks for the quick test. Could you please share the config you >>>>>>> used. >>>>>>> I am probably missing a few knobs in my conifg... >>>>>>> >>>>>> >>>>> >>>>> I also managed to test it on QEMU. The config is based on >>>>> pmac32_defconfig. >>>> >>>> I had the same config but hit this problem: >>>> >>>> # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf >>>> test_bpf: #0 TAX >>>> ------------[ cut here ]------------ >>>> WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367 >>>> bpf_int_jit_compile+0x8a0/0x9f8 >>> >>> I get no such problem, on QEMU, and I checked the .config has: >> >>> CONFIG_STRICT_KERNEL_RWX=y >>> CONFIG_STRICT_MODULE_RWX=y >> >> Yeah. That did the trick. > > Interesting. I guess we have to find out why it fails when those config > are missing. > > Maybe module code plays with RO and NX flags even if > CONFIG_STRICT_MODULE_RWX is not selected ? Need to look at the code closely but fwiw, observing same failure on 64-bit as well with !STRICT_RWX... Thanks Hari ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc 2022-11-18 9:39 ` Hari Bathini @ 2022-11-18 11:46 ` Christophe Leroy 2022-11-18 17:28 ` Song Liu 0 siblings, 1 reply; 19+ messages in thread From: Christophe Leroy @ 2022-11-18 11:46 UTC (permalink / raw) To: Hari Bathini, linuxppc-dev, bpf Cc: Naveen N. Rao, Song Liu, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko Le 18/11/2022 à 10:39, Hari Bathini a écrit : > > > On 18/11/22 2:21 pm, Christophe Leroy wrote: >>>>> >>>>> I had the same config but hit this problem: >>>>> >>>>> # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf >>>>> test_bpf: #0 TAX >>>>> ------------[ cut here ]------------ >>>>> WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367 >>>>> bpf_int_jit_compile+0x8a0/0x9f8 >>>> >>>> I get no such problem, on QEMU, and I checked the .config has: >>> >>>> CONFIG_STRICT_KERNEL_RWX=y >>>> CONFIG_STRICT_MODULE_RWX=y >>> >>> Yeah. That did the trick. >> >> Interesting. I guess we have to find out why it fails when those config >> are missing. >> >> Maybe module code plays with RO and NX flags even if >> CONFIG_STRICT_MODULE_RWX is not selected ? > > Need to look at the code closely but fwiw, observing same failure on > 64-bit as well with !STRICT_RWX... The problem is in bpf_prog_pack_alloc() and in alloc_new_pack() : They do set_memory_ro() and set_memory_x() without taking into account CONFIG_STRICT_MODULE_RWX. When CONFIG_STRICT_MODULE_RWX is selected, powerpc module_alloc() allocates PAGE_KERNEL memory, that is RW memory, and expects the user to call do set_memory_ro() and set_memory_x(). But when CONFIG_STRICT_MODULE_RWX is not selected, powerpc module_alloc() allocates PAGE_KERNEL_TEXT memory, that is RWX memory, and expects to be able to always write into it. Christophe ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc 2022-11-18 11:46 ` Christophe Leroy @ 2022-11-18 17:28 ` Song Liu 2022-11-18 18:05 ` Christophe Leroy 0 siblings, 1 reply; 19+ messages in thread From: Song Liu @ 2022-11-18 17:28 UTC (permalink / raw) To: Christophe Leroy Cc: Hari Bathini, linuxppc-dev, bpf, Naveen N. Rao, Song Liu, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko On Fri, Nov 18, 2022 at 3:47 AM Christophe Leroy <christophe.leroy@csgroup.eu> wrote: > > > > Le 18/11/2022 à 10:39, Hari Bathini a écrit : > > > > > > On 18/11/22 2:21 pm, Christophe Leroy wrote: >>>>> > >>>>> I had the same config but hit this problem: > >>>>> > >>>>> # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf > >>>>> test_bpf: #0 TAX > >>>>> ------------[ cut here ]------------ > >>>>> WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367 > >>>>> bpf_int_jit_compile+0x8a0/0x9f8 > >>>> > >>>> I get no such problem, on QEMU, and I checked the .config has: > >>> > >>>> CONFIG_STRICT_KERNEL_RWX=y > >>>> CONFIG_STRICT_MODULE_RWX=y > >>> > >>> Yeah. That did the trick. > >> > >> Interesting. I guess we have to find out why it fails when those config > >> are missing. > >> > >> Maybe module code plays with RO and NX flags even if > >> CONFIG_STRICT_MODULE_RWX is not selected ? > > > > Need to look at the code closely but fwiw, observing same failure on > > 64-bit as well with !STRICT_RWX... > > The problem is in bpf_prog_pack_alloc() and in alloc_new_pack() : They > do set_memory_ro() and set_memory_x() without taking into account > CONFIG_STRICT_MODULE_RWX. > > When CONFIG_STRICT_MODULE_RWX is selected, powerpc module_alloc() > allocates PAGE_KERNEL memory, that is RW memory, and expects the user to > call do set_memory_ro() and set_memory_x(). > > But when CONFIG_STRICT_MODULE_RWX is not selected, powerpc > module_alloc() allocates PAGE_KERNEL_TEXT memory, that is RWX memory, > and expects to be able to always write into it. Ah, I see. x86_64 requires CONFIG_STRICT_MODULE_RWX, so this hasn't been a problem yet. Thanks, Song ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc 2022-11-18 17:28 ` Song Liu @ 2022-11-18 18:05 ` Christophe Leroy 0 siblings, 0 replies; 19+ messages in thread From: Christophe Leroy @ 2022-11-18 18:05 UTC (permalink / raw) To: Song Liu, Hari Bathini Cc: linuxppc-dev, bpf, Naveen N. Rao, Song Liu, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko Le 18/11/2022 à 18:28, Song Liu a écrit : > On Fri, Nov 18, 2022 at 3:47 AM Christophe Leroy > <christophe.leroy@csgroup.eu> wrote: >> >> >> >> Le 18/11/2022 à 10:39, Hari Bathini a écrit : >>> >>> >>> On 18/11/22 2:21 pm, Christophe Leroy wrote: >>>>> >>>>>>> I had the same config but hit this problem: >>>>>>> >>>>>>> # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf >>>>>>> test_bpf: #0 TAX >>>>>>> ------------[ cut here ]------------ >>>>>>> WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367 >>>>>>> bpf_int_jit_compile+0x8a0/0x9f8 >>>>>> >>>>>> I get no such problem, on QEMU, and I checked the .config has: >>>>> >>>>>> CONFIG_STRICT_KERNEL_RWX=y >>>>>> CONFIG_STRICT_MODULE_RWX=y >>>>> >>>>> Yeah. That did the trick. >>>> >>>> Interesting. I guess we have to find out why it fails when those config >>>> are missing. >>>> >>>> Maybe module code plays with RO and NX flags even if >>>> CONFIG_STRICT_MODULE_RWX is not selected ? >>> >>> Need to look at the code closely but fwiw, observing same failure on >>> 64-bit as well with !STRICT_RWX... >> >> The problem is in bpf_prog_pack_alloc() and in alloc_new_pack() : They >> do set_memory_ro() and set_memory_x() without taking into account >> CONFIG_STRICT_MODULE_RWX. >> >> When CONFIG_STRICT_MODULE_RWX is selected, powerpc module_alloc() >> allocates PAGE_KERNEL memory, that is RW memory, and expects the user to >> call do set_memory_ro() and set_memory_x(). >> >> But when CONFIG_STRICT_MODULE_RWX is not selected, powerpc >> module_alloc() allocates PAGE_KERNEL_TEXT memory, that is RWX memory, >> and expects to be able to always write into it. > > Ah, I see. x86_64 requires CONFIG_STRICT_MODULE_RWX, so this hasn't > been a problem yet. > In fact it shouldn't be a problem for BPF on powerpc either. Because powerpc BPF expects RO at all time and today uses bpf_jit_binary_lock_ro(). It just means that we can't use patch_instruction() for that. Anyway, using patch_instruction() was sub-optimal. All we have to do I think is set a mirror of the page using vmap() then perform a memcpy() of the code then vunmap() it. Maybe a call to flush_tlb_kernel_range() will be also needed, unless BPF already does it. Christophe ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2022-11-18 18:05 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-11-10 18:43 [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc Hari Bathini 2022-11-10 18:43 ` [RFC PATCH 1/3] powerpc/bpf: implement bpf_arch_text_copy Hari Bathini 2022-11-13 13:17 ` Christophe Leroy 2022-11-14 14:54 ` Hari Bathini 2022-11-10 18:43 ` [RFC PATCH 2/3] powerpc/bpf: implement bpf_arch_text_invalidate for bpf_prog_pack Hari Bathini 2022-11-13 18:00 ` Christophe Leroy 2022-11-10 18:43 ` [RFC PATCH 3/3] powerpc/bpf: use bpf_jit_binary_pack_[alloc|finalize|free] Hari Bathini 2022-11-13 18:36 ` Christophe Leroy 2022-11-11 11:25 ` [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc Christophe Leroy 2022-11-14 14:47 ` Hari Bathini [not found] ` <bf0af91e-861c-1608-7150-d31578be9b02@csgroup.eu> [not found] ` <e0266414-843f-db48-a56d-1d8a8944726a@csgroup.eu> 2022-11-16 17:01 ` Hari Bathini 2022-11-16 17:16 ` Christophe Leroy 2022-11-17 6:59 ` Christophe Leroy 2022-11-18 8:39 ` Hari Bathini 2022-11-18 8:51 ` Christophe Leroy 2022-11-18 9:39 ` Hari Bathini 2022-11-18 11:46 ` Christophe Leroy 2022-11-18 17:28 ` Song Liu 2022-11-18 18:05 ` Christophe Leroy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).